You are on page 1of 132
Testing for Language Teachers Testing for Language Teachers SECOND EDITION ARTHUR HUGHES e Cambridge Language Teaching Library CAMBRIDGE CAMBRIDGE Drs ena Testing for Language ‘Teachers Second Ealion CAMBRIDGE LANGUAGE EEACHING TUSK ARY A sete anergy ngage te anal arm By sts whe have eapert Konrad tv kl In this tee Affect im Language Learning edited by Jame Arnold Approaches and Methods in Language Teaching Second Edition by Jack Richards and Theodore S. Rodgers ‘Beyond Training by Jack C. Richards ‘Classroom Decision Making edited by Michael Breen and Andrew Litlejob Collaborative Action Research for Eplish Language Teachers by Ame Burns Collaborative Language Learning and Teaching edited by David Neonat Communicative Language Teaching by Wiliam Littlewood Designing Tasks for the Communicative Classroom ly David Ninn Developing Reading Skills y Frongoise Grellet Developments in English for Specie Purpose by Tomy Dudley-Evane and ‘Maggio Jo St John Discourse Analysis for Language Teachers by Michael McCarthy Discourse and Language Education by Fvelys Hatch “The Dynamics ofthe Language Classroom by lan Tudor English for Academie Purposes by R.R. Jordan English for Specie Perpones by Tom Hlatchinson and Alan Waters Establishing Self-Actess by David Garder and Lindsay Miller Foreign and Second Language Learning by Willa Litlweood Language Leatng in Intercaleural Perspective edited by Michael Byram and Michael Fleming ‘The Language Teaching Matrix by Jack C. Richards ‘Language Test Construction and Esaluation by J Charles Alderson, ‘Caroline Clapham, and Dianne Wall ‘Learner-entrednes a Language Edcation by Lam Tudor ‘Managing Curricular Innovation by Naa Markee “Materials Development in Language Teaching edited by Brian Tomlinson “Motivational Strategies in the Language Clastoom by Zoltan Dorms: Psychology for Language Teachers Ly Mavion Wills and Robert I. Burden Research Methods ia Language Learning by David Nuean Second Language Teacher Education edited by Jack C. Richards and David Nase Society and the Language Classroom edited by Hywel Coleman ‘Teaching Languages to Young Learners by Lyrae Cameron Teacher Learning in Language Teaching dived by Donald Freeman and “Jack C Richards Understanding Rescarch in Second Language Learning by James Deam Broun Vocabolary: Description, Acquistion and Pedagogy edited hy Norbert Scott and "Michael McCarthy Vocabulary, Semantics, nd Language Education by Evelyn Hatch and (Chery Browne Voices From the Language Clasroom edie by Kathleen M. Bailey and avid Nico Testing for Language Teachers Second Edition Arthur Hughes UNIVERSITY PRESS 8 CAMBRIDGE Pao Yue-Kong Library PolyU + Hong Kong ‘The Pitt Building, Trumpington Street, Cambridge, United Kingdom ‘The Edinburgh Building, Cambridge crs anv, UK 40 West 2oth Street, New York, nvi00r 14211, USA 477 Williamstown Road, Port Melbourne, vic $307, Australia Ratz de Alarcon 15, 28014 Madrid, Spain Dock House, The Waterfront, Cape Town Scor, South Aftiea buepsfww-cambridgecong (© Cambridge Universy Press 1989, 2005, He ri bok isin ope. sebjsaay exception and to the provisions of relevant collective licensing agreements, W06 Jao reproduction of sy pare may tale place who the writen permission of Cambridge University Press. First published 1988 Second edition 2603, Printed in the United Kingdom at the University Press, Cambridge ‘Typeface Sabon 10.shx2. System QuarkxPress® [opKt] A catalogue record for this book is available fram the British Library Ian o 528 823250 hardback ISBN 0 521 484952 paperback For Vicky, Meg and Jake Contents Acknowledgements Preface | Teaching and testing 2 Testing as problem solving: an overview of the book Kinds of tests and testing. 4 Validity 5 Reliability Achieving beneficial backwash Stages of test development 8 Common test techniques 9 Testing writing 10. Testing oral abiliey 1 Testing reading 12 Testing lise 13 Testing grammar and vocabulary 14 Testing overall ability 15 Tests for young learners page ix 8 26 36 33 58 75 83 a3 6 172 186 199 Contents 16 ‘Test administration | Append 1 The statistical analysis of test data Appendix 2 Item banking Appendix 3 Questions on the New Zealand youth hostels passage Bibliography Subject Index Author Index ats 234 256 237 246 ase Acknowledgements TThe publishers and I are geateful co the authors, publishers and others. soho have given petiission for the use of copyright material identified in the text. It has not been possible to identify, or trace, sources of all the materials used and in such cases the publishers would welcome information from copyright owners. ‘American Council on the Teaching of Foreign Languages Inc. for extracts from ACTF Proficiency Guidelines: ARELS Examination Trust for extracts from examinations; A. Hughes for extracts from the New Bogazici University Language Proficiency Test (1984); ‘Cambridge University Press for M. Swan and C. Walter: Cambridge English Course 3, p16 (1988); Filmscan Lingual House for M. Garman and A. Hughes for English Cloze Exercises Chapter $ (1983); The Foreign Service Institute for the Testing Kit pp. 35-8 (1979), Hia per & Row; The Independent for N. Timmins: ‘Passive smoking comes tunder fire’, 14 March 1987; Language Learning, and J. W. Oller Jr and C. A. Conrad for the extract from ‘The cloze technique and ESL proficiency’; Macmillan London Ld for Colin Dexter: The Secret of Nnneve: ¥ (1986); The ©Guardian for §, Limb: ‘One-sided headache’, 9 October 1983; The Royal Society of Arts Examinations Board) University of Cambridge Local Examinations Syndicate (UCLES) for extracts from the examination in The Communicative Use of English as a Foreign Language; UCLES for the extract from Testpack 1 paper 35 UCLES for extracts from the Oxford Examination English as a Foreign Language, CSE, FCE, aud Young learners, ARELS handbooks and papers, Interagency Language Roundtable Speaking Levels, Oxford University Press for extract from Speaking by Martin Bygate © Oxford University Press 1987; TOEFLO. materials are reprinted by permission of Educational Testing Service, the copyright owner. However, che test questions and any other testing information are provided in their entirety by Cambridge University Press. No endorsement of this publication by Educational Acknorwledgements Testing Serie should be infeed Cle test by Chine Kein Bea and Ulich Raat 198, ssl y pemiion of Conan a area Kommosy Reena stoma Abi agen mad Sn nels Hansen ntonnal Poe a ‘rua and Hamnab i NFABDAQA exert ge 202 dluced by permission ofthe Assessment and Qualifeaione Allawoa” Preface The objective ofthis book isto help language teachers write better tests. Ictakes the view that tet construction is essentially a matter of problem solving, with every teaching situation setting a different testing problem. In order to arrive atthe best solution for any particular problem ~ the most appropriate test or testing system ~ it is not enough to have at ‘one's disposal a collection of test techniques from which to choose. I is also necessary to understand the principles of testing and how they can bre applied in practice. Ieis relatively staightforward to introduce and explain the desirable qualities of tests: validity, reliability, practicality, and beneficial back- wash; this last referring to the favourable effects testing can have on teaching and learning. Its much less easy to give realistic advice on how tw achieve them in teacher-made tests. One is tempted either to ignore the issue of to present as a model the not always appropriate methods ‘of large-scale testing organisations. In resisting this temptation, I have made recommendations that I hope teachers will find practical but ‘which I have also tried to justify in terms of language testing theory. Exemplifcation throughout the book is from the testing of English as 4 forcign language. This reflects both my own experience in. language testing and the fact that English will be the one language known by all readers. I trust that it will not prove too difficult for teachers of other languages to find or construct parallel examples of their own, Because the objective and general approach of the book remain those of the fist edition, much of the text remains. However, I have made changes throughout. These include the inclusion of greater detail in the ‘writing of specifications and the provision of outlines of training pro- ‘grammes for interviewers and rater, In response the now widespread Availabilty of powerful and relatively inexpensive computers, as well as ready access tothe Incernet, Ihave made extensive reference to resources fon the Ineernct and have written a completely new chapter on the stats tical treatment of test data, using an inexpensive program which is avail- able to readers via the book's own website www.cambridge.orgeett. Preface Links oll fhe meses menond in he Bok canbe found sere he increasing tendemy duoughout the world for language learning, (and testing) to begin ax primaryechool has ed me include a chapter on the testing of young leamncrs. While one might have resevations bout such testing (as opposed to other forms of assessment), since testing does take place, it seemed better to offer advice in the area rather than Simply ignore the issue, . publcaton of heft dite hasbeen he reat inceae ne name Of published articles and books. Many ~ perhaps most ~ of the articles have been of theoretical or techie! artare, no deeyelevant £0 the concerns of language teachers. Even where ther relevance is clear, inowder tn herp the seer aceretle to newrnmars tthe field, have Usually restricted references to them to the Turther reading section. ‘These Secons are intended to act as apie for those readers who wish to go more deeply into the isues raised inthe book, and also o provide an outline of the state of language texting today. They alo contain smmentions of muner of ec ans hh an ase lars) in preater depth tans possible inte present lume ‘imu acknowlsge the contributions of others: MA. and research smidente at Reading University, f00 numerous to mention by name, who have taught me much, suallyby asking question that {found aifcut to answer, my frends and colleagues, Paul Hache, Michel Garman, Don Portes, Tony Woods, who all read parts of the manuseripe of the first edvion and made many helpfal suggestions; Angela Hasselgren wh shared hugh onthe xing of og eres a pone ne ‘with copies ofthe materials used in the Norwegian EVA projet; my Tends Cy Weir and Russa Hoeayn, with whom e collaborated seeing prs me at ay my yo de te freed dieeaewades es 1 Teaching and testing Many language teachers harbour a deep mistrust of ests and of testers. The starting point for this book is the admission thar this mistrust is frequently well-founded. Ie cannot be denied that a great deal of lanuage testing is of very poor quality. Too often language ress have a hhaninfal effect on teaching and learning, and fail to measure accurately Whatever it is they are intended to measure, Backwash Te effect of testing on teaching and learning is known as backwash, sand can be harmful or beneficial Ia testis regarded as imporcant if the ‘takes are high, preparation for it can come to dominate all reaching wad earning activities. And ifthe tect content and resting rechniques are ae variance with the objectives ofthe course, there i likely to be harmful backwash, An instance of this would be where studenes are following an Tnglish cots that is meant to train them inthe language skills (includ ing: writing) necessary for university study in an English-speaking Country but where the language test that they have to take in order 00 fevadmiteed to a university does not test those skills ditecty. I the okill Pf writing, for example, is tested only by multiple choice items, then {here is great pressure to practise such items rather chan practise the skill of writing itself. This is clearly undesirable.) ‘We have just looked at a case of harmful backwash. However, back- swash can be positively heneficial. 1 was once involved in the develop- Trent of an English language tes for an English medium university 1 a ftom English-speaking country. The test was to be administered at the vad of an intensive year of English study there and would be used te steering which students would be allowed to go on to their under- faduate courses (taught ia English) and which would have o leave the Braversity. A rest was devised which was based directly on an analysis Of the English language needs of frst year undergraduate students, and Testing for language teachers which included tasks as similar as possible ro those which they would have to perform as undergraduates (reading textbook materials, taking shoies during lectures, and so on). The introduction of ths test in place of one which had been entirely ‘multiple choice, had an immediate effect on teaching: the syllabus wae redesigned, new books were chosen, classes were conducted different, ‘The result ofthese changes was that by the end of their year’s training, in circumstances made particularly difficult by greatly. increased rumbers and limited resources, the students reached a much higher standard in English than had ever been achieved in the university's history. This was a case of beneficial backwash. Davies (1968:5) once wrote that ‘the good testis an obedient servant since it follows and apes the teaching’. I find it difficult to agree. and ppethaps way Davies would as well. The proper relationship between ‘caching and testing is surely that of partnership. It is true that there may be occasions when the teaching programme is potentially good and appropriate but the testing is not; we are then likely to suffer from hhatmful backwash. This would seem to be the situation that led Davies in 1968 to confine texting to the role of servant tothe teaching. Dt ‘equally there may be occasions when teaching is poor or inappropriate annd when testing is able to exert beneficial influence: We consee expect testing only to follow teaching, Rather, we should demand of it that it is supportive of good teaching and, where necessary, exetts a corrective influence on bad teaching. If testing always had a beneficial backwash on teaching, ie would have a much better reputation among teachers. Chaprer 6 ofthis book is devored co a discussion of how bene. ficial backwash can be achieved. one 2s thing to be said about backwash in the present chapter i that it ean be viewed as part of something more general ~the impact of assessment. The term impact, as ite ern eccadonal memereorenn és not limited to the effects of assessment on learning and teaching but extends to the way in which assessment atfects society asa whole, and hhas been discussed in the context of the ethics of language testing (see Further Reading). Inaccurate tests ‘The second reason for mistrusting tests is that very often they fail to -tucasure_accurately whate they are intended to measure, ‘Teachers know this. Students” true abilires arr-not always reflected in the test scores that they obtain. To a certain extent this is inevitable, Language abilities are not easy to measure; we cannot expect a level of ‘Teaching and testing accuracy comparable to those of measurements in the physical sciences, thu we Gan expect sreaer accuracy than is frequently achieved. ‘Why are tests inaccurate? The causes of inaccuracy (and ways of minimising their effects) are identified and discussed in subsequent Chapters, but a short answer is possible here. There are two_main accuracy. The first of these concerns test content and test fechniques. To return to an earlier example, if we want to know how Well somcoie can write, there is absolutely no way We can get a really Accurate measure of thc aii by means of 3 mule choice et. Professional testers have expended great effort, and not a little money, in attempts to do it, but they have always failed. We may be able to get {mn apptoximate measure but that al, When testing cased out on 4 very larne scale, when the scoring of tens of thousands of composi- tons might aot com to ea practi proponon is understandable that potentially greater accuracy is sacrificed for reasons of economy tind convenience. But this does not give testing a good name! And it docs seta had example ‘While few teachers would wish co follow that particular example in fonder to test writing ability, the overwhelming practice in large-scale testing of sing mile choice tems does lad oimitaton in crcum Stances where such items aze not at all appropriate. What is more, d imitation tends to be of a very poor standard. Good multiple choice tems are notoriously iff o wee. A great deal of time and efor {as t0 BO into their construction. Too many multiple-choice tests are wen hee he cena casa ex genTe ee Js.a set of poor items that cannot possibly provide accurate measure: tment One othe principal aims ofthis book isto discourage the use of inappropriate techniques and to show that teacher-made tests can be superior in certain respects to their professional counterparts. "The scconed source of inaccuracy iz lack of reliability. This is a cech~ nical term that is explained in Chapter 5. For the moment its enough to say that a testis reliable iit measures consistently. On a reliable test You an be conden tha someone wil et more orks the sare so, ‘whether they happen to take it on one particular day or on the next, ‘heres on tunable the weve gue Hkly tbe contra Uilferent, depending on the day on which iti taken. Unrehabibty has two origins, The first is the interaction between the person taking the test and features ofthe test itself. Husman beings are not machines and wwe therefore canriot expect them to perform in exactly the same way on toro different occasions, whatever test they take. As a result, we expect Some variation in the scores a person gots on atest, depending on when they happen to take it, what mood they are in, how much sleep they had the night before. Fiowever, what we can do is ensure that the tests 3 ‘esting for language teachers themselves don't increase this variation by having unceae instructions, isbiguows questions or items that revul fa puesring on the part of the tert takers, Unless we minimise these features, we cannot have conf dines inthe scores that people obtain on text. “The sezond orgin of unreliablicy i o be found inthe scoring of a ‘cit Scoring can be unreliable in that equivalent test performances are accorded significantly different scores. For example, the same composi fiom may be piven very alfferen scores by diferent makers (or cren the same marker on diffrent occasions). Fortunately there ae ways Of minimising such eiferences in scoring, Most (but noe all) large testing ‘Organisations, to their eed take every precaution to make their tests, Snu the scoring of them, a lable as posible, and are generally highly Sccessfl in thi respect Small-scale esting, on the other hand tend 0 ii rete ian shold ener of hs ks es fo Show how to achieve greater ciability in testing. Advice on this i t0 found in Chapter 5- vinrenns wore ‘The need for tests So far this chapter has been concerned with understanding why tests are so mistrusted by many language teachers, and how this mistrust is ‘often justifed. One conclusion drawn fom this might be that we ‘would be better off without language tests. Teaching is, afterall, the primary activity: Hf testing comes in conflice with it chen itis testing that should go, especially when it has been admitted that so much testing provides inaccurate information. However, information about people's language ability is often very useful and sometimes necessary Itisdiffcultto imagine, for example, British and American universities accepting students from overseas without some knowledge of their proficiency in English, The same ie true for organisations hiring inter- preters or translators. They certainly need dependable measures of Tanguage ability. Within teaching systems, too, so long as tis thought Appropriate for individuals to be given a statement of what they have achieved in a second or foreign language, tests of some kind or another will be needed. They will also be needed in order to provide informa- tHon about the achievement of yroups wf leauness, without which it is difficult to see how rational educational decisions can be made. While for some purposes teachers informal assessments oftheir own students are both appropriate and sufficient, this is not true for the cases just ‘mentioned, Even without considering the possibility of bias, we have £0 recognise the need for a-common yardstick, which tests provide, in -ordet to make meaningful comparisons. 4 “Teaching and testing ‘Testing and assessment The focus of this book is on more or [es Formal testing. But testing is 1 oF course, the: only way in which information about people's (image ablity can be gathered. Its just one form of asessment, and iter methods wll often be more appropriate, Is helpful here ro make cl the ference erween formate and smmutbe awesment ‘Xetumene is frmative avhen teachers use it fo check on the progress Aver students, to see how far they have mastered what they should Ihave learned, and then use this origation to modify their future reach Ing plans Such assessment can aso be the basis for feedback to-the 108 Prt aformal tests oF quizzesmay have a part to play i formative aaeermait but 20 will simple observatony(of performance on learning STE ae) ani sey of penfoliogthat students have made GPRS hee Students chemselves may be encouraged 10 carry out elf- Soest in order to nontor thir progress and then modiy hei ‘own learning objectives. sarang azvestnent used at, say, the end of the term, semester, ‘Warn order to measure what has been achieved both by groups and Perea, Here tor the reasons given in the pervious section. (tral tetsjare usualy called for. However the results of such tests (a not Be fooked at in isolation. A complete view of what bas been vnpieved should include information from as many sources as posible Side world, the diferent pices of information from al sources, Ir iulng formal tests, shoul! be consistent with each other. hey are sens edible sources ofthese dscrepaiiesnced so be invetigate. What is to be done? | lieve that the teaching profession can make three contributions ¢0 ti improvemene of testing! they cam wrte better ests chenselves they Linn enlighten other people who ate involved in testing processes} and they can put pressure on professional testers and examining boards, to improve thei tests. This book aims to help them do all three. The fist aig easily understood. One would be surprised if a book with this {idle did nor artempe co help teachers write better tests. The second aim {s perhaps lese obvious, [tis based on the belief that the better all of the lkcholders in a test or testing system understand testing, the better the ‘eating willbe and, where relevant, che better it willbe tegrated with teaching, The stakeholders I have in mind include test takers, teachers, fest waters, school or college administrators, education authorities, SSimining bodiee and texting inettutions. The more they interact and 5 ‘Testing for language teachers ‘cooperate on the basis of shared knowledge and understanding, the better and more appropriate should be the testing in which they all have a sake. Teachers are probably in the best position to understand the issues, and then to share their knowledge with others. For the reader who doubts the relevance of the third aim, let this chapter end with 4 further reference to the testing of writing through multiple choice items. This was the practice followed by those respons- ible for TOEFL. (Test of English as a Forcign Language) — the test taken by most non-native speakers of English applying to North American universities. Over a period of many years they maintained that it was simply not possible to test the writing ability of hundreds of thousands, of candidates by means of a composition: it was impracticable and the results, anyhow, would be unreliable. Yer in 1986 a writing test (Test ‘of Written English), in which candidates accaally have to write for thirty minutes, was introduced as a supplement to TOEFL. The pein- ‘cipal reason given for this change was pressure from English language teachers who had finally convinced those responsible for the TOEFL of the overriding need for a writing task that would provide beneficial backwash. Reader activities 1, Think of tests with which you are familia (the tests may be inter- national oF local, written by professionals or by teachers). Whar do you think the backwash effect of each of them is? Harmful or bene- ficial? What are your reasons for coming to these conclusions? 2. Consider these tests again. Do you think that they give accurate or inaccurace information? What are your reasons for coming to these ‘conclusions? Further reading Rea-Dickens (1997) considers the relationship between stakeholders in language testing and Hamp-Lyons (1997) raises ethical concerns relat- ing to backwash, impact and validity. These two papers form part of a special issue of Language Testing (Volume 14, Number 3) devoted to ethics in language testing. For an early discussion of the ethics of language testing, sce Spolsky (1981). The International Language Testing Association has developed a code of ethics (adopted in 2000) which can be downloaded from the Internet (sec the book's website). Kunnan (2000) is concerned with fairness and validation in language Teaching and testing testing. Rea-Dickens and Gardner (2000) examine the concept and practice of formative assessment. Alderson and Clapham (1995) make ‘commendations for classroom assesment. Brown and Hudzon (1998) resent teachers with alterative ways of assessing language. Nitko {1989) offers advice on the designing of tests which are integrated with ‘nsteuction. Ross (1998) reviews research into self assessment. DeVicenzi (1995) gives advice to teachers on how to learn from standardised tests. Gipps (1990) and Raven (1991) draw attention tothe possible dangers of inappropriate assesment. For an account of how the introduetion of new test ean have a striking beneficial effec on reaching and learning, see Hughes (1988). Testing for language teachers matters and for successful problem solving. In the chapters on validity and reliability, simple statistical norions are presented in terms that it is Iiuped everyone should be able to grasp. Appendix 1 deals in some detail with the statistical analysis of test results. Even here, however, the temphasis is on interpretation rather than on calculation. In fac, given the computing power and statistics software that is readily available these days, there is no real need for any calculation on the part of language testers. They simply need to understand the ouput of the computer programs which they (or others) use. Appendix 1 attempts to develop this understanding and, just as important, show how valuable statistical information can be in developing better tests. Further reading The collection of critical reviews of nearly $0 English language tests (mostly British and American), edited by Alderson, Keahnke and Stansfield (1987), reveals how well professional test writers are thought to have solved their problems. A full understanding af the reviews wil depend to some degree on an assimilation of the content of Chapters 3, 4, and 5 of this book. Alderson and Buck (1993) and Alderson et ai (1995) investigate the test development procedures of certain British testing insticutions. 1. ‘Abilities is not being used here in any technical sense, Ie refers simply to ‘what people can do in, or with, 2 language. I could, for example, include the ability to converse fluently ina language, as well asthe ably to recite ‘grammatical rules if chat is something which we are interested in measur ig) Te does nor, howeves, refer co language aptitude, ee talent whieh people have in differing degres, for leaening languages. The measurement ‘ofthis talent in order to predict how well or how quickly individuals will Tear a foreign language, is beyond the scope of this book. The interested scader is referred 0 Pimsleur (1968), Carroll (1981), and Skehan (1986), Stemberg (1995), MacWhinney (1995), Spolsky (1995), Mislevy (1995), Matalin (1995), is of tests and testing “This chapter bogins by considering the purposes for which language testing i caried out, It goes on to make + number of distinctions between direct and indiret texting, between discrete point and integra: tive testing, between norm-ffereaced and criteion-eferenced testing, tnd between objective and subjective testing, Finally there are notes on Computes adap eng and common gage esting, Tess an be categorised according tothe types of information they provide, The categoraston wil prove wefl both in deciing whether In existng testis suitable fora particular purpose and in writing appro- pre ae tt hed ar scenery sac abe ch tve will discuss inthe following sections are: proficiency tests ahieve- mene tess, diagnostic tests, and placement tests Proficiency tests Profieny tes are designed o measure people’ bility ina ngage, regardless of any training they may have had in that language. The. content of a proficiency test, Heelers is aot based on the comient oF Objectives of language cow ‘People taking the test may have followed. Rather, cs based on a specification of what candidates have to be able to do in the language in order to be considered proficient. This raises the question of what we mean by the word ‘proficient. In the case of some proficiency tests, proficient” means having suffi ient command of the language fora particular purpose. An example of {iis would be ates designed to discover wether someone can funcion successfully asa United Nations translator, Another example would be atest used f0 determine whether a student's English is good enough to follow a course of study at a British university. Such a test may even attempt to take into account the level and kind of English needed to follow courses in particular subject areas. Ie might, for example, have ‘one form of the test for arts subjects, another for sciences, and so on. ‘Testing for language teachers ‘Whatever the particular purpose to which the language isto be put, this Ml be reflected inthe specfeation fest conent a a eal stage O 2 test’ development. “There are ther proficiency tex which, by contrast, do not have any cccupation or cours of study in mind. For them the concept of proficiency is more general, British examples of these would be the Cambridge Fist Certificate in English examination (FCE) and the Cambridge Certifeate of Proficiency in English examination (CPE). The function of soch tests 4s ta show whether candidates have reached a certain standard with respect to a set of specified abilities. The examining bodies responsible for such tests are independent of teaching institutions and so ean be relied on by potential employers, etc to make fair comparisons between Candidates from different insttitions and different countries. Though there ig no parscular purpose in mind for the language, these general proficiency tests should have detailed specifications saying just what it is that successful candidates Rave demonstrated that they can do. Each test should be seen to be based directly on these specifications. All users fof atest (teachers, students, employers etc.) can then judge whether the test is suitable for them, and can interpret tes results. Iris nor enough to have some vague notion of proficiency, however prestigious the testing body concerned. The Cambridge examinations referred to above are linked to levels in the ALTE (Association of Language Testers in Europe) framework, which draws heavily on the work of the Council of Europe (see Further Reading) Despite differences between them of content and level of difficulty, all proficiency tests have in common the fact that they are not based on ‘courses that eandidases may have previously taken. On the other hand, as we saw in Chapter 1, such tests may themselves exercise considerable influence over the method and content of language courses. Their back swash effece - for thie ip what i ir - may be heneficial or harmful. In my’ view, the effect of some widely used proficiency tests is more harmful than beneficial. However, the teachers of students who take such tests, and whose work suffers from a harmful backwash effect, may be able to exercise more influence over the testing organisations concerned than they realise, The supplementing of TOEFL with a writing test, referred ton Chapter 1, is a ease in point. ‘Achievement tests ‘Most teachers are unlikely to be responsible for proficiency tests. It is ‘much more probable that they will be involved in the preparation and tuse of achievement tests, In contrast to proficiency tests, achievement Kinds of tests and testing ses are directly related to language courses, their purpose bing_t0 ‘SSMEsh how suseesfl individeal students, groups of students, or the “SERS Ulemsclves have Been in achieving objectives. They ae of two sevement tests and progress achievement Ts. "rial agbievernent tests are those administered atthe end ofa course of study They may be written and administered by ministries of educa- Ton, oficial examining boards, or by members of teaching institutions cit i content of there teste must be related to the courses with ‘chic they are concerned, but the nature ofthis relationship is a matter Df dlsagreement amongst language testers. In the view of some testers, the content ofa final achievement test, shouldbe based diecly on a detailed course sllabus or on the books nd other materials used, This has been referred to as the. syllabus- aie et Sppraach, It hasan obvious appert since the vest only contains att sthougt har he students actual encoun nd hon “in be considered, in this respect atleast fittest. The disadvant ‘hate plaka bay designed, ofthe books and other ma Mare badlychosen, the results of a est can be very misleading, Sassou performance on the test may noe truly indicate soccessul SURSESie af couse objectives, For example, a course may have as an ‘Shyectve the develooment of conversational ability, but the cours elf Sind the test may require student only to ter carefully prepared state- iment about their home town, the weather, or whatever. Another Course may aim to develop a reading ability in German, but the test seat inn fll to the vocabulary the sulenre are knowen to have met. Ya another course is intended to prepare stadens for university study iMEnplsi, bur the slabs (and 50 the course and the test);may not include listening (with note taking) to English delivered in eerre style aie opace ofthe Lind tha the students wil have to deal with at univer- Sis In each ofthese examples ~all of them based on actual cases ~ cst results will faul to show what students have achieved in terms of course onthe lt h “The altemative approach is 0. et objectives of the course, This has a number of advantages. First, it compels course designers to be explicit about objectives. Secondly, it

You might also like