Professional Documents
Culture Documents
There are distinct historical phases in the nature of formal testing. These phases provide us with
another useful way to categorize formal tests and compare them. The phases are not surprisingly
linked to the view of language and language learning that had most currency at a particular time.
The tests from each phase are referred to as first, second and third generation tests.
These are the tests broadly associated with the grammar/translation approach to language
learning. Candidates are asked to complete various questions such as compositions, translations,
or simple question and answer activities devoid of context. e.g. Write about a holiday you enjoyed
(200 words) - who to? why? Question types are probably non authentic; you do not usually
translate large chunks of literature unless this is your job.
Question types aim to elicit integrative language, which is language that requires a wide range of
language abilities, e.g. a composition will test grammar, vocabulary punctuation and spelling,
discourse structure, i.e. these tests subsume the testing of both accuracy and fluency together.
The type of testing techniques lead to subjective scoring, which charges an experienced tester to
make a judgement of the sample of language according to their knowledge and experience of
other similar samples. This can lead to problems of reliability in marking.
The degree of agreement between two examiners about a mark for the same language sample
is known as inter-rater reliability. The degree of agreement between one single examiners
marking of the same language sample on two separate occasions is known as intra-rater
reliability. Both inter- and intra-rater reliability is low in first generation tests. With first generation
test formats, it is common for two different examiners to mark the same test in a very different
way, or the same examiner to mark the same test differently on two different occasions, and
severe criticism is made to these test types because of this unreliability.
It was the reliance on subjective marking, and the associated problems of reliability, that led to
the development of the next generation of tests.
Later versions of these tests developed techniques that aim to redress this problem, evolving
techniques that were both objective and integrative, such as the cloze test - a text from which
words are removed either randomly or against some linguistic criteria, the test is to complete the
text. This is both objective and integrative, drawing on a wider range of language abilities.
Nor are second generation test formats authentic either, real-world language use does not
normally extend to multiple choice conversations! The over-use of second generation testing
techniques can lead to mechanistic teaching of discrete language items, and to very little
language use.
The testing of integrative language, with the use of both objective and subjective testing formats,
has come together in third generation tests. These are those tests which have come along the
back of developments in communicative language teaching (CLT). Just as CLT strives to emulate
real language use, then communicative tests aim to do the same, and consist of test items of real
language use.
One of the main issues in communicative language testing is the definition of Communicative
Language Ability; i.e. the theoretical base on which to build a communicative test. Recent models
of communicative language ability propose that it consists of both knowledge of language and the
capacity for implementing that knowledge in communicative language use.
Examples of communicative language testing tasks may be an authentic reading with some
transfer of information such as correcting some notes taken from it, or writing a note with
instructions about some aspect of household organization, or listening to an airport
announcement to find the arrival time of a plane, or giving someone spoken instructions for how
to get to a house.
To compare the third generation tests against the strengths and weaknesses of the previous two:
both the texts used and the tasks set aim to be authentic in third generation tests, all third
generation techniques are contextualized by their very nature as authentic. Candidates are asked
to do tasks which have clear reference in reality. Third generation tests assess integrative
language. The nature of the tasks in speaking and writing demand integrative language use.
Similarly, techniques for listening and reading demand global (integrative) comprehension as well
as comprehension of discrete items.
Finally, since communicative testing of the productive skills gives rise to samples of integrative
language that have to be assessed subjectively, much effort has been put into achieving greater
inter-and intra-examiner reliability. Often more than one assessor is used, assessors attend
retraining meetings where they discuss the interpretation of descriptors of performance against
real samples of tests.
We will be going back to many of the points made briefly in this section, since they require deeper
coverage. This section should however serve as a little historical context against which formal
testing operates, whether the tests are second or third generation. Tests are not normally
watertight representations of one generation of testing, normally techniques are mixed and
matched. The strengths and weaknesses of the second and third generation tests make them
suitable for different testing purposes.
WEEK 02 - Material 3