You are on page 1of 11

Language Assessment

Characteristics of a good Test

1
What are the main
characteristics of a good test?
1. Practicality
In PAIRS or THREES,
2. Validity think of a test you have
3. Reliability taken or constructed.
As I go through the
4. Authenticity
criteria, ask yourself
5. Washback how well the test meets
each criterion.

2
Is the test PRACTICAL?
Practicality refers to facilities available to test
developers regarding the development,
administration, and scoring procedures if a test.

How long is the test?


Is it easy to administer?
How many examiners do you need?
What equipment do you need e.g. audio-visual
system?

You need to take into account the CONTEXT in


which the test is taken.

3
Is the test VALID?
 Face validity: Does it look like a proper test? (familiar tasks, do-able in
time limit, clear items, clear instructions, reasonable level of difficulty,
correct language, no typos).

 Content validity: Does it test what it is supposed to? (e.g. if you want to
test listening, should you ask students to write an essay?) Is the test related
to the learning objectives in the syllabus or curriculum (constructive
alignment)?

 Criterion-Related validity – Do student scores on this test correspond


with scores they get on other (usually established and well-known) tests?

 Construct validity – Is the test based on an accepted theory of language


proficiency and language learning? (see next slide)
Is the test RELIABLE?

•Inter-Rater reliability:
Will different teachers
marking the same student’s work give the same
score?

•Intra-Rater reliability: Will the same teacher


gave the same score whether the test is marked
first or last? Morning, noon or night? (fatigue)

•Test Administration reliability: Is the test


administered under “fair” conditions to all test-
takers(e.g. noise, interruptions)?
An Experiment in Inter-Rater Reliability
http://www.youtube.com/watch?v=FtbrPGaINt0

Source: www.ehow.com
An Experiment in Inter-Rater Reliability
Step 1: Without Rubrics
See handout
Groups A, B and C and : give a score
out of 10

Group D and E: give a grade A, B, C, D


or F

Step 2: With Simple Rubrics

All Groups: Use the simple writing


rubric and give a score.

Step 3: Reflection
See handout
1. Which way of scoring
resulted in higher inter-rater reliability?
Rubrics or no rubrics?

2. What are some things


teachers can do to increase reliability?
LEVELS & DESCRIPTORS (See handout)
5 4 3 2 1
Fully satisfies all Presents key Adequately Attempts to Answer is barely
requirements of features, but satisfies address the task related to the
Task the task details may be requirements of but does not task
Achievement missing the task address all key
features
Logical, Good Adequate Has very little Not arranged
sequential, easy organization organisation control of sequentially with
Coherence to read overall; one or but with key organizational no clear
and Cohesion two detractors features features progression
missing
Uses a wide Uses a wide Uses an Uses only basic Uses an
range of range of adequate range vocabulary, but extremely
Lexical vocabulary vocabulary but of vocabulary accurately limited range of
Resource conveying not always with for the task vocabulary
precise precision
meanings
Sentences are Makes some Makes several Several errors Errors hinder
error-free; uses minor errors in errors in throughout; communication
Grammatical a wide range of grammar; uses grammar but simple forms and distort the
Range and grammatical both simple and uses both predominate meaning
Accuracy forms complex forms simple and
complex forms

8
Is the test AUTHENTIC?

Is the language in the test similar to “real


world” situations?

Are items contextualised according to your


culture?

Is it interactive?
Is there positive washback?
What is the effect of the test on teaching and
learning?

Ifstudents “cram” for the test and if teachers


“teach to the test”, will it help students to learn
better?

10
To help you remember…

Please

Read
Words

Very

Accurately

11

You might also like