Principles of Language Assessment

Beranda
BLOG LIGHTNING R-MD

I am just share to all
Home
About
Archive
Comments
Labels
Search
Home English Principles of Language Assessment - Practicality, Reliability, Validity,

Authenticity, and Washback
Principles of Language
Assessment - Practicality,
Reliability, Validity,
Rizki Rmd on 01.00 No Comment
Blogger templates
Submit Express
SEO Services & Tools
Blogger news
Blogroll
A. Practicality
An effective test is practical. This means that it
Is not excessively expensive,
Stays within appropriate time constraints,
Is relatively easy to administer, and
Has a scoring/evaluation procedure that is specific and
time-efficient.
A test that is prohibitively expensive is impractical. A test
of language proficiency that takes a student five hours to
complete is impractical-it consumes more time (and money)
than necessary to accomplish its objective. A test that
requires individual one-on-one proctoring is impractical for
a group of several hundred test-takers and only a handful of

examiners. A test that takes a few minutes for a student to
take and several hours for an examiner too evaluate is
About
impractical for most classroom situations.

B. Reliability
A reliable test is consistent and dependable. If you give the

same test to the same student or matched students on two
different occasions, the test should yield similar result. The
issue of reliability of a test may best be addressed by
Translate Widget by Google
considering a number of factors that may contribute to the
About
unreliability of a test. Consider the following possibilities

(adapted from Mousavi, 2002, p. 804): fluctuations in the
student, in scoring, in test administration, and in the test
itself.
Student-Related Reliability
He most common learner-related issue in reliability is
caused by temporary illness, fatigue, a bad day,
anxiety, and other physical or psychological factors,
which may make an observed score deviate from
ones true score. Also included in this category are
such factors as a test-takers test-wiseness or
strategies for efficient test taking (Mousavi, 2002, p.
804).
Rater Reliability
Human error, subjectivity, and bias may enter into the
scoring process. Inter-rater reliability occurs when
two or more scores yield inconsistent score of the
Diberdayakan oleh Blogger.
Google+ Followers
Rizki Rmd
+ ke lingkaran
same test, possibly for lack of attention to scoring
criteria, inexperience, inattention, or even

preconceived biases. In the story above about the
placement test, the initial scoring plan for the
dictations was found to be unreliable-that is, the two

scorers were not applying the same standards.
Test Administration Reliability
Unreliability may also result from the conditions in
which the test is administered. I once witnessed the
administration of a test of aural comprehension in
which a tape recorder played items for
comprehension, but because of street noise outside
the building, students sitting next to windows could

not hear the tape accurately. This was a clear case of
unreliability caused by the conditions of the test
2 memiliki saya di
Lihat
lingkaran
semua
Beranda
Blog Archive
2014 (6)
Februari (2)
Januari (4)
The Present: An Informed
Approach - Chapter 3
Teaching Across Proficiency
Levels - Learner Varia...
Teaching Across Age Levels Learner Variables I
Principles of Language
Assessment - Practicality, ...
administration. Other sources of unreliability are

found in photocopying variations, the amount of light
in different parts of the room, variations in
Mengenai Saya
temperature, and even the condition of desks and
Rizki Rmd
chairs.
Ikuti
Liha t prof il
lengk apku
Test Reliability
Sometimes the nature of the test itself can cause
measurement errors. If a test is too long, test-takers
may become fatigued by the time they reach the later
items and hastily respond incorrectly. Timed tests may
Populars
Comments
Archive
discriminate against students who do not perform

well on a test with a time limit. We all know people
PART II CONTEXTS OF TEACHING

CHAPTER 6 LEARNER VARIABLES I:
TEACHING ACROSS AGE LEVELS A.
TEACHING CHILDREN Popular
tradition wo...
adversely affected by the presence of a clock ticking
Principles of Language Assessment

- Practicality, Reliability, Validity,
(and you may be include in this category1) who

know the course material perfectly but who are
away. Poorly written test items (that are ambiguous or
that have more than on correct answer) may be a
further source of test unreliability.
A. Practicality An effective test is

practical. This means that it Is not
excessively expensive, Stays within
appropriate ...
C. Validity
Teaching Across Proficiency Levels

- Learner Variables II
arguably the most important principle-is validity, the extent

to which inferences made from assessment result are
PART II CONTEXTS OF TEACHING

CHAPTER 7 LEARNER VARIABLES II:
TEACHING ACROSS PROFICIENCY
LEVELS A. DEFINING PROFICIENCY
LEVELS ...
By far the most complex criterion of an effective test-and
appropriate, meaningful, and useful in terms of the purpose
of the assessment (Ground, 1998, p. 226). A valid test of

reading ability actually measures reading ability-not 20/20
Labels
vision, nor previous knowledge in a subject, nor some other
variable of questionable relevance. To measure writing

ability, one might ask students to write as many words as
they can in 15 minutes, then simply count the words for the
final score. Such a test would be easy to administer
(practical), and the scoring quite dependable (reliable). But it
would not constitute a valid test of writing ability without
some consideration of comprehensibility, rhetorical discourse
elements, and the organization of ideas, among other factors.
English
Tips & Tricks
Content-Relate Evidence
If a test actually samples the subject matter about
which conclusion are to be drawn, and if it requires

the test-takers to perform the behavior that is being
measured, it can claim content-related evidence of
validity, often popularly referred to as content validity

(e.g., Mousavi, 2002; Hughes, 2003). You can usually
identify content-related evidence observationally if
you can clearly define the achievement that you are
measuring.
Criterion-Related Evidence
A second of evidence of the validity of a test may be
found in what is called criterion-related evidence,
also referred to as criterion-related validity, or the

extent to which the criterion of the test has actually
been reached. You will recall that in Chapter I it was
noted that most classroom-based assessment with
teacher-designed tests fits the concept of criterionreferenced assessment. In such tests, specified
classroom objectives are measured, and implied
predetermined levels of performance are expected to
be reached (80 percent is considered a minimal
passing grade).
Construct-Related Evidence
A third kind of evidence that can support validity, but
one that does not play as large a role classroom
teachers, is construct-related validity, commonly

referred to as construct validity. A construct is any
theory, hypothesis, or model that attempts to explain
observed phenomena in our universe of perceptions.
Constructs may or may not be directly or empirically
measured-their verification often requires inferential
data.
Consequential Validity
As well as the above three widely accepted forms of
evidence that may be introduced to support the
validity of an assessment, two other categories may be
of some interest and utility in your own quest for

validating classroom test. Messick (1989), Grounlund
(1998), McNamara (2000), and Brindley (2001),
among others, underscore the potential importance of
the consequences of using an assessment.
Consequential validity encompasses all the
consequences of a test, including such considerations
as its accuracy in measuring intended criteria, its
impact on the preparation of test-takers, its effect on
the learner, and the (intended and unintended) social

consequences of a tests interpretation and use.
Face Validity
An important facet of consequential validity is the
extent to which students view the assessment as fair,
relevant, and useful for improving learning
(Gronlund, 1998, p. 210), or what is popularly
known as face validity. Face validity refers to the
degree to which a test looks right, and appears to
measure the knowledge or abilities it claims to
measure, based on the subjective judgment of the
examines who take it, the administrative personnel
who decode on its use, and other psychometrically
unsophisticated observers (Mousavi, 2002, p. 244).
D. Authenticity
An fourth major principle of language testing is
authenticity, a concept that is a little slippery to define,

especially within the art and science of evaluating and
designing tests. Bachman and Palmer (1996, p. 23) define
authenticity as the degree of correspondence of the
characteristics of a given language test task to the features of
a target language task, and then suggest an agenda for
identifying those target language tasks and for transforming

them into valid test items.
E. Washback
A facet of consequential validity, discussed above, is the
effect of testing on teaching and learning (Hughes, 2003, p.
1), otherwise known among language-testing specialists as

washback. In large-scale assessment, wasback generally
refers to the effects the test have on instruction in terms of
how students prepare for the test. Cram courses and
teaching to the test are examples of such washback.
Another form of washback that occurs more in classroom
assessment is the information that washes back to students
in the form of useful diagnoses of strengths and weaknesses.
Washback also includes the effects of an assessment on
teaching and learning prior to the assessment itself, that is,
on preparation for the assessment.
F. Applying Principles to the Evaluation of Classroom Tests

The five principles of practicality, reliability, validity,
authenticity, and washback go a long way toward providing
useful guidelines for both evaluating an existing assessment
procedure and designing one on your own. Quizzes, tests,
final exams, and standardized proficiency tests can all be
scrutinized through these five lenses.
1.
2.
3.
4.
5.
6.
Are the test procedures practical?

Is the test reliable?
Does the procedure demonstrate content validity?
Is the procedures face valid and biased for best?
Are the test tasks as authentic as possible?
Does the test other beneficial washback to the learner?
Reference :
Brown, H. Douglas. 2004. Language Assessment: Principles
and Classroom Practices. New York: Longman.
Next
Previous
This is the last post.
Related Posts
Copyright 2014 Blog Lightning R-md All Right Reserved

Designed by IVYthemes | MKR Site

Principles of Language Assessment - Practicality, Reliability, Validity, Authenticity, and Washback - Blog Lightning R-MD PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Principles of Language Assessment - Practicality, Reliability, Validity, Authenticity, and Washback - Blog Lightning R-MD PDF

Uploaded by

Copyright:

Available Formats

Beranda

BLOG LIGHTNING R-MD

Home English Principles of Language Assessment - Practicality, Reliability, Validity,

a group of several hundred test-takers and only a handful of

impractical for most classroom situations.

A reliable test is consistent and dependable. If you give the

Translate Widget by Google

considering a number of factors that may contribute to the

unreliability of a test. Consider the following possibilities

Diberdayakan oleh Blogger.

same test, possibly for lack of attention to scoring

criteria, inexperience, inattention, or even

dictations was found to be unreliable-that is, the two

the building, students sitting next to windows could

administration. Other sources of unreliability are

temperature, and even the condition of desks and

Teaching Across Age Levels Learner Variables I

discriminate against students who do not perform

PART II CONTEXTS OF TEACHING

adversely affected by the presence of a clock ticking

(and you may be include in this category1) who

A. Practicality An effective test is

Teaching Across Proficiency Levels

arguably the most important principle-is validity, the extent

PART II CONTEXTS OF TEACHING

By far the most complex criterion of an effective test-and

appropriate, meaningful, and useful in terms of the purpose

of the assessment (Ground, 1998, p. 226). A valid test of

vision, nor previous knowledge in a subject, nor some other

variable of questionable relevance. To measure writing

Tips & Tricks

which conclusion are to be drawn, and if it requires

validity, often popularly referred to as content validity

also referred to as criterion-related validity, or the

teachers, is construct-related validity, commonly

of some interest and utility in your own quest for

the learner, and the (intended and unintended) social

authenticity, a concept that is a little slippery to define,

identifying those target language tasks and for transforming

1), otherwise known among language-testing specialists as

F. Applying Principles to the Evaluation of Classroom Tests

Are the test procedures practical?

and Classroom Practices. New York: Longman.

Teaching Across Age Levels Learner Variables I

This is the last post.

Copyright 2014 Blog Lightning R-md All Right Reserved

You might also like