You are on page 1of 15

Principles of Language

Assessment

By. Estu Yogyaningsih


Hairmawan
Winda
Outline

1 Practicality

2 Validity

3 Reliability

4 Authenticity
Testing
Washback
5
Practicality
• Practicality: It is defined as the relationship between the
resources that will be required in the design, development, and
use of the test and the resources that will be available for these
activities. It is represented as following figure:
Brown (2004:19) defines practicality is in terms of:
1) Cost
2) Time
3) Administration
4) Scoring / Evaluation
Reliability
Reliability is the extent to which a test produces consistent scores at
different administrations to the similar group of examinees. Reliability is
synonymous with Dependability, Stability, Consistency, Predictability and
Accuracy.
Classical True Score
Measurement

Obtained = Real Ability = Other Factor


Score
X= Observed
Score

Xt= True Score


X= Xt+Xe
Xe= Error Score
Continue…

• Accordingly, reliability is defined as the extent to which a test is


error free. In fact, your scores (Obtained Scores) are only a
partial representation of your true score ( Real Ability) and the
reason lies in the presence of the factors other than the ability
being tested.
• Therefore, a reliable test is a test in which true score variance is
higher and error score variance is lower. If the test is error free,
the true score will be equal to observed score.
Reliability Falls within 4
kind :
RELIABILITY.

1. It refers to psychological and physical factors including “bad day” anxiety, illness, test taker’s
“test wiseness” and fatigue which can make an “observed score” deviate from one’s true
score.

A). Inter- Rater Reliability: Scorers yield


inconsistent scores of the same
test.
2. Rater reliability falls into 2 categories
B). Intra- Rater Reliability: Unclear scoring
criteria, bias and carelessness
3. It basically springs from the conditions in which the test is administered: noisy class, amount of
light, chairs ..

4. The test should fit into the time constraints, the test should not be too long or short and test
items should be clear.
Validity
Validity is the degree of correspondence between the test content and the content of the
material to be tested. Ex: A valid test of Reading Ability actually measures the reading ability
itself: not previous knowledge.
VALIDITY

Content Validity: If a test actually samples the subject matter about which
conclusions are to be drawn, so it can claim content-related evidence of validity.
Ex: Speaking
Direct Testing: It requires the test takers to perform the targetMultiple
task directly.
Choice

Indirect Testing: Learners are required to perform by the useOral


of indirectly
Productionrelated
tasks

Criterion validity: Is the extent to which performance on a test is related to a


criterion which is the indicator of the ability being tested. The criterion may be
individuals’ performance on another test or even a known standard.
Concurrent Validity: A test has CV if its results are supported by other concurrent
performance beyond the assessment itself.
Predictive Validity: It tends to predict a student’s likelihood of future success
Continue…
Construct Validity: Is the extent to which a test measures just the construct
it is supposed to measure.
Consequential Validity: It refers to the positive or negative consequences
of a particular test. Consequences include its impact on the preparation of
test takers, on learners, social consequences and washback as well.
Face Validity: It is the extent to which the measurement method “on its
face” appears to measure the particular ability. It is generally based on the
subjective judgment of the examinees.
Authenticity
Is the extent to which the tasks required on a given test
are similar to normal “real life” language use, in other
words, it is the degree of correspondence between tests,
tasks, and activities of target language use. Therefore,
the higher the correspondence, the more authentic the
test.
Authenticity may be present in the following ways

The language as natural as possible.


Items contextualized rather than isolated.
Topics meaningful (relevant, interesting) for the learner.
Some thematic organization to items is provided, such as through a
story line or episode.
Tasks represent, or closely approximate, real-world tasks.
Washback/Backwash
Washback Effect: Generally, it is the influence of the nature of a
test on teaching and learning.
2 Kind of Washback

1) Negative Washback 2) Positive Wasback


Continue

1. Negative Washback: When test and testing techniques are at variance


with the objectives of the course. Tests which have negative washback is
considered to have negative influence on teaching and learning.
Ex: Taking an English course to be trained in 4 language skills, however
the language test does not test those skills.

2. Positive Washback: Positive washback would result when a testing


procedure encourages “good” teaching practices. Ex: The consequence
of many reading comprehension tests is a possible development of the
reading skills.
THANK YOU

You might also like