You are on page 1of 15

Validity and Reliability

Joanna Ochoa V.
A valid test

Validity measures
accurately what it
is intended to
measure.

 To assert that a test has construct validity; empirical evidence


is needed.

 Subordinate forms of validity.

Criterion-
Content
related
validity
validity
Content Validity
 The content of the test constitutes a respresentative sample of the skills it is
supossed to measure = Content Validity

The test must include a proper


sample of the relevant
structures.

 Specification of the skills or structures that the test is meant to cover.

ATTENTION!
Content validation should be
carried out while a test is being
developed.
Criterion-related validity

• The test and the criterion are administer at the same time.

• Example: Oral exam. Long vs. Short version of the exam. Random
Concurrent sampling.

validity • Levels of agreement = correlation coefficient.

• Perfect agreement = coefficient of 1. Lack of agreement = coefficient of


zero.

• The degree to which a test can predict candidates’ future performance.

Predictive • Example: Profiency test to predict a student’s ability to cope with a


graduate course at a British university. Criterion measure: student’s
validity English perceived by his supervisor or the outcome of the course.
Validity in scoring
 Items and the way in which they are scored must
valid.

 Example: A reading test.

(Should we consider grammar and spelling


mistakes in the responses?)
A test is said to have face
validity…..

For example: A test to measure


pronunciation ability.
How to make tests more valid

Whenever
feasible, use
direct testing.
The scoring must be
related to what is
being testing.

Reliability!!
Reliability
 We have to: construct, administer and score items in a way that
we will obtain similar results in different situations.
The reliability coefficient
 To quantify the realibility of a test.

Ideal reliability
coefficient = 1 Would always give
the same results.

Reliability
coefficient of Sets of results
zero unconnected with
each other.
TEST-RETEST
METHOD

It is required to
have two sets of A group of students take
scores to be the same test twice.
compared.

1. Too soon
Alternate forms
(memorization of
methods
the answers)
solution
Split half method =
2. Too late
only one administration
(forgetting)
of one test
Scorer reliability

 If the scoring of a test is not reliable, then the test results cannot be reliable
either.

 For example:

 The scorer reliability coefficient on a composition writing test = .92

 The reliability coefficient for the test = .84

Variability in the performance of individual candidates accounted for the


differece between the two coefficients.
How to make tests more reliable

Exclude items Ensure that tests Make candidates


Take enough Write Provide clear and Provide uniform
(weaker vs. Do not allow too are well laid out familiar with
samples of unambiguous explicit conditions of
Stronger much freedom and perfectly format and
behaviour items instructions administration
students) legiable testing techniques

1. More items = more reliability


2. Too easy and too difficult items
3. Choice of questions
4. Unclear meaning of the items
5. The supposition that the students all understand the instructions
6. Institutional tests are badly typed
7. Unfamiliar aspects of the test
8. Precautions must be taken
Ways of obtaining scorer reliability
Use items that permit Make comparisons
Provide a detailed
scoring which is as between candidates as
scoring key
objective as possible direct as possible

Agree acceptable
responses and Identify candidates by
Train scorers
appropiate scorers at number, not name
outset of scoring

Employ multiple
independent scoring
Reliability and validity

 To be valid a test must provide consistently accurate


measurements.

 A reliable test may not be valid at all.

 Example: writing test

 To make tests reliable, we must be wary of reducing their


validity.

You might also like