Professional Documents
Culture Documents
AND
ANALYSIS
VALIDATION CHAP TE
R 6
INTRODUCTIO
N prepares a draft of the test, which is then subjected
The teacher
to item analysis and validation. The teacher tries out the test to a
group of students of similar characteristics, and each item is
analyzed in terms of its ability to discriminate between those who
know and those who do not know and its level of difficulty. The
item analysis will provide information that will allow the teacher
to decide whether to revise or replace an item. Finally, the final
draft is subjected to validation if the intent is to make use of the
test as a standard test for the particular unit or grading period.
Explain the meaning of item
1 analysis, item validity, reliability,
LEARNING
item difficulty, discrimination index
OUTCOMES 2
Determine the validity and
1 ITEM
DIFFICULTY
2 DISCRIMINATION
INDEX
ITEM
DIFFICULTY
The difficulty of an item or item difficulty is defined as the
number of students who are able to answer the item correctly
divided by the total number of students. Thus:
Example:
Obtain the index of discrimination of an item if the upper 25%
of the class had a difficulty index of 0.60 (i.e. 60% of the
upper 25% got the correct answer) while the lower 25% of the
class had a difficulty index of 0.20.
T
I
V
Validity is the extent to which a test
A measures what it purports to measure or as
referring to the appropriateness,
correctness, meaningfulness and usefulness
L of the specific decisions a teacher makes
based on the test results.
I
I
Content-related evidence of validity
THREE MAIN TYPES OF refers to the content and format of
EVIDENCE THAT MAY the instrument.
BE CONNECTED
Criterion-related evidence of validity
refers to the relationship between
scores obtained using the instrument
and scores obtained using one or
1 Content-Related Evidence of Validity more other tests (often called
criterion).
Construct-related evidence of
2 Criterion-Related Evidence of Validity validity refers to the nature of the
psychological construct or
characteristic being measured by the
test.
3 Construct-related evidence of validity
RELIABILITY
Realiability Interpretation
Good for a classroom test; in the range most. There are probably a few items
.70 - 80
which could be improved.
Somewhat low. This test needs to supplemented by other measures (e.g., more test)
.60 - 70
to determine grades. There probably some items which could be improved
Suggests are need for revision of test, unless it is quite short (ten or fewer items). The
.50 - 60
test definitely needs to be supplemented by other measures (e.g. more test) for
grading
Questionable reliability. This test should not contribute heavily to the course
.50 or below
grade, and it needs revision.
Believe. You're halfway
there. Keep hustling
to the finish line.
T h an k
You!