Professional Documents
Culture Documents
ANALYSIS
”THE BETTER THE ITEMS, THE BETTER
THE TEST.”
The reliability of test scores and the
validity of the interpretation of test scores
are dependent on the quality of the items in
ITEM the test.
Is the process of analyzing test items
ANALYSI statistically to ascertain whether the items
are functioning as intended.
S FOR Through item analysis, the test developer
can identify flawed items that should be
CLASSRO eliminated or revised, select the “best
items” so that a shorter more efficient
OM version can be created for the final form or
construct alternate test forms that are
ASSESSM similar in level of difficulty.
Is the process of collecting, summarizing
ENT and using information from students’
responses to make decisions about each
item.
Determining whether an item
function as you intended.
Feedback to students about their
performance and as basis for
CLASSROO class discussion.
M USES OF Feedback to the teacher about
ITEM pupil difficulties.
ANALYSIS Areas for curricular information
Revising assessment tasks
Improving item writing skills
According to Oriondo, one method that that can be used for
item analysis is the U-L Index Method (Stocklein, 1957)
The steps are
Score the papers and rank them from highest to the lowest
according to total score
Separate the top 27% and the bottom 27% of the papers.
Or
cU-
cL-
or
pU -
pL-
ITEM DIFFICULTY INDEX
(P)
Item Difficulty Index (Constructed-Response Items)
Items that are correctly answered by every subject (p = 1.0) and items that are
missed by every subject (p = 0.0) provide no information about individual
differences and are of no value from a measurement perspective.
WHAT ARE DESIRABLE P
VALUES?
Too many difficult items and few subjects get them correct resulting in
reduced variability. Too many easy questions and most subjects get them
correct, reducing variability.
D = pU – pL
pU -
pL -
One common approach is to compare the top 27% versus bottom 27%
and exclude the middle 46%.
ITEM DISCRIMINATION
INDEX (D)
Constructed-Response Items
Desirable D Values
As a general rule it is suggested that items with D values over 0.30 are acceptable and items with D
values below 0.30 should be reviewed and possibly revised or deleted.
Items with D values close to 0.0 do not contribute to the ability of the test of discriminate.
Items with negative values do not measure the same characteristic as the overall test and actually
detract from the test.
Index D Interpretation
.40 and larger Excellent
.30 - .39 Good
.11 - .29 Fair
.00 - .10 Poor
Negative Values Miskeyed or other major flaw
GUIDELINES FOR
EVALUATING D VALUES
With multiple-choice items, an incorrect
alternative is referred to as a distracter.
If the item has a marginal p or D and you want
to revise it, Distracter Analysis may help.
Distracter Analysis allows you to evaluate the
DISTRACT percentage of people selecting each distracter to
determine which ones are useful.
ER The discrimination index tells you if more
people in the top group than the bottom group
ANALYSIS selected the correct answer (i.e., positive
discrimination),
With distracter analysis you examine the
individual distracters to see if they have
negative discrimination (i.e., selected more by
people in the bottom group).
Distracters should demonstrate negative
DISTRACT discrimination.
If they have positive discrimination,
ER there is likely some ambiguity present.
POWER If a distracter is rarely or never selected,
it is probably so obviously wrong that
ANALYSIS no one selects it - it is ineffective and
should be revised.
p = 0.52 Options SAMP
D = 0.43
LE
A* B C D
(GOOD
ITEM)
Number in Top 22 3 2 3
Group
Number in 9* Correct
7 Answer
8 6
Bottom Group
ITEM
Options NEEDIN
p = 0.50
G
D = 0.14 A* B C D REVISI
ON
Number in Top Group 17 9 0 4
* Correct Answer
Number in Bottom Group 13 6 0 11
p = 0.20 Options
D = -0.13 MAJOR
A B *C D PROBLE
M
Number in Top 20 4 4 2
Group
Number in Bottom 11* Correct
6 Answer
8 5
Group
Item Upper Lower Difficulty Discrimination Decision
No. 27% 27% Index Index *Rejected
** Retained
*** Revised EXAMP
1
2
14
10
12
6
0.81
0.51
0.13
0.25
LE OF
3 11 7 0.57 0.25 RESULT
4
5
9
12
2
6
0.35
0.57
0.43
0.37 S OF
6
7
6
13
14
4
0.63
0.53
-0.50
0.56
ITEM
8 3 10 0.41 -0.44 ANALYS
IS
9 13 12 0.78 0.06
10 8 6 0.44 0.12
Item is retained if the p is between 0.10 and 0.90 and the D is 0.30 and above
Set the test aside and after a
break carefully proof the
test.
QUALITAT
Have a colleague familiar
with the content area review
IVE ITEM
the test for errors. ANALYSIS