You are on page 1of 20

ITEM

ANALYSIS
”THE BETTER THE ITEMS, THE BETTER
THE TEST.”
 The reliability of test scores and the
validity of the interpretation of test scores
are dependent on the quality of the items in
ITEM the test.
 Is the process of analyzing test items
ANALYSI statistically to ascertain whether the items
are functioning as intended.
S FOR  Through item analysis, the test developer
can identify flawed items that should be
CLASSRO eliminated or revised, select the “best
items” so that a shorter more efficient
OM version can be created for the final form or
construct alternate test forms that are
ASSESSM similar in level of difficulty.
 Is the process of collecting, summarizing
ENT and using information from students’
responses to make decisions about each
item.
 Determining whether an item
function as you intended.
 Feedback to students about their
performance and as basis for
CLASSROO class discussion.
M USES OF  Feedback to the teacher about
ITEM pupil difficulties.
ANALYSIS  Areas for curricular information
 Revising assessment tasks
 Improving item writing skills
According to Oriondo, one method that that can be used for
item analysis is the U-L Index Method (Stocklein, 1957)
The steps are
 Score the papers and rank them from highest to the lowest
according to total score
 Separate the top 27% and the bottom 27% of the papers.

ITEM  Tally responses made to each test item by each individual


in the upper 27% group.

ANALYSI  Tally responses made to each test item by each individual


in the lower 27% group.

S  Compute the percentage of the upper group that got the


item right and call it U.
 Compute the percentage of the lower group that got the
item right and call it L.
 Average U and L percentage and the result is called the
difficulty index of the item (p)
 Subtract the L percentage from U percentage and the
result is the discrimination index.
The difficulty of an item (problem or
question) may be determined in several
ways:

ITEM  By the judgment of competent people


who rank the items in order of difficulty
DIFFICUL  By how quickly the items can be
answered or solved.
TY  By the number of examinees who get the
item right
ITEM DIFFICULTY INDEX
(P)
 (Selected-Response Items)

Or

cU-

cL-

nU=nL number of students in either upper group or lower group

or

pU -

pL-
ITEM DIFFICULTY INDEX
(P)
Item Difficulty Index (Constructed-Response Items)

Item Difficulty Index


 Ranges from 0.0 to 1.0

 Large p indicates an easy item; a small p indicates a difficult item.

 Items that are correctly answered by every subject (p = 1.0) and items that are
missed by every subject (p = 0.0) provide no information about individual
differences and are of no value from a measurement perspective.
WHAT ARE DESIRABLE P
VALUES?
Too many difficult items and few subjects get them correct resulting in
reduced variability. Too many easy questions and most subjects get them
correct, reducing variability.

Difficulty Indeces Interpretation


0.00 - 0.10 Very Difficult
0.11 - 0.20 Difficult
0.21 - 0.80 Moderately Difficult
0.81 - 0.90 Easy
0.91 - 1.00 Very Easy
p Value
Desirable p Values
 In practice, a general recommendation is
WHAT to use items with p values with a range
of approximately 0.20 around the
ARE optimal values.

DESIRAB Other Considerations with p


LE P  If you need to distinguish among the
most talented individuals, it may be
VALUES? desirable to include some items with low
p values to challenge the most gifted
individuals.
 Item discrimination refers to how well
an item can discriminate or differentiate
among test takers who differ on the
ITEM construct being measured by the test.
DISCRIMINAT  On a test of reading comprehension,
ION item discrimination reflects an item’s
ability to distinguish between
individuals with good and poor reading
comprehension skills
ITEM DISCRIMINATION
INDEX (D)
 Selected-Response Items

D = pU – pL

pU -
pL -

One common approach is to compare the top 27% versus bottom 27%
and exclude the middle 46%.
ITEM DISCRIMINATION
INDEX (D)
 Constructed-Response Items

Desirable D Values
 As a general rule it is suggested that items with D values over 0.30 are acceptable and items with D
values below 0.30 should be reviewed and possibly revised or deleted.
 Items with D values close to 0.0 do not contribute to the ability of the test of discriminate.

 Items with negative values do not measure the same characteristic as the overall test and actually
detract from the test.
Index D Interpretation
.40 and larger Excellent
.30 - .39 Good
.11 - .29 Fair
.00 - .10 Poor
Negative Values Miskeyed or other major flaw

Source: Based on Hopkins (1998)

GUIDELINES FOR
EVALUATING D VALUES
 With multiple-choice items, an incorrect
alternative is referred to as a distracter.
 If the item has a marginal p or D and you want
to revise it, Distracter Analysis may help.
 Distracter Analysis allows you to evaluate the
DISTRACT percentage of people selecting each distracter to
determine which ones are useful.
ER  The discrimination index tells you if more
people in the top group than the bottom group
ANALYSIS selected the correct answer (i.e., positive
discrimination),
 With distracter analysis you examine the
individual distracters to see if they have
negative discrimination (i.e., selected more by
people in the bottom group).
 Distracters should demonstrate negative

DISTRACT discrimination.
 If they have positive discrimination,
ER there is likely some ambiguity present.
POWER  If a distracter is rarely or never selected,
it is probably so obviously wrong that
ANALYSIS no one selects it - it is ineffective and
should be revised.
p = 0.52 Options SAMP
D = 0.43
LE
A* B C D
(GOOD
ITEM)
Number in Top 22 3 2 3
Group
Number in 9* Correct
7 Answer
8 6
Bottom Group
ITEM
Options NEEDIN
p = 0.50
G
D = 0.14 A* B C D REVISI
ON
Number in Top Group 17 9 0 4
* Correct Answer
Number in Bottom Group 13 6 0 11
p = 0.20 Options
D = -0.13 MAJOR
A B *C D PROBLE
M

Number in Top 20 4 4 2
Group
Number in Bottom 11* Correct
6 Answer
8 5
Group
Item Upper Lower Difficulty Discrimination Decision
No. 27% 27% Index Index *Rejected
** Retained
*** Revised EXAMP
1
2
14
10
12
6
0.81
0.51
0.13
0.25
LE OF
3 11 7 0.57 0.25 RESULT
4
5
9
12
2
6
0.35
0.57
0.43
0.37 S OF
6
7
6
13
14
4
0.63
0.53
-0.50
0.56
ITEM
8 3 10 0.41 -0.44 ANALYS
IS
9 13 12 0.78 0.06
10 8 6 0.44 0.12
Item is retained if the p is between 0.10 and 0.90 and the D is 0.30 and above
Set the test aside and after a
break carefully proof the
test.

QUALITAT
Have a colleague familiar
with the content area review
IVE ITEM
the test for errors. ANALYSIS

After administering the test,


solicit feedback from
examinees about ambiguous
items.

You might also like