You are on page 1of 3

Item Analysis

Remember that tests are used for instructional purposes and other important decisions, it
is therefore important to conduct item analysis. Item analysis is the process of evaluating the
quality of the items and of the tests as a whole. Item analysis “investigates the performance of
items considered individually in relation to the remaining items on the test” (Thompson &
Leviton, 1985). It is the process of examining the pupil’s responses to each test item.

The item analysis procedure provides the following information:


 The difficulty of the item
 The discriminating power of the item
 The effectiveness of the distracters

The key aim of the item analysis is to improve the whole test and eventually increase the
reliability and validity of the test. The tools to test the validity and reliability of the test include
item difficulty, item discrimination, and effectiveness of distracters. There are many methods
that can be used for item analysis.

One method is the U – L Index Method (Stocklein, 1957). The steps are:

1. Score the papers and rank them from highest to lowest according to total score.
2. Separate the top 27% and the bottom 27% of the papers.
3. Tally the responses made to each test item by each individual in the upper 27% group.
4. Tally the responses made to each test item by each individual in the lower 27% group.
5. Compute the percentage of the upper group that got the item right and call it U.
6. Compute the percentage of the lower group that got the item right and call it L.
7. Average U and L percentage and the result is the difficulty index of the item.
8. Subtract the L from the U percentage and the result is the discrimination index of
the item

Item Difficulty/Difficulty Index

Difficulty index, denoted by p, is simply the percentage of the students taking the test
who answered the item correctly. It can be interpreted as how easy or how difficult an item is.
The larger the percentage getting an item right, the easier the item. The higher the difficulty
index, the easier the item is understood to be. For example, an item answered correctly by 85%
of the examinees will have an item difficulty of .85, whereas an item answered by 50% of the
examinees would have a lower item difficulty of .5. It is usually best when difficulty index is
around .5, for it provides maximum differentiation. The best difficulty index, however, is
halfway between the lowest and highest expected score. When all of the items are extremely
difficult, the great majority of the test scores will be very low. When all items are extremely
easy, most test scores will be extremely high. We do not want items that are too easy or too
difficult.
Item Discrimination/Discrimination Index

A good item discriminates between those who do well on the test and those who do
poorly (Susan Matlock-hetzel, 1997). The discriminative power of a test item, denoted by D
refers to the degree of the ability of an item to distinguish those who know from those who do
not know. It can be measured by comparing the number of people with high test scores who
answered that item correctly with the number of people with low scores who answered the same
item correctly. The top and bottom 27% are used for analysis because 27% has shown that this
value will maximize differences in the normal distribution while providing enough cases for
analysis (Wiersma and Jurs (1990). Other methods use 30% while others use 50% (no middle
group).

The higher discrimination index, the better the item because such a value indicates that
the item discriminates in favor of the upper group, which should get more items correct. If more
students in the lower group get an item correct than in the upper group, the item will have a
negative D value and is probably flawed. Similarly, if there are equal number of students from
the upper group and lower group, the item cannot discriminate because D value is 0.

Computing for the difficulty index and discrimination index of an item is a lot easier than
interpreting it. Remember that the purpose of the analysis is to determine which are good items;
thatis there are items to be retained or accepted, rejected or discarded, and revised. The table
below will serve as guide in interpreting the results of the item analysis.

Table for interpreting difficulty index (p)


Range of Difficulty Index (p) Interpretation
0 - .20 Very difficult
0.21 – 0.40 Difficult
0.41 – 0.60 Moderately difficult
0.61 – 0.80 Easy
0.81 – 1.00 Very easy

Table for interpreting discrimination index (D)


Range of Discrimination Index (D) Interpretation
-1.0 - -.60 Questionable
-0.59 - -0.20 Not discriminating
-0.19 - 0.20 Moderately discriminating
0.21 - 0.60 Discriminating
0.61 - 1.00 Very discriminating

In interpreting the results, we have to consider not only whether how easy or how
difficult the item is, but also its ability to discriminate students who know and those who do not
know the answer. In other words, both the p values and D values are taken into consideration.
The decision rule is to retain or accept the items that are not so easy or too difficult, and at the
same time can discriminate bright from poor students
After interpreting the difficulty and discrimination indices, the table below will help us
what to do with the test item

Difficulty Index Discrimination Index Suggested Action


Not discriminating Discard/Reject
Difficult Moderately discriminating May need revision
Discriminating/Very Accept/Retain
discriminating
Not discriminating Discard/Reject
Moderately difficult Moderately discriminating May need revision
Discriminating/Very Accept/Retain
discriminating
Not discriminating Discard/Reject
Easy Moderately discriminating Needs Revision
Discriminating/Very Accept/Retain
discriminating

Note that whether an item is easy, moderately difficult or difficult, there are always three
categories of discrimination indices that can be obtained, that also leads to three different actions.
For example, an easy item could either be not discriminating, moderately discriminating or
discriminating, and then it could be rejected, revised or retained.

However, care and caution must be followed in using the table in interpreting the results
of an item analysis. Judgment of the test constructor is very important. For example, what will
be done with an item that is easy and not discriminating? Using the table, we should reject the
item. But there will be an instance when that kind of item can be revised. When? When that
particular item is the only item left to test a very important concept. So, we have no choice but
to revise or improve it. On the other hand, what will be done with an item that is moderately
difficult and discriminating? Normally that item should be retained because it has good indices.
But there will be instance when that kind of item may be discarded or rejected. That will happen
if there are already enough items to test the particular concept or skill that it assesses. The table
below shows a sample result of an item analysis, illustrating the steps of the U-L Index method.
Study the results and focus on the interpretation of the test item.

You might also like