You are on page 1of 7

Conducting Test Item Analysis

Conducting Item Analysis

After a test is administered and scored, it is usually put aside and waits
until it is used again. Much of the effort in planning and constructing the test are
usually not evaluated. A better procedure is to analyze the test items and find
out if it has to be rejected, revised, or retained for future use. This process is
called item analysis. Item analysis is a process by which each item in the test
is carefully analyzed (e.g. difficulty, discriminating ability and effectiveness of
each alternative). Recall in the early part of this chapter that in building a test
item bank, certain item statistics should be sought so that teachers can build a
pool of high-quality test items in the subject or courses they handle.
In a standardized test, item analysis is usually conducted right after
the item try-out. Item try-out refers to the field testing of the test items. In an
item try-out, the goal is to let students answer the test items and then examine
the students’ responses to each test item. This is not to be confused with actual
and formal administration of the test.
There are different ways on conducting item analysis depending on
whether the test is criterion-referenced or norm-referenced. The perspectives
of test item analysis procedures for these two tests differ. A norm-referenced
test item analysis is a process of analysing each item based on its ability to
distinguish high and low performing students which gives prime emphasis on
the difficulty of the test items. This is not the case in a criterion referenced test.
For example, if the learning outcome being measured by the item is hard, the
test item would naturally be hard and likewise if the learning outcome is easy,
it would follow that the test item would also be easy. The following are the steps
in conducting norm-referenced test item analysis.

Norm-Referenced Test Item Analysis

To illustrate the procedures in conducting a norm-referenced item analysis,


the following example is given.

Mr. Lucky administered a test in refrigeration and air-conditioning. The test is a 50


item multiple choice type test for a group of 40 high school students and the first five
(5) items are shown in the following figure.
Sample Multiple Choice Test Items for Analysis

1. It refers to the process of transferring heat from one place to another


a. Condensation
b. Evaporation
c. Compression
d. Refrigeration
2. If there is a glass of water with ice cubes and the temperature of the water is 10 degree Celsius while the
ice cube has a temperature of 0 degree celcius, how does heat flows?
a. From water to ice cube
b. From ice cube to water
c. There was no flow of heat
d. None of the above
3. It refers to the smallest tubing in the refrigeration system.
a. Capillary
b. Condenser
c. Heat exchanger
d. Filter drier
4. It refers to a mechanical instrument used in troubleshooting electrical problems in refrigerator
a. Volt meter
b. Ammeter
c. Barometer
d. Multi-tester
5. What would be the mechanical defect in a refrigerator system if there is an inability to pump the maximum
pressure of 500 psi at its high side?
a. Leak back pressure
b. Oil pumping
c. Oily evaporator
d. Loose compression

The pre-item analysis procedures are as follows:

Step 1. Arrange the test papers from highest to lowest score.


Step 2. Select 27% of the papers from the lower group and 27% from the upper
group. For smaller classes such as a group of only 20 students, you may just
divide it in half with 10 test papers (students) belonging to the lower group and
10 test papers (students) belonging in the upper group. In the example (40 high
school students), 27% would be 10.8 or 11. We are going to get the bottom 11
test papers (lower group) and upper 11 test papers (upper group) and set aside
the middle 18 test papers.
Step 3. Tabulate the number of students in both the upper and lower groups
who selected each alternative.
A tabulation of the number of students who selected each alternative for the
first five items of the given test is shown below:

Table 1 Sample Tabulation of Students’ Responses


Alternatives No. of Students
Groups (Upper and
Item No. who got the Total
Lower 27%) A B C D
item right
1 Upper 0 0 1 10 10 11
Lower 1 0 1 9 9 11
2 Upper 8 1 1 1 8 11
Lower 4 2 2 3 4 11
3 Upper 8 1 2 0 8 11
Lower 5 2 3 1 5 11
4 Upper 1 0 0 10 10 11
Lower 0 1 0 10 10 11
5 Upper 3 2 1 5 5 11
Lower 5 4 2 0 0 11

The table reveals the distribution of responses of the students. For


example, in item number 1, none of the students from the upper group selected
alternative A and B, while from the lower group, none of the students selected
alternative B. Most of the students from both groups selected the keyed
response which is alternative D.

1. Determining the Difficulty Index. In computing for the difficulty index


of each item, use the formula below:
Item Difficulty= R/T
Where:
R= Number of students who got the item right from both groups
T= Total number of students from both groups

Example. Compute for the difficulty index of the given first five test items.

Table 2 Difficulty Index of the Sample Test Items


NO. OF STUDENTS WHO GOT
DIFFICULTY
ITEM NO. THE ITEM RIGHT VERBAL INTERPRETATION
INDEX
(FROM BOTH GROUPS)
1 19 0.86 Very Easy
2 12 0.55 Ideal Difficulty
3 13 0.59 Ideal Difficulty
4 20 0.91 Very Easy
5 5 0.23 Difficult

Solutions:
Item no. 1 Item no. 2 Item no. 3 Item no. 4 Item no. 5
Difficulty = Difficulty = R2/T Difficulty = R3/T Difficulty = R4/T Difficulty = R5/T
R1/T
=12/22 =13/22 =20/22 =5/22
= 19/22
=0.55 =0.59 =0.91 =0.23
= 0.86

Guide in Interpreting the computed Difficulty Index


0.81 – 1.00 - Very Easy
0.61 – 0.80 - Easy
0.41 – 0.60 - Ideal Difficulty
0.21 – 0.40 - Difficult
0.00 – 0.20 - Very Difficult
It is evident that the larger the value of the difficulty index, the easier
the item is. This is the reason why some suggest that this measure be called
item ease and not item difficulty.
Item 1 and 4 are suggested to either be totally rejected or revised
since it appears to be very easy for the students while items no. 2 and 3 are
retained due to their ideal difficulty. Item no. 5 might be revised or it can also
be retained depending on the judgment of the teacher whether the level of
difficulty is necessary. Keep in mind that the decision to either reject or retain a
particular item is not solely based on its difficulty but rather, based on a
combination of other indices. It is rarely that an item is revised based only on
the difficulty index. Items might have ideal difficulty but the alternatives might
have to be revised as the plausibility analysis would tell us later on this section.
There are cases or situations when a need for difficult item is justified to
discriminate students’ relative level of learning.

2. Determining the Discrimination Index. In computing for the


discrimination index of each item, the formula below can be used:

Discrimination Index = RU - RL
½T
Where:
RU = Number of students in the upper group who answered the item correctly
RL= Number of the students in the lower group who answered the item correctly
½ T = One half of the total number of students included in the analysis (number of students in
one of the two groups)

Example. Compute for the discrimination index of the give first five test items
Table 3 Discrimination Index of the Sample Test Items
ITEM UPPER LOWER DIFFICULTY DISCRIMINATING VERBAL
NO. GROUP GROUP INDEX INDEX INTERPRETATION
(DISCRIMINATING
INDEX)
1 10 9 0.86 0.09 Poor
2 8 4 0.55 0.36 Good
3 8 5 0.59 0.27 Moderate
4 10 10 0.91 0 Poor
5 5 0 0.23 0.45 High

Solutions:
Item No. 1 Item No. 2 Item No. 3 Item No. 4 Item No. 5
Discrimination Discrimination Discrimination Discrimination Discrimination
Index = RU1 - RL1 Index = RU2 - RL2 Index = RU3 - RL3 Index = RU4 - RL4 Index = RU5 - RL5
½T ½T ½T ½T ½T
= 10 – 9 =8–4 =8–5 = 10 – 10 =5–0
11 11 11 11 11

= 0.09 = 0.36 = 0.27 =0 =0.45


Guide in Interpreting the computed Discrimination Index
0.41 – 1.00 - High Discriminating Index
0.31 – 0.40 - Good Discriminating Index
0.21 – 0.30 - Moderate Discriminating Index
0.20 and below - Poor Discriminating Index

Based on the discrimination indices, items 2, 3 and 5 can be retained


due to their acceptable discriminating index. An item’s discrimination index
refers to the degree by which it separates the upper and lower group of
students. Based on the data, items 2, 3, and 5 are said to possess this
characteristic. By observing the number of students from both groups who got
the item correctly, we can say that in items 2, 3 and 5, there are more students
from the upper group who answered the item correctly. Conversely, in items
number 1 and 4, there is an equal/almost equal number of students from the
lower group and upper group who got the item right which means that the item
does not separate the upper and lower group.
In extreme cases, a negative value for the discriminating index might
occur. This would mean that there are more students in the lower group who
got the item correctly compared to the upper group. If this happens, the item
should be revised at the very least, as there might be a high degree of ambiguity
in the test item. Remember however, that the data from item analysis tells us
only what specific items are poorly functioning but it does not tell us the reasons
or causes of its poor functioning.

3. Analysis of the Alternatives’ Effectiveness. Analyse the plausibility


(effectiveness) of the alternatives or the distracters. This can be done by using
the tabulation made in step 3 and adding a row for the difference between the
number of students who selected each alternative from the upper and lower
group.

Example: A table for analysing the effectiveness of the alternatives in the first five items
of the sample test is shown in Table 4. (The columns that corresponds to the correct
answers are highlighted)

Table 4 Plausibility of the Alternatives from the Sample Test Items


Groups (Upper Alternatives VERBAL
Item
and Lower A B C D INTERPRETATION
No.
27%)
1 Upper 0 0 1 10 Options B and C are
Lower 1 0 1 9 not Plausible; A and D
Difference -1 0 0 1 are Plausible
2 Upper 8 1 1 1 All options are
Lower 4 2 2 3 Plausible
Difference 4 -1 -1 -2
3 Upper 8 1 2 0
Lower 5 2 3 1 All options are
Difference 3 -1 -1 -1 Plausible
4 Upper 1 0 0 10 Option B is plausible;
Lower 0 1 0 10 All the other options
Difference 1 -1 0 0 are not Plausible
5 Upper 3 2 1 5 All options are
Lower 5 4 2 0 Plausible
Difference -2 -2 -1 5

Sample Solution for Item No. 1, Option A


Difference = UA – LA
=0–1
=-1
Where:
UA = No. of students in the upper group who selected Option A
LA = No. of students in the lower group who selected Option A

Table 5 Guide in Determining the Plausibility or Effectiveness of the Alternatives


Keyed Response (Correct Alternative) Distracters (Incorrect alternatives)
Condition Interpretation Condition Interpretation
Positive Difference Plausible Zero/Positive Difference Not Plausible
Zero/Negative
Not Plausible Negative Difference Plausible
Difference

In the resulting table of a simple plausibility analysis, it can be


deduced that some of the alternatives for items 1 and 4 are not plausible while
all the alternatives for items 2, 3 and 5 were found to be plausible. These results
are in line with earlier findings from other indices that items 1 and 4 are
somewhat problematic. Simple plausibility analysis gave a hint on why both
items obtained low discriminating index and item difficulty. The teachers can
either revise the items including its corresponding alternatives or totally discard
the items.

You might also like