PL4201 – Psychometrics and Psychological Testing Item Analysis

Item Analysis in an Educational Setting Sometimes, especially in an educational context, we are interested in the relation between an item and the overall test scores (e.g., which items on a math test best discriminate among students who scored high versus those who scored low on the test). We might use extreme groups (the top 25-33% versus the bottom 25-33% of the students) to examine some item statistics. Example: We have 60 students in a class and we will take the top and bottom 33% as extreme groups. Therefore, there are 20 students in each group. Item 1 2 3 4 5 6 x x x Top 33% (T) 15 20 3 10 17 12 Middle 33% (M) 9 20 1 11 10 11 Bottom 33% (B) 7 16 0 16 10 11 Difficulty (T + M + B) 31 56 4 37 37 34 Discrimination (T - B) 8 4 3 -6 7 1

Items 1 and 5 are good items. Items 2 and 3 have extreme difficulty levels; too easy for Item 2 and too difficult for Item 3. Items 4 and 6 have low indices of discrimination; items not differentiating top versus bottom students well. One can also translate the frequencies to proportions and obtain the difference in proportions (e.g., Item 4; T = .50 passing and B = .80 passing and therefore difference = -.30).

Obviously, one can also calculate the phi coefficient in the above example by taking the T and the B groups. Hence, for Item 1: Groups Pass 1 0 T (1) 15 (.375) 5 (.125) .50 B (0) 7 (.175) 13 (.325) .50 .55 .45 1.00

NB: Proportions are in parentheses (e.g., for cell11, proportion = 15/40 = .375). phi correlation = pij – pipj / ¥(piqi) (pjqj) = 0.375 – (0.5*0.55) / ¥(0.5*0.5)(0.55*0.45) = 0.40. Or, one can compute a biserial correlation between passing of one item (1, 0) and the total score (a continuous scale) for the 60 students. The biserial correlation is used here because the items are assessing a continuous variable (math ability) that is artificially dichotomized (i.e., pass or fail on math item).

then one might want to investigate the item closely. Can you identify problematic items? Explain why they are problematic. As such.81 ..48 .24 .09 . 2.14 * 4 1. Endorsing .38 * Questions: 1. .29 Item 1 Alternatives A B C D A B C D A B C D A B C D A B C D Key * 2 .00 * 5 .51.24 .e. .03 . Which item would you say is the “best” item? Why? 3.41 Biserial r .67 .In this context.10 -.24 . ambiguous phrasing. Correct .00 .00 .38 .38 Index of Discrim.28 .14 .00 . we would want to select items with high indices.51 . All 30 questions were in the format of Multiple-Choice Questions in which students had 4 alternatives to choose from in each question. or that two answer options are very likely in a MCQ question) so as to either revise or discard them.17 .00 .00 . item analysis can be used to identify possible deficiencies in the items (i. Presented below in the table is a selection of 5 items from this pop quiz.67 .00 . The phi coefficient and the biserial correlation indices for each item can be used to select good items for a subsequent test. Some relevant item statistics are shown in the table. too difficult. Or.14 .31 * 3 . Prop.14 . Item Analysis Exercise Dr Smartypants gave a 30-item pop quiz on psychometrics to his class.00 .24 Prop. If an item functions in such a way that high-scoring students do not differ from low-scoring students.00 1. Which items would you choose for a revised test? .05 . The underlying assumption in this analysis is that students who score high on the test overall should good well in most of the items (compared to low-scoring students). the teacher might want to clarify certain teaching materials again to ensure that the students understand them.

Sign up to vote on this title
UsefulNot useful