You are on page 1of 58

Item Analysis

By

Dr. Moawia Ahmed Elbadri


Item Analysis

By

Dr. Moawia Ahmed


Item Analysis

Item analysis provides a way of


measuring the quality of questions -
seeing how appropriate they were for
the respondents how well they
measured their ability.
Item Analysis

Item analysisis a process which


examines student responses to
individual test items(questions) in
order to assess the quality of
thoseitemsand of the test as a
whole.
Parts of an A-type MCQ Item w e r

(1-Best Answer (OBA))


s
st An
1 -Be
r f ect
ar-p e m
Ne Q Ite
MC
The whole thing is called an Item
Purpose of Item
Analysis

Evaluates the quality of each item.


Rationale: the quality of items
determines the quality of test (i.e.,
reliability & validity)
Reliability and Validity

Reliability
"consistency" or "repeatability" of your
measures.

is the degree to which an


assessment tool produces stable and
consistent results.
Reliability

Test-retest reliability
is a measure of reliability obtained by
administering the same test twice over a
period of time to a group of individuals.
Reliability

inter-rater reliability:
inter-rateragreement
is the degree of agreement
amongraters.
Validity

Validity:

Test validity is the extent to which a test


accurately measures what it is supposed
to measure.
Reliability and Validity

You want your test to be both reliable and


valid
Item Analysis
Item Analysis information can tell us
if an item (i.e. the question) was too easy
or too difficult

how well it discriminated between high


and low scorers on the test

whether all of the alternatives(distractors)


functioned as intended
Item Analysis
Item Analysis information can tell us
if an item (i.e. the question) was too easy or
too difficult (item difficulty)

how well it discriminated between high and


low scorers on the test (item
discrimination)

whether all of the alternatives(distractors)


functioned as intended (distractor analysis)
Item Analysis
Difficulty

Discrimination

Reliability

Distractor analysis
Item
Difficulty
difficulty level
(p or percentage passing)
Item Difficulty

In item analysis, the first


step is to find out the
difficulty value of the item
or the index of difficulty of
an item.
Item Difficulty(p value)

Definition:
measure of the proportion of students
who answered a test item correctly
Item Difficulty(p value)
Item Difficulty

The item difficulty is usually expressed


in
Value or percentage%
Interpretation of Item
Difficulty
For medical school tests where there is
an emphasis on mastery, MOST items
should have a p-value (0.31
0.69)
.
Interpretation of the Difficulty Index

Diff. index Interpretation

0.20 Very difficult (should be revised)

0.21 0.30 Difficult (retained in the Q. bank)

0.31 0.69 Average (retained in the Q. bank)

0.70 0.80 Easy (revised before re-use)

0.81 Very easy (discarded or carefully reviewed)


Optimum Difficulty*
*corrected for guessing

0.7
5 True-False
0.6
7 MCQ 3 alternatives
0.62
5 MCQ 4 alternatives
0.6
0 MCQ 5 alternatives
0.5
0 Essay test
Examples
Number of students who answered each item
= 50
Item No. Correct % Correct Difficulty
No. Answers Level
1 15 30 ???
2 25 50 ???
3 35 70 ???
4 45 90 ???
Discrimination Index (d)
Discrimination Index
distinguishes for each item between
the performance of students who did
well on the exam and students who
did poorly.
Discrimination index
refers to the degree to which success or
failure of an item indicates possession of
the achievement being measured.
Formula: Item
Discrimination
Student Total Score Questions
(%)
Q -I Q-II Q-III
Ram 92 1 0 1
Swetha 90 0 0 1
Ajmal 85 1 0 1
John 80 1 0 1
Prabhu 78 1 0 1
Rajesh 70 1 1 0
Asif 65 1 1 1
Manish 55 1 0 0
Sam 48 1 1 0
Manu 45 0 0 0

Item # Correct # Correct Difficulty Discriminat


(Upper Gr) (Lower Gr) Ind n Ind
Q1 4 4 ? ?
Q2 0 3 ? ?
Q3 5 1 ? ?
Discrimination Index
D = U - L

U = # in the upper group correct response


Total # in upper group

L = # in the lower group correct response


Total # in lower group

The higher the value of D, the more adequately


the item discriminates (The highest value is 1.0)
Discrimination Index
D = U - L

The higher the value of D, the more


adequately the item discriminates

(The highest value is 1.0)


Interpretation of the
Discrimination Index
Disc. index Interpretation

< 0.2 Poorly discriminating


(discarded or reviewed carefully before
re-use).

0.0 Not discriminating (Reject Q)

< 0.0 Badly discriminating (Reject


Q) , check the key answer first.
Examples

Item Number of Correct Answers Item


No. in Group Discrimination
Upper 1/4 Lower 1/4 Index
0.7

0.1
1 90 20
2 80 70
1
3 100 0 0
4 100 100
0
5 50 50
6 20 60
-0.4

Number of students per group = 100


Item Upper Lower Discrimination
Difficulty Remarks
no. 27% 27% Index
Index

1 14 12 0.81 0.13 Revised


2 10 6 0.50 0.25 Retained
3 11 7 0.56 0.25 Retained

4 9 2 0.34 0.44 Retained


1 Retained
5 6 0.56 0.38
2
6 14 -0.50 Rejected
6 0.63
7 13 4 0.53 0.56 Retained
8 3 10 0.41 -0.44 Rejected

9 13 12 0.78 0.06 Rejected


10 8 0.44 0.13 Revised
6
No. of pupils tested- 60
Reliability
Reliability
Test analysis
Test reliability is a measure of the
accuracy, consistency, or precision of the
test scores.
Reliability Coefficient
This is a measure of the likelihood of
obtaining similar results if you re-administer
the exam to another group of similar students.

1.Kuder- Richardson 20
2.Kuder-Richardson 21
3.Cronbach alpha

The most useful measure is generally the


Kuder-Richardson Formula 20 (KR-20)
Reliability coefficients
Kuder-Richardson Formula (KR-
20) ----specific for MCQs
Reliability coefficients
Kuder-Richardson Formula
(KR-20)
Measures inter-item consistency or
how well your exam measures a
single construct.
Reliability coefficients

Internal consistency reliability


is a measure of how well the items
on the test measure the same
construct or idea
Internal consistency
reliability
Internal consistency reliability
Reliability coefficients
Kuder-Richardson Formula (KR-20)
Ranges from 0-1 (the higher the better)
>0.5------------------ >good on a teacher-made test
Best when a test measures a unified body of
content
Lots of very difficult items or poorly written
items can skew this
The higher the variability in
scores, the higher the reliability
Interpreting KR-20
KR-20 statistic is influenced by
Number of test items on the exam
Student performance on each item
Variance for the set of student test
scores
Range: 0.00 1.00
Values near 0.00 weak relationship
among items on the test
Values near 1.00 strong relationship
among items on the test
Interpretation of KR-20
(reliability)
KR-20 Interpretation

0.9 Excellent reliability; at the level of the best standardized test

0.71 - 0.89 Very good for a classroom test

0.61 0.70 Good ; in the range of most. Probably few items could be
improved.

0.51 0.60 Low reliability ; test revision and supplementation

< 50 Questionable reliability


Distractor
Analysis
Distractor Analysis
Distractor analysis is an extension of item
analysis, using techniques that are similar
to item difficulty and item discrimination.

Distractor efficiency:
deals with the way a distractor lures test
takers, especially those with low abilities.
Any distractor not picked up by at
least 5% of the students
=
NOT good
distractor
Distractor Analysis
Distractor Analysis
An ideal item
is one that all students in the upper group
answer correctly and all students in the lower
group answer wrongly.

the responses of the lower group have to be


evenly distributed among the incorrect
alternatives.
Distractor Analysis
the ideal situation would be for each
of the 4 distractors to be selected
by an equal number of all students
who did not get the answer
correct
Distractor Analysis&
item discrimination
the item discrimination formula
can also be used in distractor
analysis.

The concept of upper groups &


lower groups would still remain,
but the analysis and expectation
would differ.
Distractor Analysis&
item discrimination
Instead of expecting a + value, we should
logically expect a - value as more students
from the lower group should select
distractors.

Each distractor can have its own item


discrimination value in order to analyse
how the distractors work and ultimately
refine the effectiveness of the test item itself.
Item Analysis
Distractor Analysis
Distractor A Distractor B Distractor C Distractor D

Item 1 8 3 1 0

Item 2 2 8 2 0

Item 3 4 8 0 0

Item 4 1 3 8 0

Item 5 5 0 0 7
Distracter Analysis: Examples

Item 1 A* B C D E Omit
% of students in upper 20 5 0 0 0 0
% of students in the middle 15 10 10 10 5 0
% of students in lower 5 5 5 10 0 0

Item 2 A B C D* E Omit
% of students in upper 0 5 5 15 0 0
% of students in the middle 0 10 15 5 20 0
% of students in lower 0 5 10 0 10 0

(*) marks the correct answer.