Item Analysis: Improving Multiple Choice Tests

Item Analysis:
Improving Multiple
Choice Tests
Crystal Ramsay
September 27, 2011
Schreyer Institute
for Teaching Excellence
http://www.schreyerinstitute.psu.edu/
This workshop is designed to help you do three
things:
To interpret statistical indices

provided by the university’s
Scanning Operations
To differentiate between
well-performing items and
poor-performing items
To make decisions about

poor performing items
We give tests for 4 primary reasons.
To find out if students learned

what we intended
To separate those who

learned from those who
didn’t
To increase learning and

motivation
To gather information for

adapting or improving instruction
Multiple choice items are comprised of 4 basic
components.
Stem
The rounded filling of an internal angle

between two surfaces of a plastic
molding is known as the
Distracters
A. rib.
Options B. fillet.
C. chamfer.
D. Gusset plate. Key
An item analysis focuses on 4 major pieces of
information provided in the test score report.
Test Score Reliability
Item Difficulty
Item Discrimination
Distracter information
Test score reliability is an index of the likelihood that
scores would remain consistent over time if the same
test was administered repeatedly to the same learners.
Reliability
coefficients range
from .00 to 1.00.
Ideal score
reliabilities are >.80.
Higher reliabilities =
less measurement
error.
Now look at the test score reliability from your exam.

Item Difficulty is the percentage of students who
answered an item correctly.
Represented in the Response

Table as KEY-%
RESPONSE TABLE - FORM A
ITEM ITEM
NO. OMIT A B C D E KEY- % EFFECT
% % % % % %
1 0 0 18 82 0 0 C 82 0.22
2 0 79 0 0 21 0 A 79
0.23
3 0 4 7 89 0 0 C 89
-0.12
Ranges from 0% to 100%

RESPONSE TABLE –FORM A
ITEM ITEM
NO. OMIT A B C D E KEY - % EFFECT
Easier items have % % % % % %

higher item 4 0 0 4 96 0 0 C 96 0.18
difficulty values. 5 0 100 0 0 0 0 A 100 0.00
6 0 0 0 5+ 0 95 E 95 -0.11

ITEM ITEM
More difficult NO.
OMIT A B C D E KEY - %
EFFECT
items have lower
% % % % % %
item difficulty 8 0 0 43 0 57 0 D 57 0.46
values. 9 0 7 4 0 75 14 D 75 -0.19
10 0 5 12 27 31 25 D 31 0.10
What is an ‘ideal’ item difficulty statistic depends on
2 factors.
Number of
alternatives for
each item
Your reason for

asking the
question
Sometimes we include very easy or very difficult
items on purpose.
Did I deliberately pose difficult items to challenge my

students’ thinking?
Did I deliberately pose easy items to test basic

information or to boost students’ confidence?
Now look at the item difficulties from your exam.
Which items were Which items were

easier for your more difficult?
students?
Item Discrimination is the degree to which students with
high overall exam scores also got a particular item correct.
Represented as Item Effect

because it tells how well an
item ‘performed’

ITEM ITEM
% % % % % %
1 0 0 18 82 0 0 C 82 0.22
2 0 79 0 0 21 0 A 79
0.23
3 0 4 7 89 0 0 C 89
-0.12
Ranges from -1.00 to 1.00

and should be >.2
ITEM ITEM
A well- NO.
OMIT A B C D E KEY - %
EFFECT
performing item % % % % % %
8 0 0 43 0 57 0 D 57 0.46

ITEM ITEM
NO. OMIT EFFECT
A poor- A B C D E KEY - %
performing item % % % % % %
6 0 0 0 5+ 0 95 E 95 -0.11
What is an ‘ideal’ item discrimination statistic
depends on 3 factors.
Item Difficulty
Test heterogeneity
Item characteristics
Item difficulty
Very easy or very

difficult items Very easy or very
will have poor difficult items may
ability to still be necessary to
Yet…
discriminate sample content
among students. taught.
Test heterogeneity
A test that assesses

many different
topics will have a
A heterogeneous item
lower correlation
Yet… pool may still be
with any one
necessary to sample
content-focused
content taught.
item.
Item quality
A poorly written There is no substitute

item will have little for a well-written item
ability to or for testing what you
discriminate teach!
among students. and…
Now look at the item effects from your exam.
Which items on
your exam
performed ‘well’?
Did any items

perform ‘poorly’?
Distracter information can be analyzed to determine
which distracters were effective and which ones were not.

ITEM ITEM
% % % % % %
1 0 0 18 82 0 0 C 82 0.22
2 0 79 0 0 21 0 A 79
0.23
3 0 4 7 89 0 0 C 89
-0.12
Now look at the distracter information for items from

your exam. What can you conclude about them?
Whether to retain, revise, or eliminate items depends
on item difficulty, item discrimination, distracter
information, and your instruction.
Item
Item Difficulty
Discrimination
Distracters Instruction
Ultimately, it’s a
judgment call that
you have to make.
What if I have a relatively
short test or I give a test
in a small class? I might
not use the testing
service for scoring. Is Yes.
there a way I can
understand how my
items worked?
Item 1 A B* C D From: Suskie, L. (2009).
Top 1/3 10 Assessing student
learning: A common
Bottom 1/3 1 4 3 2 sense guide (2nd ed.).

San Francisco: Jossey-
Bass.
Item 2 A* B C D
Top 1/3 8 2
Bottom 1/3 7 3
Item 3 A B C* D
Top 1/3 5 1 4
Bottom 1/3 2 4 4
Item 4 A* B C D
Top 1/3 10
Bottom 1/3 9 1
1. Which item is the easiest?

2. Which item shows negative (very bad) discrimination?
3. Which item discriminates best between high and low scores?
4. In Item 2, which distracter is most effective?
5. In Item 3, which distracter must be changed?
Even after you consider reliability, difficulty, discrimination, and
distracters, there are still a few other things to think about…
Multiple course sections
Student feedback
Other item types

Resources
 For an excellent resource on item analysis:
 http://www.utexas.edu/academic/ctl/assessment/iar/students/r
eport/itemanalysis.php
 For a more extensive list of item-writing tips:
 http://testing.byu.edu/info/handbooks/Multiple-Choice%20
Item%20Writing%20Guidelines%20-%20Haladyna%20and%20Downi
ng.pdf
 http://homes.chass.utoronto.ca/~murdockj/teaching/MCQ_basic
_tips.pdf
 For a discussion about writing higher-level multiple choice items:
 http://www.ascilite.org.au/conferences/perth04/procs/pdf/woo
dford.pdf

Item Analysis: Improving Multiple Choice Tests

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Item Analysis: Improving Multiple Choice Tests

Uploaded by

Copyright:

Available Formats

Item Analysis:

To interpret statistical indices

To make decisions about

To find out if students learned

To separate those who

To increase learning and

To gather information for

The rounded filling of an internal angle

Test Score Reliability

Now look at the test score reliability from your exam.

Represented in the Response

Ranges from 0% to 100%

Easier items have % % % % % %

RESPONSE TABLE –FORM A

Your reason for

Did I deliberately pose difficult items to challenge my

Did I deliberately pose easy items to test basic

Which items were Which items were

Represented as Item Effect

RESPONSE TABLE - FORM A

Ranges from -1.00 to 1.00

RESPONSE TABLE –FORM A

Very easy or very

A test that assesses

A poorly written There is no substitute

Did any items

RESPONSE TABLE - FORM A

Now look at the distracter information for items from

Bottom 1/3 1 4 3 2 sense guide (2nd ed.).

1. Which item is the easiest?

Multiple course sections

Other item types

You might also like