Professional Documents
Culture Documents
Development
Graham McMahon, MD, MMSc.
Sarah E. Peyre, EdD
Educational Research Methods Program
Learning Objectives
Understand the pros and cons to various testing questions
for written examinations
Learn how to determine
Item difficulty and
Item discrimination
Reliability
Standard Setting
Come to our Workshop!
Work in small groups to…
Review problematic multiple choice items
Establish validity and reliability for a test
A. Anti-nuclear antibody.
B. Erythrocyte sedimentation rate.
Responses
C. Serum concentration of creatine Correct response
kinase.
D. Serum concentration of Distractors
angiotensin-converting enzyme.
E. Urine microscopy.
Tips for writing discriminant MCQs
Be sure that each item reflects a clearly defined learning
outcome
Stem
The stem of the item should be self-contained and written in clear and
precise language.
Avoid ‘trigger’ words (e.g. pin-rolling tremor)
Negatives, excepts, absolutes and qualifiers in question stems are no-
no’s.
Responses
All answers should be plausible and homogenous
Items need to be independent of one another
Answer choices should be similar in length and grammatical form
List answer choices in alphabetical or numerical order
Avoid ‘all of the above’ as a response
Avoid technical flaws (tense or plurality for example)
Pros and Cons of MCQ’s
Pros Cons
Useful for measuring Good questions
learning outcomes at Take a long time to write
almost any level Are difficult to write
Easy to understand
Easy to score
Constrain creative
responses from learners
Easily analyzed for
effectiveness
May have more than
Allow broad coverage
one correct answer
efficiently
Item Analysis
Qualitative: looks at whether the content
matches the information, attitude,
characteristic or behavior being assessed
Quantitative:
Item difficulty
Item discrimination
Determining item difficulty
Number of Students achieving each Score
The percentage of
participants who 30
correct 10
Item difficulty 0
0 10 20 30 40 50 60 70 80 90 100
from 0 to 100%
Low value = high High Medium Low
difficulty (Difficult) (Moderate) (Easy)
High value = low >30% AND >=80
difficulty <= 30%
< 80% %
0 10 20 30 40 50 60 70 80 90 100
Discrimination Index
The Discrimination Index distinguishes for each item
between the performance of students who did well on the
exam and students who did poorly.
Index of discrimination:
The difference in the % of
people in one extreme group
minus the % of people in the
other extreme group
Item discrimination scores can
range from -1.00 to +1.00
Example Item Item Difficulty
100 test takers: 20 in top 25 Discrimination High Med Low
(D)
were correct but only 5 in the
lowest 25 students were correct.
D =< 0% revie revie revie
DI = (20-5)/25 = 0.8
w w w
0% < D < 30% ok revie ok
Item Analysis Report
Order ID and group number
percentages counts
Inter-rater reliability
How do I set a passing grade?
Standard Setting
Norm referenced: Z-scores
Number of standard deviations below the mean
Criterion Referenced: Angoff Method
Panel of experts are asked to evaluate each item and
estimate the number fraction of minimally competent
students who would answer each item correctly
Ratings are averaged across the experts for each item,
discussed and then summed to get panel raw cutscore
Thank you!
Welcome to Our
Workshop on Test
Development!
Establishing
Validity and Reliability for a Test
Mock Standard Setting
Item Creation
Consider beginning with the end in mind
What is it that you think the medical student should
demonstrate that he/she knows or knows how to do?
This should be an objective from your lesson plan.
Learning Activities
Objectives Evaluation
Item Stems: Clinical Vignettes
Things to consider:
Patient description (46-year-old-female)
Functional disability (difficulty rising from a seated
position, but has no difficulty flexing her legs)
The question based on this item template:
A 46-year-old-female has difficulty rising from a seated
position, but has no difficulty flexing her legs. Which of the
following muscles has been injured?
[Objective: Identify and explain the function of the muscles in
the…. ]
Item Creation
Lead-in: The most likely Lead-in: The most likely
diagnosis is cause is
Options: disorders, diseases Options: bacteria, toxins,
Objective: Describe the signs medications, metabolic defects
and symptoms of X. Compare Objective: List and explain the
and contrast the signs and causes of X.
symptoms of XY and Z. Lead-in: The most likely
Lead-in: Which of the mechanism is
following additional Options: disease mechanisms,
symptoms would you expect pharmacologic mechanisms
to be present? Objective: Diagram and
explain the mechanism of drug
Options: symptoms X.
Objective: same as above
Item Templates
Other considerations:
Age, gender, race, ethnicity
Site of care (ER, office visit)
Presenting complaint
presents for a routine physical exam
presents with a headache
Duration
Patient history, family history
There is no history of…
He has a history of…
Physical findings
Lab values, imaging studies, pathology reports
Treatment, subsequent findings
Item Creation
Add the lead-in (question) and the options
Which of the following pulmonary variables is most
likely to be lower than normal in this patient?
A. Alveolar-arterial PO2 difference
B. Compliance of the lung
C. Oncotic pressure of the alveolar fluid
D. Work of breathing
E. Residual volume
Item Creation: Taking Recall up
to Another Level
Recallquestion:
What area is supplied with blood by the posterior
inferior cerebral artery?
correct 10
Item difficulty 0
0 10 20 30 40 50 60 70 80 90 100
from 0 to 100%
Low value = high High Medium Low
difficulty (Difficult) (Moderate) (Easy)
High value = low >30% AND >=80
difficulty <= 30%
< 80% %
0 10 20 30 40 50 60 70 80 90 100
Discrimination Index
The Discrimination Index distinguishes for each item
between the performance of students who did well on the
exam and students who did poorly.
Index of discrimination:
The difference in the % of
people in one extreme group
minus the % of people in the
other extreme group
Item discrimination scores can
range from -1.00 to +1.00
Example Item Item Difficulty
100 test takers: 20 in top 25 Discrimination High Med Low
(D)
were correct but only 5 in the
lowest 25 students were correct.
D =< 0% revie revie revie
DI = (20-5)/25 = 0.8
w w w
0% < D < 30% ok revie ok
Item Analysis Report
Order ID and group number
percentages counts
(Groups)
Graham McMahon
gmcmahon@partners.org
43
Item Discrimination: Examples
Item Number of Correct Answers in Item Discrimination
No. Group Index
Upper 1/4 Lower 1/4
0.7
1 90 20 0.1
2 80 70 1
3 100 0 0
4 100 100
0
5 50 50
-0.4
6 20 60
Number of students per group = 100
Distracter Analysis: Examples
Item 1 A* B C D E Omit
% of students in upper ¼ 20 5 0 0 0 0
% of students in the middle 15 10 10 10 5 0
% of students in lower ¼ 5 5 5 10 0 0
Item 2 A B C D* E Omit
% of students in upper ¼ 0 5 5 15 0 0
% of students in the middle 0 10 15 5 20 0
% of students in lower ¼ 0 5 10 0 10 0