Professional Documents
Culture Documents
1
Define clearly what you want to measure.
Generate an item pool.
Avoid exceptionally long items.
Keep the level of reading difficulty
appropriate for those who will complete the
test.
Avoid double-barreled items that convey
more than one ideas at the same time.
Consider making positive and negative
worded items.
2
Form, plan, structure, arrangement, and layout of
individual test items.
I. Selected Response Format
Requires test takers to select a response from a
set of alternative responses.
II. Completion Items
Requires test takers to complete a set of stimuli to
complete a certain item.
▪ Essay items – samples need to respond to a question by
writing a composition; used to determine the depth of
knowledge of the respondent.
Dichotomous Format
4
Polytomous/ Polychotomous Format
• Question – stems
• Correct Choice – keyed response
• Incorrect choices are called “distractors”
R = Right responses
W = Wrong responses
n = number of choices for each item
*omitted responses are NOT part of the
computation
6
Likert Format
7
Category Format
8
Guttman Scale
15
For a test that measures achievement or ability, item
analysis is defined by the number of people who get a
particular item correct.
For example, if 84% of the test takers answered item
number 1 correctly, then we have a difficulty index of .84.
This definition, however, indicate the easiness of the test
than difficulty.
Thus, it is also suggested that achievement tests make use
of multiple choice questions because it has .25 chance of
getting the correct response.
Should range from 0.30 – 0.70
16
Suggests the best difficulty for an item based
on the number of responses.
OID = (chance performance + 1)/ 2
Chance performance – performance based on
guessing. Can be equated by dividing 1 from
the number of distractors.
Item difficulty Index
The value that describes the item difficulty
for an ability test.
25
Used for correlating dichotomous and
continuous data.
Correlates whether those who got an item
correct tends to have high scores as well.
Graphic representation of item difficulty and
discrimination.
Usually plots the scores at x-axis then p and d
on the y-axis.
A frequency polygon is created after the test
given to two groups; one group that is
exposed to learning unit, another group that
is not exposed to learning unit.
Antimode - the score with the lowest
frequency
Determination of cut score (passing score) for a
criterion referenced test.
Item Fairness
Degree of an item is biased.
Biased Test Items – items that favor one
particular group of examinees.
Can be tested using inferential statistics among
groups.
Expert panels
Guide researchers/test developers in doing
sensitivity review (especially in cultural issues)
Sensitivity review – a study of test items typically to
examine test bias, presence of offensive language and
stereotypes.
Test
conceptualization
Test Construction
Test Tryout
Item Analysis
Test Revision
An umbrella term that goes into the process of
creating a test.
Test conceptualization – an early stage of test
development wherein idea for a particular
test is conceived.
Test construction – writing test items as well
as formatting items, scoring rules, and
otherwise designing and building a test.
Test Tryout – administration of a test to a
representative sample of test takers under
conditions that stimulate the conditions under
which the final version of the test will be
administered.
Item analysis – entails procedures usually
statistical designed to explore how individual
test items work as compared to other items in
the test and in the context of the whole test.
Test revision – action taken to modify a test’s
content or format for the purpose of improving
the test’s effectiveness as a tool of assessment.
Stage wherein the following are determined:
Construct, Goal, User, Taker, Administration,
Format, Response, Benefits, Costs,
Interpretation
Determination whether the test would be
Norms-Referenced or Criterion-Referenced
Also called as pilot study, pilot research
May be in the form of interview in
determining appropriate item for the test.
It may entail literature review,
experimentation, or any efforts that
researcher may cause in order to determine
the items that might be included in the test.
writing test items as well as formatting items,
scoring rules, and otherwise designing and
building a test.