Item Analysis and Validation: Educ5. Assessment of Learning

Item analysis
and validation
EDUC5. ASSESSMENT OF LEARNING
Group 5 | SMCB-PTC, 1st Semester, AY 2021-2022
16 November 2021
S
o:t
VE
le
ab
be
ill
TI
w
ou
y
EC
r,
Explain the meaning of item analysis, item validity, reliability, item difficulty,
pte
and discrimination index;

ha
c
J
his
OB
ft
Determine the validity and reliability of a test item; and

do
en
the
Determine the quality of a test item by its difficulty index and

At
plausibility of option (for a selected-response test).

First draft
Pilot testing
and item
analysis
Revision/
replacement
HOW DO
TEACHERS
PREPARE A TEST?
Validation
INTRODUCTION
Item Analysis
Item Difficulty
Index of difficulty
Index of
discrimination
Reliability
Validity
Discrimination
KEY TERMS
Discrimination
a measure based on the comparison of performance between
stronger and weaker candidates in the exam as a whole
Source:
Edit Blog post Measuring Item Reliability Part 1 – Item
Discrimination Index |
Maxinity
Part 1
ITEM ANALYSIS: DIFFICULTY INDEX &
DISCRIMINATION INDEX
• The number of students who are able to answer the
DIFFICULTY item correctly divided by the total number of students
INDEX • Difficulty Index is usually expressed in percentage
(%)
There are 2 important

characteristics of an
item that will be of
interest to a teacher.
• Measures the difficulty of an item with respect to

DISCRIM- those in the upper 25% of the class and how difficult it
INATION is with respect to the lower 25% of the class
INDEX
• Discrimination Index = DU - DL
• The number of students who are able to answer the item correctly divided by the total number of
students
ITEM • Difficulty Index is usually expressed in percentage (%)
DIFFICULT
FORMULA:
Y No. of Students
Item Difficulty = with Correct Answer x 100%
No. of Students
SAMPLE PROBLEM: What is the difficulty of an item if 25 students are unable to answer it
correctly while 75 answered it correctly?
Solution:
75
Item Difficulty = x 100%
1 00
Item Difficulty = 0.75 x 100%
Item Difficulty = 75%

students
DIFFICULT
Y
IN A NUTSHELL…
Difficulty Index measures how easy a test item is.

students
DIFFICULT
WHAT TO DO WITH DIFFICULTY INDEX?
Y
Reject – Revise - Retain
DIFFICULTY
PERCENTAGE DESCRIPTION ACTION
INDEX
0 – 0.20 0–20% Very difficult Reject
0.21 – 0.40 21-40% Difficult Revise
0.41 – 0.60 41-60% Moderate Retain
0.61 – 0.80 61-80% Easy Revise
0.80 – 1.00 81-100% Very Easy Reject
students
DIFFICULT
PRACTICE EXERCISES: Compute the Item Difficulty of the following:
Y
Item Total No. Students w/ Item Teacher’s
No. of Students Correct Answers Difficulty Action
1 40 20
50% Retain
2 45 9
20% Reject
3 50 34
68% Revise
4 55 47
85% Reject
5 30 28
93% Reject
students
DIFFICULT
WEAKNESS OF DIFFICULTY INDEX:
Y
 It may not actually indicate that the item is difficult or easy.
 A student who does not know the subject matter will naturally be unable to answer the item
correctly.
If such is the case, the how do we decide on the basis of this index whether the item is too difficult
or too easy?
• The number of students who are able to
DIFFICULTY
answer the item correctly divided by the
INDEX total number of students
• Difficulty Index is usually expressed in
percentage (%)
There are 2 important
characteristics of an
item that will be of
interest to a teacher.
• Measures the difficulty of an item with
respect to those in the upper 25% of the
DISCRIM-
INATION
class and how difficult it is with respect
INDEX to the lower 25% of the class
• Discrimination Index = DU - DL
• Measures the difficulty of an item with respect to those in the upper 25% of the class and how
difficult it is with respect to the lower 25% of the class
DISCRIM- • Discrimination Index = DU - DL
INATION
ARBITRARY RULES OF DISCRIMINATION INDEX
INDEX
 Difficult items tend to discriminate those who know and those who do
not know the answer.
 Easy items cannot discriminate between the two groups of students.
We are, therefore, interested in deriving a measure that can tell us whether an item can
discriminate between these two groups of students. Thus, the formula above.
FORMULA:
DISCRIM- Discrimination Index = DU – DL

INATION
INDEX SAMPLE PROBLEM: Obtain the discrimination of an item if the upper
25% of the class had a difficulty of 0.60 (i.e., 60% of the 25% got the
correct answer) while the lower 25% of the class had a difficulty index of
0.20.
SOLUTION:
Here, DU=0.60 while DL=0.20
Discrimination Index = 0.60 – 0.20
Discrimination Index = 0.40

INATION
Theoretically, the index of discrimination can range from -1.00 (when
INDEX
DU=0 and DL=1) to 1.0 (when DU=1 and DL=0).
WHAT TO LOOK FOR IN THE DISCRIMINATION INDEX?
a. If DU-DL = -1
b. If DU-DL = 0
c. If DU-DL = 1
A. WHEN DU-DL= -1
DISCRIM-  When the index of discrimination is equal to -1.00, this means that all
INATION of the lower 25% of students got the correct answer while the upper
INDEX 25% got the wrong answer.
• In a sense, the item discriminates correctly between the two groups

but the item itself is highly questionable.
• Why would the bright ones get the wrong answer, and the poor
ones get the right answer?
B. WHEN DU-DL= 0
 Both the upper 25% and the lower 25% got the correct answer.
DISCRIM- • The item did not discriminate between the two groups of students
INATION
INDEX
C. WHEN DU-DL= 1
 When the index of discrimination is equal to 1.00, this means that all of
the lower 25% failed to get the correct answer while the upper 25% got
the correct answer.
• This is a perfectly discriminating item and is the ideal item that

should be included in the test.
WHAT TO DO WITH DISCRIMINATION INDEX?
DISCRIM- Retain / Reject

INATION
INDEX DISCRIMINATION INDEX ACTION
(DU-DL)
Negative (-1) Reject
Zero (0) Reject
Positive (+1) Retain
INATION
PRACTICE EXERCISES: Compute the Discrimination Index of the
INDEX
following:
Item DU DL Disc. Teacher’s
(Upper 25%) (Lower 25%) Index Action
1 0.45 0.70 - 0.25 Reject
2 0.50 0.25 0.25 Retain
3 0.30 0.40 - 0.10 Reject
4 0.75 0.75 0 Reject
5 0.60 0.40 0.20 Retain
APPLICATION OF DIFFICULTY & DISCRIMINATION INDICES
SAMPLE PROBLEM:
Consider a multiple-choice type of a test which the following data were obtained:
Multiple-Choice Item No. 1 (Correct Answer: B)
A B C D TOTAL
Total No. of Students 0 40 20 20 80 students took the test
Upper 25% 0 15 5 0 20 students
Lower 25% 0 5 10 5 20 students
Compute the Difficulty and Discrimination Indices

DIFFICULTY INDEX
Students
= with Correctx Answer
100%
No. of Students
= 40 students x100%
80 students
= 0.5 x100%
= 0.5 or 50%
= DU-DL
= 0.75 – 0.25
= 0.5 or 50%
Multiple-Choice Item No. 1 (Correct Answer: B)
A B C D
Total No. of Students 0 40 20 20 80 students took the test
Upper 25% 0 15 5 0
Lower 25% 0 5 10 5
Things to Notice:
• “A” is not a good distracter (implausible distracter)
• “B” and “C” have good appeal as distracters (plausible distracters)
MORE SOPHISTICATED
Item Discrimination
• the ability of an item to differentiate among the students
on the basis of how well they know the material being
tested.
Item Discrimination Index provided by ScorePak ®

• is a Pearson Product Moment correlation between student
responses to a particular item and total scores on all other
items on the test.
• Equivalent to point-biserial coefficient
• Provides an estimate of the degree to which an individual
item is measuring the same thing as the rest of the items.
MORE SOPHISTICATED
Things to Remember when Dealing with Discrimination
Indices
• Values of the coefficients will tend to be lower for tests ScorePak® Classification of Item
measuring wide range of areas Discrimination
• Items with low indices are often ambiguously-worded 0.30 and above “good”
• Items with negative indices should be examined.
0.10-0.30 “fair”
• Tests with high internal consistency consist of items
with mostly positive relationships with the total score. below 0.10 “poor”
• Values of discrimination index will seldom exceed 0.50
because of differing shapes of item and total score
distribution
• A good item is the one that has a good discriminating
ability
ITEM ANALYSIS IN GENERAL
Item analysis provides the

following information
The difficulty of the item

The Discriminating Power of the Item
The Effectiveness of each Item
ITEM ANALYSIS IN GENERAL
Benefits derived from item analysis
1. It provides useful information for class discussion of the

test
2. It provides data which helps students improve their
learning.
3. It provides insights and skills that lead to the preparation
of better tests in the future.
Part 2
VALIDATION & VALIDITY
VALIDITY DEFINED
DEFINITION 1
the extent to which a test measures what it intends to
measure
DEFINITION 2
the appropriateness, correctness, meaningfulness, and
usefulness of the specific decisions a teacher makes based on
the test results.
TYPES OF VALIDITY
1. Content validity
2. Construct Validity
3. Criterion-related validity (Concurrent Validity)
4. Criterion-related validity (Predictive Validity)
5. Face Validity
1. CONTENT VALIDITY
• It is related to how adequately the content of the root test

sample the domain about which inference is to be made
(Calmorin, 2004)
• This is being established through logical analysis and

adequate sampling of test items usually enough to assure
that the test has content validity. (Oriondo, 1984)
Example
A teacher wishes to validate a test in
Mathematics. He requests experts in
Mathematics to judge if the items or
questions measures the knowledge, the
skills, and values supposed to be
measured.
2. CONSTRUCT VALIDITY
• The test is the extent to which a test measures a

theoretical trait. This involves such tests as those of
understanding and interpretation of data.
Example
A teacher might design whether an
educational program increases artistic
ability amongst preschool children.
Construct validity is a measure whether
your research actually measures artistic
ability (a slightly abstract label)
3. CONCURRENT VALIDITY
• It refers to the degree to which the test correlates with a

criterion which sets as an acceptable measure on standard
other than the test itself. The criterion is always available
at the time of testing.
4. PREDICTIVE VALIDITY
• This refers to the degree of accuracy of how a test one’s
performance at some subsequent outcome (Asaad, 2004).
5. FACE VALIDITY
• The test questions are said to have validity when they

appear to be related to the group being examined. (Asaad,
2004)
• This done by examining the test if it is the good one.
(there is no common numerical method for face validity)
Example
Calculation of the area of a rectangle
when its given direction of length and
width are 4 ft and 6 ft. respectively.
FACTORS AFFECTING THE VALIDITY OF
AN ASSESSMENT INSTRUMENT
1. Unclear Directions
2. Reading vocabulary and sentence structure are too
difficult
3. Ambiguity
4. Inadequate time limits
5. Overemphasis of easy (to assess aspects of domain at the
expense of important but hard to assess) aspects
6. Test items inappropriate for the outcomes being
measured.
7. Poorly constructed test items
8. Test too short
9. Improper arrangement of items
10. Identifiable patter of answer.
Part 3
RELIABILITY
RELIABILITY DEFINED
Reliability refers to the consistency of the scores obtained—

how consistent they are for each individual from one
administration of the instrument to another and from one set
of items to another.
RELIABILITY AND VALIDITY
• If an instrument is unreliable, it cannot yield valid

outcomes.
• As reliability improves, validity may improve (or it may
not).
• If an instrument is shown scientifically to be valid, then it
is almost certain that it is also reliable.
RELIABILITY STANDARD
Reliability Interpretation
0.90 and above Excellent reliability
0.80 – 0.90 Very good for a classroom
test
0.70 – 0.80 Good for a classroom test
0.60 – 0.70 Somewhat low
0.50 – 0.60 Needs revision of test
0.50 and below Questionable reliability
TYPES OF RELIABILITY
1. Test-Retest Reliability
2. Parallel Forms Reliability
3. Inter-Rater Reliability
4. Internal Consistency Reliability
1. TEST-RETEST RELIABILITY
• It is related to how adequately the content of the root test

sample the domain about which inference is to be made
(Calmorin, 2004)
• This is being established through logical analysis and

adequate sampling of test items usually enough to assure
that the test has content validity. (Oriondo, 1984)
Example
A teacher wishes to validate a test in
Mathematics. He requests experts in
Mathematics to judge if the items or
questions measures the knowledge, the
skills, and values supposed to be
measured.

Item Analysis and Validation: Educ5. Assessment of Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Item Analysis and Validation: Educ5. Assessment of Learning

Uploaded by

Copyright:

Available Formats

Item analysis

and discrimination index;

Determine the validity and reliability of a test item; and

Determine the quality of a test item by its difficulty index and

plausibility of option (for a selected-response test).

There are 2 important

• Measures the difficulty of an item with respect to

ITEM • Difficulty Index is usually expressed in percentage (%)

Item Difficulty = 75%

ITEM • Difficulty Index is usually expressed in percentage (%)

Difficulty Index measures how easy a test item is.

ITEM • Difficulty Index is usually expressed in percentage (%)

ITEM • Difficulty Index is usually expressed in percentage (%)

ITEM • Difficulty Index is usually expressed in percentage (%)

DISCRIM- • Discrimination Index = DU - DL

 Easy items cannot discriminate between the two groups of students.

DISCRIM- Discrimination Index = DU – DL

Discrimination Index = 0.60 – 0.20

Discrimination Index = 0.40

DISCRIM- • Discrimination Index = DU - DL

WHAT TO LOOK FOR IN THE DISCRIMINATION INDEX?

• In a sense, the item discriminates correctly between the two groups

• This is a perfectly discriminating item and is the ideal item that

DISCRIM- Retain / Reject

DISCRIM- • Discrimination Index = DU - DL

Multiple-Choice Item No. 1 (Correct Answer: B)

Compute the Difficulty and Discrimination Indices

Item Discrimination Index provided by ScorePak ®

Item analysis provides the

The difficulty of the item

Benefits derived from item analysis

1. It provides useful information for class discussion of the

• It is related to how adequately the content of the root test

• This is being established through logical analysis and

• The test is the extent to which a test measures a

• It refers to the degree to which the test correlates with a

• The test questions are said to have validity when they

Reliability refers to the consistency of the scores obtained—

• If an instrument is unreliable, it cannot yield valid

• It is related to how adequately the content of the root test

• This is being established through logical analysis and

You might also like