You are on page 1of 23

Direction: Identify the following.

1. What is the colour of apple?

2. How many dwarves does Cinderella have?

3. How many years were we’re conquered by the Spanish?

4. In what continent does Philippines belongs to?

5. What is 1=4?
Direction: Identify the following.
1. What creature has the largest eyes in the world?
2. Is light a form of wave or particle?
3. What do you call a system through which things are bought and
sold illegally?
4. In Mathematics, what do you call the operation that puts together
the number and determine its sum?
5. What country doesn’t have written constitution?
-Item Difficulty Number of the
students who are
-Discrimination able to answer the
Index item
Item difficulty = no. of students with correct answer
total no. of students
Range of Interpretation Action
Difficulty Index
0 – 0.25 Difficult Revise or discard

0.26 – 0.75 Right Difficulty Retain

0.76 - above Easy Revise or discard


Measure how difficult an
item is with respect to those
in the upper 25% of the class
Index of as well as to the lower 25%n
of the class.
Discrimination
Index of Discrimination
= DU - DL
Given that:

Upper 25% of the class = 0.60


Lower 25% of the class = 0.20

0.40
Index Range Interpretation Action
-1.50 - -.50 Can discriminate Discard
but item is
questionable
-5.5 – 0.45 Non-discriminating Revise

0.46 – 1.0 Discriminating item Include


• 0.1 = lower 25% of the class failed to get the correct
answer while the upper 25% of the class got the right
answer

• -0.1 = lower 25% of the class got the correct answer


while the upper 25% of the class failed to get to correct
answer
Consider a multiple choice type of test which the ff. data were
obtained.
Item no. 1
Options
A B* C D

0 40 20 20 TOTAL

0 15 5 0 Upper 25%

0 5 10 5 Lower 25%
Difficulty index = 40/100 = 40%, within the range of “good item”

Discrimination index:
DU = 15/20 = .75 or 75%
DL = 5/20 = .25 or 25%
Therefore:
.75 - .25 = .50 or 50%

Conclusion: The item has a “good discriminating


power”
Item Discrimination - refers to the ability of an item to differentiate
among the students on the basis of how well they know the material
being tested.
- a Pearson Product Moment correlation between
student responses to a particular item and total scores on all other items
on the test.
-Provides an estimate of the degree to which an
individual item is measuring the same thing as the rest of the items.
VALIDATION
Validation

Extent to w/c a test measures


Process of collecting and what it purports to measure
analysing evidence to or as referring to the
support the meaningfulness appropriateness, correctness,
and usefulness of the test. meaningfulness and
usefulness of the specific
decisions a teacher makes
based on the test results.
Three main types
of evidences

Content- Criterion- Construct-


related related related
• Concurrent Validity
• Predictive Validty
Grade Point Average
Test Score Very Good Good Needs
Improvement
High 20 10 5

Average 10 25 5

Low 1 10 14
RELIABILITY
Item Discrimination - refers to the ability of an item to differentiate
among the students on the basis of how well they know the material
being tested.
- a Pearson Product Moment correlation between
student responses to a particular item and total scores on all other items
on the test.
-Provides an estimate of the degree to which an
individual item is measuring the same thing as the rest of the items.
Reliability

The consistency of the scores obtained-how


consistent they are fro each individual from
one administration of a instrument to another
and from one set of items to another
Reliability Interpretation
.90 and above Excellent reliability; at the level of the best
standardized tests
80-90 Very good for classroom test
.70-80 Good for classroom test; in the range of most. There
are probably a few items which could be improved.

.60-70 Somewhat low. This test needs to be supplemented by


others measures (e.g. , more tests) to determine
grades. There are probably some items which could be
improved

.50-60 Suggest need for revision of test, unless it is quite short


( ten or fewer items). The test definitely needs to be
supplemented by other measures (e.g., more test) for
grading.

.50 or below Questionable reliability. This test should not contribute


heavily to the course grade, and it needs revision.

You might also like