Testing and Measurement: An Overview of Constructing A Test

Testing And Measurement
An Overview of Constructing a Test.
2012/09/04 Testing & Measurement

What is a Test?
1. A systematic procedure for

obtaining a sample of student
behavior
2. A process of collecting
information.
Testing & Measurement

Types of Tests
 Criterion-Referenced Tests.
 Pre-tests
 Post-tests
 Norm-Referenced Tests.

Three basic Concepts of a Test
 A Test Focuses on a Particular Domain.
 A Test is a Sample of Behavior, Products,

Answers, or Performances from the Domain.
 A Test Permits the User to Make Inferences

About the Larger Domain of Interest.

Why Do We Test?
 Diagnose student strengths and weaknesses.
 Monitor each student’s progress.
 Assign grades.
 Determine the teacher’s own instructional
effectiveness.
 Provide information to inform instructional
and curricular decisions.
 Help teachers clarify their instructional
intentions.
When Do We Test?
 Formative Testing/Assessment.
 To ascertain whether each objective has been met.
 To gauge if each student has grasped the concept taught.
 To determine if we have to modify our methods or strategies
used to get information across.
 Summative Testing/Assessment.
 At the end of the teaching of the syllabus.
 To assign a mark or grade?

Test Construction
 There should be clearly defined objectives
which can be measured.
 Each Item must measure one objective only.
 However, an objective can be measured by a
number of items.
 Items should be selected based on the skills to
be measured.

Test Construction cont’d.
 Test items fall under two categories.
Selected Response.
 Multiple Choice
 True or False
 Matching
Constructed Response
 Fill-in-the-Blanks
 Short Answer Responses
 Extended Responses

Multiple Choice
 Stem
(Statement and question )
 e.g. The scrub jay's numbers are dwindling so rapidly
that some fear it soon may be found nowhere at all.
What does the word dwindling mean.
 Key
(the correct answer)
 Distractors
(other possible choices)
Multiple Choice cont’d.
 All answer choices should be plausible and
homogeneous.
 Answer choices should be similar in length
and grammatical form.
 Choices should be in a logical (alphabetical or
numerical) order.
 Avoid using "All of the Above" options.
 Usually falls in level 1or 2 of the taxonomy
but can also fall in higher levels.
Multiple Choice Cont’d
 Rainfall data gleaned from ancient cypress trees shows
that the region's worst drought in 800 years peaked in
1587, the year the 120 men, women and children of the
Roanoke colony were last seen by Europeans.
 What does peaked mean?
 A. was sharp
 B. was at its height
 C. was mountainous
 D. was rising
Multiple Choice Cont’d
 Rainfall data gleaned, from ancient cypress trees shows
that the region's worst drought in 800 years peaked in
1587, the year the 120 men, women and children of the
Roanoke colony were last seen by Europeans.
 What does peaked mean?
 A. was pale
 B. was at its height
 C. was hot
 D. was beautiful
Matching
 Matching items consist of two lists of words, phrases,
or images (often referred to as stems and responses).
 Answer choices should be short, related and arranged
in logical order.
 Responses should make sense and similar in length
and grammatical form.
 There should be more response options than stems.
 As a general rule, the stems should be longer than the
responses.

True/False
 Statements should be completely true or
completely false.
 Statements should be simple and easy to
follow.
 You should refrain from negatives -- especially
double negatives.
 Try hard not to use absolutes such as "always;
never."
 Items usually fall in level 1 of the taxonomy.
Fill in the Blanks.
 Questions should be direct and have a
definitive answer.
 It is recommended that no more than two (2)
blanks should be used per item.
 Blanks should come near the end of the
sentence.
 Incorrect spellings are usually accepted
depending on the objective.
Short Response
 Short response questions are more focused and
constrained than extended response questions.
 Short response includes task such as "write an

example," "list three reasons," or "compare
and contrast two techniques.

Extended Response
 Longer and more complex then short
responses.
 Has no one correct answer.
 Usually more weighting than the other
questions.
 Allows for examination of higher order
cognitive skills.

Educational Objectives to be Tested
 Questions are constructed based of the
educational objectives you want to test.
 Use of Bloom’s Taxonomy.
 Creation of a Table of Specifications.
 Every item should be accounted for in the
table.
 Should be a guideline to the development of
your marking scheme.
09/04/14 Testing & Measurement
Structure of Test Paper
 Always start with your Selected Response
questions first.
 End with the Extended Response (Essay)
questions.
 Ensure complete questions fall on the same
page.
 The weighting for each question should be
shown on the paper.
Structure of Test Paper cont’d.
 Make sure that the font is adequate for your
students.
 Allow for adequate spacing so the students can

respond to the questions properly.
 Use bold letters for the negatives in any

instructions.
Marking Scheme
 For every test paper there should be a marking
scheme.
 Each item must be identified on the mark
sheet.
 A breakdown of how the marks are allocated
should be done for each item.
 Should be presented if students challenge their
marks.
 Should accompany your scripts.
Analysis of Marks
 An analysis should be done to see how the
marks are distributed.
 Comparison of marks to previous tests done.
 Make evaluations based on students.
 Make evaluations based on self.
 Help determine the reliability and validity of
the test.

Reliability
 Degree to which the test consistently yields the
same results. Test-Retest Reliability.
 Consistency across two administrations to

the same students. Split-Halves Reliability.
 Consistency across two forms of the same

test.

Validity
 Degree to which the test measures what it is
intended to measure. (Content Validity)
 Degree to which the test predicts performance.

(Criterion-Related Validity)

Test Bank
 Store questions that can be drawn to formulate
a test.
 Add more questions which you deem credible.
 Makes it easy for consistency to be maintained

throughout the school.

Item Analysis
 Item Analysis is usually done after your test has been
administered.
 Focus is usually on Multiple Choice Items.
 How can you be sure that the items are appropriate --
not too difficult and not too easy? (Difficulty Index )
 How will you know if the test effectively
differentiates between students who do well on the
overall test and those who do not? (Discrimination
Index)

Finding the Difficulty Index
Question A B C D
Item #1 0 3 24* 3
Item #2 12* 13 3 2
* Denotes items which are correct
To Find the difficulty Index (p), you divide the

number of correct answers by the total number of
students.
Hence for item 1, p = 24 / 30 = .80
For item 2, p = 12 / 30 = .40
Difficulty Index Cont’d
An item is considered difficult when the difficulty index

value is less than 30%.
An item is considered easy when the index value is

greater than 80%.
Items falling outside of these percentages can be

discarded or revised.

The End
 Thank You
 Applause
 ?????????????????????????

Testing and Measurement: An Overview of Constructing A Test

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Testing and Measurement: An Overview of Constructing A Test

Uploaded by

Copyright:

Available Formats

Testing And Measurement

An Overview of Constructing a Test.

2012/09/04 Testing & Measurement

1. A systematic procedure for

Testing & Measurement

Testing & Measurement

 A Test is a Sample of Behavior, Products,

 A Test Permits the User to Make Inferences

Testing & Measurement

Testing & Measurement

Testing & Measurement

Testing & Measurement

Testing & Measurement

 Short response includes task such as "write an

Testing & Measurement

Testing & Measurement

 Allow for adequate spacing so the students can

 Use bold letters for the negatives in any

Testing & Measurement

 Consistency across two administrations to

 Consistency across two forms of the same

Testing & Measurement

 Degree to which the test predicts performance.

Testing & Measurement

 Add more questions which you deem credible.

 Makes it easy for consistency to be maintained

Testing & Measurement

Testing & Measurement

* Denotes items which are correct

To Find the difficulty Index (p), you divide the

An item is considered difficult when the difficulty index

An item is considered easy when the index value is

Items falling outside of these percentages can be

Testing & Measurement

You might also like