You are on page 1of 12

WHAT IS PAPER-AND-PENCIL-TESTS?

Paper-and-pencil assessment refers to traditional student assessment formats such as written tests and
also to standardized tests that ask students to use pencils to fill in bubbles on a scannable answer sheet.

In the classroom, paper-and-pencil assessment frequently refers to tests scored objectively, which are
meant to measure memorized knowledge and lower levels of understanding, as compared with
performance-based assessment, which is meant to measure deeper understanding through skills and
ability.

Paper-and-pencil test can either be selected response or constructed-response types.

Selected-response items ask students to select the correct answer from a list of options included in the
item.

Selected-response tests are those that are composed of questions to which there is typically one best
answer. They are sometimes referred to as objective assessments (Suskie, 2018).

Selected response type includes:

a. True-false items
b. Multiple – choice type items
c. Matching type

Constructed-response items ask students to write, or “construct,” the correct answer.

Constructed response type of test includes:

a. Enumeration
b. Completion
c. Essays

TYPES OF PAPER-AND-PENCIL TEST

Selected-response tests are those that are composed of questions to which there is typically one best
answer. They are sometimes referred to as objective assessments (Suskie, 2018). Some of the most
commonly used selected-response tests include multiple choice, fill-in-the-blank, true-false, and/or
matching questions/items.

Purpose for Using Selected-Response Tests

Nilson (2016) notes that these tests are good for assessing students’ ability to remember and
understand course concepts and materials, but cannot “measure students’ abilities to create, organize,
communicate, define problems, or conduct research” (p. 291).

The construction of valid test items begins with a Table of Specifications.

Why there is a need to plan a test and construct the table of specifications?

The table of specifications (TOS) is a tool used to ensure that a test or assessment measures the content
and thinking skills that the test intends to measure. ... That is, a TOS helps test constructors to focus on
issue of response content, ensuring that the test or assessment measures what it intends to measure.
The primary purpose of a TOS is to ensure alignment between the items or elements of an assessment
and the content, skills, or constructs that the assessment intends to assess.

PLANNING A TEST AND CONSTRUCTION OF TABLE OF SPECIFICATIONS (TOS)

The important steps in planning for a test are;

1. Identifying test objectives/lesson outcomes

2. Deciding on the type of objective test to be prepared

3. Preparing a Table of Specifications (TOS)

4. Constructing the draft test items

5 Try – out and validation

Identifying Test Objectives

An objective test, if it is to be comprehensive, must cover the various levels of Bloom’s taxonomy. Each
objective consists of a statement of what is to be achieved preferably by the students. The following are
typical objectives: knowledge/remembering, comprehension/understanding, application/applying,
analysis/analyzing, evaluation/evaluating, synthesis/synthesizing.

Deciding on the Type of Objective Test

The test objectives guide the kind of objective tests that will be designed and constructed by the
teacher. This means aligning the test with the lesson objective/outcome. The test to be formulated must
be aligned with the learning outcome. This is the principle of constructive alignment.

Type 7 then enter

Constructive Alignment involves:

a. Thoughtfully determining intentions for what students should learn and how they will
demonstrate their achievement of these intended learning outcomes, and clearly communicating these
to students;

b. Designing teaching and learning activities so that students are optimally engaged in achieving
these learning outcomes; and

c. Creating assessments that will allow students to demonstrate their attainment of the learning
outcomes and allow instructors to discern how well these outcomes have been achieved.

Type 6 then enter

Preparing a Table of Specifications

A Table of Specifications or TOS is a test map that guides the teacher in constructing a test. The TOS
ensures that there is balance between items that test lower-level thinking skills and those which test
higher order thinking skills (or alternatively, a balance between easy and difficult items) in the test.

Type 8 then enter


Table1: Table of specification for a (30) items Economics test for SS2.

Objectives
Remembering Understanding Applying Total
Consumer’s
behavior & Price 2 4 3 9
determination
Population 2 2 2 6
Money Inflation 1 3 2 6
Economics Systems 1 2 2 5
Principle of 1 2 1 4
Economics
Total 7 13 10 30

From the table, it would be seen that of the five subject matter area, consumer behavior / price
determination attracted the highest number of items (that is 9) and the principle of Economics, the least
(that is 4). And for objectives the understanding level had (13) items as the highest. The remembering
level had the least. The distribution of number of items in each cell (that is for each objective level and
subject matter) is a reflection of the emphasis and the importance the teacher attached to these areas.
With a table of specification of this nature designed the teacher then proceeds to construct the test
items or questions. This must be in line with what has been specified in the table specification.

Type 6 then enter

Table of specification to kibler (1998) is to ensure that the subject matter content and the course
objectives are adequately sampled by the test items; We need to develop a table of specification that
will provide a guide to the item construction which takes into account the relative importance of each
component of the syllabus and each level of cognitive domain. TOs should be prepared before testing.
The teacher should develop the table of specification in order to have content sampling and item
validity. These specifications may help the teacher to be more effective. In order words, it will help the
teacher in organizing teaching and learning, assessment and evaluation as well as all the resources he
plans to achieve during the teaching and learning.

Constructing the Test Items

The actual construction of the test items follows the TOS. As a general rule, it is advised that the actual
number of items to be constructed in the draft should be double the desired number of items. For
instance, if there are five (5) knowledge level items to be included in the final test form, then at least ten
(10) knowledge level item should be included in the draft.

Item Analysis and Try-out

The test draft is tried out to a group of pupils or students. The purpose of this try-out is to determine the
(a) item characteristics trough item analysis, and (b) characteristics of the test itself-validity, reliability
and practicality.
Type 9 then enter

CONSTRUCTING SELECTED-RESPONSE TYPE

A. TRUE AND FALSE TEST

In a traditional true/false question, students are asked to judge whether a factual statement is either
true or false. True/false questions are best suited to assessing surface level knowledge, but can be
crafted to assess higher order thinking.

True/False questions are quite popular because they are generally easy to write; one does not have to
think of lots of plausible but incorrect answers as with an MCQ. However, one has to be a little careful
with them: a true/false question should contain only a single statement and be one that is either clearly
true or false.

It is also arguable that they are unsuitable for summative assessment because the student would score
50% simply by answering at random.

Can a modified true false test offset the effect of guessing?

By requiring students to explain their answers and disregarding a correct response if the explanation is
incorrect, a modified true-false test can lessen the effect of guessing.

Here are some rules of thumb in constructing true-false items.

GUIDELINES FOR CONSTRUCTING ALTERNATE-RESPONSE TEST

Rule 1: Do not give a hint (inadvertently) in the body of the question.

Example. The Philippines gained its independence in 1898 and therefore celebrated its centennial year
in 2000.

centennial year is a 100th anniversary.

Obviously, the answer is FALSE because 100 years from 1898 is not 2000 but 1998.

Rule 2. Avoid using the words "always”, “never", "often and other words that tend to be

either always true or always false.

Example: The sky is always blue.

the statement that “the sky is always blue” is not necessarily right if you account for days that are cloudy
or the dark, nighttime sky.

Rule 3: Avoid long sentences as these tend to be “true”. Keep sentences short.

Avoid using an abundance of words in your true or false question in order to make it more challenging.
In fact, the ideal true or false statement should consist of a simple sentence that lacks commas or semi-
colons.
Rule 4. Avoid trick statements with some minor misleading word or spelling anomaly, misplaced
phrases, etc. A wise student who does not know the subject matter may detect this strategy and thus
get the answer correctly.

Example:

True or False. The Principle of our school is Mr. Albert P. Panadero.

The principal’s name may actually be correct but since the word is misspelled and the entire sentence
takes a different meaning, the answer would be false! This is an example of a tricky but utterly useless
item.

Rule 5: Avoid quoting verbatim from reference materials or textbooks. This practice sends the wrong
signal to the students that it is necessary to memorize the textbook word for word and thus,
acquisition of higher-level thinking skills is not given due importance.

Example:

A pronoun is a word that takes the place of a noun, a group of words acting as noun, or another
pronoun. (From Grammar and Composition Handbook of Glencoe McGraw Hill).

Rule 6. Avoid specific determiners or give-away qualifiers. Students quickly learn that strongly worded
statements are more likely to be false than true, for example, statements with “never” “no” “all” or
“always.” Moderately worded statements are more likely to be true than false. Statements with
“many” “often” “sometimes” “generally” ‘frequently” or “some” should be avoided.

Example:

All types of cars have some type of engine.

True. Even though the absolute term “all” could tend to make this question false, the qualifier “some”
makes the question more general and allows for possibilities (“some type of engine”: doesn’t have to be
the familiar gasoline- driven engine).

Rule 7. With true or false questions, avoid a grossly disproportionate number of either true or false
statements or even patterns in the occurrence of true and false statements.

Example: TFTFTFTF

Rule 8: Avoid double negatives. This makes test item unclear and definitely will confuse the student.

Example: Aspirin is not an illegal drug.

Hint: Cancel the negatives to turn question into a positive statement, then select your answer.

It will make “Aspirin is a legal drug.” That’s why the answer is True.

Rules in Constructing Multiple Choice Tests

A generalization of the true-false test, the multiple-choice type of test offers the student with more than
two (2) options per item to choose from. Each item in a multiple-choice test consists of two parts: (a) the
stem, and (b) the options. In the set of options, there is a “correct” or “best” option while all the others
are considered “distracters”.

Terminology for Multiple-choice Items

Before discussing the construction of such items, let's review the terminology commonly used to
describe the parts of multiple-choice questions. The diagram below labels the specific components of a
multiple-choice item.

Stem: A question or statement followed by a number of choices or alternatives that answer or complete
the question or statement

Alternatives: All the possible choices or responses to the stem

Distractors (foils): Incorrect alternatives

Correct answer: The correct alternative!

The distracters are chosen in such a way that they are attractive to those who do not know the answer
or are guessing but at the same time, have no appeal to those who actually know the answer. It is this
feature of multiple-choice type tests that allow the teacher to test higher-order thinking skills even if the
options are clearly stated. As in true-false items, there are certain rules of thumb to be followed in
constructing multiple-choice tests.

Rule 1: Do not use unfamiliar words, terms and phrases. The ability of the item to discriminate or its
level of difficulty should stem from the subject matter rather than from the wording of the question.

Example: What would be the system reliability of a computer system whose slave and peripherals are
connected in parallel circuits and each one has a known time to failure probability of 0.05?

A student completely unfamiliar with the term’s “slave” and “peripherals “may not be able to answer
correctly even if he knew the subject matter of reliability.

Rule 2: Do not use modifiers that are vague and whose meanings can differ from one person to the
next such as: much, often, usually, etc.

Example: Much of the process of photosynthesis takes place in the:

a. bark

b. leaf

c. stem

The qualifier “much” is vague and could have been replaced by more specific qualifiers like:” 90% of the
photosynthetic process” or some similar phrase that would be more precise.

Rule 3: Avoid complex or awkward word arrangements. Also, avoid use of negatives in the stem as
this may add unnecessary comprehension difficulties.

Example:
(Poor) As President of the Republic of the Philippines, Corazon Cojuangco Aquino would stand next to
which President of the Philippine Republic subsequent to the 1986 EDSA Revolution?

(Better) Who was the President of the Philippines after Corazon C. Aquino?

Keep the stem simple, only including relevant information.

Example:

[Stem]: The purchase of the Louisiana Territory, completed in 1803 and considered one of Thomas
Jefferson's greatest accomplishments as president, primarily grew out of our need for

a. the port of New Orleans*

b. helping Haitians against Napoleon

c. the friendship of Great Britain

d. control over the Indians

CHANGE TO:

[Stem]: The purchase of the Louisiana Territory primarily grew out of our need for

a. the port of New Orleans*

b. helping Haitians against Napoleon

c. the friendship of Great Britain

d. control over the Indians

*an asterisk indicates the correct answer.

Any additional information that is irrelevant to the question, such as the phrase "completed in 1803…,"
can distract or confuse the student, thus providing an alternative explanation for why the item was
missed. Keep it simple.

Rule 4: Do not use negatives or double negatives as such statements tend to be confusing. It is best to
use simpler sentences rather than sentences that would require expertise in grammatical
construction.

Example:

(Poor) Which of the following will not cause inflation in the Philippine economy?

(Better) Which of the following will cause inflation in the Philippine economy?

(Poor) What does the statement “Development patterns acquired during the formative years are NOT
Unchangeable” imply?

A.

B.
C.

D.

(Better) What does the statement “Development patterns acquired during the formative years are
changeable” imply?

A.

B.

C.

D.

Once again, trying to determine which answer is NOT consistent with the stem requires more cognitive
load from the students and promotes the likelihood of more confusion. If that additional load or
confusion is unnecessary it should be avoided (Haladyna, Downing, & Rodriguez, 2002).

If you are going to use NOT or EXCEPT, the word should be highlighted in some manner so that students
recognize a negative is being used.

Rule 5: Each item stem should be as short as possible; otherwise, you risk testing more for reading and
comprehension skills.

Rule 6: Distracters should be equally plausible and attractive.

Example: The short story: May Day’s Eve, was written by which Filipino author?

a. Jose Garcia Villa

b. Nick Joaquin

c. Genoveva Edrosa Matute

d. Robert Frost

e. Edgar Allan Poe

If students can easily discount one or more distractors, then the chance of guessing is increased,
reducing the discriminability of that item

If distracters had all been Filipino authors, the value of the item would be greatly increased. In this
particular instance, only the first three carry the burden of the entire item since the last two can be
essentially disregarded by the students.

Rule 7: All multiple-choice options should be grammatically consistent with the stem.

Example:

What is the dietary substance that is often associated with heart disease when found in high levels in
the blood?

a. glucose
b. cholesterol*

c. beta carotene

d. proteins

Change To

a. glucose

b. cholesterol*

c. beta carotene

d. protein

The distractor "proteins" is inconsistent with the stem; the stem is asking for a singular substance while
"proteins" is plural. It can be easy for the test writer to miss such inconsistencies. As a result, students
may more easily guess the correct answer without understanding the concept - a rival explanation.

Rule 8: The length, explicitness, or degree of technicality of alternatives should not be the
determinants of the correctness of the answer. The following is an example of this rule:

Example: If the three angles of two triangles are congruent, then the triangles are:

a. congruent whenever one of the sides of the triangles are congruent

b. similar

c. equiangular and therefore, must also be congruent

d. equilateral if they are equiangular

The correct choice, “b,” may be obvious from its length and explicitness alone. The other choices are
long and tend to explain why they must be the correct choices forcing the students to think that they
are, in fact, not the correct answers!

Rule 9: Avoid stems that reveal the answer to another item.

Example:

One item on a test might be:

The electronic online catalog includes

a. books, videos, reference materials*

b. magazine articles and compact discs

c. newspaper clippings

d. only books

A later question on the same test asks:


Using the online catalog, which search term would you use to find a book by a specific writer?

a. title keyword

b. subject

c. author*

d. call number

After students see that online catalogs include books in the latter question, they can return to the first
question and rule out any alternatives that do not include books. It is relatively easy to miss such clues
when constructing a test since we construct many tests item by item. Thus, it is imperative to review the
entire test to check for clues.

Rule 10: Avoid alternatives that are synonymous with others or those that, include or overlap others.

Example: What causes ice to transform from solid state to liquid state’?

a. Change in temperature

b. Changes in pressure

c. Change in the chemical composition

d. Change in heat levels

The options a and d are essentially the same. Thus, a student who spots these identical choices would
right away narrow down the field of choices to a, b, and c. The last distracter would play no significant
role in increasing the value of the item.

Rule 11: Avoid presenting sequenced items in the same order as in the text.

Rule 12: Avoid use of assumed qualifiers that many examinees may not be aware of.

Rule 13: Avoid use of unnecessary words or phrases, which are not relevant to the problem at hand
(unless such discriminating ability is the primary intent of the evaluation). The items value is
particularly damaged if the unnecessary material is designed to distract or mislead. Such items test
the student’s reading comprehension rather than knowledge of the subject matter.

Example: The side opposite the thirty-degree angle in a right triangle is equal to half the length of the
hypotenuse. If the sine of a 30-degree is 0.5 and its hypotenuse is 5, what is the length of the side
opposite the 30-degree angle?

a. 2.5

b. 3.5

c. 5.5

d. 1.5
The sine of a 30-degree angle is really quite unnecessary since the first sentence already gives the
method for finding the length of the side opposite the thirty-degree angle. This is a case of a teacher
who wants to make sure that no student in his class gets the wrong answer!

Rule 14: Avoid use of non-relevant sources of difficulty such as requiring a complex calculation when
only knowledge of a principle is being tested.

Note in the previous example, knowledge of the sine of the 30-degree angle would have led some
students to use the sine formula for calculation even if a simpler approach would have sufficed.

Rule 15: Pack the question in the stem.

Here is an example of a question which has no questions. Avoid it by all means.

Example:

The Roman Empire_________.

a. Had no central government


b. Had no definite territory
c. Had no heroes
d. Had no common religion

Rule 16: Use the “None of the above” option only when the keyed answer is totally correct. When
choice of the “best” response is intended, “none of the above” is not appropriate, since the
implication has already been made that the correct response may be partially inaccurate.

Rule 17: Note that the use of “all of the above” may allow credit for partial knowledge. In a multiple
option item, (allowing only one option choice) if a student only knew that two (2) options were
correct, he could then deduce the correctness of “all of the above”. This assumes you are allowed only
one correct choice.

If a student recognizes that two of the four alternatives are true, the student knows that the answer is
all of the above without having to know whether the remaining alternative is true or not. Such guessing
requires some knowledge of the material, but not as extensive understanding as if they had to consider
all four of the alternatives.

Rule 18: Better still use “none of the above” and “all of the above” sparingly but best not to use them
at all.

All of the above and none of the above have been misused as alternatives on some tests because
students have learned that all of the above or none of the above is almost always the right answer when
it is used on those tests. So, if you use all of the above or none of the above, do not always make it the
right or wrong answer. Generally, research has found more problems with the use of "all of the above"
than with "none of the above," but the common recommendation for both is to limit their use
(Haladyna, Downing, & Rodriguez, 2002).
Rule 19: Having compound response choices may purposefully increase difficulty of an item.

The difficulty of a multiple-choice item may be controlled by varying the homogeneity or


degree of similarity of responses. The more homogeneous, the more difficult the item.

Example:

(Less Homogeneous) Thailand is located in:

a. Southeast Asia

b. Eastern Europe

c. South America

d. East Africa

e. Central America

(More Homogeneous) Thailand is located next to:

a. Laos and Kampuchea

b. India and China

c. China and Malaya

d. Laos and China

e. India and Malaya

You might also like