You are on page 1of 9

Binary-Choice Test Items

When students select an answer from only two options, they are completing
a binary-choice item, also called alternative response. The most common
binary-choice item is the true/false question. Other types of options can be
right/wrong, correct/incorrect, yes/no, fact/opinion, agree/disagree, and so on. In
each case, the student selects one of two options. In this section, we will use the
terms binary-choice, alternative response, and true/false (TF) interchangeably.
Miller et al. (2009) notes that binary-choice items are popular probably
because they are quick and easy to write, or at least they seem to be. It is true that
these items do take less time to write than good objective items of any other
format, but good binary-choice items are not that easy to write.
Before we proceed to discussing good practices in writing binary-choice
items, test your prior knowledge by accomplishing the following exercise. Use your
common sense to help you determine which are good items and which are poor.

Exercise: Put a G in the space next to the items you believe are good binary-choice
items and a P next to the items you feel are poor.
_____ 1. High-IQ children always get high grades in school.
_____ 2. Cognitive theorists believe that motivation to learn comes from extrinsic
factors.
_____ 3. If a plane crashed on the North and South Korean border, half of the
survivors would be buried in North Korea and half in South Korea.
_____ 4. The use of double negatives is not an altogether undesirable characteristic
of diplomats and academicians.
_____ 5. Prayer should not be outlawed in schools.
_____ 6. Of the objective items, true-false items are the least time-consuming to
construct.
_____ 7. The trend toward competency testing of high school graduates began in
the late 1970s and represents a big step forward for underachieving
learners.

In your notebooks, try to provide a brief explanation for each of your answers
in the exercise. To maximize learning, prepare you better for the discussions that
follow, and help you in the item writing activity later, make sure that you
accomplish the exercise before you read the answers and explanations provided
next.

Of the items presented in the exercise, only numbers two and six are
appropriately written. All the rest are poor items. Let’s discuss each below.
In item 1, the word always is an absolute. To some extent, alternative
response items depend on absolute judgments. However, statements of facts are
seldom completely true or completely false. Thus, an alert student will usually
answer “false” to items that include always, all, never, or only.
Remember that in classroom assessment, what we want to measure during
or after a learning experience is added knowledge or skill in students, not their test
wiseness. So to avoid this problem, reduce the effects of guessing by avoiding the

Prof Ed 221: Assessment in Learning 1 / jmmillare@usm.edu.ph


29
use of absolute terms such as always, all, never, or only. Item 1 could be improved
by replacing always with a less absolute term like tend. Thus, item 1 might read:

High IQ children tend to get high grades in school.

Item 2 is a good one. To answer the item correctly, the students would have
to know the perspectives of cognitive theorists about learning and motivation.
Item 3 is a trick question. “Survivors” of a plane crash are not buried! Chances
are that you never even noticed the word survivors and probably assumed the item
referred to fatalities. Trick items may have a place in tests of critical reading and
visual discrimination (in which case they would no longer be trick questions), but
seldom are they appropriate in the average classroom test. Rewritten, item 3 might
read:

If a plane crashes on the North and South Korean border, half the fatalities
would be buried in North Korea and half in South Korea.

Item 4 is also poor. First of all, it includes a double negative ⎼ not and
undesirable. Items with a single negative are confusing enough. Negating the first
negative with a second wastes space and test-taking time and also confuses most
students. If you want to say something, say it positively. The following revision
makes this item slightly more acceptable.

The use of double negatives is an altogether desirable trait of diplomats and


academicians.

We said slightly more acceptable because the item is still troublesome. The
word altogether is an absolute, and we now know we should avoid absolutes since
there usually are exceptions to the rules they imply. When we eliminate altogether,
the item reads:

The use of double negatives is a desirable trait of diplomats and academicians.

However, the item is still flawed because it states an opinion, not a fact. Is
the item true or false? The answer depends on who you ask. To most of us, the use
of double negatives is probably undesirable, for the reasons already stated. To some
diplomats, the use of double negatives may seem highly desirable. In short,
binary-choice statements should be factual. If you must use a binary-choice item to
measure knowledge of an opinionated position or statement, state the referent (the
person or group that made the statement or took the position), as illustrated in the
following revision:

According to the National Institute of Diplomacy, the use of double negatives is


a desirable trait of diplomats and academicians.

Item 5 further illustrates this point. It is deficient because it states an opinion.


It is neither obviously true nor obviously false. The following revision includes a
referent that makes it acceptable:

Prof Ed 221: Assessment in Learning 1 / jmmillare@usm.edu.ph


30
The Global Rights organization has taken the position that prayer should NOT
be outlawed in schools.

Notice the word NOT in Item 5. When you include a negative in a test item,
highlight it in italics, underlining, or uppercase letters so that the reader will not
overlook it. Remember that you intend to determine whether your students have
mastered your objective, not to ensure low test scores.
Item 6 represents a good item. It measures factual information, and the
phrase “Of the objective items” qualifies the item and limits it to a specific frame of
reference.
The last item is deficient because it is double barreled: It is actually two items
in one. When do you mark true for a double-barreled item? When both parts of the
item are true? When one part is true? Or only when the most important part is true?
The point is that items should measure a single idea. Double barreled items take too
much time to read and comprehend. To avoid this problem, simply construct two
items, as we have done here:

The trend toward competency testing of high school graduates began in the
late 1970s.

The trend toward competency testing represents a big step forward for
underachieving learners.

Better? Yes. Acceptable? Not quite. The second item is opinionated.


According to whom is this statement true or false? Let’s include a referent.

According to DepEd, the trend toward competency testing represents a big step
forward for underachieving learners.

The discussions that follow center on binary-choice item-writing guidelines


that apply to various learning targets.

Suggestions for Writing Binary-Choice Items

1. Write the item so that the answer options are consistent with the logic in
the sentence. The way the item is written will suggest a certain logic for
what type of response is most appropriate. For example, if you want to test
spelling knowledge, it doesn’t make much sense to use true/false questions;
it would be better to use correct/incorrect as options.

2. Avoid including two ideas in one statement unless cause-and-effect


relationships are being measured. For assessing recall knowledge, avoid two
or more facts or ideas in a single item. This is because one idea or fact may
be true and the other false, which introduces ambiguity and error.

Prof Ed 221: Assessment in Learning 1 / jmmillare@usm.edu.ph


31
Examples
Poor: T F Japan is easily affected by earthquakes because of the collision
between oceanic and continental plates.
Improved: T F Earthquakes in Japan are caused by the collision between
oceanic and continental plates.

An exception to this guideline is if the learning outcome measures the ability


to understand cause-and-effect relationships. (Examples are provided
separately in the section Assessing Reasoning and Deep Understanding)

3. Avoid long, complex sentences. As noted earlier, a test item should indicate
whether a student has achieved the knowledge or understanding being
measured. Long, complex sentences tend to also measure reading
comprehension and therefore should be avoided in tests designed to
measure achievement.

Examples
Poor: T F A cup with hot water that has a spoon in it will cool more quickly
than a similar cup with the same amount of hot water that does not have a
spoon in it.
Improved: T F Hot water in a cup will cool more quickly if a spoon is
placed in a cup.

4. Avoid broad general statements if they are to be judged true or false. Most
broad generalizations are false unless qualified, and the use of qualifiers
provides clues to the answer.

Examples
Poor: T F The president of the Philippines is elected to that office.
Poor: T F The president of the Philippines is usually elected to that office.

In this example, the first version is generally true but must be marked false
because there are exceptions, such as when the vice president takes office in
event of the president’s death or impeachment. In the second version, the
qualifier usually makes the statement true but provides a definite clue.
Words such as usually, generally, often, and sometimes, are more likely to
appear in true statements, and absolute terms such as always, never, all,
none, and only are more likely to appear in false statements.

Although the influence of such clues sometimes can be offset by balancing


their use in TF statements, the simplest solution seems to be to avoid the use
of broad generalizations that are obviously false or must be qualified by
specific determiners.

5. Avoid insignificant or trivial facts and words. It is relatively easy to write


“difficult” binary-choice items that measure trivial knowledge. Avoid this by
beginning with what you believe are the most significant learning targets.

Prof Ed 221: Assessment in Learning 1 / jmmillare@usm.edu.ph


32
Examples
Poor: T F Jose Rizal’s dog was named Vincenzo.
Poor: T F Charles Darwin was 22 years old when he began his voyage of
the world.

Trivial statements cause students to direct their attention toward


memorizing details at the expense of more important ideas.

6. Avoid negative statements, especially double negatives. Statements that


include the words not or no are confusing to students and make items and
answers more difficult to understand. Careful reading and sound logic
become prerequisites for answering correctly. If the knowledge can be tested
only with a negatively worded statement, be sure to highlight the negative
word with boldface type, underlining, or all caps.

Examples
Poor: T F Philippine senators are not elected to six-year terms.
Improved: T F Philippine senators are elected to six-year terms.

7. If opinion is used, attribute it to some source, unless the ability to identify


opinion is being measured. Statements of opinion cannot be marked true or
false. Knowing whether some significant individual or group supports or
refutes a certain opinion, however, can be important from a learning
standpoint.

Attributing an opinion to some source makes it possible to use true/false and


measure knowledge/simple understanding concerning the beliefs held by an
individual or the values supported by an organization or institution. If opinion
statements are not attributed to any source (there are no referents
indicated), the fact/opinion binary-choice item type is more appropriate to
use. This time, the learning target is reasoning/deep understanding.
(Examples are provided separately in the section Assessing Reasoning and
Deep Understanding)

8. True statements and false statements should be approximately equal in


length. There is a natural tendency for true statements to be longer because
such statements must be precisely phrased in order to be absolutely true.
This can be overcome by lengthening the false statements so that they
become similar in length with true statements. This way, the length of the
statement will be eliminated as a possible clue to the correct answer.

9. The number of true statements and false statements should be


approximately equal. Some students consistently mark statements “true”
when in doubt about an answer, whereas others consistently mark them
“false.” Neither response set should be favored by overloading the test with
items of one type.

Prof Ed 221: Assessment in Learning 1 / jmmillare@usm.edu.ph


33
In honoring this suggestion, the words approximately equal should be given
special attention. If a teacher consistently uses exactly the same number,
this will provide a clue to the student who is unable to answer some of the
test items. The best procedure seems to be to vary the percentage of true
statements somewhere between 40% and 60%. Under no circumstances
should the statements be all true or all false. Students who detect this as a
possibility can obtain perfect scores on the basis of one guess.

10. Do not try to trick students. Items that are written to “trick” students by
including a word that changes the meaning of an idea or by inserting some
trivial fact should be avoided. Trick items undermine your credibility,
frustrate students, and provide less valid measures of knowledge.

Assessing Application. Assessing application with binary-choice items is


essentially the same process as is used in multiple-choice items. Knowledge needs
to be used to answer questions that present novel situations. For example, the
following questions would test what students have learned about electricity and
resistance at the application level.

Examples
T F Other things being equal, an electric stove with greater resistance will
be hotter than a stove with less resistance.
T F Jon is building a new electric motor. His decision to use thicker wire
results in less resistance.

Assessing Reasoning and Deep Understanding. Binary-choice items can be


used to assess reasoning skills in several different ways. Students can be asked to
indicate whether a statement is a fact or an opinion.

Examples
If the statement is a fact, circle F; if it is an opinion, circle O.
F O Literature is ancient Rome’s most important legacy.
F O The word Mississippi has 11 letters.
F O The best way to wash a car is with a sponge.

If people are to think critically about a topic, they must first be able to
distinguish fact from opinion.
Additional reasoning skills can be assessed using the same approach by
developing some statements that are examples of the skill and some statements
that are not examples. This can be done with many of the critical thinking skills (e.g.,
identifying stereotypes, biased statements, emotional language, relevant data, and
verifiable data.)

Examples
If the statement is an example of a stereotype, circle S; if it is not a
stereotype, circle N.

Prof Ed 221: Assessment in Learning 1 / jmmillare@usm.edu.ph


34
S N Ilocanos are thrifty people.
S N Women live longer than men.

If emotional language is used in the statement, circle E; if no emotional


language is used, circle N.
E N Health insurance reform is needed so that poor people with serious
injuries will be able to lead productive lives.
E N Health insurance is going to cost a lot of money.

Logic can be assessed by asking if one statement follows logically from


another. More specifically, such binary-choice items measure the ability to
recognize cause-and-effect relationships. They usually contain two true
propositions in one statement, and the student is to judge whether the relationship
between them is true or false.

Examples
If the second part of the sentence explains why the first part is true, circle T
for true; if it does not explain why the first part is true, circle F for false.
T F Food is essential because it tastes good.
T F Plants are essential because they provide oxygen.
T F Alex is tall because he has blue eyes.

Directions: In each of the following statements, both parts of the statement


are true. You are to decide whether the second part explains why the first
part is true. If it does, circle Yes. If it does not, circle No.
Yes No 1. Leaves are important because they shade the tree trunk.
Yes No 2. Whales are mammals because they are large.
Yes No 3. Some plants do not need sunlight because they get their food
from other plants.

Checklist for Writing and Reviewing Binary-Choice Items (McMillan, 2018; Miller
et al., 2009)
✓ Is this the most appropriate type of item to use?
✓ Is the type of answer logically consistent with the statement?
✓ Does the item contain a single idea?
✓ Are the statements briefly and clearly expressed?
✓ Is trivial knowledge being tested?
✓ Is the item stated positively?
✓ Are opinion statements attributed to some source?
✓ Have specific determiners (e.g., usually, always) been avoided?
✓ Are the true and false (or other binary) items approximately equal in length?
✓ Has a detectable pattern of answers been avoided?
✓ Is there an approximately equal number of true and false (or other binary)
items?
✓ Does the item try to trick students?
✓ If revised, are the items still relevant to the intended learning outcomes?
✓ Have the items been set aside for a time before reviewing them?

Prof Ed 221: Assessment in Learning 1 / jmmillare@usm.edu.ph


35
Advantages of Binary-choice Test Items (McMillan, 2018; Miller et al., 2009):
1. They are efficient. Students can typically respond to 5-8 T/F items per minute
and about three T/F items in the time it takes to respond to two MC items.
2. Wide sampling of course material can be obtained. This is related to the first
advantage -- a student can respond to many test items in a short time, and
this makes it possible to cover a wide range of content. (See footnote1)
3. Compared to MC items, T/F questions are easier to construct. They take less
time to write than good objective items of any other format, however, good
alternative response items are not that easy to write (See footnote2).
4. The format is similar to what is asked in class, so students are familiar with
the thinking process involved in making binary choices.
5. Scoring is objective and quick.

Disadvantages/Limitations of Binary-choice Test Items (McMillan, 2018; (Miller, et


al., 2009):
1. They are not especially useful beyond the knowledge area. The exceptions to
this seem to be distinguishing between fact and opinion and identifying
cause and effect relationships. These two outcomes are probably the most
important measured by T/F items. Many of the learning outcomes measured
by T/F items can be done so more effectively by other forms of selection
items, especially the MC form.
2. They are susceptible to guessing. With every item, regardless of how well or
poorly written, the student has a 50% chance of guessing correctly even
without reading the item. This 50% chance becomes even greater if the
items are poorly constructed. Often more test-wise students are able to
score higher. A combination of some knowledge, guessing, and poorly
constructed items that give clues to the correct answer will allow some
students to score well, even though their level of knowledge is weak.

A common criticism of a T/F item is that a student may be able to recognize


a false statement as incorrect but still not know what is correct. For example, when

1
T/F types of items are not suitable for some types of subject matter. T/F
statements require course material that can be phrased so that the statements are
true or false without qualification or exception. There are areas in which such
absolutely true or false statements cannot be made. In some fields, such as the
social sciences, practically all significant statements require some qualification. In
some subject areas, only relatively trivial statements can be reduced to absolute
terms.
2
Miller et al., (2009) notes that the perceived ease of construction of T/F items has
probably resulted from the common practice of taking statements from textbooks,
changing half of them to false statements, and submitting the product to students
as a T/F test. Such test items are often too obvious that everyone gets them correct
or so ambiguous that even the better prepared students are confused. In short, it is
easy to construct poor T/F items. To construct clearly stated T/F items that measure
significant learning outcomes, however, requires much skill.

Prof Ed 221: Assessment in Learning 1 / jmmillare@usm.edu.ph


36
students answer the following item as false, it does not indicate whether they know
what negatively-charged particles of electricity are called; all the answer tells us is
that they know they are not called neutrons.

T F Negatively charged particles of electricity are called neutrons.

This is a rather crude measure of knowledge because there is an infinite


number of things that negatively charged particles of electricity are not called. To
overcome such difficulties, some teachers prefer to have the students change all
false statements to true. When this is required, the part of the statement it is
permissible to change should be indicated.
Directions: Read each of the following statements. If a statement is true,
circle the T. If a statement is false, circle the F and change the underlined
word to make the statement true. Place the new word in the blank space
before each number.
T F __________ 1. Particles of negatively charged electricity are called
neutrons.
T F __________ 2. Mechanical energy is turned into electricity by means
of the generator.
T F __________ 1. An electric condenser is used to generate electricity.
It is important to indicate the keywords to be changed. Otherwise, students
may change the entire statement. In addition to the increase in scoring difficulty,
this frequently leads to true statements that deviate considerably from the original
intent of the item. A clever student may even change false statements to true by
simply adding not in the appropriate place.
In sum, T/F items are most useful in situations where there are only two
possible alternatives (e.g., right, left; more, less; who, whom) and special uses such
as distinguishing fact from opinion, cause from effect, superstition from scientific
belief, relevant from irrelevant information, valid from invalid conclusions, and the
like.

Activity 5D

Task Description: This activity is designed to test your ability to construct binary
choice items according to guidelines presented.

Task Instructions:
1. Based on the test blueprint constructed for Activity 5A, select two or three
instructional objectives that can be measured using binary-choice items.
2. Write a total of 10 binary-choice items. Each objective selected should have
3-5 items that appropriately measure the learning outcome targeted.
3. Indicate the level of thinking, the objective, and the directions for every set
of items.
4. Review the test items using the checklist provided in this module.
5. Make necessary revisions.

Prof Ed 221: Assessment in Learning 1 / jmmillare@usm.edu.ph


37

You might also like