You are on page 1of 9

LESSON 4 – DESIGNING AND DEVELOPING ASSESSMENT

Introduction
The quality of assessment instrument is vital since the evaluation and judgement of a
teacher on his/her students are based on the information obtained using these instruments. Thus,
to come up with more realistic assessments of student learning, teachers must consider various
components of high quality assessment.

A. General Principle of High -Quality Assessment


Ebel and Fresbie (1999) as cited by Garcia (2008) listed five basic principles that
should guide teachers in assessing the learning progress of the students and in developing
their own assessment tools.
• Measure all instructional objectives. The test that a teacher writes should match all
the learning objectives posed during instruction.
• Cover all the learning tasks. The teacher should construct a test that contains a wide
range of sampling items. It must not focus only on one type of objective. It must be
representative of all targeted learning outcomes.
• Use appropriate test items. The test items must be appropriate to measure learning
outcomes.
• Make test valid and reliable. That teacher must construct a test that is valid so that it
can measure what it is supposed to measure from the students. The test is reliable
when the scores of the students remain the same or consistent when the teacher gives
the same test for the second time.
• Use test to improve learning. The test scores should be utilized by the teacher
properly to improve learning by discussing the skills or competencies on the items that
have not been learned or mastered.

1. Clarity of the Learning Target. When a teacher plans for his classroom instruction, the
learning target should be clearly stated and must be focused on students learning
objectives rather than teacher activity. The learning outcomes must be Specific,
Measurable, Attainable, Realistic, and Time-bound (SMART). The performance task of the
students should also be clearly presented so that they can accurately demonstrate what
they are supposed to do and how the final product should be done. The evaluation
procedures, the criteria to be used and the skills to be assessed should also be clearly
discussed with the students
2. Appropriateness of Assessment Tools. The type of test should always match the
instructional objectives or learning outcomes of the subject matter posed during the
delivery of the instruction. Teachers should be skilled in choosing and developing
assessment methods appropriate for instructional decisions.
The kinds of assessment tools commonly used to assess the learning progress of the
students are:
a. Objective Test. It is a type of test that requires students to select the correct
response from several alternatives or to supply a word or short phrase to answer
a question or to complete a statement. It includes true-false, matching type, and
multiple choice questions. The word objective refers to the scoring, it indicates
that there is only one correct answer.
b. Subjective Test. It is a type of test that permits the student to organize and
present an original answer. It includes either short answer questions or long
general questions. This type of test has no specific answer. Hence, it is usually
scored on an opinion basis, although there will be certain facts and understanding
expected in the answer.
c. Performance Assessment. According to Mueller (2010), it is an assessment in
which students are asked to perform real-world tasks that demonstrate
meaningful application of essential knowledge and skills. It can appropriately
measure learning objectives which focus on the ability of the students to
demonstrate skills or knowledge in real-life situations.
d. Portfolio Assessment. It is an assessment that is based on the systematic,
longitudinal collection of student work created in response to specific known
instructional objectives and evaluated through in relation to the same criteria.
Portfolio is a purposeful collection of student’s work that exhibits the student’s
efforts, progress and achievements in one or more areas over a period of time. It
measures growth and development of the students.
e. Oral Questioning. This method is used to collect assessment data by asking oral
questions. The most commonly used of all forms of assessment in class assuming
that the learner hears and shares the use of common language with the teacher
during instruction. The ability of the student to communicate orally is very
relevant to this type of assessment. This is also a form of formative assessment.
f. Observation Technique. Another method of collecting data is through
observation. The teacher will observe how the students carry out the activities
either observing a process or product. Observation techniques may be formal or
informal. Formal observations are planned in advance like when the teacher
assesses oral report or presentation in class while informal observation is done
spontaneously during instruction like observing the working behavior of students
while performing a laboratory experiment. The behavior of the students involved
in his performance is systematically monitored, described, classified and
analyzed.
g. Self-report. The responses of the student may be used to evaluate both
performance and attitude. Assessment tools could include sentence completion,
Likert scales, checklists, or holistic scales.

B. Qualities of Assessment Tools


A good test must possess the following attributes or qualities:
1. Validity. It refers to the degree to which a test measures what it seeks to measure.
To determine whether the test constructed is valid or not, he/she has to answer the
following questions:
a. Does the test adequately sample the intended content?
b. Does it test the behaviors/skills important to the content being tested?
c. Does it test all the instructional objectives of the content taken up in the class?
2. Reliability. It refers to the consistency of measurement. A test, therefore, is reliable
if it produces similar results when administered twice to the same group of students.
3. Fairness. Fairness means the test item should not have any biases. It should not be
offensive to any examinee subgroup. To insure fairness, the teacher should construct
and administer the test in a manner that allows students an equal chances to
demonstrate their knowledge or skills.
4. Objectivity. It is the extent to which personal biases or subjectivity judgement of the
test scorer is eliminated in checking the students’ responses to the test items. If two
raters who assess the same student on the same test cannot agree on th3e score, the
test lacks objectivity and neither of the scores from the judges is valid.
5. Scorability. It means that the test is easy to score and check as answer key and
answer sheet are provided.
6. Adequacy. It means that the test should contain a wide range of sampling of items to
determine the educational outcomes or abilities so that the resulting scores are
representatives of the total performance in the areas measured.
7. Administrability. The test is easy to administer as clear and simple instructions are
provided to students, proctors, and scorers. It should be administered uniformly to all
students so that the scores obtained will not vary due to factors other than
differences of the students’ knowledge and skills.
8. Practicality and Efficiency. This principle states that evaluation should be finished in
a specified period of time and applicable in a particular educational setting. It refers
to the teacher’s familiarity with the methods used, time required for the assessment,
complexity of the administration, ease of scoring, ease of interpretation of the test
results and the materials used that must be at the lowest cost.
C. Planning Test and Construction of Table of Specifications (TOS)
a. Steps in Developing Assessment Tools
1. Examine the instructional objectives and learning outcomes. The first step is to
examine and go back to the instructional objectives so that you can match with the
test items to be constructed.
2. Make a table of specification (TOS).The table of specification is a chart or table
that details the content and level of cognitive level assessed on a test as well as the
types and emphases of test items (Gareis and Grant, 2018). It is important in
addressing the validity and reliability of the test items; provides the test
constructor a way to ensure that the assessment is based from the intended
learning outcomes, and the number of questions on the test is adequate to ensure
dependable results. It also serves as guide in constructing a test and determining
the type of test items that you need to construct.
3. Construct the test items. The following are some guidelines in constructing test
items (Airisian, 1994):
a. Avoid wording that is ambiguous and confusing.
b. Use appropriate vocabulary and sentence structure.
c. Keep questions short and to the point.
d. Write the items such that they have one correct answer.
e. Do not provide clues to the answer.
4. Assemble the test items. When assembling test items, consider the following
guidelines:
a. Group all items that have similar format.
b. Arrange the test items from easy to difficult.
c. Space the items for easy reading.
d. Keep items and options in the same page.
e. Place the illustration near the description.
5. Check the assembled test items. Before reproducing the test, it is very important
to proofread the test items for some typographical and grammatical errors and
make necessary corrections if any.
6. Write directions. Test directions are very important in any written test as the
inability of the test taker to understand them affects the validity of a test. Thus,
directions should be complete, clear, and concise which the students have to follow
in answering the test. The method of answering has to be kept as simple as
possible. Test direction should also contain instructions on guessing.
7. Make the answer key. To facilitate checking of students’ answers, the teacher has
to provide a scoring key in advance. Be sure to check the answer key so that the
correct answers follow a fairly random sequence.
8. Analyze and improve the test items. Analyzing and improving the test items should
be done after checking, scoring and recording the test to examine the quality of
each item in the test.

D. Table of Specifications
What is a Table of Specification?
A Table of Specifications (TOS) sometimes called a test blueprint, is a tool used by
teachers to design a test. It is a table that maps out the test objectives, content, or topics
covered by the test; the levels of cognitive behavior to be measured; the distribution of items;
and the test format. It helps ensure that the course’s intended learning outcomes,
assessments and instructions are aligned.
Generally, a TOS is prepared before a test is created. However, it is ideal to prepare
one even before the start of instruction. Teachers need to create a TOS for every test that
they intend to develop. The test TOS is important because it does the following:
● ensures that the instructional objectives and what the test captures match
● ensures that the test developer will not overlook details that are considered
essential to a good test
● makes developing a test easier and more efficient
● ensures that the test will sample all important content areas and processes
● is useful in planning and organizing
● offers opportunity for teachers and students to clarify achievement expectations
Preparing a Table of Specification
Here are the steps in developing TOS:
1. Determine the objectives of the test. In general, objectives are identified at
the start when the teacher creates the syllabus. There are 3 types of
objectives
▪ cognitive- designed to increase an individual’s knowledge, understanding,
and awareness
▪ affective-aim to change an individual’s attitude towards something
desirable
▪ psychomotor- designed to build physical or motor skills
In planning for assessment, choose only the objectives that can be best
captured by a written test. Some cognitive objectives are not meant for written tests
such as measuring a student’s fluency skills, more so assessing psychomotor skills
like a student’s balance or speed. These types of measurements should be done
through performance-based assessments more specifically.

2. Determine the coverage of the test. Only topics or concepts that have been
covered in class and are relevant should be included in the test. Calculate the
weight for each topic. The weight assigned per topic in the test is based on the
relevance and time spent to cover each topic during the discussion. The
percentage of time for a topic in a test is determined by dividing the time spent
for that topic during instruction by the total amount of time spent for all topics
covered in the test. See example on the next page. Determine the number of
items for the whole test. Consider the amount of time to be used by the
students should be considered. As a general rule, students are given 30-60
seconds for each item in test formats with choices. For a 1-hour class, this
means that the test should not exceed 60 items. However, because you need to
give time for the paper distribution, and giving instructions, the number of
items should be less (around 50items) Determine the number of items per
topic. Here, weights per item should be considered.

Topic # of Time Spent % of Time # of


Sessions (mins) (weight) items

Skeletal System 0.5 30 10 5

Muscular System 1.5 90 30 15

Circulatory System 1.0 60 20 10

Respiratory System 0.5 60 10 5

Reproductive System 0.5 30 10 5

XXX 0.5 30 10 5

YYY 0.5 30 10 5

TOTAL 5 sessions 300mins 100% 50 items


Formats of TOS

One-way TOS maps out the topics, test objectives, number of hours spent and format,
number, and placement of items. This is easy to develop because it works around the
objectives without considering the different levels of cognitive behaviors. However, this
cannot ensure that all levels of cognitive behaviors that should have been developed by the
course are covered in the test.

Topic Test Objective # of Hours Relative Number of Placement of


Spent Weight (%) Items items

Skeletal recognize the 0.5 hours 10% 5 Multiple


system importance of skeletal Choice #1-5
system

XXX XXX 1.5 30% 15 Matching


type #1-15

Two-way TOS reflects content, time spent, number of items, levels of cognitive
behavior targeted per test content based on the theory behind cognitive testing. For
example, the common framework for testing at present in the DepEd Classroom Assessment
Policy is the Revised Bloom’s Taxonomy (DepED 2015). One advantage of the format is that
it allows one to see the levels of cognitive skills and dimensions of knowledge that are
emphasized by the test. It also shows the framework of assessment used in the development
of the test. This is more complex than the one-way format.

Content Time # & % of KD* Level of Cognitive Behavior, Item Format, # and Placement of
Spent items Items

Remembe Understand Apply Analyze Evaluate Create


r

XXX 0.5 5 (10%) F 1.3 (#1-3)

C 1.2 (#4-5)

YYY 1.5 15 (30%) F 1.2 (#6-7)

C 1.2(#8-9) 1.2
(#10-
11)

p 1.2 (12- 1.2 (14-


13) 15)

M 1.3 (16- 11.1 11.1


17) (41) (42)

SCORING 1PT PER ITEM 2PTS PER ITEM 3PTS PER ITEM

OVERALL 5 50 20 20 10
TOTAL (100%)
Legend: KD-Knowledge Dimension (Factual, Conceptual, Procedural, Metacognitive)
Three-way TOS This type of TOS reflects the features of one-way and two-way TOS.
One advantage of this format is that it challenges the test writer to classify objectives based
on the theory behind assessment. It also shows the variability of thinking skills targeted by
the test. It take much longer time to develop this type of TOS. See the format below:

Content Learning Time # of Level of Cognitive Behavior, Item Format, # and Placement
Objective Spent Items of Items

Remembe Understan Apply Analyz Evaluate Create


r d e

XXX xxx .5hrs 5 (10%) #1-3 (F) #4-5 (C)

Scoring - - - 1 point per item 2 pts per item 5 points per item

Overall 50 (100%)
Total

E. Developing Assessment Tools

Types of Paper and Pencil Test


Development of paper -and -pencil tests require careful planning and expertise
in terms of actual test construction. The more seasoned teachers can produce true-false
items that can test even higher order thinking skills and not just rote memory learning.
Essays are easier to construct than other types of objective test, but the difficulty in
scoring essay examinations teachers from using this particular form of examination in
actual practice.

• CONSTRUCTING SELECTED-RESPONSE TYPE


1. True-False Test
Binomial-choice or alternate response tests are tests that have only two
(2) options such as true or false, right or wrong, yes or no, good or better, check
or cross out and so on. A student who knows nothing of the content of the
examination would have 50% chance of getting the correct answer by sheer guess
work. Although correction-for-guessing formulas exist, it is best that the teacher
ensures that a true-false item is able to discriminate properly those who and
those who are just guessing. A modified true-false test can offset the effect of
guessing by requiring students to explain their answer and to disregard a correct
answer if the explanation is incorrect. Here are some rules of thumb in
constructing true-false items.
Rule 1. Do not give a hint (inadvertently) in the body of the question.
Rule 2. Avoid using the words “always”, “never”, “Often”, and other words
that tend to be either always true or always false.
Rule 3. Avoid long sentences as these tend to be “true”. Keep sentences
short.
Rule 4. Avoid trick statements with some minor misleading word or spelling
anomaly, misplaced phrases, etc. A wise student who does not know
the subject matter may detect this strategy and thus get the answer
correctly.
Rule 5. Avoid quoting verbatim from reference materials or textbooks. This
practice sends the wrong signal to the students that it is necessary to
memorize the textbook word for word and, thus acquisition of higher-level
thinking skills is not given due importance.
Rule 6. Avoid specific determiners or give-away qualifiers. Students quickly learn
that strongly worded statements are more likely to be false than true.
Rule 7. With true or false questions, avoid a grossly disproportionate number of
either true or false statements or even patterns in the occurrence of trye and
false statements.
Rule 8. Avoid double negatives. This makes test item unclear and definitely will
confuse the student.

2. Multiple Choice Test


The multiple choi e type of test offers the student with more than two (2)
options per item to choose from. Each item in a multiple choice test consists of
two parts: the stem and the options. In the set of options, there is a “correct” or
“best” option while all the other are considered “distracters”. The distracters
are chosen in such a way that they are attractive to those who do not know the
answer or who are guessing but at the same time, feature of multiple choice type
tests that allows the teacher to test higher order-thinking skills even if the
options are clearly stated. As in true-false items, there are certain rues of thumb
to followed in constructing multiple choice tests.
✓ Do not use unfamiliar words, terms and phrases.
✓ Do not use modifiers that are vague and whose meaning can differed from
one person to the next such as: much often, usually, etc.
✓ Avoid complex or awkward word arrangements. Also, avoid use of negatives
in the stem as this may add unnecessary comprehension difficulties.
✓ DO not use negatives or double negatives as such statements tend to be
confusing. It is best to use simpler sentences rather than sentences that
would require expertise in grammatical construction.
✓ Each item stem should be as short as possible; otherwise you risk testing
more for reading and comprehension skills.
✓ Distracters should be equally plausible and attractive.
✓ All multiple-choice options should be grammatically consistent with the
stem.
✓ The length, explicitness or degrees of technicality of alternatives should be
the determinants of the correctness of the answer.
✓ Avoid stems that reveal the answer to another item.
✓ Avoid alternatives that are synonymous with others of those that include or
overlap others.
✓ Avoid presenting sequenced items in the same order as in the text.
✓ Avoid use of assumed qualifiers that may examinees may not be aware of.
✓ Avoid use of unnecessary words or phrases which are not relevant to the
problem at hand (unless such discriminating ability is the primary intent of
the evaluation). The item’s value is particularly damaged if the unnecessary
material is designed to distract or mislead. Such items test he student’s
reading comprehension rather than knowledge of the subject matter.
✓ Avoid use of non-relevant sources of difficulty such as requiring a complex
calculation when only knowledge of a principle is being tested.
✓ Pack the question in the stem. Here is an example of a question. Avoid it by
all means.
• The Roman Empire ______.
A. Has no central government.
B. Has no definite territory
C. Had no heroes
D. Had no common religion.
✓ Use the “none of the above” option only when the keyed answer is totally
correct. When choice of the “best” response is intended, “none of the
above” is not appropriate, since the implications has already been made that
the correct response may be partially inaccurate.
✓ Note that use of “all of the above” may allow credit for partial knowledge.
In a multiple option item, (allowing only one option choice) if a student only
know that two (2) options were correct, he could then deduce the
correctness of “all of the above”. This assumes you are allowed only one
correct choice.
✓ Better still use “none of the above” and “all of the above” sparingly but best
not to use them at all.
✓ Having compound response choices may purposefully increase difficulty of an
item. The difficulty in a multiple-choice item may be controlled by varying
the homogeneity or degree of similarity of response. The more
homogeneous, the more difficult the item because they all look like the
correct answer.

3. Matching Type
The matching type items may be considered modified multiple choice type
items where the choices progressively reduce as one successfully matches the
items on the left with the items on the right.
✓ Match homogeneous not heterogeneous items. The items to match must
be homogeneous. If you want your students to match authors with their
literary works, in one column will be authors and in the second column
must be literary works. Don’t insert nationality for instance with names of
authors. That will not be a good item since it is obviously wrong.
✓ The stem (longer in construction than the options) must be in the first
column while options (usually shorter) must be in the second column.
✓ The options must be more in number than the stems to prevent the
student from arriving at the answer by mere applicable.
✓ Like any other test, the direction of the test must be given. The
examinees must know exactly what to do.

• CONSTRUCTING SUPPLY OR CONSTRUCTED RESPONSE TYPE


Another useful device for testing lower order thinking skills is the supply type of
tests. Like the multiple choice test, the items in this kind of test consist of a stem and a
blank where the students would write the correct answer.
Supply type test depend heavily on the way the stems are constructed. These
tests allow for one and only one answer and hence, often test only the student’s recall
of knowledge.

1. Completion Type of Test


Guideline for the formulation of a completion type of test
✓ Avoid over mutilated sentences like the following:
The ____ produced by the _____ is used by the green _____ to
change the _____ and _____ into _____. This is called _____.

✓ Avoid open-ended item. There should be only one acceptable answer. This
item is open-ended, hence no good test item.
✓ The blank should be at the end or near the end of the sentence. The
question must first be asked before an answer is expected. Like matching
type of test, the stem (where the question is packed) must be I the first
column.
✓ Ask question on more significant item not on trivial matter.
✓ The length of the blanks must not suggest the answer. So better to make
the blanks uniform in size.
4. Essays
Essays, classified as non-objectives tests, allow for the assessment of higher
order thinking skills. Such tests require students to organize their thoughts on a
subject matter in coherent sentences in order to inform an audience. In essay tests,
students are required to write one or more paragraphs on a specific topic.
Essay questions can be used to measure attainment of a variety of
objectives.
• Comparing
• Relating cause and effect
• Justifying
• Summarizing
• Generalizing
• Inferring
• Classifying
• Applying
• Analyzing
• Evaluating
• Creating

Types of Essays
1. Restricted Essay
It is also referred to as short focused response. Examples are asking
students to “write an example,” “list three reasons” or ‘Compare and
contrast two techniques.”

2. Non-Restricted/Extended Essays
Extended responses can be much longer and complex than short
responses, but students are encouraged to remain focused and organized.

Guidelines for formulating and scoring of essay tests


Rule 1. Phrase the direction in such a way that students are guided on
the key concepts to be included. Specify how the students should response.
Rule 2. Inform the students on the criteria to be used for grading their
essays. This rule allows the students to focus on relevant and substantive
materials rather than on peripheral and unnecessary facts and bits of
information.
Rule 3. Put a time limit on the essay test.
Rule 4. Decide on your essay grading system prior to getting the essays
of your students.
Rule 5. Evaluate all of the student’s answers to one question before
proceeding to the next question.
Rule 6. Evaluate answers to essay questions without knowing the
identity of the writer.
Rule 7. Whenever possible, have two or more persons grade each
answer.
Rule 8. Do not provide optional questions.
Rule 9. Provide information about the value/weight of the question
and how it will be scored.
Rule 10. Emphasize higher level thinking skills.

You might also like