Professional Documents
Culture Documents
Assessment in Learning
COURSE
Assessment in Learning
COURSE
Assessment in Learning
COURSE
4. Determine the number of items for the whole test. To determine the number of items to be
included in the test, the amount of time needed to answer the items are considered. As a general
rule, students are given 30-60 seconds for each item in test formats with choices. For a one-hour
class, this means that the test should not exceed 60 items. However, because you need also to
give time for test paper/booklet distribution and giving instructions, the number of items should
be less, maybe just 50 items.
5. Determine the number of items per topic. To determine the number of items to be included in
the test, the weights per topic are considered. Thus, using the examples above, for a 60-item final
test, Theories & Concepts, Humanistic Theories, Cognitive Theories, Behavioral Theories, and
Social Learning Theories will have 5 items, Trait Theories - 10 items, and Psychoanalytic
Theories
- 15 items.
Topic Percentage of Time (Weight) No. of Items
Theories and Concepts 10.0 5
Psychoanalytic Theories 30.0 15
Trait Theories 20.0 10
Humanistic Theories 10.0 5
Cognitive Theories 10.0 5
Behavioral Theories 10.0 5
Social Learning Theories 10.0 5
TOTAL 100 50 items
Assessment in Learning
COURSE
2. Two-Way TOS. A two-way TOS reflects not only the content, time spent, and number of items
but also the levels of cognitive behavior targeted per test content based on the theory behind
cognitive testing. For example, the common framework for testing at present in the DepEd
Classroom Assessment Policy is the Revised Bloom's Taxonomy (DepEd, 2015). One advantage
of this format is that it allows one to see the levels of cognitive skills and dimensions of
knowledge that are emphasized by the test. It also shows the framework of assessment used in
the development of the test. However, this format is more complex than the one-way format.
3. Three-Way TOS. This type of TOS reflects the features of one-way and two-way TOS. One
advantage of this format is that it challenges the test writer to classify objectives based on the
theory behind the assessment. It also shows the variability of thinking skills targeted by the test.
However, it takes a much longer to develop this type of TOS.
Level of Cognitive Behavior and
Learning Time No. of Knowledge Dimension*, Item Format,
Content
Objective Spent Items No. and Placement of Items
R U AP AN E C
Theories and Recognize 0.5 5 I.3 I.2
Concepts important hours (10.0%) #1-3 # 4-5
concepts in (F) (C)
personality
theories
Psychoanalytic Identify the 1.5 15 I.2 I.2 I.2 I.2 II.1 II.
Theories different hours (30.0 %) #6-7 #8-9 #10-11 #14- #41 1
theories of (F) (C) (C) 15 (M) #4
personality I.2 (P) 2
under #12-13 I.3 (M
psychoanalytic (P) #16- )
model 18
(M)
Etc.
Scoring 1 point per 2 points per 3 points
item item per item
OVERALL 50
(100.0%) 20 20 10
TOTAL
*Legend: KD = knowledge Dimension (Factual, Conceptual, Procedural, Metacognitive
I – Multiple Choice; II – Open-Ended
COURSE
COURSE
What are the general guidelines in choosing the appropriate test format?
Not every test is universally valid for every type of learning outcome. For example, if an
intended outcome for a Research Method 1 course is “to design and produce a research study
relevant to one’s field of study, “you cannot measure this outcome through a multiple-choice test
or a matching type test.
To guide you in choosing the appropriate test format and designing fair and appropriate
yet challenging tests, you should ask the following important questions:
1. What are the objectives or desired learning outcomes of the subject/unit/lesson being assessed?
Deciding on what test format to use generally depends on your learning objectives or the
desired learning outcomes of the subject/unit/lesson. Desired learning outcomes (DLOs) are
statements of what learners are expected to do or demonstrate as a result of engaging in the
learning process.
2. What level of thinking is to be assessed (i.e., remembering, understanding, applying,
analyzing, evaluating, and creating)? Does the cognitive level of the test question match your
instructional objectives or DLOs?
The level of thinking to be assessed is also an important factor to considering when
designing your test, as this will guide you in choosing the appropriate test format. For example, if
you intend to assess how much your learners are able to identify important concepts discussed in
class (i.e., remembering or understanding level), a selected-response format such as multiple-
choice test would be appropriate. However, if you intend to assess how your student will be able
to explain and apply in another setting a concept or a framework learned in class (i.e., applying
and/or analyzing level), you may consider giving constructed-response test formats such as
essays.
It is important that when constructing classroom assessment tools, all level of cognitive
behaviors are represented – from Remembering (R), Understanding (U), Applying (Ap),
Analyzing (An), Evaluating €, and Creating (C) – and taking into consideration the Knowledge
Dimensions, i.e., Factual (F), Conceptual (C), Procedure (P), and Metacognition (M).
3. Is the test matched and aligned with the course’s DLOs and the course contents for learning
activities?
The assessment tasks should be aligned with the instructional activities and DLOs. Thus,
it is important that you are clear about what DLOs are to be addressed by your test and what
course or activities or tasks are to be implemented to achieve the DLOs.
For example, if you want learners to articulate and justify their stand on ethical decision-
making and social responsibility practice in business (i.e., DLO) then an essay test and class
debate are appropriate measure and task of this learning outcome. A multiple-choice may be used
but only if you intend to assess learners’ ability to recognize what is ethical versus unethical
decision- making practice. In the same manner, matching type items may be appropriate if you
want to know whether your students can differentiate and match the different approaches or
terms to their definitions.
COURSE
Good: (1) Who was the Philippine president during Martial Law?
(2) Who was the first president of the Commonwealth of the Philippines?
COURSE
3. Word the stem positively and avoid double negatives, such as NOT and Except in the stem. If
the negative word is necessary, underline or capitalized the word for emphasis.
4. Refrain from making the stem too wordy or containing too much information unless the
problem/question requires the facts presented to solve the problem.
Faulty: What does DNA stand for, and what is the organic chemical of complex molecular
structure found in all cells and viruses and codes genetic information for the transmission of
inherited traits?
Good: As a chemical compound, what does DNA stand for?
5. Do not use unfamiliar words, terms, and phrases. The ability of the item to discriminate or its
level of difficulty should stem from the subject matter rather than from the wording of the
question.
Example: What would be the system reliability of a computer system who slave and peripherals
are connected in parallel circuits and each one has a known time to failure probability of 0.05?
A student completely unfamiliar with the terms “slave” and “peripherals” may not be able
to answer correctly even if he knew the subject matter of reliability.
6. Do not use modifiers that are vague and whose meanings can differ from one person to the
next such as much, often, usually etc.
Example: Much of the process of photosynthesis takes place in the:
a. bark b. leaf c. stem
The qualifier “much” is vague and could have been replaced by more specific qualifiers
like: “90% of the photosynthetic process” or some similar phrase that would be more precise.
7. Avoid complex or awkward word arrangements. Also, avoid use of negatives in the stem as
this may add unnecessary comprehension difficulties.
Faulty: As President of the Republic of the Philippines, Corazon Cojuangco Aquino would stand
next to which President of the Philippine Republic after the 1986 EDSA Revolution?
Good: Who was the President of the Philippines after Corazon C. Aquino?
8. Do not use negative or double negative as such statements tend to be confusing. It is best to
use simpler sentences rather than sentences that would require expertise in grammatical
construction.
Faulty: (1) Which of the following will not cause inflation in the Philippine economy?
(2) What does the statement “Development patterns acquired during the formative years
are NOT unchangeable” imply?
COURSE
Good: (1) Which of the following will cause inflation in the Philippine economy?
(2) What does the statement “Development patterns acquired during the formative years
are changeable” imply?
9. Each item stem should be a short as possible; otherwise, you risk testing more for reading and
comprehension skills.
Option:
1. Provide three (3) to five (5) options per item, with only one being the correct or best
answer/alternative.
2. Write options that are parallel or similar in form and length to avoid giving clues about the
correct answer.
e. It is an area that one or more individual organisms defend against competition from
other organisms.
Example: Which experimental gas law describes how the pressure of gas tends to increase as the
volume of the container decreases? (i.e., “The absolute pressure exerted by a given mass of an
ideal gas is inversely proportional to the volume it occupies.”)
Faulty: Who among the following has become the President of Philippine Senate?
Good: Who was the first ever President of the Philippine Senate?
Example: The short story: May Day’s Eve, was written by which Filipino author?
If distracters had all been Filipino authors, the value of the item would be greatly
increased. In this particular instance, only the first three carry the burden of the entire item since
the last two can be essentially disregarded by the student.
COURSE
The matching test item format requires learners to match a word, sentence, or phrase in
one column (i.e., premise) to corresponding word, sentence, or phrase in a second column (i.e.,
response). It is the most appropriate when you need to measure the learners’ ability to identify
the relationship or association between similar items. They work best when the course content
has many parallel concepts. While matching-type test format is generally used for simple recall
of information, you can find ways to make it applicable or useful in assessing higher level of
thinking such as applying and analyzing.
The following are the general guidelines in writing good and effective matching-type tests:
1. Clearly state in the directions the basis for matching the stimuli with the responses.
Faulty: Directions: Match the following.
Good: Directions: Column I is a list of countries while Column II presents the continents where
these countries are located. Write the letter of the continent corresponding to the country on the
line provided in Column I.
Item #1’s instruction is less preferred s it does not detail the basis for matching the stem
and the response options.
2. Ensure that the stimuli are longer, and the responses are shorter.
Faulty:
A B
Bangladesh a. Green background with red circle in the center
Indonesia b. One red strip on top and white strip at the bottom
Japan c. Red background with white five-petal flower in the center
Singapore d. Red background with large yellow circle in the center
Thailand e. Red background with large yellow pointed star in the center
f. White background with large red circle in the circle
Good:
A B
Green background with red circle in the center a. Bangladesh
One red strip on top and white strip at the bottom b. Indonesia
Red background with white five-petal flower in the center c. Japan
Red background with large yellow circle in the center d. Singapore
Red background with large yellow pointed star in the center e. Thailand
f. Vietnam
Item #2 is aa better version because the descriptions are presented in the first column
while the response options are in the second column. The stems are also longer than the options.
COURSE
3. For each item, include only topics that are related with one another and share the same
foundation of information.
Good: On the line to the left of each country in Column I, write the letter of the country’s capital
presented in Column II.
Column I Column II
1. Indonesia a. Bandar Seri Begawan
2. Malaysia b. Bangkok
3. Philippines c. Jakarta
4. Thailand d. Kuala Lumpur
e. Manila
Item #1 is considered an unacceptable item because its response option are not parallel
and include different kinds of information that can provide clues to the correct/wrong answers.
On the other hand, item #2 details the basis for matching and the response options only include
related concepts.
4. Make the response option short, homogeneous, and arranged in logical order.
5. Include response options that are reasonable and realistic and similar in length and
grammatical form.
A B
History a. Studies the production and distribution of goods/services
Political Science b. Study of politics and power
Psychology c. Study of Society
Sociology d. Understand role of mental functions in social behavior
e. Uses narrative to examine and analyze past events
A B
1. Study of living things a. Biology
2. Study of mind and behavior b. History
3. Study of politics and power c. Political Science
4. Study of recorded events in the past d. Psychology
5. Study of society e. Sociology
f. Zoology
Item #1 is less preferred because the response options are not consistent in terms of their
length and grammatical form.
Example: Match the following fractions with their corresponding decimal equivalents:
Faulty:
A B
1/4 a. 0.25
5/4 b. 0.28
7/25 c. 0.90
9/10 d. 1.25
Good:
A B
1/4 a. 0.25
5/4 b. 0.28
7/25 c. 0.90
9/10 d. 1.25
e. 0.09
True or false items are used to measure learners’ ability to identify whether a statement or
proposition is correct/true or incorrect/false. They are best used when learners’ ability to judge or
evaluate is one of the desired learning outcomes of the course.
There are different variations of the true or false items. These include the following:
2. Yes-No Variation. In this format, the learner has to choose yes or no, rather than true or false.
Ex. The following are kinds of test. Circle Yes if it is authentic test and No if not.
3. A-B Variation. In this format, the learner has to choose A or B, rather than true or false.
Ex. Indicate which of the following are traditional or authentic tests by circling A if it is a
traditional test and B if it is authentic.
Traditional Authentic
Multiple Choice Test A B
Debates A B
End-of-the Term Project A B
True or False Test A B
Because true or false test items are prone to guessing, as learners are asked to choose
between two options, utmost care should be exercised in writing true or false items. The
following are the general guidelines in writing true or false items:
Faulty: The presidential system of government, where the president is only the head of state or
government, is adopted by the United States, Chile, Panama, and South Korea.
Good: The presidential system, where the president is only the head of state or government, is
adopted by Chile.
COURSE
Item #1 is of poor quality because, while the description is right, the countries given are
not all correct. While South Korea has a presidential system or government, it also has a prime
minister who governs alongside with the president.
Faulty: Education is a continuous process of higher adjustment for human beings who have
evolved physically and mentally, which is free and conscious of God, as manifested in nature
around the intellectual, emotional, and humanity of man.
Good: Education is the process of facilitating learning or the acquisition of knowledge, skills,
values, beliefs, and habits.
Item #1 is somewhat confusing, especially for younger learners because there are many
ideas in one statement.
Double negatives are sometimes confusing and could result in wrong answers, not
because the learner does not know the answer but because of how the test items are presented.
Faulty: The news and information posted on the CNN website is always accurate.
Good: The news and information posted on the CNN website is usually accurate.
Absolute words such as “always” and “never” restrict possibilities and make a statement
as true 100% or all the time. They are also a hint for a “false” answer.
Faulty: If an object is accelerating, a net force must be acting on it, and the acceleration of an
object is directly proportional to the net force applied to the object.
Faulty: Esprit de corps among soldiers is important in the face of hardships and opposition in
fighting the terrorists.
Good: Military morale is important in the face of hardships and opposition in fighting the terrorists.
Students may have a difficult time understanding the statement, especially if the word
“esprit de corps” has not been discussed in the class. Using unfamiliar words would likely lead to
guessing.
7. Avoid lifting statements from the textbook and other learning materials.
The following are the general guidelines in writing good fill-in-the-blank or completion
test items:
In item #1, the word “core” is not the significant word. The item is also prone to many
and varied interpretations, resulting to many possible answers.
2. Do not omit too many words from the statement such that the intended meaning is lost.
Item #1 is prone to many and varied answers. For example, a student may answer the
question based on the capital of these countries or based on what continent they are located. Item
#2 is preferred because it is more specific and requires only one correct answer.
Faulty: Ferdinand Marcos declared martial law in 1972. Who was the president during that period?
COURSE
Item #1 already gives a clue that Ferdinand Marcos was the president during this time
because only the president of a country can declares martial law.
Faulty: The government should start using renewable energy sources for generating electricity
such as .
Good: The government should start using renewable sources of energy by using turbines called _.
Item #1 has many possible answers because the statement is very general (e.g., wind,
solar, biomass, geothermal, and hydroelectric). Item #2 is more specific and only requires one
correct answer (i.e., wind).
The word “an” in item #1 provides a clue that the correct answer starts with a vowel.
6. If possible, put the blank at the end of the statement rather than at the beginning.
In item #1, learners may need to read the sentence until the end before they can recognize
the problem, and then re-read it again and then answer the question. On the other hand, in item
#2, learners can already identify the context of the problem by reading through the sentence only
once and without having to go back and re-read the sentence.
There are two types of essay test: (1) extended-response essay and (2) restricted-response essay.
Extended-Response Restricted-Response
Requires much longer and complex responses
Is much more focused and restrained
How are the leopard and tiger differ? Tina is preparing for a demonstration to display at her school's science
Support your answer with details and information from
fair. She thetoarticle.
needs show the effect of salt on the bouyancy of egg.
Part A: Identify at least two other actions that would make Tina's demonstration better.
Part B: Explain why each action would improve the demonstration.
The following are the general guidelines in constructing good essay questions:
1. Clearly define the intended learning outcome to be assessed by the essay test.
To design effective essay questions or prompts, the specific intended learning outcomes
are identified. If the intended learning outcomes to be assessed lack clarity and specificity, the
questions or prompts may assess something other than what they intend to assess. Appropriate
direct verbs that most closely match the ability that learners should demonstrate must be used in
the prompts. These include verbs such as compose, analyze, interpret, explain, and justify,
among others.
2. Refrain from using essay test for intended learning outcomes that are better assessed by other
kinds of assessment.
Some intended learning outcomes can be efficiently and reliably assessed by selected-
type test rather than by essay test. In the same manner, there are intended learning outcomes that
are better assessed using other authentic assessments, such as performance test, rather than by
essay test. Thus, it is important to take into consideration the limitations of essay tests when
planning and deciding what assessment method to employ for an intended learning outcome.
3. Clearly define and situate the task within a problem situation as well as the type of thinking
required to answer the test.
Essay questions or prompts should provide clear and well-defined tasks to the learners. It
is important to carefully choose the directive verb, to write clearly the object or focus of the
directive verb, and to delimit the scope of the task. Having clear and well-defined tasks will
guide learners on what to focus on when answering the prompts, thus avoiding responses that
contain ideas that are unrelated or irrelevant, too long, or focusing only on some part of the task.
Emphasizing the type of thinking required to answer the question will also guide students on the
extent to which they should be creative, deep. complex, and analytical in addressing and
responding to the questions.
COURSE
4. Present tasks that are fair, reasonable, and realistic to the students.
Essay questions should contain tasks or questions that students will be able to do or
address. These include those that are within the level of instruction/ expertise, and experience of
the students. training, expertise, and experience of the student.
5. Be specific in the prompts about the time allotment and criteria for grading the response.
Essay prompts and directions should indicate the approximate time given to the students
to answer the essay questions to guide them on how much time they should allocate for each
item, especially if several essay questions are presented. How the responses are to be graded or
rated should also be clarified to guide the students on what to include in their responses.
Problem-solving test items are used to measure learners' ability to solve problems that
require quantitative knowledge and competencies and/or critical thinking skills. These items
present a problem situation or task that will require learners to demonstrate work procedures or
come up with a correct solution. Full or partial credit can be assigned to the answers, depending
to the answers, solutions required.
There are different variations of the quantitative problem-solving item These include the
following:
1. One answer choice - This type of question contains four or five options, and students are
required to choose the best answer.
Example: What is the mean of the following score distribution: 32, 44, 56, 60 75, 77, 95, 967?
A. 68 B. 69 C. 72 D. 74 E. 76
2. All possible answer choices - This type of question has four or five options, and students are
required to choose all of the options that are correct.
Example: Consider the following score distribution: 12, 14, 14, 14, 17, 24, 27, 28, 30. Which of
the following is/are the correct measure/s of central tendency? Indicate all possible
answers.
3. Type-In answer - This type of question does not provide options to choose from. Instead, the
learners are asked to supply the correct answer. The teacher should inform the learners at the
start how their answers will be rated. For example, the teacher may require just the correct
answer or ma require learners to present the step-by-step procedures in coming up their answers.
On the other hand, for non-mathematical problem solving, such as a case study, the teacher may
present a rubric how their answers will be rated.
Example: Compute the mean of the following score distribution: 32, 44, 56, 69, 75, 77, 95, 96.
Indicate your answer in the blank provided.
In this case, the learners will only need to give the correct answer without having to show
the procedures for computation.
Example: Lillian, a 55-year-old accountant, has been suffering from frequent dizziness, nausea,
and lightheadedness. During the interview, Lillian was obviously restless, and sweating.
She reported feeling so stressed and fearful of anything without any apparent reason. She
could not sleep and eat well. She also started to withdraw from family and friends, as she
experienced frequent panic attacks. She also said that she was constantly worrying about
everything in work and at home. What might be Lillian's problem? What should she do to
alleviate all her symptoms?
Problem-solving test items are good test format as they minimize guessing, measure
instructional objectives that focus on higher cognitive levels, and measure extensive amount of
contents or topics. However, they require more time for teachers to construct, read, and correct,
and are prone to rater bias, especially when scoring rubrics/criteria are not available. It is
therefore important that good quality problem-solving test items are constructed.
The following are some of the general guidelines in constructing good problem-solving test items:
Faulty: Tricia was 135.6 lbs. when she started with her zumba/aerobics exercises. After three
months of attending the sessions three times a week, her weight was down to 122.8 lbs.
About how many lbs. did she lose after three months? Write you final answer in the space
provided and show your computations. [This question asks, "about how many" and does
not indicate whether learners need to give the exact weight or whether they need to round
off their answer and to what extent.]
Good: Tricia was 135.6 lbs. when started with her zumba/aerobics exercises. After three months
of attending the sessions three times a week, her weight was down to 122.8 lbs. How
many lbs. did she lose after three months? Write you final answer in the space provided
and show your computations. Write the exact weight; do not round off.
COURSE
2. Be specific and clear of the type of response required from the students.
Faulty: ASEANA Bottlers, Inc. has been producing and selling Tutti Fruity juice in Philippines,
aside from their Singapore market. The sales for the juice in the Singapore market were
S$5million more than those of their Philippine market in 2016, S$3million more in 2017,
and S$4.5million in 2018. If the sales in Philippine market in 2018 was PHP35million,
what were the sales in Singapore market during that year? [This is a faulty question
because it does not specify in what currency should the answer be presented.]
Good: ASEANA Bottlers, Inc. has been producing and selling Tutti Fruity juice in Philippines,
aside from their Singapore market. The sales for the juice in the Singapore market were
S$5million more than those of their Philippine market in 2016, S$3million more in 2017,
and S$4.5million in 2018. If the sales in Mexican market in 2018 was PHP35 million,
what were the sales in U.S. market during that year? Provide answer in Singapore dollars
(15$
= PHP36.50). [This is a better item because it specifies in what currency should the
answer be presented, and the exchange rate was given.]
Faulty: VCV Consultancy Firm was commissioned to conduct a survey on the voters'
preferences in Visayas and Mindanao for the upcoming presidential election. In Visayas,
65% are for Liberal Party (LP) candidate, while 35% are for the Nationalist Party (NP)
candidate. In Mindanao, 70% of the voters are Nationalists, while 30% are LP supporters.
A survey was conducted among 200 voters for each region. What is the probability that
the survey will show a greater percentage of Liberal Party supporters in Mindanao than in
the Visayas region? [This question is undesirable because it does not specify the basis for
grading the answer.]
Good: VCV Consultancy Firm was commissioned to conduct a survey on voters' preferences in
Visayas and Mindanao for the upcoming presidential election. In Visayas, 65% are for
Liberal Party (LP) candidate, while 35% are for the Nationalist Party (NP) candidate. In
Mindanao, 70% of the voters are Nationalists while 30% are LP supporters. A survey was
conducted among 200 voters for each region.
What is the probability that the survey will show a greater percentage of Liberal Party
supporters in Mindanao than in the Visayas region? Please show your solutions to support your
answer. Your answer will be graded as follows: