You are on page 1of 11

Principles of High Quality

2 Assesshent

OBJECTIVES
At the end of the chapter, the learmers are expected to:
identify what constitutes high-quality assessments;
list down the productive and unproductive uses of tests; and
classify the various types of tests.

Characteristies of High-@uality Assessments


High-quality assessments provide results that demonstrate and improve
targetedstudentlearning.Theyalsoconvey instructional decision making. To
eñsure the quality öfäny test, the following criteria must be considered:
1. Clear and Appropriate Learning Targets
When designing good
assessment, start by asking if the learning
targets are on the right level of difficulty to be able to motivate
students and if there is an adequate balance among the different
types of learning targets.
A leaming target is a clear description of what students know and
able are to do. Leaming targets are categorized by Stiggins and Conklin
(1992) into five.
a.
Knowledge learning target is the ability of the student to
master a substantive subject matter.
b. Reasoning learning target is the ability to use
solve problems. knowledge and
C. Skill learning targetis the
ability to
demonstrate achievement-
related skills like conducting experiments, playing basketball,
and operating computers.
d. Product
learning target is the ability to create achievement-
related products such as
written reports, oral
e.
and art products. presentations,
Afjecfive
such as
learning target is the attainment of affective tralns
attitudes, values, interests, and
self-efficacy.
17
Methods
Z. Appropriateness of Assessment
been identitied, match them with
ith
Once the learning targets have
considering the strengths of varioe
by us
their comesponding methods
methods in measuring different targeis.

Table 2.1
METHODS
WITH ASSESSMENT
MATCHING LEARNING TARGETS

ASSESSMENT
METHODS
erformanceOralo
bjertive3say sed 9uestlon
2
4
Knowledge 5 4
2
4
Reasoning 2
4
5 3
Skills
3
2 4
Products 1 5
4 4 5
Affect 2 4

3. Validity
This refers to the extent to which the test serves its purpose or the
efficiency with which it intends to measure. Validity is a characteristic

that pertains to theappropriateness


of the uses, and results
inferences,
of the test or any other method utilized to gather data.
There are factors that influence the validity of the test; namely,
appropriateness of test_items, directions, reading vocabulary and
sentence structures, pattern of answers, and
arrangement of items.
a. HowValidity is Determined
Validity is always determined by professional
However, there are different types of evidence to judgment.
use in
determining validity. The following major sources of
can be used to
establish validity: information
i.
Content-related validity determines the extent
the assessment is the of which.
interest. Once the contentrepresentative of the domain of
the test items to domain is specified, review
between the intended be assured that there is a
match
A test blueprint or tableinferences and what is on the test.
of
delineate what targets should specification will help further
important from the content be assessed and what is
domain.
ssessment of Student Learning 1:
Cognitiv
Learning
i. Criterion-related validity determines the relationship
between an assessment and another measure of the
same trait. It provides validity by relating an assessment
to some valued measure (criterion) that can either
provide an estimate of current performance (concurrent
criterion-related evidence) or predict future performance
(predictive criterion-related evidence).
ii. Construct-related validity determines which assessment
is a meaningful measure of an unobservable trait or
characteristic like intelligence, reading comprehension,
honesty, motivation, attitude, learning style, and
anixiety.
iv. Face validity is determined on the basis of the appearance
of an assessment, whether based on the superficial
examination of the test, there seems to be a reasonable
measure of the objectives of a domain.
v. Instructional-related validity determines to what extent
the domain of content in the test is taught in class.
b. Test Validity Enhancers
The following are suggestions for enhancing the validity of
classroom assessments:
i. Prepare a table of specifications (TOS).
ii. Construct appropriate test itemns.
ii.Formulate directions that are brief, clear, and concise.
iv. Consider the reading vocabulary of the examinees.
The test should not be made up of jargons.
v. Make the sentence structure of your test items simple.
Vi. Never have an identifiable pattern of answers.
vi. Arrange the test items from easy to difficult.
vii. Provide adequate time for student to complete the
assessment.
ix. Use different methods to assess the same thing.
X. Use the test only for intended
purposes.
4 . Reliability
This refers to the
consistency with which a student may be
expected to perform on a given test. It means the extent to which a
test is dependable, sef-consistent, and stable.

Chapter 2: Principles of High-Quality Assessment19


These include
There are factors that affect
test reliability.
of his/her subjectivity,
because
the (1) scorer's inconsistency and
because of incidental inclusion
(2) limited sampling some materials in the test,
(3) changes
accidental exclusion of his/her instability
himself/herself and
the individual examinee
in examination, and
environment.
(4) testing
during the

a. How Reliability is Determined These


various ways of establishing test reliability.
There are

of the test, difficulty of the test, and objectivity


are the length the
also four methods in estimating
of the scorer. There are
instrument.
reliability of a good measuring
i. Test-Retest Method or Test of Stability. The same
administered to the
same
measuring instrument is second
of subjects. The scores of the first and
group
determined by correlation
administrations of the test are
coefficient. The limitations of this
method are: However,
when the time interval is
memory effects may operate
and forgetting
short. Likewise, factors such as unlearning
occurwhen time interval is long resulting in low
the
may
correlation of the test. Another limitation of
the method
conditions may affect
is that other varying environmental
interval
the correlation of the test regardless of the time
separating the two administrations.
ii. Parallel-Forms Methód or Test of Equivalence. Parallel
or equivalent forms of a test may be admin+stered to
the
group of subiects and the paired observations correlated.
The two forms of the test must be constructed in a manner
that the content, type of item, difficulty, instructions for
administration, and several others, should be similar but
not identical.
ii. Split-Half Method. The test in this method may only be
administered once, but the test items are divided into
two halves. The common procedure is to divide a test
into odd and even items. The two halves of the test must
be similar but not identical in content,
and standard deviations.
difficulty, means
iv. Internal-Consisterncy Method. This method is used
with psychological tests, which are constructed as
dichotomously scored items. The testee either
passes
or fails in an item. The method ofobtaining reliability
coefficients in this method is determined by the Kuder-
Richardson formula.
o f Student Learning 1: Cognitive Learnina
b. The Concept of Error in Assessment
The concept of eror in assessment is critical to the
understanding of reliability. Conceptually, whenever something is
assessed, an observed score or result is produced. This observed
score is the product of what the true score or real ability or skill is
plus some degree of error.
Observed Score = True Score + Error

Thus, an observed score can be higher or lower than the true


score, depending on the nature of error. The sources of error are
reflected in Table 2.2.

Table 2.2
SOURCES OF ERROR

Internal Error External Error


Health Directions
Mood Luck
Motivation Item ambiguity
Test-taking skills Heat in the room

Anxiety Lighting
Fatigue Sample of items
General ability Observer differences and bias
Test interpretation and scoring

C. Test Reliability Enhancers


The following should be considered in enhancing the
reliability of classroom assessments:
i. Use a sufficient number of items or tasks. A longer test is
more reliable.
ii. Use independent raters or observers who can provide
similar scores to the same performances.
ii. Make sure the assessment procedures and scoring are
objective.
iv. Continue the assessment until the results consistent. are
v. Eliminate or reduce the influence of extraneous events
or factors.
vi. Assess the
difficulty level of the test.
vii. Use shorter assessments more frequently rather than a
few long assessments.

Chapter 2: Princlples of High-Quality Assessment


5 Fairness
Inis pertains to the intent that each question should be made as

absent of any biases.


ciear as possible to the examinees and the test is about a person or
An example of a bias in an intelligence test is an item context of
object that has not been part of the cultural and educational
reading difficulty
the test taker. In mathematical tests for instance, the elements
evel of an item can be a source of unfairness. Identified
of fairness are the student's knowledge of learning targets betore

Instruction, the opportunity to learn, the attainment of pre-requisite


knowledge and skills, unbiased assessment tasks and procedures,
and teachers who avoid stereotypes.
6. Positive Consequences
These enhance the overall quality of assessment, particularly the
effect of assessments on the students' motivation and study habits.

7. Practicality and Efficiency


Assessments need to take into consideration the teachers
familiarity with the method, the time required, the complexity of
administration, the ease of scoring and interpretation, and the cost
to be able to determine an assessment's practicality and efficiency.
Administrability requires that a test must be administered with ease,
clarity, and uniformity. Directions must be specific so that students
and teachers will understand what they must do exactly. Scorability
demands that a good test should be easy to score. Test results should
readily be available to both students and teachers for remedial and
follow-up measures.

Productive Hses of Tests


Learning Analysis. Tests are used to identify the reasons or causes why
students do not learn and the solutions to help them learn.
should be designed to determine what students do not knowldeally,
a test
so that the
teachers can take appropriate actions.
Improvement of Curriculum. Poor
performance in a test may indicate
that the teacher is not explain:ng the material etectively, the
textbook is not
clear, the students are not properly taught, and the students do not see the
meaningfulness of the materials. When only a few students have difficulties,
the teacher can address them separately and extend
lass does poorly, the curiculum needs to be revised or
special help. If the entire
for the class to continue.
special units need to
be developed
Improvement of Teactier. In a
reliable grading system, the class average
has earned
the teacher
is the grade

.dent Learnina 1. chaniti


how ettective'
Instructional Materials. Tests
measure
Improvement of
instructional materials arebringing about intended changes.
in
indicate differences in students
Individualization. Effective tests always
individual help.
learning. These can serve as bases for is
or any other opportunity
Selection. When enrollment opportunity
who are more qualitied.
limited, a test can be used to screen those
to which category a student
Placement. Tests can be used to determine
belongs.
Guidance and Counseling. Results from appropriate tests, particularly
counselors guide students in
standardized tests, can help teachers and
assessing future academic and career possibilities.

Research. Testscan be feedback tools to find effective methods of teaching


and learn more about students, their interests, goals and achievements.
Selling and Interpreting the School to the Community. Effective tests help
the community understand what the students are learning, since test items are
representative of the content of instruction. Tests can also be used to diagnose
general' schoolwide weaknesses and strengths that require community or
govermment support.
Identification of Exceptional Children. Tests can reveal exceptional
students inside the classroom. More oftern than not, these students are
overlooked and left unattended.
Evaluation of Learning Program. ldeally, tests should evaluate the
effectiveness of each element in a learning program, not just blanket the
information of the total learning environment.

Unproductive Hses of Tests


Grading. Tests should not be used as the only determinants in grading a
student. Most tests do not accurately retlect a student's performance or true
abilities. Poor performance on a certain task may not only indicate failure but
lack or absence of the needed foundations as well.
Labeling. It is often a sérious disservice to label a student, even if the label
is positive. Negative labels may lead the students to believe
act accordingly. Positive labels, on the other hand,
the label and
may lead the students to
underachieve or avoid standing out as different or become overconfident and
not exert effort anymore.
Threatening. Tests lose their validity when used as disciplinary measures.
Unannounced Testing. Surprise tests are generally not recommended.
More often than not, they are the scapegoats of teachers
who are unprepared,
upset by an unruly lass or reprimanded by
students perform at a slightly higher level superiors. Studies reveal that
unannounced tests create anxiety on the partwhen tests are announced;
of the students, particulany

Chapter 2: Principles of High-Quality Assessnent 23


OSe whO are already fearful of tests; unannounced tests do not give students
ddeguate time to prepare; and surprise tests do not promote emcient
learning or higher achievement.
Kidiculing. This means using tests to deride students.
racking. Students are grouped according to deficiencies as revealed by
testswithout continuous reevaluation, thus
locking them into categories.
Alocating Funds. Some schools exploit tests to solicit for funding.

Classifications of Tests
Throughout the years, psychologists and educators have cooperatively
produced new and better tests and scales that measure the students
pertomance with greater accuracy. These tests may be classitied according
to:
1. Administration
a. Individual given orally and requires the examinees constant
-

attention since the manner of


the score. An
answering may be as important as
example of this is the Wechsler Adult Intelligence
Scale, one of the three individually administered
scales. Another is
intelligence
a PowerPoint presentation used as a
performance test in a speech class.
b. Group for measuring
cognitive skills to measure achievement.
Most tests in _chools are considered
test takers can take the tests as a group tests where different
group..
2. Scoring
a. Objective independent scorers agree on the number of
-

answer should receive, e.g., points the


b.
multiple choice and true or false.
Subjective answers can be scored through various
These are then given different ways.
values by scorers, e.g.,
performance tests. essays and
3. Sort of Response being Emphasized
a. Power allows examinees a generous
-

time
answer every item. The questions are difficultlimit
to be able to
is what is emphasized. and this difficulty
b. Speed with severely limited time
-

easy and only a fevw examinees are


constraints but the items are
of
expected to errors. make
4. Types Response the Examinees must Make
a. Performance requires students to perform a task. This is
-

administered individually so that the examiner can usually


errors and measure the time the count the
examinee has
each task. performed in
Paner-and-pencil -

examinees are asked to write on


paper.
Student Learning 1: Cognitiv
24Assessment of eLearning
5. What is Measured
a. Sample - limited representative test designed to measure the
total behavior of the examinee, although no test can exhaustively
measure all the knowledge of an individual.
D.Sign test-diagnostic test designed to obtain diagnostic signs to
suggest that some form of remediation is needed.
6. Nature of the Groups being Compared
a. Teacher-made test for use within the classroom and contains
the subject being taught by the same teacher who constructed
the test.
b. Standardized test constructed by test specialists working with
curiculum experts and teachers.
Other Types of Tests
1. Mastery tests measure the level of learning of a given set of materials
and the level attained.
2. Discriminatory tests distinguish the differences between students or
groups of students. It indicates the areas where students need help.
3. Recognition tests require students to choose the right answer from a
given set of responses.
4. Recall tests require students to supply the correct answer from their
memory.
5. Specific recall tests require short responses that are fairly objective.
6. Free recall tests require students to construct their
own complex
responses. There are no right answers but à given answer might be
better than the other.
7. Maximum performance tests require students to obtain the best
score
possible.
8. Typical perfomance tests the
measure typical or usual or average
performance.
9. Written tests depend on the ability of the students to
read and write. understand,
10. Oral examinations depend on the
is also required. examinees' ability to speak. Logic
11. Language tests require instructions and questions to be
words. presented in

Chapter 2:
Principles ofHigh-Quality Assessment25
Non-language tests are administered by means of pantomime,
painting or signs and symbols, e.g., Raven's Progressive Matrices or
the Abstract
Reasoning Tests.
. Structured tests have very specific, well-defined instructions and
expected outcomes.
14. Projective tests present ambiguous stimulus or questions designed to
elicit highly individualized responses.
15. Product tests emphasize only the final answer.
16.
6. Process tests focus on how the examinees attack, solve, or work out
a problem.
17. External reports are tests where a ratee is evaluated by another
person.
18. Internal reports are self-evaluation.
19. Open book tests depend on one's understanding and ability to
express one's ideas and evaluate concepts.
20. Closed book tests depend heavily on the memory of the examinees.
21. Non-learning format tests determine how much information the
students know.
22 Learning format tests require the students to apply previously leamed
materials.
23. Convergent format tests puposely lead the examinees to one best
answer
24. Divergent format tests lead the examinees to several possible
answers.
25. Scale measurements distribute ratings along a continuum.
26. Test measurements refer to the items
right or wrong, but not both.
being dichotomous or either

27. Pretests measure how much is known about a material before it is


presented.

28. Posttests measure how much has been learned


after
material has been given.
a learningg

29. Sociometrics reveal the


interrelationship among members or the
social structure of a group.

30 Anecdotal
of
records reveal episodes
the students.
of behavior that may indicate a
profile

26
Assessment of Student Learning 1: Cognitive
Learning
Table 2.3
COMPARISON BETWEEN TEACHER-MADE TESTS
AND STANDARDIZED TESTS

Characteristic Teacher-Made Test Standardized Test


Directions for Usually, no uniform
administration Specific instructions
directions are specifled. standardize the administration
and scoring
and scoring procedures.
Both content Content is
and sampling are
determined by
curriculum and subject
determined by the matter experts. It involves
Sampling content classroom teacher. intensive investigations of
existing syllabi, textbooks, and
programs. Sampling of content
is done systematically.

May be hurriedly It uses


meticulous construction
done because of time procedures that include
constraints; often no constructing objectives and
Construction test blueprints, item test blueprints, employing
tryouts, item analysis or item tryouts, item
revision; quality of test analysis, and
item revisions.
may be quite poor.

Only local classroom In addition to local norms,


norms are available.
Norms standardized tests typically
make available national, school
district, and school building
norms.

Best suited for Bestsuited for measuring


Purpose and use
measuring particular broader curriculum objectives
objectives set by and for
the teacher and for interclass, school, and
national comparisons.
intraclass comparisons.

-Revicw&xercises-
1. Explain why validity implies reliability but not the
2. reverse.
Generate some other qualities that
you believe contribute to making
good assessments.
3. List down your
personal experiences of unfair
asessments.
Chapter 2:
Princip of High-Quality Assessment 27
7

You might also like