Assessment Compilation

Activity Sheet No.
15
Appropriateness of Assessment Methods
Assessment Instruments Appropriateness
This includes objective test (multiple choice,
true or false, matching type or short answer
test), essays, examinations and checklists.
Examples:
1. Objective test – appropriate for the various
1. Written-Response Instruments
levels of hierarchy of educational objectives.
2. Essay – when properly planned, can test the
students’ grasp of high level cognitive skills
particularly in areas of application, analysis,
synthesis and evaluation.
These scales measure products that are
frequently rated in education such as
booksreports, maps, charts, diagram, notebook,
essay and creative endeavor of all sorts.
Example:
2. Product-Rating Scale 1. Classic “Handwriting” Scale–is used in the
California Achievement Test, Form W.
Prototype handwriting specimens of pupils are
moved along the scale until the quality of
handwriting sample is most similar to the
prototype handwriting.
One of this is the performance checklist which
consists of the list of behaviors that makes up a
certain type of performance. It is used to
determine whether or not an individual behaves
in a certain way when asked to complete a
particular task.
Example:
Performance Checklist in Solving a
3. Performance Test Mathematics Problem
Behavior
1. Identifies the given information
2. Identify what is being asked
3. Use variable to replace the unknown
4. Formulate the equation
5. Performs algebraic expressions
6. Obtain the answer
7. Checks of the answers make sense.
An appropriate assessment methods when the
objectives are:
a. to the students’ stock knowledge; and
4. Oral Questioning
b. to determine the students ability to
communicate ideas in coherent verbal
sentence.
5. Obervation and Self-Report Are useful supplementary assessment methods
when used in conjunction with oral questioning
and oral test.
Reference:
Santos, Rosita. (2007). Assessment of learning 1. Quezon City, Metro Manila.
Lorimar Publishing Inc.
Activity Sheet No. 16
Validity
Definition Examples
The degree to which a test measure what is Mathematics test is administered twice to a
supposed to be measured. The quality of a test group of first year high school students. The
depends on its validity. It is the most central answer of Student A to Item 7 “How many
and essential quality in the development, meters are there in 9 kilometers?” is 9,000
interpretation and use of educational measures. meters and in the second administration, his
The most important quality of good measuring answer is still the same, 9,000 meters to Item 7.
assessment it refers to the degree to which a Hence, the student’s answer is valid because
test measures what it intends to measure. there is truthfulness of his answer (Calmorin,
2004)
References:
Asaad, Abubakar S. (2004). Measurement and evaluation concepts and
application (third edition). 856 Mecañor Reyes St., Sampaloc, Manila. Rex
Bookstore Inc.
Calmorin, L. (2004). Measurement and evaluation, 3rd ed. Mandaluyong City:
National Bookstore, Inc.
Raagas, Ester L. (2010). Measurement (assessment) and education concept and
application (third edition).Karsuagan, Cagayan De Oro City.
Factors Affecting Validity
How does each factor affect
Factors
validity?
Measuring the understandings, thinking
skills, and other complex types of
achievement with test forms that are
Inappropriateness of the test item.
appropriate only for measuring factual
knowledge will invalidate the results
(Asaad, 2004).
Directions that are not clearly stated as to

how the students respond to the items and
Directions of the test items.
record their answers will tend to lessen the
validity of the test items (Asaad, 2004).
Vocabulary and sentence structures that

do not match the level of the students will
Reading vocabulary and sentence
result in the test of measuring reading
structure.
comprehension or intelligence rather than
what it intends to measure (Asaad, 2004).
When the test items are too easy and too

difficult they cannot discriminate between
Level of difficulty of the test item. the bright and the poor students. Thus, it
will lower the validity of the test (Asaad,
2004).
How does each factor affect
Factors
validity?
Test items which unintentionally provide
clues to the answer will tend to measure
the students’ alertness in detecting clues
and the important aspects of students’
Poorly constructed test items. performance that the test is intended to
measure will be affected (Asaad, 2004).
A test should be sufficient number of items

to measure what it is supposed to
measure. If a test is too short to provide a
Length of the test items.
representative sample of the performance
that is to be measured, validity will suffer
accordingly Asaad, 2004).
Test items should be arranged in an

increasing difficulty. Placing difficult items
early in the test may cause mental blocks
and it may take up too much time for the
students; hence, students are prevented
Arrangement of the test items.
from reaching items they could easily
answer. Therefore, improper arrangement
may also affect the validity by having a
detrimental effect on students’ motivation
(Asaad, 2004).
A systematic pattern of correct answers,

Pattern of the answers. and this will lower again the validity of the
test (Asaad, 2004).
Ambiguous statements in test items

contribute to misinterpretations and
confusion. Ambiguity sometimes confuses
Ambiguity.
the bright students more than the poor
students, casing the items to discriminate
in a negative direction (Asaad, 200)
Reference:
Asaad, Abubakar S. (2004). Measurement and Evaluation Concepts and
Application (Third Edition).Manila: Rex Bookstore Inc.
Types of Validity: Content Validity
Definition Sample Illustration
It is related to how adequately the A teacherwishes to validate a test in
content of the root test sample the Mathematics. He request experts in
domain about which inference are to be Mathematics to judge if the items or
made. questions measures the knowledge the
This is being esrtablishedthroughlogical skills and values supposed to be
analysis adequate sampling of test measured.
items usually enough to assure that
test is usually enough to assure that a
test has content validity.
References:
Calmorin, Laurentina. (2004). Measurement and evaluation, 3rd ed. Mandaluyong
City. National Bookstore Inc.
Types of Validity: Face Validity
Test questions are said to have face
validity when they appear to be related
to the group being examined. Calculation of the area of the rectangle
This is done by examining the test to when it’s given direction of length and
bind out if it is the good one. And there width are 4 feet and 6 feet respectively.
is no common numerical method for
face validity.
References:
Bookstore Inc.
Raagas, Ester L. (2010). Measurement (assessment) and education concept and

application (third edition).Karsuagan, Cagayan De Oro City.
Types of Validity: Construct Validity
The test is the extent to which a test A teacher might design whether an
measure a theoretical trait. This educational program increases artistic
involves such test as those of ability amongst pre-school children.
understanding, and interpretation of Construct validity is a measure of
data. whether your research actually
measures artistic ability, a slightly
abstract label.
Reference:
Types of Validity: Criterion – Related Validity (Predictive Validity)
Mr. Celso wants to know the predictive validity of
his test administered in the previous year by
correlating the scores with the grades of the
same students obtained in a (test) later date.
Their scoresand grades are presented below:
Grade (x) Test (y) xy x2 y2
89 40 3560 7921 1600
85 37 3145 7225 1369
90 45 4050 8100 2025
79 25 1975 6241 625
80 27 2160 6400 729
Refers to the degree of 82 35 2870 6724 1225
accuracy of how a test 92 41 3772 8464 1681
predicts one performance at 87 38 3306 7569 1444
some subsequent outcome. 81 29 2349 6561 841
84 37 3108 7056 1369
_______ ______ ______ ______ ______
849 354 30 295 72 261 12 908
r = __10(30295) – (849) (354)_____________
√[10(77261) – (849)2] [10(12908) – (354)2]
r = 0.92
a 0.92 coefficient of correlation indicates that his
test has a high predictive validity.
Reference:
Bookstore Inc.
Types of Validity: Criterion – Related Validity (Concurrent Validity)
Grade (x) Test (y) xy x2 y2
34 30 1020 1156 900
40 37 1480 1600 1369
35 25 875 1225 625
49 37 1813 2401 1369
50 45 2250 2500 2025
It refers to the degree to 38 29 1102 1444 841
which the test correlate with a 37 35 1295 1369 1225
criterion, which is set up as an 47 40 1880 2209 1600
acceptable measure on 38 35 1330 1444 1225
standard other than the test 43 39 1677 1849 1521
itself. The criterion is always _______ ______ ______ ______ ______
available at the time of 411 352 14722 17197 12700
testing. r = __10(1722) – (411)2 (352)_____________
√[10(17197) – (411)2] [10(12700) – (352)2]
r = 0.83
a 0.83 coefficient of correlation indicates that his
test has a high predictive validity.
Reference:
Bookstore Inc.
Realiability
Definition General Example
Reliability is a factor of validity. It refers For the teachers – made test reliability
to the consistency of the test results. index of 0.50 and above is acceptable.
If you create a quiz to measure
students ability to solve quadratic
Reliability is defined as the as the equation, you should be able to
consistency of test results. assumethat if the students get some
items correct, he or she will get other
similar items correctly.
References:
Buendicho, F. (2010).Assessment of students learning 1. Manila. Rex Bookstore
Inc.
Rico, A. (2011). Assessment of students learning (a practical approach). Manila.
Anvil Publishing Inc.

Factors Affecting Validity
Factors How does it affects validity?
A longer test provides a more adequate
sample of behavior being measured
a. Length of the test and is less disturbed by chance factors
like guessing.
Spread the scores over a quarter range
b. Moderate item difficulty than when a test is composed of
difficult or easy items.
Eliminate the biases, opinions or
c. Objectivity judgments of the person who checks
the test.
Reliability is higher when test scores
d. Heterogeneity of the students’ group are spreadout a range of abilities.
Speed is afctor and is more reliable
e. Limited time than a test that is conducted at a longer
time.
References:

application (third edition). 856 Mecañor Reyes St., Sampaloc, Manila. Rex Bookstore Inc.
Methods of Establishing Reliability: Test-Retest Method
Name of Statistical
Estimate of
Definition Tool Sample Illustration
Reliability
(Formula)
Grade (x) Test (y) xy x2
y2
r = __n (Σxy) - (Σx(Σy)____
√ 34 30 1020 1156
2 2 2 2
[n(Σx ) – (Σx) ] [n(Σy ) – (Σy) ]
900
40 37 1480
1600 1369
35 25 875
1225 625
49 37 1813
2401 1369
50 45 2250
2500 2025
In this method, the 38 29 1102
same test is 1444 841
37 35 1295
administered twice Measure of 1369 1225
to the same group Stability 47 40 1880
2209 1600
of students with 38 35 1330
any time interval. 1444 1225
43 39 1677
1849 1521
____ ___ ____ ____ _____
411 352 14722 17197
12700
r = __10(1722) – (411)2
(352)_________
√[10(17197) – (411)2] [10(12700) –

(352)2]
r = 0.83
Reference:
Bookstore Inc.
Methods of Establishing Reliability: Equivalent/Parallel Form Method
Name of Statistical
Estimate of
Reliability
(Formula)
In this method,
r= __n (Σxy) - (Σx) 1st Test 2nd test xy x2 y2
there are two sets (Σy)______
(x) (y)
34 30 1020 1156
of test which is 2 2 2 2
√
900
[n(Σx ) – (Σx) ] [n(Σy ) – (Σy) ]
similar in content, 40 37 1480
type of items, 1600 1369
35 25 875
difficulty and 1225 625
others in close 49 37 1813
2401 1369
succession to the 50 45 2250
same group of 2500 2025
students. 38 29 1102
1444 841
37 35 1295
1369 1225
47 40 1880
2209 1600
38 35 1330
1444 1225
43 39 1677
1849 1521
_______ _____ ______ ______
______
411 352 14722 17197
12700
r = __10(1722) – (411)2
(352)__________
√[10(17197) – (411)2] [10(12700) –

(352)2]
r = 0.83

Bookstore Inc.
Methods of Establishing Reliability: Split-Half Method
Name of Statistical
Estimate of
Reliability
(Formula)
In this method, a Internal
r = __n (Σxy) - (Σx)(Σy)________ 1st Test 2ndtest xy x2 y2
test is conducted consistency √ (x) (y)
[n(Σx2) – (Σx)2] [n(Σy2) – (Σy)2] 34 30 1020 1156
once and the 900
40 37 1480 1600
results are broken r=
t __2roe_____ 1369
down into halves. 1 + roe
625
35 25 875 1225
49 37 1813 2401
1369
50 45 2250 2500
2025
38 29 1102 1444
841
37 35 1295 1369
1225
47 40 1880 2209
1600
38 35 1330 1444
1225
43 39 1677 1849
1521
_______ ______ ______ ______
______
411 352 14722 17197
12700
r = __10(1722) – (411)2
(352)__________
√[10(17197) – (411)2] [10(12700) –

(352)2]
r = 0.83
rt = 2(0.83)
1+ 0.83
= 0.91
Reference:
Bookstore Inc.
Methods of Establishing Reliability: Internal Consistency Methods
Definiti Estimate of
Statistical Tool Sample Illustration
on Reliability
This is This measures Kuder-Richardson Formula 21 Pupils Score (X) x-x̄
the last the (x-x̄)2
method homogeneity(p A 32 3.2 10.24
of attern of the
establis percentage of B 36 7.2 51.84
hing the the correct and C 36 7.2 51.84
reliabilit wrong (Asaad, 2004)

y of a responses of D 22 -6.8 46.24
test. the students) E 38 9.2 84.64
Like the of the of the Kuder-Richardson Formula 20

split-half instrument. F 15 -13.8 190.44
method, G 43 14.2 201.64
a test is
conduct H 25 -3.8 14.44
ed only (Gabuyo, 2012) I 18 -10.8 116.64
once.
This
J 23
method
assume Mean -5.8 33.64
s that all 288 801.60
items
are of x̄=∑X
equal
N x̄= 28.8 S2 = 89.07
difficulty
(Asaad, K = 5O
2004).
Standard Deviation
SD2= ∑(X- x̄)2

N–1 = 0.88
(Calmorin, 2004)
The reliability index
of 0.88 was
Definiti Estimate of
Statistical Tool Sample Illustration
on Reliability
obtained. This
means that the
S2= n(∑x2)-(∑x)2 results of the test are
N(n-1) reliable.
(Gabuyo, 2012). (Asaad, 2004)
Reference/s:
Asaad, Abubakar S. (2004). Measurement and Evaluation Concepts and
Application (Third Edition).Manila: Rex Bookstore Inc.
Calmorin, L. (2004). Measurement and evaluation, 3rd ed. Mandaluyong City:
National Bookstore, Inc.
Gabuyo, Y. (2013). Assessment of Learning 1(Textbook & Reviewer). Manila,
Philippines: Rex Bookstore.

Assessment Compilation

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assessment Compilation

Uploaded by

Copyright:

Available Formats

Activity Sheet No.

Directions that are not clearly stated as to

Vocabulary and sentence structures that

When the test items are too easy and too

A test should be sufficient number of items

Test items should be arranged in an

A systematic pattern of correct answers,

Ambiguous statements in test items

Raagas, Ester L. (2010). Measurement (assessment) and education concept and

r = __10(30295) – (849) (354)_____________

√[10(77261) – (849)2] [10(12908) – (354)2]

√[10(17197) – (411)2] [10(12700) – (352)2]

Activity Sheet No. 24

Asaad, Abubakar S. (2004). Measurement and evaluation concepts and

√[10(17197) – (411)2] [10(12700) –

√[10(17197) – (411)2] [10(12700) –

Asaad, Abubakar S. (2004). Measurement and evaluation concepts and

√[10(17197) – (411)2] [10(12700) –

the last the (x-x̄)2

method homogeneity(p A 32 3.2 10.24

hing the the correct and C 36 7.2 51.84

reliabilit wrong (Asaad, 2004)

test. the students) E 38 9.2 84.64

Like the of the of the Kuder-Richardson Formula 20

method, G 43 14.2 201.64

ed only (Gabuyo, 2012) I 18 -10.8 116.64

s that all 288 801.60

SD2= ∑(X- x̄)2

(Gabuyo, 2012). (Asaad, 2004)

You might also like

r = 10(30295) – (849) (354)___________