Professional Documents
Culture Documents
GOOD TEST
Learning Outcomes: At the end of the
chapter, you must have:
• enumerated the different ways of establishing
validity and reliability of different assessment
tools
• identified the different factors affecting the
validity and reliability of the test
• computed and interpreted the validity and
reliability coefficient
VALIDITY
• A good test must first of all be valid.
• Validity refers to the extent to which a test measures
what it purports to measure. This is related to the
purpose of the test. If the purpose of the test is to
determine the competency in adding two-digit
numbers, then the test items will be about addition of
these two-digit numbers. Thus, if the objective matches
the test items prepared, the test is said to be valid.
There are different ways of establishing
validity.
• FACE VALIDITY
• is done by examining the physical appearance of the instrument.
• CONTENT VALIDITY
• is done through a careful and critical examination of the objectives
of assessment so that it reflects the curricular objectives.
For instance, the teacher wishes to evaluate a test in English. She
requests experts in English to validate if the test items measure
knowledge, skills and values it supposed to measure as stated in the
course content/syllabus.
• CRITERION-RELATED VALIDITY
• is established statistically such that a set of scores revealed by
the measuring instrument is correlated with the scores
obtained in another external predictor or measure. It has two
types: concurrent and predictive validity.
• Concurrent validity – describes the present status of the individual
by correlating the sets of scores obtained from two measures
given concurrently.
For instance, the teacher wants to validate the Mathematics
achievement test to a group of mathematics students. The result of
the test is correlated with an acceptable Mathematics test which
has been previously proven as valid. If the correlation is “high” the
Mathematics test that he constructed is valid.
• Predictive validity – describes the future performance of
an individual by correlating the sets of scores obtained
from two measures given at a longer time interval.
For instance, the teacher wishes to estimate how well a
student may do in the graduate courses on the bases of how
well he has done on the test he has undertaken in his
undergraduate courses. The criterion measure against which
the test scores are validated and obtained are available after
a long period of interval.
• CONSTRUCT-RELATED VALIDITY
• This is the extent to which the test measures a theoretical and
unobservable variable quality such as understanding, math
achievement, performance anxiety and the like, over a period
of time on the basis of gathering evidence. It is established
through intensive study of the test or measurement
instrument using convergent/divergent validation and factor
analysis.
• Convergent validity – is a type of construct validation wherein a
test has high correlation with another test that measures the
same construct.
• Divergent validity – is a type of construct validation wherein
a test has low correlation with a test that measures a
different construct. In this case, a high validity occurs only
when there is a low correlation coefficient between the tests
that measure different traits. A correlation coefficient in this
instance is also called validity coefficient.
• Factor analysis – is another method of assessing the
construct validity of a test using complex statistical
procedures conducted with different procedures.
Factors Affecting Validity
• Poorly constructed items • Inadequate time limit
• Unclear directions • Inappropriate level of
• Ambiguous test items difficulty
• Too difficult vocabulary
• Unintended clues
• Improper arrangement
• Complicated syntax
of test items
RELIABILITY
• Another characteristic of a good test is reliability.
Reliability refers to the consistency of test scores.
Test scores may vary under different conditions.
The reliability of test scores is usually reported by
a reliability coefficient. A reliability coefficient is
also a correlation coefficient.
TEST-RETEST METHOD
• In this method, the same test is administered twice to
the same group of students with any time interval
between tests. The result of the test scores are
correlated using the Pearson Product Correlation
Coefficient or Spearman rho formula and this
correlation provides a measure of stability. This indicates
how stable or consistent the test result over a period of
time. The formulae are:
• Pearson Formula
where is the first set of scores, is the second set of scores and is
the number of cases.
• Spearman rho Formula
where stands for Spearman rho; is the sum of the squared difference
between ranks, and is the number of cases.
• EQUIVALENT FORM
• It is also known as Parallel or Alternate forms. In this
method, two different but equivalent forms of the test is
administered to the same group of students with a close
time interval. The two forms of the test must be
constructed that the content type of test item, difficulty,
and instruction of administration are similar but not
identical.
For instance, in Form A item, “How many meters are there in 8
kilometers?” In Form B item, “How many kilometres are there in 8 000
meters?” The results of the test scores are correlated using the Pearson
Product Correlation Coefficient and this correlation provides a measure
of equivalence of the tests.
•TEST-RETEST WITH EQUIVALENT FORMS
METHOD
•It is done by giving equivalent forms of test with
increased time interval between forms. The
results of the test scores are correlated using the
Pearson Product Correlation Coefficient and this
correlation provides measures of stability and
equivalence of the tests.
• SPLIT-HALF METHOD
• In this method, the test administered once and the
equivalent halves of the test is scored. The common
procedure is to divide the test into odd-numbered and even-
numbered items. The two halves of the test must be similar
but not identical in content, number of items and difficulty.
This provides two scores for each student. The scores
obtained in the two halves are correlated using Pearson. The
result is reliability coefficient for a half test. Since the
reliability holds only for a half test, the reliability coefficient
for a whole test is estimated by using the Spearman-Brown
formula. The Spearman-Brown formula as follows:
Where: reliability of whole test
reliability of half test
Student First Test Second Test Rank of Rank of Difference between Square of the
Ranks Difference
1 12 20 10 7 3 9
2 20 22 5 5 0 0
3 19 23 6 4 2 4
4 17 20 7 7 0 0
5 25 25 1 1.5 -0.5 0.25
6 22 20 3 7 -4 16
7 15 19 9 9 0 0
8 16 18 8 10 -2 4
9 23 25 2 1.5 0.5 0.25
10 21 24 4 3 1 1
• Analysis: The reliability coefficient using the
Spearman rho = 0.79 which means that it has a
high reliability. The scores of the 10 students
conducted twice with one-week interval are
consistent. Hence, the test is good for a classroom
test but there are probably few items needs to be
improved.
Example 3: Prof. Quinto conducted a test to her 10 students in Filipino class. The test
was given only once. The scores of the students in odd (O) and even (E) items below
were gathered. Using split-half method, is the test reliable? Show the complete
solution using Pearson r and Spearman-Brown Formula.
Student Odd Even
1 15 20
2 19 17
3 20 24
4 25 21
5 20 23
6 18 22
7 19 25
8 26 24
9 20 18
10 18 17
Student First Test Parallel
(X) Test
(Y)
1 15 20 300 225 400
2 19 17 323 361 289
3 20 24 480 400 576
4 25 21 525 625 441
5 20 23 460 400 529
6 18 22 396 324 484
7 19 25 475 361 625
8 26 24 624 676 576
9 20 18 360 400 324
10 18 17 306 324 289
Step 2: Use Spearman-Brown Formula to get the reliability of the whole test.
• Analysis: The reliability coefficient using Spearman Brown Formula is 0.50 which
means it is questionable reliability. Hence, the test items should be revised.
Prof. Madela administered a 40-item test in English for his Grade VI pupils
in Mayondon Elementary School. Below are the scores of 15 pupils, find the
reliability using the Kuder-Richardson 21 formula.
Student Score
1 16
2 25
3 35
4 39
5 25
6 18
7 19
8 22
9 33
10 36
11 20
12 17
13 26
14 35
15 39
Solve the variance and the mean of the scores using the table below.
Student Score
1 16 256
2 25 625
3 35 1 225
4 39 1 521
5 25 625
6 18 324
7 19 361
8 22 484
9 33 1 089
10 36 1 296
11 20 400
12 17 289
13 26 676
14 35 1 225
15 39 1 521
Variance:
Mean:
Solve the reliability coefficient using the Kuder-Richardson 21 formula