Professional Documents
Culture Documents
B. Types of Validity
B.1. Content Validity. A type of validation that refers to the relationship between
the instructional objectives so that the test measures what it is supposed to
measure. Things to remember about this type of validity:
B.1.a. The evidence of the content validity pf a test is found in the Table
of Specification.
B.1.b. This is the most important type of validity for a classroom teacher.
B.1.c. There is no coefficient for content validity. It is determined y
experts judgmentally, not empirically.
B.3. Construct Validity. A type of validation that refers to the measure of the
extent to which a test measure the theoretical and unobservable variable
qualities such as intelligence, math achievement, test anxiety, etc. over a period
of time on the basis of gathering evidence. It is established through intensive
study of the test or measuring instrument using convergent/divergent validation
and factor analysis.
B.3.a. Convergent validity is a type of construct validation wherein a test
has a high correlation with another test that measures the same construct.
B.3.b. Divergent validity is type of construct validation wherein a test
have low correlation with another test that measures a different construct.
B.3.c. Factor analysis is a complex statistical procedure in establishing tet
validity
Remember, predictive validity involves a time interval. For example, the SAT
score before college is correlated with the GPA in college. Concurrent validity does not
involve a time interval. A test is administered and its relationship to a well-established
test measuring the same behavior is determined.
C. Factors Affecting Validity of a Test
C.1. The test itself: poor construction, unclear directions, ambiguous test items,
too difficult vocabulary, inadequate time limit, level of difficulty, unintended
clues
C.2. Personal factors influencing students’ response to the test.
C.3. Validity is always specific to a particular group.
D. Computing and Interpreting Validity Coefficient
The validity coefficient is the computed value of the rxy. In theory, the validity
coefficient has values like the correlation that ranges from 0 to 1. In practice, most
of the validity scores re small and they range from 0.3 to0.5, few exceed 0.6 to 0.7
Example: Teacher A develops a 45-item test and he wants to determine if his test
is valid. He takes another test that is already acknowledge for its validity and uses it
as a criterion. He conducted these two sets of test to his 15 students. The following
table shows the results of the two tests. Is the test developed by teacher A valid?
Find the validity coefficient using Pearson r and the coefficient of determination.
Take Action: Exercises
1. What type of validity can go with the following procedure?
a. Matching test items with objectives.
b. Correlating a test of mechanical skills after training with on-the-job performance
ratings.
c. Correlating the short form of an IQ test with the long form.
d. Correlating a paper-pencil test of musical talent with ratings from a live audition
completed after the test.
e. Correlating a test of reading ability with a test of mathematical ability.
2. Compute the validity coefficient using the Pearson r and the coefficient of
determination. Interpret the results.
Scores of students in a teacher-made test (X): 25, 22, 18, 18, 16, 14, 12, 8, 6, 6
Scores of the same students on criterion test(Y): 22, 23, 25, 28, 31, 32, 34, 42, 44, 48
C.2. Alternate forms or Equivalence. If there are two equivalent forms of a test,
these forms can be used to estimate the reliability of the test. Both forms are given to
a group of students and the correlation between two sets of scores are determined..
This estimate eliminates the problems of memory and practice involved in test-retest
estimates. Large differences in the student’s score on two forms that supposedly
measure the same behavior/trait would indicate an unreliable test. To use this method
two equivalent forms of the test must be available, and they must be administered
under conditions as nearly equivalent as possible. One major problem in using this
method is that it takes a great deal of time and effort to develop one good test let
alone two. Hence, this method is often used by test publishers who create two forms
of their test for other reasons (e.g., to maintain test security).
C.3. Internal Consistency. If the test in question is designed to measure a single
basic concept, it is reasonable to assume that people who get one item right will be
more likely to get other, similar items right. In which case, the test has internal
consistency. One approach in determining the test internal consistency, called split
halves, involves splitting the test into equivalent halves and determining the
correlation between them. This can be done by assigning all items in the first half of
the test to one form and all items in the second half of the test to the other form.
However this method is appropriate only when items of varying difficulty is spread
cross the test. Frequently, they are not. In these cases, the best approach would be to
divide the test items by placing all odd-numbered items into one half and all even-
numbered items into the other half. When this latter approach is used, the reliability is
more commonly called the odd-even reliability. To compute the internal consistency
of the whole test the Spearman-Brown prophecy formula (a correction formula) is
used. It is
rw = ____2rb____
1 + rb
Where: rw is the correlation for the whole test
rb is the correlation between the two halves of the test
Exercise: The test was given once. The score of the students in odd and even
items are gathered. Using the split-half method, is the test reliable? Show the
complete solution. To find the reliability of odd and even test items solve for
∑x ∑y ∑xy ∑x2 ∑y2. Use the formula rw = ____2rb____
1 + rb
to solve the reliability of the whole test.