You are on page 1of 3

CONTENT VALIDITY EVIDENCE

content validity evidence refers to evidence of the relationship between a test's content and the
construct it is intended to measure. In other words, the content of an instrument which includes
items, tasks, and/or questions as well as administration and scoring procedures must clearly
represent the content domain that the instrument is designed to assess. For example, a college
student would probably be most concerned with the content validity of a final exam the student
would want to be assures that the test questions represent the content that was covered in the
classroom and in the class materials

Collecting evidence of content validity is an imprecise and ongoing activity that does not require
complex calculations or statistical analyses, but rather a systematic process of evaluating instrument
content. The content validation process begins at the outset of instrument construction and follows
a rational approach to ensure that the content matches the test specifications. The first step in the
process is clearly delineating the construct or the content domain to be measured. The definition of
the construct determines the subject matter and the items to be included on the instrument.

Once the construct or content domain has been defined, the second step is to develop a table of
specifications. A table of specifications, or a test blueprint, is a two-dimensional chart that guides
instrument development by listing the content areas that will be covered on a test, as well as the
number (proportion) of tasks or items to be allocated to each content area. The content areas reflect
essential knowledge, behaviors, or skills that represent the construct of interest. Test developers
decide on relevant content areas from a variety of sources, such as the research literature,
professional standards, or even other tests that measure that same construct. The content area of
instruments measuring achievement or academic abilities often come from educational or
accreditation standards, school curricula, course syllabi, textbooks, and other relevant materials. For
example, in contructing the keymath 3, a norm-referenced achievement test measuring essential
math skills in k-12 students, test developers created a table of specifications with content areas that
reflected essential math content, national mathematical curriculum, and national math standards.
For personality and clinical inventories, content areas may be derived from the theoretical and
empirical knowledge about personality traits or various mentak health problems. For example, the
content on the beck depression inventory-ii (bdi-ii) was designed to be consistent with the diagnostic
criteria for depression in the diagnostic and statistical manual of mental disorders-fourth edition
(dsm-iv; American psychiatric association, 1994). In employment assessment, content areas may
reflect the elements, activities, tasks, and duties related to a particular job; for example, figure 6.2
displays a table of specifications for an instrument designed to predict sales performance.
Instrument manuals should always provide clear statements about the source(s) of content areas
that are represented on the test.

After developing the table of specifications with the identified content areas, test developers write
the actual test items. Thus, the next step in the content validation proces involves recruiting multiple
outside consultants (i.e., panel of experts) who review the test items to determine if they do in fact
represent the content domain. The panel of experts can consist of both content experts in the field
and lay experts. For example, to evaluate the content validity of an instrument assesing depression,
two groups of experts should participate:
1. People who have published in the field or who have worked with depressed individuals, and
2. People who are depressed (rubio, 2005)
EVIDENCE BASED ON RESPONSE PROCESSES

the term response processes refers to the actions, thought processes, and emotional traits that the
test taker invokes in responding to a test. Evidence based on response processes, sometimes viewed
as part of content validity, provides validity support that examinees used intended psychological
processes during test taking. Differences in response processes may reveal sources of variance that
are irrelevant to the construct being measured. For example, if a test is designed to measure an
individual's ability to solve mathematical problems, the individual's mental processes should reflect
problem solving, rather than remembering an answer he or she had already memorized.
Understanding how test content affects test taker response processes is very important for creating
tests that assess the intended content domain and for avoiding unnecessary cognitive or emotional
demands on student performance. Methods of examining response processes include think-aloud
protocols (i.e., test takers think their responses out loud during the performance of a test item or
task), posttest interview (i.e., test takers explain reasons for their responses after taking the test),
measurements of brain activity, and eye movement studies.

FACE VALIDITY
content validity should not be confused with face validity, which refers to the superficial appearance
of what a test measures from the perspective of the taker or other untrained observer (urbina, 2004,
p.168). In other words, does a test appear to be measuring what it claims to be measuring? Does the
test look valid to examinees who take it, to those who administer the test, and to other technically
untrained observers? Although not a measure of validity itself, face validity is a desirable feature of a
test. Anastasi and urbina (1997) described how a test designed for children and extended for adult
use was initially met with resistance and criticm because of its lack of face validity. To the adult test
takers, the test appeared silly, childish, and inappropriate; therefore, it lacked face validity. As a
result, the adults cooperated poorly with the testing procedure, thereby affecting the outcome of
the test.

CRITERION-RELATED VALIDITY EVIDENCE


criterion-related validity evidence involves examining the relationships between test results and
external variables that are thought to be a direct measure of the construct. It focuses on the
effectiveness of test results in predicting a particular performance or behavior. For this purpose, test
results, known as the predictor variable, are checked against a criterion, which is a direct and
independent measure that the test is designed to predict or be correlated with. For example, test
user might want evidence that a particular aptitude test can predict job performance. The aptitude
test is the predictor variable, and job performance is the criterion to which test results are
compared; if test results can accurately predict job performance, the test has evidence of criterion-
related validity. The modern conceptualization from the 1999 standards that corresponds to this
category is evidence based on relations to other variables (AERA et al., 1999).

There are two forms of criterion-related validity evidence. Concurrent validity refers to the degree to
which a predictor variable is related to some criterion at the same time (concurrently). For example,
a test that measures depressed mood should have concurrent validity with a current diagnosis of
depression. Predictive validity is the degree to which a test score estimates some future level of
performance. For example, the SAT, which is used to forecast how well a high school students
performs in college, should have predictive validity with college success. Whereas validity in general
focuses on whether a test is valid or not, criterion-related validity emphasizes what exactly a test is
valid for (Shultz & Whitney, 2005). Thus, test scores may be valid in predicting one criterion, but not
another. For example, an intelligence test may be a good predictor of academic performance, but
may be a poor predictor of morality (p.101). Therefore, the chosen criterion must be appropriate to
the intended purpose of the test.

You might also like