Professional Documents
Culture Documents
1. Content Validity
2. Criterion-Related Validity
3. Construct Validity
¡ Content Validity is determined by the
degree to which the questions, tasks or
items on a test are representative of the
universe of behavior the test is designed to
sample. In theory, content validity is really
nothing more than a sampling issue.
§ Do the items adequately sample the
content domain of the construct or
constructs that the test purports to
measure?
§ Systematically examine test content to
determine whether it covers a
representative sample of the behavior
domain to be measured.
§ Content validity is usually established by a
careful examination of items by a panel of
experts in a given field. In effect, the test
developer asserts that “a panel of experts
reviewed the domain specification carefully
and judged the following test questions to
possess content validity.”
---------------------------------------------------------------------------------------------------------------------
Please read carefully through the domain specification for this test. Next, please indicate
how well you feel each item reflects the domain specification. Judge a test item solely
on the basis of match between its content and the content defined by the domain
specification. Please use the four-point rating scale shown below:
Sales Potential Test Peso amount of goods sold in the preceding year
Characteristics of a Good Criterion
1. Reliable
¡ It is a useful index of what the test measures
§ Example: The validity of the USTET can be studied
by computing the correlation (r) between
entrance exam scores and grade point averages
for a representative sample of students. In any
case, the resulting correlation coefficient is called
a validity coefficient
Characteristics of a Good Criterion
2. Appropriate for the test under investigation
¡ All criterion measures should be described
accurately, and the rationale for choosing them
as relevant criteria should be made explicit.
§ Example: In the case of “interest tests,” it is
sometimes unclear whether the criterion measure
should indicate satisfaction, success, or continuance
in the activities under question. The choice between
these subtle variants in the criterion must be made
carefully, based on an analysis of what the interest
test purports to measure.
Characteristics of a Good Criterion
3. Free of contamination from the test itself.
§ Criterion contamination is the term applied to a
criterion measure that has been based, at least in
part, on predictor measures.
¡ Example
Name of Test Criterion Validation
“Inmate Violence Ratings from fellow Asking guards to rate
Potential Test” –
predicts a prisoner’s inmates, guards, and each inmate on their
potential for violence in other staff in order violence potential
the cell block to come up with a
number that
represents each
inmate’s violence
potential.
§ Concurrent Validity – relationship between
test scores and an external criterion that is
measured at approximately the same time.
o Example 1: A test for determining skills in
logical reasoning is administered to a group of
students. Scores on this test are compared
with scores on another test on logical
reasoning of already known validity. If r is
high, then the test has concurrent validity.
o Example 2: An arithmetic achievement test
would possess concurrent validity if its scores
could be used to predict, with reasonable
accuracy, the current standing of students in a
mathematics course.
o Example 3: A personality inventory would
possess concurrent validity if diagnostic
classifications derived from it roughly matched
the opinions of psychiatrists or clinical
psychologists.
§ Predictive Validity – relationship between
test scores and an external criterion that is
measured somewhat later.
o Example 1: When scores on a math aptitude
test correlate highly with the final grades of
students in math, the aptitude test is said to
have high predictive validity.
o Example 2: An employment test can be
validated against supervisor ratings after six
months on the job.
¡ Construct Validity is a judgment about the
appropriateness of inferences drawn from
test scores regarding individual standings on
a variable called a construct (Cohen and
Swerdlik, 2009). It is the extent to which the
test may be said to measure a theoretical
construct or trait (Anastasi, 1996).
¡ What is a construct?
§ A construct is an unobservable trait that is known
to exist. It is a theoretical, intangible quality in
which individuals differ (Gregory, 2011).
¡ Examples of constructs:
IQ Anxiety
Leadership Ability Hostility
Motivation Neuroticism
Self Esteem Scholastic Aptitude
Depression
¡How can we be sure a test measures these, if
we can’t directly measure them?
§ Each construct is developed to explain and
organize observed response consistencies. It
derives from established interrelationships among
behavioral outcomes.
§ A test designed to measure a construct must
estimate the existence of an inferred, underlying
characteristic (e.g., leadership ability) based on a
limited sample of behavior. How appropriate are
these inferences about the underlying construct –
that is construct validity.
¡ All psychological constructs possess two
characteristics in common:
§ 1. There is no single external referent sufficient to
validate the existence of the construct; that is, the
construct cannot be operationally defined.
§ 2. Nonetheless, a network of interlocking
suppositions can be derived from existing theory
about the construct.
¡ Example: PSYCHOPATHY
Description Characteristic # 1 Characteristic # 2
A personality No single behavioral A network of
constellation characteristic or interlocking
characterized by outcome sufficient suppositions can be
antisocial behavior to determine who is derived from
(lying, stealing, strongly existing theory
occasional psychopathic and about psychopathy.
violence), lack of who is not.
guilt or shame, and
impulsivity.
¡ Characteristic # 1: On average, we
might expect psychopaths to be frequently
imprisoned, but so are many common
criminals. Furthermore, many successful
psychopaths somehow avoid apprehension
altogether. Psychopathy cannot be gauged
only by scrapes with the law.
¡ Characteristic # 2: The fundamental problem in
psychopathy is presumed to be a deficiency in the
ability to feel emotional arousal – whether empathy,
guilt, fear of punishment, or anxiety under stress. A
number of predictions follow from this appraisal. For
example, psychopaths should lie convincingly, have a
greater tolerance for physical pain, and get into
trouble because of their lack of behavioral inhibition.
Thus to validate a measure of psychopathy, we
would need to check out a number of different
expectations based on our theory of psychopathy.
¡ Construct validation requires the gradual
accumulation of information from a variety
of sources.
¡ The crucial point to understand about
construct validity is that “no criterion or
universe of content is accepted as entirely
adequate to define the quality to be
measured
¡ Approaches to Construct Validity
¡ 1. Developmental Changes
§ Age Differentiation – a major criterion
employed in validating traditional IQ tests.
§ Since abilities increase with age, it is logical
that test scores will also improve with age.
§ Stanford-Binet Test – checked against
chronological age to determine whether
scores show a progressive increase with
advancing age
§ Theory: Intelligence increases with age.
§ This is just one measure of construct validity,
but is NOT conclusive. In other words,
determining construct validity by
developmental changes alone is NOT sufficient.
¡ 2. Correlations with Other Tests
§ Correlations between a new test and similar earlier
tests are sometimes cited as evidence that the new
test measures approximately the same general area
of behavior as the other test.
¡ Correlations should be moderately high, but not too
high. If the test correlates too highly with an already
available test, without such added advantages such
as brevity or ease of administration, then the new
test represents needless duplication.
3. Factor Analysis
§ A refined statistical technique for analysing
interrelationships of behavior data.
§ Factor analysis groups multiple factors into
a few factors.
4. Internal Consistency
§ The criterion is none other than the total
score on the test itself.
5. Convergent and Divergent Validation
§ The test should correlate highly with other
similar tests, and correlate low with
dissimilar tests.