Professional Documents
Culture Documents
Inference
a result made through logical deduction
Validation
collecting and assessing the gathered evidence
about validity
Validation Studies
finding out to how well the test measures what it is
trying to measure in order to establish the validity
of the test through research
Predictive Validity
the capability of a test score to predict behavior
based on a criterion mesure
criterion The standard against which a test or a test score
is evaluated.
Criterion contamination Term applied to a criterion measure that has been
based, at least in part, on predictor measures
base rate Extent to which a particular trait, behavior,
characteristic, or attribute exists in the population
hit rate The proportion of people a test accurately
identifies as processing or exhibiting a particular
trait, behavior, characteristic, or attribute
miss rate The proportion of people the test fails to identify
as having, or not having, a particular characteristic
or attribute
False Positive an error in measurement wherein the test believes
that the testtaker possess a trait, ability, or
behavior when in fact they do not possess the
said trait, ability, or behavior at all
Discriminant Evidence data that shows that there is only a weak link
between test scores
Multitrait-Multimethod Matrix
a method that is used to assess construct validity
through simultaneous examination of both
convergent and discriminant evidence through
table of correlations of traits and methods
Factor Analysis group of mathematical procedures that are
designed to identify which variables people may
possibly differ from
Rankings
the ordinal sequencing of people, items, or
concepts based on a assigned value
Halo effect Describes the fact that, for some raters, some
rates can do no wrong
fairness The extent to which a test is used in an impartial,
just, and equitable way
CATEGORIES OF VALIDITY
1. There are some tests that are universally valid for all time, all uses, and with all types of testtaker
populations. (F)
2. Ecological Momentary Assessment (EMA) has the limitations of retrospective self report (T)
3. A personality test where respondents are asked to report what they see in inkblots may be
perceived as a test with low face validity. (F)
4. For an employment test to be content-valid, its content must be a representative sample of the
job-related skills required for employment. (T)
5. If test scores are obtained at about the same time as the criterion measures are obtained,
measures of the relationship between the test scores and the criterion do not provide evidence of
concurrent validity. (T)
CHARACTERISTICS OF A CRITERION
1. A criterion should be relevant to the matter at hand. The criterion should be verified by a
set of data that is representative of the population and the topic at hand or what it is trying
to measure An example would be of taking a test that would enable for the test taker to
know if they have the same skill set wit professional programmers in which those
successful programmers have taken the test already and are the norms of the test.
2. A criterion should be valid for what purpose it may be used. This means that the criterion
should be reviewed by those who are deemed worthy enough if the criterion is valid
enough and actually measures the targeted construct. In proving the criterion is valid
evidence must be also provided.
3. A criterion should not be contaminated. The basic explanation of this characteristic is that
the criterion should not bear nor be manipulated
What questions must a test user ask to determine the validity of a test? List at least 3 of
them below.
1. What were the characteristics of the sample used in the validation study?
2. Will this test consistently yield information similarly to the other widely accepted
tests?
3. How matched are the characteristics to the people for whom an administration of
the test is contemplated?
EVIDENCE OF CONSTRUCT VALIDITY
Explain how each of the following is used to establish construct validity. Fill in the table
below.
TEST BIAS
Highlight 5 major points that relate to Test Bias and explain each one.
2. Severity Error is a type of error in which the ratings are either overly positive or negative.
A lot of times, this is because something is consistently or consensually viewed as good or
bad by a lot of people.
3. Central Tendency Error is when the rater neither displays extremely good or bad in rating
something. The raters tend to stay neutral and in the middle of the continuum.
4. Halo effect is when raters think that rates can not make any mistake. It is also where a
rater gives someone a score that is higher than what he or she originally deserves. This
is because of the rater’s failure to see other distinct behavior of an individual.
5. To avoid test bias, test users may opt to choose ranking. This is where instead of using
an absolute scale, the rater measure individuals against one another. By using this, the
ranker will select first, second, third choices, and so forth.