Professional Documents
Culture Documents
Chap 06 - Characteristics of Effective Tests
Chap 06 - Characteristics of Effective Tests
Characteristics of Effective
Selection Techniques
1
© 2013 Cengage Learning
Reliability
• The extent to which a score from a test is
consistent and free from errors of measurement
• Methods of Determining Reliability
– Test-retest (temporal stability)
– Alternate forms (form stability)
– Internal reliability (item stability)
– Scorer reliability
3
© 2013 Cengage Learning
Test-Retest Reliability
• Measures temporal stability
• Administration
– Same applicants
– Same test
– Two testing periods
• Scores at time one are correlated with
scores at time two
• Correlation should be above .70
4
© 2013 Cengage Learning
Test-Retest Reliability
Problems
• Sources of measurement errors
– Characteristic or attribute being measured
may change over time
– Reactivity
– Carry over effects
• Practical problems
– Time consuming
– Expensive
– Inappropriate for some types of tests
5
© 2013 Cengage Learning
6
© 2013 Cengage Learning
7
© 2013 Cengage Learning
8
© 2013 Cengage Learning
Internal Reliability
• Defines measurement error strictly in terms
of consistency or inconsistency in the
content of the test.
• Used when it is impractical to administer
two separate forms of a test.
• With this form of reliability the test is
administered only once and measures item
stability.
9
© 2013 Cengage Learning
10
© 2013 Cengage Learning
Spearman-Brown Formula
(2 x split-half correlation)
(1 + split-half correlation)
11
© 2013 Cengage Learning
• Kuder-Richardson Formula
– Used for test with dichotomous items (yes-no true-false)
12
© 2013 Cengage Learning
Interrater Reliability
• Used when human judgment of performance is
involved in the selection process
• Refers to the degree of agreement between 2 or
more raters
13
© 2013 Cengage Learning
Reliability: Conclusions
• The higher the reliability of a selection test
the better. Reliability should be .70 or
higher
• Reliability can be affected by many factors
• If a selection test is not reliable, it is
useless as a tool for selecting individuals
15
© 2013 Cengage Learning
16
© 2013 Cengage Learning
Validity
• Definition
The degree to which inferences from scores on
tests or assessments are justified by the
evidence
• Common Ways to Measure
– Content Validity
– Criterion Validity
– Construct Validity
17
© 2013 Cengage Learning
Content Validity
• The extent to which test items sample the
content that they are supposed to measure
18
© 2013 Cengage Learning
Criterion Validity
• Criterion validity refers to the extent to which a
test score is related to some measure of job
performance called a criterion
• Established using one of the following research
designs:
– Concurrent Validity
– Predictive Validity
– Validity Generalization
19
© 2013 Cengage Learning
Concurrent Validity
• Uses current employees
20
© 2013 Cengage Learning
Predictive Validity
• Correlates test scores with future behavior
• Reduces the problem of range restriction
• May not be practical
21
© 2013 Cengage Learning
Validity Generalization
• Validity Generalization is the extent to which a test
found valid for a job in one location is valid for the
same job in a different location
• The key to establishing validity generalization is
meta-analysis and job analysis
22
© 2013 Cengage Learning
23
© 2013 Cengage Learning
Construct Validity
• The extent to which a test actually measures
the construct that it purports to measure
• Is concerned with inferences about test
scores
• Determined by correlating scores on a test
with scores from other test
24
© 2013 Cengage Learning
Face Validity
• The extent to which a test appears to be job
related
• Reduces the chance of legal challenge
• Increasing face validity
25
© 2013 Cengage Learning
26
© 2013 Cengage Learning
Utility
The degree to which a selection
device improves the quality of a
personnel system, above and
beyond what would have occurred
had the instrument not been used.
27
© 2013 Cengage Learning
28
© 2013 Cengage Learning
29
© 2013 Cengage Learning
Utility Analysis
Taylor-Russell Tables
• Estimates the percentage of future employees
that will be successful
• Three components
– Validity
– Base rate (successful employees ÷ total employees)
– Selection ratio (hired ÷ applicants)
30
© 2013 Cengage Learning
Taylor-Russell Example
• Suppose we have
– a test validity of .40
– a selection ratio of .30
– a base rate of .50
• Using the Taylor-Russell
Tables what percentage of
future employees would be
successful?
31
© 2013 Cengage Learning
r. .05 .10 .20 .30 .40 .50 .60 .70 .80 .90 .95
50% .00 .50 .50 .50 .50 .50 .50 .50 .50 .50 .50 .50
.10 .58 .57 .56 .55 .54 .53 .53 .52 .51 .51 .50
.20 .67 .64 .61 .59 .58 .56 .55 .54 .53 .52 .51
.30 .74 .71 .67 .64 .62 .60 .58 .56 .54 .52 .51
.40 .82 .78 .73 .69 .66 .63 .61 .58 .56 .53 .52
.50 .88 .84 .76 .74 .70 .67 .63 .60 .57 .54 .52
.60 .94 .90 .84 .79 .75 .70 .66 .62 .59 .54 .52
.70 .98 .95 .90 .85 .80 .75 .70 .65 .60 .55 .53
.80 1.0 .99 .95 .90 .85 .80 .73 .67 .61 .55 .53
.90 1.0 1.0 .99 .97 .92 .86 .78 .70 .62 .5632 .53
© 2013 Cengage Learning
33
© 2013 Cengage Learning
10 x x x x x
I II
9 x x x
C 8 x x
r 7 x x x x x
i
t 6 IV x x x
e III
5 x x
r
i 4 x x
o
n 3 x x x
2 x x x
1 x x
1 2 3 4 5 6 7 8 9 10
Test Score (x) 34
© 2013 Cengage Learning
= 21 ÷ 30 = .70
= 15 ÷ 30 = .50
35
© 2013 Cengage Learning
Brogden-Cronbach-Gleser Utility
Formula
• Gives an estimate of utility by estimating
the amount of money an organization would
save if it used the test to select employees.
Savings =(n) (t) (r) (SDy) (m) - cost of testing
• n= Number of employees hired per year
• t= average tenure
• r= test validity
• SDy=standard deviation of performance in dollars
• m=mean standardized predictor score of selected
applicants
36
© 2013 Cengage Learning
Components of Utility
Selection ratio
The ratio between the number of openings to the
number of applicants
Validity coefficient
Base rate of current performance
The percentage of employees currently on the
job who are considered successful.
SDy
The difference in performance (measured in dollars)
between a good and average worker (workers one
standard deviation apart) 37
© 2013 Cengage Learning
Calculating m
• For example, we administer a test of mental ability
to a group of 100 applicants and hire the 10 with
the highest scores. The average score of the 10
hired applicants was 34.6, the average test score of
the other 90 applicants was 28.4, and the standard
deviation of all test scores was 8.3. The desired
figure would be:
• (34.6 - 28.4) ÷ 8.3 = 6.2 ÷ 8.3 = ?
38
© 2013 Cengage Learning
Calculating m
• You administer a test of mental ability to a group
of 150 applicants, and hire 35 with the highest
scores. The average score of the 35 hired
applicants was 35.7, the average test score of the
other 115 applicants was 24.6, and the standard
deviation of all test scores was 11.2. The desired
figure would be:
– (35.7 - 24.6) ÷ 11.2 = ?
39
© 2013 Cengage Learning
Example
– Suppose:
• we hire 10 auditors per year
• the average person in this position stays 2 years
• the validity coefficient is .40
• the average annual salary for the position is $30,000
• we have 50 applicants for ten openings.
– Our utility would be:
(10 x 2 x .40 x $12,000 x 1.40) – (50 x 10) =
$133,900
41
© 2013 Cengage Learning
42
© 2013 Cengage Learning
Definitions
• Measurement Bias
– Technical aspects of the test
– A test is biased if there are group differences in
test scores (e.g., race, gender) that are unrelated
to the construct being measured (e.g., integrity)
• Predictive Bias
– A test is fair if people of equal probability of
success on a job have an equal chance of being
hired
43
© 2013 Cengage Learning
Adverse Impact
Occurs when the selection rate for one group is
less than 80% of the rate for the highest scoring
group
Male Female
Number of applicants 50 30
Number hired 20 10
Selection ratio .40 .33
Male Female
Number of applicants 40 20
Number hired 20 4
Selection ratio .50 .20
45
© 2013 Cengage Learning
48
© 2013 Cengage Learning
49
© 2013 Cengage Learning
50
© 2013 Cengage Learning
Top-Down Selection
Advantages
• Higher quality of selected applicants
• Objective decision making
Disadvantages
• Less flexibility in decision making
• Adverse impact = less workforce diversity
• Ignores measurement error
• Assumes test score accounts for all the variance in
performance (Zedeck, Cascio, Goldstein & Outtz, 1996).
52
© 2013 Cengage Learning
53
© 2013 Cengage Learning
Passing Scores
Applicant Sex Score
Omar M 98
Eric M 80
Mia F 70 (passing score)
Morris M 69
Tammy F 58
Drew M 40
54
© 2013 Cengage Learning
Passing Scores
Advantages
• Increased flexibility in decision making
• Less adverse impact against protected
groups
Disadvantages
• Lowered utility
• Can be difficult to set
55
© 2013 Cengage Learning
56
© 2013 Cengage Learning
Top-Down Banding
57
© 2013 Cengage Learning
58
© 2013 Cengage Learning
Traditional Bands
• Based on expert judgment
• Administrative ease
• e.g. college grading system
• e.g. level of job
qualifications
59
© 2013 Cengage Learning
Expectancy Bands
Band Test Score Probability
A 522 – 574 85%
D 0 – 418 56%
60
© 2013 Cengage Learning
SEM Bands
“Ranges of Indifference”
61
© 2013 Cengage Learning
SEM Banding
• Compromise between top-down selection and passing scores
• Based on the concept of the standard error of measurement
• To compute you need the standard deviation and reliability of
the test
Standard error =
SD 1 - reliability
64
© 2013 Cengage Learning
Advantages of Banding
• Helps reduce adverse impact, increase
workforce diversity,and increase perceptions of
fairness (Zedeck et al., 1996).
• Allows you to consider secondary criteria
relevant to the job (Campion et al., 2001).
67
© 2013 Cengage Learning
Disadvantages of Banding
(Campion et al., 2001)
71
© 2013 Cengage Learning
Banding Example
• Sample Test Information • The Band
– Reliability = .80 Band = Standard error * 1.96
– Mean = 72.85 Band = 4.07 * 1.96 = 7.98 ~ 8
– Standard deviation = 9.1
• The Standard Error • Example 1
SD 1 - reliability – We have four openings
– We would like to hire more
females
9.1 1 - .80
• Example 2
9.1 .20 – Reliability = .90
– Standard deviation = 12.8
= 9.1 * .447
= 4.07 72
© 2013 Cengage Learning
Focus on Ethics
Diversity Efforts
75
© 2013 Cengage Learning
76