You are on page 1of 61

Jean Phillips & Stanley Gully

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-1
 Effective measurement and data analytics can
result in a competitive edge
 Improperly assessing and measuring
candidate characteristics can lead to:
◦ Systematically hiring the wrong people
◦ Offending and losing good candidates
◦ Exposing your company to legal action
 There are many legal issues involved with
candidate assessment and measurement

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-2
 Measurement is the process of assigning numbers
according to some rule or convention to aspects of
people, jobs, job success, or aspects of the staffing
system
 The measures enables improvement of the staffing
system by identifying patterns useful for understanding
and predicting relevant processes and outcomes
 The measures relevant to staffing are those that assess:
◦ The characteristics of the job, which enables the creation of
job requirements and job rewards matrices
◦ Aspects of the staffing system such as the number of days a
job posting is run, where it is run, and the recruiting message
◦ The characteristics of job candidates such as ability or
personality
◦ Staffing outcomes, such as performance or turnover

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-3
 The numerical outcomes of measurement are
data
 There are 2 types of data:
◦ Predictive data is information about measures used to
make projections about outcomes.
◦ Criterion data is information about important outcomes
of the staffing process.
 Traditionally, this data includes measurement of employee
job success, which is the organization’s unique definition
of success and performance in the job and in the firm.
 Criterion data should also include all outcome data that is
relevant to the evaluation of the effectiveness of the
staffing system against its goals. This may include
measures of job success, time-to-hire, promotion rates,
and tenure rates as well as job and company engagement,
fit with company values, and willingness to help other
employees.
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-4
 Nominal: numbers are assigned to discrete labels
or categories (e.g., race, gender, college major)
 Ordinal: attributes are ranked in ascending or
descending order (e.g., ranking from best to worst
performance)
 Interval: zero point is arbitrary but distance
between scores has meaning (e.g., intelligence or
interview scores)
 Ratio: distance between scores has meaning and
there is a true zero point (e.g., salary, typing
speed)

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-5
 Scoring: The process of assigning numerical
values during measurement
 Raw scores: the unadjusted scores on a
measure
◦ Criterion-referenced measures: measures in
which the scores have meaning in and of
themselves
◦ Norm-referenced measures: measures in which
the scores have meaning only in comparison to
the scores of other respondents
 Normal curve: a symmetrical, bell-shaped
curve representing the distribution of a
characteristic
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-6
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-7
 Percentile score: a raw score that has been
converted into an expression of the percentage of
people whose score falls at or below that score
 Central tendency: describes the midpoint or center
of data
◦ Mean: the average of the scores
◦ Median: the middle score , or the point below which 50 percent of
the scores fall
◦ Mode: the most commonly observed score (bimodal = two modes)
 Variability: describes the spread of the data around
the midpoint
◦ Range: the difference between the highest &lowest observed score
◦ Outlier: score much higher or lower than most of the scores in a
distribution
◦ Variance: a mathematical measure of spread based on squared
deviations of scores from the mean
◦ Standard deviation: positive square root of the variance;
conceptually similar to the average distance from the mean of a
set of scores
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-8
 Standard scores: converted raw scores that
indicate where a person’s score lies in
comparison to a referent group
◦ Indicates how many units of standard deviations
the individual’s score is above or below the mean
of the referent group
 A standard score is negative when the
target individual’s raw score is below the
referent group’s mean, and positive when
the target individual’s raw score is above
the referent group’s mean

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-9
zscore = (individual’s raw score – referent group mean) /
referent group standard deviation)
Meaningfully combining the raw scores would be difficult.
Combining the z scores is easy and results in a single number
reflecting how each candidate did on both of the assessments
relative to the other candidates.
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-10
 When making candidate selection decisions, it is
often assumed that in the applicant pool, the
distribution of applicant fit with the job reflects the
normal curve. A large burden is then placed on the
selection system to accurately identify which
candidates are in the far right tail of the normal
curve.
 However, many of the most desirable people for
the position are likely to be actively and happily
employed elsewhere and are semi-passive job
seekers at best. In this case, the distribution of
applicant fit with the job might resemble the A
distribution shown on the next slide.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-11
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-12
 If done strategically, sourcing and recruiting can
discourage poor fits from applying and increase
the number of high quality passive and semi-
passive candidates who apply.
 This shifts the curve to reflect a distribution like
that shown by the B distribution.
 The B distribution clearly reduces the burden on
the selection system to identify quality candidates
and significantly increases the likelihood of
identifying a high-quality candidate.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-13
 Linear correlation coefficient, also called “Pearson’s
r” or the “bivariate correlation,” is a single number
that ranges from -1 to +1 that reflects the
direction (positive or negative) and magnitude
(strength) of the relationship between two
variables.
◦ A value of r = 0 indicates that values of one measure are
not linearly related to values of the other measure (but they
are not neceassarily independent).
◦ A value of r = +1 means that there is a perfectly linear,
positive relationship between the two measures; as values
of one measure increase, values of the other measure
increase exactly the same amount in standard deviations.
◦ A value of r = -1 means that there is a perfectly negative or
inverse relationship between the two measures; as values of
one measure increase, values of the other variable decrease
exactly the same amount in standard deviations.
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-14
 Scatter plot: graphical illustration of the
relationship between two variables
◦ Each point on the chart corresponds to how a
person scored on a measure and how he or she
performed on the job

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-15
Would this test be useful in making hiring decisions?

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-16
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-17
 Relating store size with staffing levels
 Relating seniority in a firm with job
performance
 Relating the time to fill a job with new-hire
quality
 Relating quality of new hires with business
performance and customer satisfaction

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-18
 Sampling error: When you use statistics,
including correlations, to draw inferences or
conclusions, you have to be concerned about
sampling error. Sampling error is the
variability in sample correlations due to
chance.
 You can address sampling error through
statistical significance testing procedures.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-19
 Statistical significance: the degree to which the
observed relationship is not likely due to sampling
error.
◦ This is a minimum requirement for establishing a
meaningful relationship
 Practical significance: the observed relationship is large
enough to be of value in a practical sense.
◦ In a large enough sample, a very small correlation
would be statistically significant but the relationship
may not be strong enough to justify the expense and
time of using the predictor
 An inexpensive assessment system may be useful even
if the correlation is small.
 Alternatively, if an assessment method that correlated
.15 with job success was expensive, took a long time to
administer, and was only moderately liked by job
candidates, it may not be worth using even if it is a
statistically significant predictor of job success.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-20
 A statistical technique that predicts an
outcome using one or more predictor
variables; it identifies the ideal weights to
assign each predictor to maximize the
validity of a set of predictors; the analysis is
based on each predictor’s correlation with
the outcome and the degree to which the
predictors are themselves intercorrelated
 Multiple regression examines the effect of
each predictor variable after statistically
controlling for the effects of other
predictors in the equation
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-21
Job successpredicted = Constant + (b1*Test score1) + (b2 * Test score2)
+ (b3 * Test score3)…
Job successpredicted = 10 + (2 * Interview) + (1 * Personality)
+ (.2 * Job knowledge)

If someone scores 50 on the interview, 27 on the personality test, and


20 on the job knowledge test, what is the predicted job success
score?

Job successpredicted = 10 + (2 * 50) + (1 * 27) + (.2 * 20)

Job successpredicted = 141

141 is then compared with predicted job success scores of other


candidates to determine who should be selected

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-22
 Reliability refers to how dependably or
consistently a measure assesses a particular
characteristic
 Measurement error influences reliability.
 Measurement error can be random or
systematic.
 To evaluate a measure’s reliability, you
should consider:
◦ The type of measure
◦ The type of reliability estimate reported
◦ The context in which the measure will be used

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-23
 All of these factors, as well as others, can
influence reliability. That is why tests or
assessment tools should be standardized in
their use.
◦ Temporary physical or psychological state
◦ Environmental factors
◦ Version, or form, of the measure
◦ Different evaluators

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-24
 Random error: error that is not due to any
consistent cause
 Systematic error: error that occurs because
of consistent and predictable factors
 Deficiency error: error that occurs when you
fail to measure important aspects of the
attribute you would like to measure
 Contamination error: error that occurs when
other factors unrelated to whatever is being
assessed affect the observed scores

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-25
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-26
 Test-retest reliability reflects the repeatability of
scores over time and the stability of the underlying
construct being measured
 Alternate or parallel form reliability indicates how
consistent scores are likely to be if a person
completes two or more forms of the same measure
 Internal consistency reliability indicates the extent
to which items on a given measure assess the same
construct
 Inter-rater reliability indicates how consistent
scores are likely to be if the responses are scored
by two or more raters using the same item, scale,
or instrument

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-27
 Commonly used as a measure of the internal
consistency of psychometric tests.
 Not robust against missing data.
Si2
rXX  ( k
k 1 )(1  S X2
)
k : the number of items
S i2 : sample variance for item i
S X2 : sample variance of the total test scores

Copyright © 2011 Pearson Education, Inc.


Publishing as Prentice Hall 1-
28
 The standard error of measurement (SEM) is the
margin of error that you should expect in an
individual score because of the imperfect reliability
of the measure. It represents the spread of scores
you might have observed had you tested the same
person repeatedly.
 The confidence interval represents the degree of
confidence that a person’s “true” score lies within
their earned score plus or minus the SEM, given
some level of desired confidence.
 The lower the standard error, the more accurate
the measurements.
◦ If the SEM is 0, then each observed score is that person’s
true score

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-29
SEM = sX(1-rxx)0,5

Copyright © 2011 Pearson Education,


Inc.
Publishing as Prentice Hall 1-
30
 Validity refers to how well a measure assesses a given
construct and the degree to which you can make specific
conclusions or predictions based on observed scores.
 Validity can tell you what you may conclude or predict about
someone based on his or her score on a measure, thus
indicating the measure’s usefulness.
 Validity will tell you how useful a measure is for a particular
situation; reliability will tell you how consistent scores from
that measure will be.
 You cannot draw valid conclusions unless you are sure that
the measure is reliable. Even when a measure is reliable, it
may not be valid.
◦ You might be able to measure a person’s shoe size reliably but it
may not be useful as a predictor of job performance.
 Any measure used in staffing needs to be both reliable and
valid for the situation.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-31
Figure 8-7

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-32
 Validation is the cumulative and ongoing
process of establishing the job relatedness
of a measure
 There are three types of validation
processes:
◦ Content-related validation: Demonstrating that
the content of a measure assesses important job-
related behaviors
◦ Construct-related validation: Demonstrating that
a measure assesses the construct, or
characteristic, it claims to measure
◦ Criterion-related validation: Demonstrating that
there is a statistical relationship between scores
from a measure and the criterion, usually some
aspect of job success
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-33
 Face validity is a subjective assessment of
how well items seem to be related to the
requirements of the job.
 Face validity is often important to job
applicants who tend to react negatively to
assessment methods if they perceive them
to be unrelated to the job or not face valid.
 Even if a measure seems face valid, if it
does not predict job performance, then it
should not be used.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-34
 A validity coefficient is a number between 0
and +1 that indicates the magnitude of the
relationship between a predictor (such as
test scores) and the criterion (such as a
measure of actual job success).
 The validity coefficient is the absolute value
of the correlation between the predictor and
criterion.
 Validity coefficients rarely exceed .40 in
staffing contexts.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-35
 If perfect prediction is 1.000:
◦ Development centre is 0.650
◦ Work sample test is 0.550
◦ Ability tests are 0.525
◦ Assessment centre is 0.450
◦ Personality tests are 0.425
◦ Bio-data analysis is 0.375
◦ Structured interviews are 0.350
◦ Typical interviews are 0.166
◦ References are 0.133
◦ The use of graphology, astrology: lower than zero
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-37
 Consider:
◦ The level of adverse impact associated with your
assessment tool
◦ The number of applicants compared to the
number of openings
◦ The number of currently successful employees
◦ The cost of a hiring error
◦ The cost of the selection tool
◦ The probability of hiring a qualified applicant
without using a scored assessment tool

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-38
 Applicants—a valid assessment system can result
in adverse impact by differentially selecting people
from various protected groups, have low face
validity, and result in lawsuits.
 Organization’s time and cost—a valid assessment
system can have an unacceptably long time to fill
or cost per hire, result in the identification of high-
quality candidates who demand high salaries,
resulting in increasing payroll costs; and be
cumbersome, difficult, or complex to use.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-39
 Future recruits—a system can be valid but if the system
is too long or onerous then applicants, particularly
high-quality applicants, are more likely to drop out of
consideration; word that a firm is using time-
consuming selection practices could reduce the number
of applications; a valid system could result in
differential selection rates and reduce the number of
applicants from a particular gender, ethnicity, or
background; and valid systems can still be viewed as
unfair, resulting in fewer future applicants.
 Current employees—a valid assessment system may
favor external applicants or not give all qualified
employees an equal chance of applying for an internal
position; employees may question its fairness.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-40
 Validity generalization: the degree to which
evidence of validity obtained in one situation
can be generalized to another situation
without further study
 Based on meta-analysis
 No guarantee that the same validity will be
found in any specific workplace
 Legal acceptability not yet established

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-41
 Examine available validation evidence supporting using the measure
for specific purposes. Evaluate the procedures used in the validation
studies and the results of those studies, and consider the definition of
job success used in them.
 Identify the possible valid uses of the measure. The purposes for
which the measure can legitimately be used should be described, as
well as the performance criteria that can be predicted validly.
 Establish the similarity of the sample group(s) on which the measure
was developed with the group(s) with which you would like to use the
measure. Ex. What was the race, ethnicity, and age of the sample?
 Confirm job similarity. A job analysis should be performed to verify
that your job and the original job are substantially similar in terms of
ability requirements and work behavior.
 Examine adverse impact evidence. Reports from outside studies must
be considered for each protected group that is part of your labor
market. If this information is not available for an otherwise qualified
measure, an internal study should be conducted, if feasible.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-42
 Measures should be used in a purposeful manner
 Use a variety of tools
 Use measures that are unbiased and fair to all groups
 Use measures that are reliable and valid
 Use measures that are appropriate for the target
population
 Ensure that administration staff are properly trained
 Ensure suitable and uniform assessment conditions
 Maintain assessment instrument security
 Maintain confidentiality of results
 Interpret scores properly

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-43
 All assessment tools are subject to errors, both in
measuring a characteristic, such as verbal ability,
and in predicting job success criteria, such as job
performance.
◦ Do not expect any measure or procedure to measure a
personal trait or ability with perfect accuracy for every
single person.
◦ Do not expect any measure or procedure to be completely
accurate in predicting job success.
 Selection errors occur when you fail to hire
someone who would have been successful at the
job (false negatives) or you hire someone who is
not successful at the job (false positives).
◦ Selection errors cannot be completely avoided in any
assessment program or method, but they can be reduced.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-44
 Because appropriately using professionally
developed measures enables organizations to
make more effective staffing decisions than does
the use of simple observations or random decision
making, even if they are not perfect.
 The practice of using a variety of measures and
procedures to more fully assess people is referred
to as the whole-person approach to assessment,
and will help reduce the number of selection errors
and boost the effectiveness of your overall decision
making.
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-45
 Standardization: the consistent administration and
use of a measure
 Norms: reflect the distribution of scores of a large
number of people whose scores on an assessment
method are to be compared. The standardization
sample is the group of respondents whose scores
are used to establish norms. These norms become
the comparison scores for determining the relative
performance of future respondents.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-46
 Objectivity refers to the amount of judgment or bias
involved in scoring an assessment measure.
 The scoring of objective measures is free of personal
judgment or bias.
◦ Multiple-choice exams and the number of words typed
in a minute are objective measures.
 Subjective measures contain items for which the score
can be influenced by the attitudes, biases, and personal
characteristics of the person doing the scoring (e.g.,
essay or interview questions).
◦ Whenever hiring decisions are subjective, it is also a
good idea to involve multiple people in the hiring
process, preferably of diverse gender and race, to
generate a more defensible decision.
 Because they produce the most accurate measurements,
it is best to use standardized, objective measures
whenever possible.
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-47
 Conduct a job analysis to identify the important KSAOs and
competencies required of a successful employee.
 Identify reliable and valid methods of measuring these KSAOs
and competencies, and create a system for measuring and
collecting the resulting data.
 Examine the data collected from each measure to ensure that
it has an appropriate mean and standard deviation.
 Use correlation or regression analysis to evaluate any
redundancies among the measures and to assess how well
the group of measures predicts job success.
 Consider adverse impact and the cost of the measures in
evaluating each measure.
 After the final set of measures are identified, develop
selection rules to determine which scores are passing.
 Periodically reevaluate the usefulness and effectiveness of the
system to ensure that it is still predicting job success without
adverse impact.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-48
 It is sometimes useful to compare an
organization’s staffing data with other similar
organizations
 Comparative dimensions can include:
◦ Application rates
◦ Average starting salaries
◦ Average time to fill
◦ Average cost per hire

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-49
 Determinants of effectiveness of an assessment method
include
◦ Validity (whether assessment predicts job success)
◦ Return on investment (ROI) (whether assessment generates
a financial return that exceeds the cost associated with
using it)
◦ Applicant reactions (perceptions of job relatedness and
fairness)
◦ Usability (willingness and ability of people in the
organization to use the method consistently and correctly)
◦ Adverse impact (whether the method can be used without
discriminating against members of a protected class)
◦ Selection ratio (whether the method has a low selection
ratio)

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-50
 What types of measures of job candidates are
most likely to be high in terms of their reliability
and validity? Does this make them more useful?
Why or why not?
 How would you explain to your supervisor that
the correlation between interview scores and new
hire quality is low and persuade him or her to
consider a new job applicant evaluation method?

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-51
 What correlation would you need to see
before you were willing to use an expensive
assessment test?
 When would it be acceptable to use a
measure that predicts job success but that
has adverse impact?
 What do staffing professionals need to know
about measurement?

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-52
Teddy-bear maker Fuzzy Hugs pursues a high-
quality, low-cost strategy and can’t afford to hire
underperforming manufacturing employees given its
lean staffing model. Fuzzy Hugs has identified an
assessment system that has high validity and
predicts job success well, but that is also very
expensive and results in fairly high levels of adverse
impact. The company is concerned about maintaining
a diverse workforce, and wants to avoid legal trouble.
The assessment tools it identified that had lower
adverse impact had substantially lower validity as
well, and were almost as expensive.
◦ The company asks your professional advice about whether
it should use the new assessment system. What advice do
you give?

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-53
 This chapter’s “Develop Your Skills” feature
gave you some tips on assessing job
candidates. Based on what you read in this
chapter, what are three additional tips that
you would add to the list?

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-54
The opening vignette described how Valtera
developed an assessment system to enable its
retail merchandising client to hire better
employees to enhance the execution of its
low-cost and high-service strategy. Reread
the vignette and answer the following
questions:
◦ What other measures do you think could be
considered for the job given the company’s high
service quality goal? Why?
◦ If you applied to this company and were tested
on reading, math word problems, situational
judgment, and personality measures for a retail
sales position, how would you respond? Would
you think that these methods were fair?
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-55
 Unreliability will weaken (attenuate) observed
correlations
 You can correct observed correlations for
unreliability.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-56
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-57
 Organizations do not hire randomly and keep
only the best employees
 Restricted variability of employees in an
organization will tend to attenuate observed
correlations

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-58
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-59
Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall
8-60
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise, without the prior written
permission of the publisher. Printed in the United States of America.

Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall


8-61