You are on page 1of 6

BES3149 PSYCHOMET 1

Understanding Test Utility

defining test utility


- Usefulness or practical value of testing to improve efficiency
- Benefits that testing brings to decision making
® Depends on the extent to which their use increases the rate of accuracy of the
inferences and decisions we wish to make – over and above what it would be if
we used other available tools

test utility
Factors affecting
1. Psychometric Soundness
- Reliability and validity of a test
- Normally, a valid test is most likely to be useful
® But there are other factors that must be considered in determining a test’s utility
® Selection ratio = number of people hired divided by number of applicants
a. Lower selection ratio = mas important yung validity in terms of test utility
b. Higher selection ratio = less important yung validity in terms of test utility
2. Costs
- Disadvantages, losses, or expenses in both economic and non-economic terms
- Usual meaning = economic
- If testing is to be conducted, then it may be necessary to allocate funds to purchase:
a. Particular test
b. Supply of blank test protocols
c. Computerized test processing, scoring and interpretation from the test publisher or
some independent service
- Associated costs testing may come in other forms
a. Payment to professional personnel and staff associated with test administration,
scoring, and interpretation
b. Facility rental, mortgage, and/or other charges related to the usage of the test facility
c. Insurance, legal, accounting, licensing, and other routine costs of doing business
- Other economic costs are not easy to compute, especially those related to the use of
ineffective tests/instruments or the implications of cost-cutting measures
® Pagtitipid = pwedeng mag cause ng issue in reliability
3. Benefits
- Profits, gain, or advantages derived from the use of a particular test
- While testing can have some cost to the company, the economic benefits can be
tremendouns in terms of
a. Increase in quantity and quality of worker performance
b. Decrease in competency gaps (require training), accidents, and employee turnover
BES3149 PSYCHOMET 1
Understanding Test Utility

4. Index of Utility
- Practical value of the informantion derived from scores on a test
5. More than practical value, though, test utility tells us whether the use of test scores
actually helps in making “better” decisions

test utility
determined
- How do professionals in the field of testing and assessment balance variables such as
psychometric soundess, benefits and costs?
- How do they come to a judgement regarding the utility of a specific test?
- How do they decide that the benefits outweigh the costs and that a test or intervention
indeed has utility
- There are formulaes that can be used with values that can be filled in, and there are tables
that can used with values to be looked up

EXAMPLE OF TAYLOR-RUSSELL TABLE

Base rate

Validity Selection ratio


BES3149 PSYCHOMET 1
Understanding Test Utility

utility analysis
How to conduct
1. Use of Expectancy Data
- An expectancy table provides an indication of the likelihood that a
testtaker will score within some interval of scores on a criterion
measure (ex: passing, acceptable, or failing)
- Can provide helpful information to decision-makers

Selection ratio = proportion of job


applicants to be hired (0.5 to 0.95)
Validity of tests =
possible validity
coefficients for a
test intended for
employee selection
(0.0 to 1.0)

® When selection ratios are low, even with the use of a test with .15 validity,
success rates are still expected to be higher than the initial base rate
® Mababang selection rate = mas need for validity
§ Limitation of Taylor-Russell Tables
® Relationship between the predictor (ex: test) and criterion (ex:
rating of performance on the job) must be linear (ex: as test scores
increase = performance improves)
® What if at a certain time, performance “levels off” regardless of
what test score was obtained? Taylor-Russell Tables would not be
appropriate
® Potentially difficult to identify a criterion score that will separate
“successful” from “unsuccessful” employees
BES3149 PSYCHOMET 1
Understanding Test Utility

2. Use of Brogden-Cronbach-Gleser Formula (BCG)


- Used to calculate the dollar/peso amount of a utility gain resulting from the
use of a particular selection instrument under specified conditions
- Utility gain = refers to an estimate of the benefit (monetary or otherwise)
of using a particular test or selection method

BCG Formula: utility gain = (N)(T)(rxy)(SDy)(Zm) – (N)(C)


N – number of applicants selected per year
T – average length of time in the position (tenure)
rxy – validity coefficient
SDy – standard deviation of performance (in dollars/peso) of employees
Zm – mean (standardized) score on the test for selected applicants
N – number of applicants
C – cost of test/applicant
3. Decision Theory and Test Utility
- Decision theory was recommended to determine test utility by Cronbach
and Gleser
- To illustrate decision theory, we need to recall five terms: base rate, hit
rate, miss rate, false positive, and false negative

Base rate = number or percentage of applicants deemed to be successful


Hit rate = gaano ka efficient yung test to determine qualified applicants
Miss rate = extent to which the test was either not able to get qualified
applicants
False positive = sabi na qualified pero hindi pala
False negative = sabi na hindi qualified pero qualified pala

- Test are often “assumed” to be perfect predictors of future performance


® Those who score above the cutoff score = expected to be successful on the job
® Those who did not meet the cutoff score = predicted to be unsuccessful
- However, tests are not perfect predictors of future performance
® There will always be “misses”
- With a high selection ratio = cutoff score can be set to be lower = expect a lot of false
positives (people who were hired but would fail on the criterion measure)
- Low selection ratio = cutoff score must be at a higher level = expect a lot of false
negatives (people who were not hired but would have eventually proven to be successful
on the job)
- Decision theory provides guidelines for setting optimal cutoff scores
BES3149 PSYCHOMET 1
Understanding Test Utility

® In certain professions, like airline pilots and surgeons, having false negatives
would be preferrable than false positives

Some practical considerations


1. Pool of job applicants
- Utility estimates assume that there is a steady supply of viable applicants to occupy the
positions at stake
- There are some professions with few qualified applicants (or would they accept, even if
they are qualified)
2. Complexity of the job
- There are disagreements among experts as to whether it is appropriate to use the same
utility models to jobs of varying complexities (ex: a highly complex job may have more
stringent standards of successful performance)
3. Cutoff score used
- Usually a numerical reference or word, point derived as a result of a judgement and used
to divide
a. Relative cut score
® Also known as norm-referenced cut score
® A reference point – in a distribution of test scores used to divide a set of data into
two or more classifications
® Based on norm-related considerations rather than on the relationship of test scores
to a criterion
® This type of cut score is set with reference to the performance of a group or some
target segment of a group
b. Fixed cut score
® Also known as absolute cut scores
® Typically set with reference to a judgement concerning a minimum level of
proficiency required to be included in a particular classification
- Some cut scores can also be multiple cut scores or multiple hurdles
a. Multiple cut scores
® Refer to the use of two or more cut scores with reference to one predictor for the
purpose of categorizing testtakers
® Example: different cut socres are set to be equivalent to rating of A, B, C, D
b. Multiple hurdles
® Achievement of one cutoff score is necessary to proceed to the next stage in the
evaluation process
BES3149 PSYCHOMET 1
Understanding Test Utility

Methods of setting cut scores


1. Angoff method
- A way to set fixed scores that entails averaging the judgements of experts
® Determines how often a minimally qualified performer would answer a test item
correctly
® A panel of experts is chosen to review test items and estimate the probability that
a minimally qualified performer would answer the item correctly
® This simple technique has wide appeal, and works well – as long as the experts
agree
® There is low inter-rater reliability and major disagreements regarding how certain
populations of testtakers should respond to items
2. Known Groups method
- Method of contrasting groups
- Method of collecting data on a predictor of interest from groups known to possess and
not possess a trait, attribute, or ability of interest
® Based on data analysis, a cut score is set on the test that best discriminates the two
groups’ test performance
® Main problem = determining the cutoff score is inherently affected by the
composition of the contrasting groups

Take note

- Other factors that must be considered in using certain tests:


a. Cost efficiency
b. Time factor
c. How a “valid and reliable” test compares to another “valid and reliable” test
d. How useful is a test for diagnosis? For treatment? For classifying patients?
e. How an admissions test can be successfully “trim down” the number of applicants to only
a few qualified applicants
f. Will it help that we add another test to our battery
g. Having a test versus not having a test
- Mas ginagamit yung validity as possible determinant of utility

You might also like