You are on page 1of 13

Psychological Bulletin Copyright 1991 by the American Psychological Association~ ln~

1991, Vol. 110, No. 3, 486--498 0033-2909/91/$3.00

Costs and Benefits of Judgment Errors: Implications for Debiasing


Hal R. Arkes
Ohio University

Some authors questioned the ecological validity of judgmental biases demonstrated in the labora-
t o ~ One objection to these demonstrations is that evolutionary pressures would have rendered
such maladaptive behaviors extinct if they had any impact in the "real world" I attempt to show that
even beneficial adaptations may have costs. I extend this argument to propose three types of
judgment errors--strategy-based errors, association-based errors, and psychophysical based
errors--each of which is a cost of a highly adaptive system. This taxonomy of judgment behaviors is
used to advance hypotheses as to which debiasing techniques are likely to succeed in each category.

During the last two decades, cognitive psychologists docu- bly be adaptive, there is reason for psychologists to assume that
mented many types of judgment and decision-making errors. these maladaptive behaviors exist mainly in artificial labora-
Spurred in good measure by the work of Tversky and Kahne- tory environments and not in naturalistic ecologies.
man (1974), the research area has come to be known as "judg- However, Archer (1988) pointed out that successful adapta-
ment under uncertainty" The relatively poor performance of tions have costs as well as benefits. Consider first a physiologi-
subjects in many of these judgment experiments has caused cal example. In dangerous situations, the body mobilizes for a
some researchers to question the ecological validity of such fight or flight response. This is highly adaptive. However, pro-
studies (e.g., Berkeley & Humphreys, 1982; Edwards, 1983; longed stress will result in serious physical deterioration. This is
Funder, 1987; Hogarth, 1981; Phillips, 1983). The reasoning a maladaptive long-term consequence of a response that is gen-
seems to be that because subjects' performance is so poor in erally beneficial. Hence, stomach ulceration is not grounds for
these experiments, it may not be representative of their behav- deeming the alarm reaction to be maladaptive.
ior in more naturalistic environments in which people seem A phylogenetic example is the emergence of upright gait. Al-
quite competent. though upright gait has resulted in epidemic levels of lower
One purpose of the present article is to argue that even suc- back pain in humans, it has resulted in substantial adaptive
cessful adaptations can have costs. This position is common in consequences (e.g., freeing the hands for tool use). The benefit
biology. Its application to the areas of judgment and decision outweighs the cost.
making will, I hope, help explain why particular judgment be- A psychological example is provided by the costs and bene-
haviors persist despite their obvious drawbacks in some situa- fits of expertise (Arkes & Freedman, 1984; Arkes & Harkness,
tions. My goal is to show that the costs of otherwise beneficial 1980). Experts have substantial background knowledge that
cognitive adaptations are the consequence of appropriate re- they can draw on to instantiate the missing slots of incomplete
spouses to environmental demands. schemata. They subsequently demonstrate a tendency to recall
A second goal of this article is to propose a taxonomy of the instantiated information as having been presented when in
judgment behaviors based on the nature of the adaptational
fact it was not. For example, Arkes and Harkness (1980) showed
costs. The use of this taxonomy may help suggest what type of that speech therapy students who made a diagnosis of Downs
debiasing techniques may be effective in each category of judg-
syndrome tended to remember having seen the symptom "fis-
ment behavior.
sured tongue" However, this common symptom of Down's syn-
drome had not been presented in the actual list of symptoms.
Costs a n d Benefits F r o m an Evolutionary Perspective Because nonexperts have less background knowledge, they are
To examine maladaptive judgment behaviors, we need to less likely to make this type of error. Thus, even though we all
consider what makes any characteristic adaptive. Viewed from agree that expertise is beneficial, it does have its costs. In an
an evolutionary perspective, adaptive behaviors contribute to analogous way, the presence of widespread, maladaptive judg-
reproductive success, although the route from good judgment ment strategies is not necessarily contrary to the principles of
performance to reproductive success may be quite indirect. Be- evolution. (See also Einhorn & Hogarth, 1981, p. 58.)
cause it is not obvious how serious judgment errors could possi- I divide the judgment and decision-making errors docu-
mented in the literature into three broad categories. Each cate-
gory of errors is a cost of an otherwise adaptive system. First, I
I am grateful to Bruce Carlson and Robyn Dawes for their helpful present a brief overview before describing each category in
suggestions on an earlier draft of this article. Daniel Kahneman and
more detail.
two anonymous reviewers also provided very constructive comments.
Correspondence concerning this article should be addressed to Hal Strategy-based errors occur when subjects use a suboptimal
R. Arkes, Department of Psychology, Ohio University, Athens, Ohio strategy; the extra effort required to use a more sophisticated
45701. strategy is a cost that often outweighs the potential benefit of
486
COSTS AND BENEFITS OF JUDGMENT ERRORS 487

enhanced accuracy. Hence, decision makers remain satisfied Tom does Tom does not
with the suboptimal strategy in low-stakes situations. want to date want to date
this woman, this woman.
Association-based errors are costs of otherwise highly adap-
tive system of associations within semantic memory. The auto-
maticity of such associations, generally of enormous benefit, The woman has cell a cell b
becomes a cost when judgmentally irrelevant or counterpro- a good sense
ductive semantic associations are brought to bear on the deci- of humor. 8 4
sion or judgment.
Psychophysically based errors result from the nonlinear map-
ping of physical stimuli onto psychological responses. Such The woman does
errors represent costs incurred in less frequent stimulus ranges not have a good cell c cell d
where very high and very low stimulus magnitudes are located. sense of humor.
8 4
These costs are more than offset by sensitivity gains in the more
frequent stimulus ranges located in the central portion of the
stimulus spectrum. Figure 1. Example of information presented to subjects in the study
I now present a more detailed description of each of these by Harkness, DeBono, and Borgida (1985). (From "Personal Involve-
three categories of judgment errors. ment and Strategies for Making Contingency Judgments: A Stake in
the Dating Game Makes a Difference" by A. R. I-Iarkness, K. G. De-
Bono, and E. Borgida, 1985, Journalof Personalityand Social Psychol.
Three Types o f J u d g m e n t Errors ogy, 49, p. 25. Copyright 1985 by the American Psychological Associa-
tion. Adapted by permission.)
Strategy-Based Judgment Errors
Despite the fact that poor judgment performance has been
demonstrated in a large number ofsituatious (Kahneman, Sip- good sense of humor" and "The woman does not have a good
vie, & Tversky, 1982), evidence exists that some suboptimal sense of humor: The columns might be labeled "Tom does
behaviors may be adaptive in a larger sense. Suppose a person want to date this woman" and "Tom does not want to date this
adopts a quick and dirty strategy to solve a problem. Because it woman7 Subjects considered these data to decide how much
is quick, it is easy to execute. This is a benefit. Because it is dirty, Tom's liking for a woman covaried with a characteristic such as
it results in more errors than a more meticulous strategy. This is her sense of humor. Using a procedilre developed by Shaldee
a cost. Although the choice of this strategy may result in fewer and Tucker (1980), Harkness et al. (1985) were able to deter-
correct answers compared with the other strategy, this cost may mine the complexity of the strategy used by the female subjects
be outweighed by the benefit of time and effort saved. as they performed this covariation estimation task. In some
Thorngate (1980) and Johnson and Payne (1985) compared groups, the strategy used was rather elemental. This was not
the performance of various decision strategies with regard to the case, however, if the man whose preferences were being
their ability to select alternatives with the highest expected examined was someone the female subject would be going out
value. Some strategies were quite rudimental. For example, with for the next 3 to 5 weeks. In this group, the women used
some completely ignored the probability of an outcome and complex covariation estimation strategies significantly more
only considered the average payoff for each possible choice. It often. This finding is consistent with Payne's (1982) description
was found that many of the 10 decision strategies performed of contingent decision behavior. The decision behavior is con-
well. Even the primitive ones selected alternatives with the high- tingent on such factors as the reward for high levels of accuracy.
est expected value under some circumstances and almost never Researchers who are optimistic about human judgment and
selected alternatives with the lowest. The use of such elemen- decision-making performance point out that sensitivity to such
tary strategies would be drastically less taxing to the human factors as incentive, task complexity (Billings & Marcus, 1983;
information-processiog system than would the use of more Paquette & Kida, 1988), and time pressure (Christensen-Sza-
complete but complicated ones. Hence, ifa suboptimal strategy lanski, 1980; Payne, Bettman, & Johnson, 1988) is highly adap-
were to be used, the large savings in cognitive effort might far tive.
outweigh the small loss in potential outcomes. This point was
stressed a number of years ago by Beach and Mitchell (1978).
Association-Based Judgment Errors
When subjects know that the stakes are high, they often can
change from a suboptimal strategy to a better one. It is worth it Experiments in which semantic memory has been primed
for them to do so. For example, Harkness, DeBono, and Bor- have become very common during the last 20 years. The prin-
gida (1985) asked undergraduates to perform a covariation esti- cipal result of such studies is that priming causes an activation
mation task. Female subjects examined data that described of concepts related to the prime. A number o f models, such as
other women whom Tom did or did not want to date and HAM (Anderson & Bower, 1973) and ACT* (Anderson, 1983)
whether each of these women possessed a particular character- among many others, posited spreading activation as a funda-
istic. The presented data could be summarized as entries into a mental characteristic of semantic m e m o ~
2 × 2 matrix, such as the one presented in Figure 1. For exam- Consider a study by Kubovy (1977). When subjects were
ple, the rows of the matrix might be labeled "The woman has a asked to report "the first digit that comes to m i n d ; only 2.2%
488 HAL R. ARKES

chose the digit 1. When subjects were asked to report the "first certainly not adaptive if what we mean by "adaptive" implies
one-digit number that comes to m i n d ; 18% chose 1. This result correspondence to reality (e.g, the actual number of men and
is consistent with the tenets of spreading activation. The second women in the list). However, I believe that such errors are a
group of subjects is much more likely to respond with a I be- consequence of the normal operation o f long-term memory:
cause that digit was primed by the request to report a one-digit Priming content with related items or by asking the person to
number. perform a cognitive activity will result in heightened retriev-
If we consider the first group to be the control group because ability, which can result in nonveridical estimates of frequency
they were asked the question in a more neutral manner, should and probability
we then consider the response o f the second group to be a Again, such errors are a cost o f a memory system whose
manifestation of bias? Have these subjects made a judgment principles o f association and retrieval produce benefits far in
error? excess o f these costs3 I very briefly enumerate several such
The relatively high probability of reporting the digit I as the errors.
first one-digit number to come to mind is a consequence o f the Explanation bias. Ross, Lepper, Strack, and Steinmetz
spreading activation characteristic o f semantic memory The (1977) asked subjects to read a short scenario about a person
fact that related concepts in semantic memory influence each and then explain why this individual might have eventually
other through this activation is essential to normal cognitive done some specified behavior, such as contributing money to
functioning) The benefits o f spreading activation include some the Peace Corps or committing suicide. Subjects were assured
of the most fundamental cognitive tasks: stimulus generaliza- that the to-be-explained outcome was entirely hypothetical, be-
tion, inference, and transfer o f training, for example. These cause it was not known what actually happened to this individ-
substantial benefits o f spreading activation are accompanied by ual. Participants subsequently rated the probability that the
a cost, which Kubovy (1977) demonstrated, namely, the inabil- person actually did each of several behaviors. Ross et al. (1977)
ity of humans to prevent associated items from influencing found that subjects assigned higher probabilities to the outcome
their cognition even when those related items are irrelevant or that they had explained. Making an option more available can
counterproductive to judgmental accuracy. make the option seem more probable.
An experiment by Gilovich (1981) serves as an example from Hindsight bias. A judgment error closely related to availabil-
the judgment literature. Newspaper sportswriters rated the po- ity is the hindsight bias (Fischhoff, 1975). In hindsight we tend
tential of various hypothetical college players to become profes- to exaggerate the likelihood that we would have been able to
sional football players. I f a college player was said to have come predict the event beforehand. Of course, the event that did oc-
from the same hometown as a current professional football cur and its possible causes are far more available than events
player, the college player was rated much higher than if he grew that never occurred. For example, after the election has taken
up in some other town. If we assume that one's hometown has place, subjects say that they had assigned higher probability to
little to do with one's potential as a professional football player, the winner's prospects than they actually had assigned before
then we must attribute the higher rating to the fact that the the election occurred (Powell, 1988).
mere association between the to-be-rated player and the suc- Ignoring P(D[It). Consider a farmer who wishes to deter-
cessful professional player was responsible for the higher rating. mine if there is a relation between cloud seeding and rain. Avail-
Another example is provided by Gregory, CiaMini, and Car- able evidence includes the entries in a 2 x 2 table, the rows of
penter (1982). People who were instructed to imagine experi- which are "seeding" and "no seeding" and the columns o f
encing certain events subsequently rated those events as more which are "rain" and "no rainy Of course, the farmer needs to
likely to occur to them compared with subjects who did not examine the numbers of entries in each cell to arrive at a correct
previously imagine them. Gregory et al. (1982) explained their conclusion. However, many investigators found that subjects
results in terms o f availability (Tversky & Kahneman, 1973). who are trying to determine the relation between cloud seeding
Through the activity o f imagining, items are made more avail- and rain often do not consider as relevant the evidence in the
able in long-term memory As a result such items are judged to "no-seeding" row (e.g., Arkes & Harkness, 1983). Similarly, stu-
be more probable. Whereas Kubovy (1977) increased the avail- dents are usually astonished to learn that to determine the rela-
ability o f the digit 1 by mentioning it in an unobtrusive manner, tion between a symptom and a disease one needs to collect data
Gregory et al. were able to increase the availability o f scenarios on the likelihood o f a symptom when the disease is not present.
by blatantly asking subjects to imagine their occurrence. In Fischhoffand Beyth-Marom (1983) deemed to be a"meta-bias"
both cases the experimenters exploited the normal working o f the general tendency to ignore data when the hypothesis is not
the memory system to heighten the retrieval of some item, true or when the possible antecedent cause is not present.
thereby creating an "error"
Application of this same principle can result in other judg-
ment errors, as when Tversky and Kahneman (1973) presented Ratcliff and McKoon (1981) questioned the validity of spreading
activation theories of semantic memory. They did not question the
a list o f famous women and not-so-famous men to a group o f
validity of the findings that have spawned such theories, however. I
subjects. Although the list contained more men than women, believe the judgment errors I attribute to associative mechanisms
the subjects erroneously claimed that the list contained more could be explained by either the spreading activation or compound cue
women. The notoriety o f the women heightened their availabil- (Ratcliff& McKoon, 1981) theories.
ity in memory and, thus, their retrievability The many judg- 2 Tversky and Kahneman (1974) also noted that heuristics such as
ment errors that have been demonstrated in this manner are availability have benefits as well as costs.
COSTS AND BENEFITS OF JUDGMENT ERRORS 489

The reason why this meta-bias exists is that the hypothesized Suppose subjects are given the choice of examining one pair
cause, but not its absence, primes relevant data. IfI believe that of data to determine P(H0. They may choose to learn P(D~IHO
disease D causes symptoms S, it seems obvious to ascertain the and P(DI]H2); they may choose P(D2]H 0 and P(D2[H2); or they
status of S when D is present. The absence of D does not prime may choose P(DdH0 and P(D2IH0. It may be seen by examin-
S; as a result, many people do not believe that the status of S ing Bayes's theorem that to infer the probability of H~, the
needs to be ascertained in such cases. choice of either of the first two pairs would be helpful. Choos-
Confirmation bias. I define confirmation bias as a selective ing the last pair will generally not provide diagnostic informa-
search, recollection, or assimilation of information in a way tion. However, the members of the last pair--P(DdH0 and
that lends spurious support to a hypothesis under consider- P(D2]H0---are both strongly cued when H~ is considered. This
ation. Some authors put this term in quotation marks to denote may be why these two nondiagnostic data are selected by so
that it refers to a rather loosely related group of findings. (Fisch- many subjects in a study by Doherty, Mynatt, Tweney, and
hoff & Beyth-Marom, 1983, even suggested abandoning the Schiavo (1979). These investigators termed this nonoptimal
term because of its imprecise referent.) judgment behavior pseudodiagnosticity.
Confirmation bias was demonstrated by Chapman and Overconfidence. One of the most robust findings in the
Chapman (1967) using the Draw-a-Person Test. This is a pro- judgment and decision-making literature is overconfidence
jective instrument in which a patient draws a picture of a per- (Lichtenstein, Fischhoff, & Phillips, 1982). Koriat, Lichten-
son, and a clinician then examines the picture for particular stein, and Fischhoff(1980) suggested that a primary reason for
cues that supposedly are associated with various types of psy- unwarranted confidence is that subjects can generate support-
chopathology. Because there was negligible evidence favorable ing reasons for their decisions much more readily than contra-
to the validity of this technique, the Chapmans thought that dictory ones. The supporting reasons are more strongly cued.
associations between the features of the drawings and the pur- For example, suppose I am asked whether Oslo or Leningrad is
ported diagnosis must be entirely illusory. To test this hypothe- further north, and I answer "Oslo: Now I am asked to assign a
sis, drawings of people were randomly paired with personality confidence level to my answer. To complete this task, I search
traits presumably characteristic o f the person who did the my semantic memory for the information that made Oslo seem
drawings. Clinicians and undergraduates who viewed these like the correct answer. Items pertaining to Oslo's nearby gla-
drawings perceived correlations between certain drawing fea- ciers and fjords are much more strongly cued than information
tures and the personality traits of the person who drew the concerning Oslo's summer warmth. The evidence I am most
figure. For example, subjects claimed that drawings containing likely to retrieve thus is an unrepresentative sample of all avail-
large eyes were frequently done by people who were said to be able evidence, and my confidence is thereby inappropriately
suspicious. Drawings with muscular figures were frequently inflated. Because of the increase in the confidence with which
said to be done by men who were concerned about their manli- an opinion is held, the process leading to overconfidence is
ness. The Chapmans concluded that because there was no real related to the confirmation bias.
correlation between these drawing features and personality Representativeness. To appreciate the nature of this heuris-
traits, subjects must be relying on preexisting associations in tic, it may be instructive first to consider the term overgeneral-
perceiving this association. ization (Slobin, 1971). We admire the intelligence of the child
To test this hypothesis, the Chapmans performed a follow-up who generalizes the past tense verb ending "ed" to the unfamil-
study. Undergraduates were asked to rate the strength of seman- iar verb "revel" thereby making "reveled." We think less highly
tic association between the body parts emphasized in the of the child who generalizes the same past tense ending to the
various drawings and the personality traits said to be character- verb "do7 thereby making "doed: We call the latter behavior
istic of the drawers. In this follow-up study, the subjects rated overgeneralization, even though it seems to be a manifestation
eyes as closely associated with Suspiciousness, for example. This of the same very fundamental principle we call generalization.
is precisely the illusory correlation detected by subjects in the Of course, those of us who are aware of the existence of irregu-
first study: They had incorrectly reported that suspiciousness lar verbs can be arrogant with children about what constitutes
was characteristic of the persons who drew figures with large overgeneralization of an inferential strategy outside its domain
eyes. This is an illustration of the confirmation bias because the of appropriate application. Overgeneralization is a judgment
subjects assimilated the evidence in a biased way based on their error, but again I think this is a small cost of an otherwise
preconceived association between eyes and suspiciousness. adaptive associationistic system.
This study is quite similar to the one by Gilovich (1981) in that a The representativeness heuristic (Tversky & Kahneman,
prior association results in an inappropriate consideration of 1974) provides an example of such overgeneralization. This
the evidence. In this case, the inappropriate consideration heuristic refers to the fact that people often judge probabilities
serves to bolster the prior association. on the basis of similarity, or representativeness. For example, in
Pseudodiagnosticity. Bayes's theorem may be expressed in judging whether Instance A belongs to Class B, people often
the following way: rely on the extent to which A seems representative of B. Of
P(HI)P(Di IH1) course, the probability that A belongs to B can be influenced by
P(HIID~) P(HI)P(DilH0 + P(H2)P(DilH2) ' many factors that have no bearing on representativeness. For
where H and D signify the hypotheses and data, respectively, example, basing one's decision solely on representativeness will
and the subscript i indexes a set of data. Assume that the two result in the underutilization of base rates (Kahneman &
hypotheses are mutually exclusive and exhaustive. Tversky, 1973), thereby resulting in errors.
490 HAL R. ARKES

Theories of category classification as old as that o f Hull VALUE

(1920) are based on the principle that decisions concerning the +


category membership o f an exemplar are based on the degree o f
similarity between the exemplar and the category. More recent
models, such as the feature-comparison model o f Smith, Sho-
ben, and Rips (1974), share this assumption. For example, to
the extent cardinal shares features with the category clergyman,
it is likely to be deemed a member o f that category. If cardinal
shares fewer features with the category bird than the category
clergyman, it is deemed likely to belong to the latter category
even though there are many more birds than clergymen in the
world. This feature-matching process ignores base rates o f the LOSSES GAINS
two categories; hence, it is prone to error.
Judgments o f similarity follow one of the most fundamental
principles of cognition: stimulus generalization. It is highly
adaptive that we associate items to other items with which they
are related. Even a task as basic as classic conditioning requires
this. The fact that cardinal is more closely associated with cler-
gyman than bird may bode very poorly for our consideration o f
base rates. However, I consider this cost to be an overgeneral-
ization o f a process that serves us very well in other contexts.
Thus, I suggest that the manifestation of the representativeness
heuristic is another example o f a cost o f an otherwise highly
adaptive associationistic system.
Figure 2. The value function of prospect theory (Kahneman &
Psychophysically Based Errors Tversky, 1979). (See text for discussion. From "Prospect Theory: An
Analysis of Decision Under Risk" by D. Kahneman and A. Tversky,
From psychophysical power functions (Stevens, 1957) to pros- 1979, Econometrica,47, p. 279. Copyright 1979 by Basil Blackwell Ltd.
pect theory's S-shaped curve (Kahneman & Tversky, 1979), Adapted by permission.)
from the original Weber-Fechner log function to economists'
law of diminishing returns, many theorists postulated an
asymptotic curve relating external stimuli (e.g., mass, cash, light course, extracting the optimal amount o f useful information is
intensity) and the psychological responses to those stimuli. Fig- highly adaptive even though diminished sensitivity at the ex-
ure 2 depicts the value function o f prospect theory (Kahneman tremes is a cost.
& Tversky, 1979), which represents one such nonlinear curve. Several judgment errors may be a manifestation o f this partic-
A system that translated physical intensity in a linear manner ular cost. I briefly enumerate several.
onto psychological response would impose an immense cost on Sunk cost effect. Economic decisions should be made based
any transduction system. Extreme stimuli, which occur rela- on the anticipated costs and benefits that will result from the
tively infrequently, would have to be coded with as great a level choice o f each o f the alternative courses o f action. Note that
of discriminability as the more frequent middle-range stimuli. future costs and benefits are relevant; prior (sunk) costs are not.
Any nonlinear system with an asymptote at the extreme end (or A judgment error occurs when sunk costs are used as a basis for
ends) would have the benefit o f eliminating the structures and decision making.
processes needed to discriminate small changes in rare events, The sunk cost effect is manifested in a willingness to con-
such as very intense sounds or extremely heavy weights. Of tinue spending after an investment o f time, effort, or money has
course, sacrificing discriminability at the ends o f the contin- already been made (Arkes & Blumer, 1985). Persons who have
uum has a cost. already invested substantial amounts and who have not yet real-
An experiment by Dinnerstein (1965) illustrates this point. ized compensatory returns are at Point B in Figure 2. Persons in
Dinnerstein presented subjects with a series o f weights and that situation are not very sensitive to further losses; a small
found that subjects' ability to discriminate was finest at the subsequent expenditure o f funds will therefore cause negligible
center of the range o f stimuli. Then a weight was introduced psychological disutility. Hence, such persons are willing to
that was either above or below all the others. This caused the "throw good money after bad" in a desperate attempt to recoup
region of maximal discriminability to either rise or drop de- their sunk cost, even though such behavior may be irrational. If
pending on whether the new weight was heavy or light. This a particular project is a poor idea, the fact that it has already
result occurred even though the new weight was not included in wasted a lot o f money does not make it a better idea. Yet the
the range of stimuli to be rated. This study demonstrates the sunk cost effect has been shown to be powerful (Arkes &
adaptation o f the nervous system to the available stimulus Blumer, 1985).
array, an adaptation that allows the perceiver to extract the opti- Psychophysics of spending. Once a person has decided to
mal amount of useful information from each situation. Of purchase a new ear, for example, he or she is located in the
COSTS AND BENEFITS OF JUDGMENT ERRORS 491

asymptotic region o f a curve describing the psychophysics of literature by Tversky and Kahneman (1974), although they were
spending. Persons in this situation would be more willing to discussed earlier (e.g., Slovic & Lichtenstein, 1971).
pay $235 for a car radio compared with their willingness to buy First, let us consider the psychophysical research pertaining
a radio for $235 if they had not purchased a car. We have good to "induction illusions" or "context effects" We know from
discriminability (ie., the curve is steep) in the region o f a few many experiments on adaptation level theory (Helson, 1964)
hundred dollars on either side of our current state. Once we are that when a medium-size circle is placed in a group o f much
several thousand dollars away from our current state, discrim- larger ones, the medium one is perceived as small. This constel-
inability drops (ie., the curve flattens), and we no longer object lation o f circles induces an adaptation level that is approxi-
to extravagant additional expenditures (Christensen, 1989). mately at the mean value of the circles' areas, and the medium
Reflection effect. Tversky and Kalaneman (1981, p. 453 ) dem- circle has an area that is below this mean. When this same
onstrated how framing the outcomes of the same gamble as medium-sized circle is placed in a context of much smaller
losses or as gains can lead to different decisions. They referred circles, it is perceived as large. Now its area is above the adapta-
to this phenomenon as the reflection effect, which is illustrated tion level. The shift in the adaptation level is consistent with the
in their following well-known example (p. 453): principle discussed previously by Dinnerstein (1965): It is best
to have the adaptation level change location to locate maximal
Imagine that the U.S. is preparing for the outbreak of an unusual discriminability near the center o f the stimulus continuum.
Asian disease, which is expected to kill 600 people. Two alterna-
tive programs to combat the disease have been proposed. Assume Note that this adaptation is congruent with the relation be-
that the exact scientific estimates of the consequences of the pro- tween the physical and psychological dimensions depicted in
grams are as follows: Figure 2. Point O is the current state, which is in the area of
maximal discriminability. The asymptotes are in areas of di-
If Program X is adopted, 200 people will be saved.
minished discriminability.
If Program Y is adopted, there is a one-third probability that 600 Consider an analogous judgment experiment (Sutherland,
people will be saved and a two-thirds probability that no people Dunn, & Boyd, 1983). Sixty-four hospitalized patients rated five
will be saved. different health states using three different methods. In the first
method, subjects assigned values to these five health states on a
The benefit o f saving 200 lives is located at Point X in Figure scale anchored by perfect health and death. In the second
2. The benefit o f saving 600 lives is located at Point Y. Program method, perfect health was replaced on the high end of the scale
Y represents a relatively small gain in value over Program X. by a health state each rater had rated less desirable. Thus, the
Two hundred lives saved is so great a benefit that the additional high end o f the scale was no longer quite as high. In the third
lives that might be saved under Program Y are too small an method, death was replaced on the low end o f the scale by a
additional benefit to warrant the risk of saving no one. Hence, health state each rater had rated more desirable. Thus, the low
about three fourths of the subjects chose Program X. end o f the scale was no longer quite as low. Relative to the values
Other subjects were asked to consider two other programs: assigned to the various health states using the first method, the
values assigned to the very same health states using the second
If Program B is adopted, 400 people will die.
If Program A is adopted, there is a one-third probability that no- method were lower, and those assigned using the third method
body will die and a two-thirds probabilitythat 600 people will die. were higher. This study showed that a patient's rating o f possible
health states was strongly influenced by the context in which
The loss of 400 lives is located at Point B in Figure 2. The loss such stimuli were considered. Such context-dependent effects
o f 600 lives is located at Point A. Because the loss of 400 lives is appear to induce inconsistent ratings just as the medium-sized
so terrible, the loss of 200 additional lives represents only a circle was rated differently depending on the size of the circles
small additional loss in value. Hence, about three fourths o f the with which it could be compared. However, this is a small cost
subjects chose Program A. The potential extra loss in value by o f an otherwise beneficial adaptation designed to extract the
choosing that program was more than offset by the chance of optimal amount of useful information out o f each situation.
saving everyone. Another group o f judgment studies is more closely related to
It is easy to see that Program X, which is generally endorsed, a context phenomenon demonstrated by Restle (1971). Subjects
is the same as Program B, which is generally shunned. If the were presented with a drawing like that in Figure 3. Restle
value function were strictly linear, this inconsistency would not
occur. By asking some subjects to consider the problem in
terms of lives gained, Tversky and Kahneman (1981) exploited
the small superiority of Y over X, a superiority which is not
sufficient to warrant additional risk. By asking other subjects to
consider the problem in terms o f lives lost, Tversky and Kahne-
man (1981) exploited the small inferiority o f A over B, an inferi-
ority small enough to warrant additional risk. The wording or
C
framing o f the problem directs subjects to different portions o f
the nonlinear curve. E E
Anchoring. The terms anchoring and anchoring and adjust-
ment were popularized in the judgment and decision-making Figure 3. Stimulus used in study by Restle (1971).
492 HAL R. ARKES

varied the length of the horizontal test line (H), the length of may have more than one cause. Hence, it would not be appro-
the vertical center line (C), which crossed the test line, and the priate to categorize the bias as belonging to a category o f judg-
length of the identical vertical ends lines (E). It was expected ment error. Instead, the various causes o f the bias may be cate-
that judgments of the length of H should decrease as C or E gorized according to the taxonomy presented previously.
increased. It was easy for Restle to determine the influence of E Perhaps the best example of this situation is the conjunction
and C on H by ascertaining the slope of the function relating E fallacy. Tversky and Kahneman (1983) suggested that the repre-
to H and C to H. sentativeness heuristic is one basis for this fallacy. However,
Restle (1971) presented subjects with one of two possible sets they also pointed out in an earlier article (Tversky & Kahne-
of instructions. One group was told, "Pay attention to the verti- man, 1974) that the anchoring and adjustment heuristic may
cal lines at the ends of the test line, and use them as a frame of play a role in the manifestation of this fallacy in some instances
reference to help you in your judgments: These subjects were (e.g., Bar-Hillel, 1973). Tversky and Kahneman (1983, p. 312)
also warned, "Try to disregard the center vertical line: Other contended that speakers' conformity with Gricean (1975) con-
subjects were told just the opposite: They were to use the center versational rules could hinder appreciation of the probabilistic
vertical line as a frame of reference and to ignore the end lines: law relevant to the consideration of conjunctions. Worse yet,
The result was that the line that subjects were told to attend to Yates and Carlson (1986) suggested that individual subjects may
was much more influential on the subjects' judgment oftbe test use multiple procedures in arriving at their answers on different
stimulus than was the line they were told to ignore. conjunction problems depending on the presence of various
This study differs from the study involving the circles in that factors. Thus, it would be a mistake to place the conjunction
instructions are used to direct the subject's attention to the refer- fallacy itself into only one of the categories o f judgment errors.
ence point, which serves as the context for the ensuing judg- It would be proper, however, to place each of the various causes
ment. Flexibility of frames of reference, which introductory of the fallacy into one of the categories. Thus, the taxonomy
psychology students first appreciate when they view a Necker does not divide the judgment errors into mutually exclusive
cube, is essential to recognize the same object in different con- categories. I suggest that the causes can be so divided. Whether
texts. However, this immense benefit has a cost, and many an- the categories are exhaustive with regard to the causes of judg-
choring and adjustment studies illustrate this cost. ment errors remains to be determined.
In the most famous such study, Tversky and Kahneman To return to the example of the conjunction fallacy, anchor-
(1974) asked subjects to estimate the percentage of African ing and adjustment is a psyehophysically based error. Represen-
countries in the United Nations. Subjects spun a "rigged" spin- tativeness is an association-based error as is the overgeneraliza-
ner, which landed on either 10% or 65% as a starting point. tion of Gricean principles to probability estimates. If the envi-
Subjects were then were asked to adjust the starting number to ronment contains cues that foster one of these "incriminating"
the level they thought was appropriate to answer the question behaviors, then the fallacy will occur.
correctly. The median estimate for those who started with 10%
was 25%, whereas the median estimate for those who started
with 65% was 45%. Debiasing
By directing the subject's attention to a starting point or an- Strategy-Based Errors
chor, Tversky and Kahneman (1974) did something analogous
to what Restle (1971) asked his subjects to do. When 10% is Bias may not be an appropriate term to use to describe subop-
presented as the anchor, a context is induced that contains low timal behaviors in this category, and thus debiasingwould not
numbers. Adjustments upward move toward the area of maxi- be an appropriate term to use to describe the adoption o f strate-
mal discriminability in the central region of the spectrum. Be- gies that result in higher accuracy levels. Suboptimal behaviors
cause such adjustments in this region are perceived to be quite occur in this category because the effort or cost of a more dili-
significant, subjects often refrain from making them as large as gent judgment performance is greater than the anticipated ben-
would be warranted. This results in the insufficient adjustment efit. The way to improve judgment within this category is to
observed by Tversky and Kahneman (1974). Of course, the op- raise the cost of using the suboptimal judgment strategy. Typi-
posite result occurs when the subject's attention is drawn to the cally this results in the judge's utilization of the currently avail-
large anchor at the beginning of the experiment. able data in a much more thorough way, an obviously superior
Many other studies illustrate the influence of the anchor in strategy.
analogous judgment situations (e.g., Northcraft & Neale, 1987). Consider first the study by Harkness et al. (1985) alluded to
It is true that different anchors and the subsequent insufficient earlier and depicted in Figure 1. The investigators identified
adjustment result in different final estimates given different four covariation strategies. The first, the Cell A strategy, con-
anchors. I suggest that this "irrationality" is a worthwhile cost sists of noting how many times Tom liked a woman who had a
to achieve context-dependent judgment behavior. good sense of humor. The covariation between Tom's liking for
the woman and her sense o f humor is based on the number of
times he wanted to date such a person. The second strategy, A
Multiple Causes
minus B, consists of comparing Cells A and B. To the extent A
To this point I have identified various biases, for example, the exceeds B, Tom is judged to like women with a sense of humor.
hindsight bias, as belonging to one of the three categories of Note that these two strategies do not use Cells C and D. The
judgment errors. However, some phenomena we term biases third strategy, sum of diagonals, compares the sum of A and D
COSTS AND BENEFITS OF JUDGMENT ERRORS 493

with that of B and C. To the extent the former sum exceeds the Slovic, and Lichtenstein (1977). In this study, subjects assigned
latter, Tom is judged to like women with a sense o f humor. The confidence levels to their answers to two-option questions, such
final strategy, conditional probability, is the normative assess- as"Aden was occupied in 1839 by the (a) British or (b) French7 If
ment o f covariation. These final two strategies use the data in the analysis o f this situation by Koriat et al. 0980) is correct,
all four ceils. Harkness et al. found that 6 o f the 11 women who subjects search for reasons to support their answer. This search
were given information about Tom but who would not be dat- instills a high (and inappropriate) level o f confidence. Fischhoff
ing him used one o f the two primitive strategies. None o f the 11 et al. (1977, Experiment 4) wanted to find out how intransigent
women who thought they would be going out with Tom used this overconfidence was. Subjects were asked to wager actual
these simple strategies; they all used one o f the two sophisti- money based on the confidence levels they had assigned to
cated covariation estimation strategies. If a subject in a low- their answers. About 93% o f the subjects agreed to wager in a
stakes judgment environment were using only a subset o f the game that would have been biased in their favor if their confi-
available data, it would be obvious how to improve one's judg- dence levels had been appropriate for their level o f accuracy. I
ment should the stakes increase: Use more data. assume that the prospect o f winning or losing substantial
An analogous finding is exemplified in a study by Petty and amounts o f cash based on their stated confidence levels would
Cacioppo (1984). Undergraduates were exposed to three or nine cause subjects to scrutinize the basis o f these stated levels. This
arguments that were all o f either high or low quality. The argu- additional, highly motivated scrutiny apparently led the vast
ments related to an issue that would be o f importance to a majority of subjects to conclude that their stated confidence
group of undergraduates: The institution o f a new policy in one levels were justified. Nevertheless, 36 o f the 39 subjects would
year under which all students had to pass a comprehensive have lost money in this game, because their high levels o f confi-
exam in their major field to graduate from the university. Need- dence were not justified.
less to say, this was the "high-involvement" group. The "low-in- Incentives are not effective in debiasing association-based
volvement" subjects also evaluated either three or nine strong or errors because motivated subjects will merely perform the sub-
weak arguments, but the issue concerned the adoption o f this optimal behavior with more enthusiasm. An even more assidu-
new policy at a time long after this group o f students would ous search for confirmatory evidence will not lower one's over-
graduate. Petty and Cacioppo found that the low-involvement confidence to an appropriate confidence level.~
subjects were more persuaded by nine arguments than by three Fischhoff(1975) and others tried a direct approach to debias
arguments. The strength o f the argument was not a significant the hindsight effect: Tell the subjects about the bias and then
factor. For the high-involvement subjects, the strength o f the warn them not to succumb to it. If the mechanism responsible
arguments rather than their mere number was significant. If for the hindsight bias is the memorial priming o f an outcome by
subjects are not concerned with a proposition, merely counting its actual occurrence, then exhortations to prevent this priming
the arguments in support o f it might be sufficient. If the stakes will generally not be effective because the priming o f associa-
are raised, an obvious strategy is available the benefits o f which
tions between related items probably occurs automatically
are substantial: Consider the merits of the arguments.
(Neely, in press; Ratcliff & McKoon, 1981). That is, priming is
In both the Harkness et al. (1985) and the Petty and Cacioppo
unconscious and occurs with negligible capacity usage (Posner
(1984) studies, the presence o f higher stakes resulted in less
& Snyder, 1975). It would be difficult for subjects to abort a
superficial treatment o f the data available as the basis o f a deci-
cognitive process that occurs outside o f their awareness. ~Please
sion. Tetlock and K i m (1987) found that the same end could be
prevent associated items from influencing your thinking"
accomplished through slightly different means. All subjects
would be a curious entreaty unlikely to accomplish much de-
were presented with the responses of an actual test taker to the
biasing.
first 16 items of a personality test. Based on these responses,
subjects were first asked to write a brief personality sketch o f There is a long history o f research in cognitive psychology
the test taker. Then subjects were then asked to predict how the that demonstrates that the occurrence o f automatic processes
test taker would answer 16 new items. One group o f subjects can be maladaptive. The most commonly cited example is the
was told beforehand that they would be interviewed by the Stroop effect (Stroop, 1935). Subjects are shown words and are
experimenter to learn how the subjects went about making asked to name as quickly as possible the color o f the ink in
their predictions. These accountability subjects wrote more which the word is printed. When the word itself is the name o f
complex personality sketches, were more accurate in their pre- some color, such as red, and the ink is a different color, subjects
dictions o f how the test takers would answer the next 16 items, experience difficulty in suppressing the tendency to announce
and expressed less overconfidence in their predictions. Know- red rather than the color o f the ink. The activation o f a word in
ing that they would be held accountable for their predictions semantic memory is "too automatic" for the subject to perform
raised the stakes for the subjects, which caused them to interact the Stroop task with facility.
with the stimulus materials in a less superficial way. In an analogous way, association-based judgment errors are a
Cursory interactions with currently available data cause strat-
egy-based errors, and incentive promotes the adoption of a
more thorough strategy. 3 Thaler (1986), among others, also noted that increasing the incen-
tive for rational behavior does not always result in heightened rational-
Association-Based Errors ity. This presents a problem for economists who hope that the irration-
The influence o f incentives in eliminating association-based alities documented by psychologists in questionnaire studies will dis-
errors is negligible, as illustrated in an experiment by Fischhoff, appear when financial incentives for rational behavior are introduced.
494 HAL R. ARKES

small cost of an otherwise adaptive association-based semantic expertise is due to the fact that they get rapid feedback on a very
memory system. These errors occur when items semantically large number of predictions the confidence level of which is
related to the judgment influence it even when their influence carefully recorded. Of course, daily feedback on the appropri-
is not conducive to increased accuracy. To diminish an associa- ateness of one's confidence is a dehiasing technique almost
tion-based judgment error, neither the introduction of incen- never available to most people.
tives nor entreaties to perform well will necessarily cause sub- Cuing a debiasing behavior. Rather than instructing people
jects to shift to a new judgment behavior. Instead, it will be in a different judgment behavior, it is possible to merely cue
more helpful to instruct the subjects in the use of a behavior such a behavior. An example is provided by the research pro-
that will add or alter associations. gram of Nisbett, Krantz, Jepson, and Kunda (1983), who were
Instructions to perform a debiasing behavior. On the basis of interested in discovering independent variables that would fos-
earlier research by Slovic and Fischhoff(1977) and by Koriat et ter the use of an appropriate statistical inference technique by
al. (1980), Arkes, Faust, Guilmette, and Hart (1988) presented college students. Tversky and Kahneman (1974) showed that
neuropsychologists with a small case history and then asked many subjects did not use such inference techniques in many
them to state the probability that each of three possible diag- instances; therefore, the subjects' judgments were incorrect.
noses was correct. The estimates of these subjects comprised Hence, the imposition on subjects of any of the effective inde-
the foresight estimates. Other neuropsychologists were told that pendent variables discovered by Nisbett et al. for these tasks
one of the diagnoses was correct and that they should estimate would constitute debiasing. For example, subjects in their third
the probability they would have assigned to the three diagnoses study were presented with the story of David, a high school
if they did not know which one correct. These hindsight sub- senior who had to choose between a small liberal arts college
jects exhibited a bias by assigning a higher probability level to and an Ivy League university. Several of David's friends who
the "correct" diagnosis than did the foresight subjects. How- were attending one of the two schools provided information
ever, hindsight subjects who had to state one reason supporting that seemed to favor quite strongly the liberal arts college. How-
each of the diagnoses before making their probability estimates ever, a visit by David to each school provided him with contrary
manifested no hindsight bias. information. Should David rely on the advice of his many
The behavior of considering evidence supportive of an out- friends (a large sample) or on his own 1-day impressions of each
come that did not occur is unlikely to he performed by school (a very small sample)? Other subjects were given the
subjects--whatever their motivation--unless they are asked to same scenario with the addition of a paragraph that made them
do so. The consequence of performing this behavior is lowering "explicitly aware of the role of chance in determining the im-
the inappropriate confidence one has in the accuracy of one's pression one may get from a small sample" (Nisbett et al., 1983,
responses and reducing the magnitude of the hindsight effect. p. 353). Namely, David drew up a list for each school of the
Koriat et al. (1980) found that this technique was effective in classes and activities that might interest him during his visit
reducing the overconfidence people generally have in their an- there, and then he blindly dropped a pencil on the list, choosing
swers to general knowledge questions, and Hoch (1985) found to do those things on the list where the pencil point landed.
the same technique was able to lower overconfidence in fore- These authors found that if the chance factors that influenced
casts made by business students. David's personal evidence base were made salient in this way,
Note that this"consider the opposite" strategy (Lord, Lepper, subjects would he more likely to answer questions about the
& Preston, 1984) attempts to debias by priming stimuli other scenario in a probabilistic manner (i.e, rely on the large sample
than the ones that would normally be accessed. Once this prim- provided by many friends) than if the chance factors were not
ing occurs, new causal skids are greased. The consequent influ- made salient. Such hints, rather than blatant instruction, can
ence of these new factors will occur according to the same mech- provide routes to a debiasing behavior in some problems.
anisms that led to the bias (e.g., hindsight, confirmation, over- Confidence as a second-order judgment. If I estimate the
confidence) in the first place. If the occurring event cued its potential of college football players, the proportion of men in a
own causal chains, then considering the nonoccurring event list of people, or the merit of program X in combating an Asian
ought to accomplish the analogous result, thereby reducing the disease, I would be performing a first-order judgment task. I f I
bias. am called on to express my confidence in any of those judg-
Another type of debiasing has been effective against overcon- ments, then I am performing a second-order judgment task. By
fidence, a bias that I have postulated is a consequence of cuing stating my confidence, I am rendering a judgment about my
mainly supportive evidence. Murphy and Winkler (1974) found first-order judgment.
that weather forecasters have outstanding accuracy-confidence Debiasing techniques aimed at the first-order judgment can
calibration. For example, there is rain on 90% of the days on also have salutary effects on overconfidence. For example, Tet-
which meteorologists say there is a 90% chance of rain. How- lock and Kim (1987) found that subjects who knew that they
ever, Wagenaar and Keren 0986) showed that meteorologists would be held accountable for their predictions and thus were
were very overconfident in their answers to general knowledge in a high-stakes situation wrote complex personality sketches of
questions. This suggests that these professionals have not the person whose test they were reviewing. They also made
learned some general dehiasing strategy like "consider the op- more accurate predictions concerning these test takers than did
posite" which they can then apply to domains outside their area subjects in a low-stakes situation. This study was used to illus-
of expertise. Instead, the absence of overconfidence for this and trate the fact that incentives can improve strategy-based judg-
a very few other select groups of professionals in their area of ments. However, Tetlock and Kim found that the accountabil-
COSTS AND BENEFITS OF JUDGMENT ERRORS 495

ity group also was less overconfident in their judgments than errors. Sunk cost reasoning has been used to justify continued
the control group. I hypothesize that this result was because funding of multibillion dollar "lemons" (Arkes & Blumer,
accountability subjects more thoroughly studied all available 1985). In addition, it has been used to justify continued spend-
evidence compared with the control group. As a consequence, ing on the exceedingly expensive B-2 bomber(Staff, 1989, p. 8).
the usual finding--overconfidence--was ameliorated. Subjects If saving billions of dollars is not a sufficient monetary incen-
typically have to be instructed to consider evidence that is con- tive, then we may conclude that the sunk cost effect is not partic-
trary to their decision to bring their confidence down to reason- ularly vulnerable to this type of debiasing.
able levels (Koriat et al., 1980). However, because confidence is It is not clear how changing the judgment behavior to add
a second-order judgment, attempts at improving the first-order associations, a technique effective in debiasing association-
judgment may also have a beneficial effect in debiasing over- based errors, would even be applied here. I am unaware of any
confidence. (See also Arkes, Christensen, Lai, & Blumer, 1987, such attempts.
Experiment 2). To debias psychophysically based errors, at least four tech-
niques may be effective, however.
First, because the curve relating objective gains and losses to
Psychophysically Based Errors
subjective gains and losses cannot be changed, debiasing may
Techniques effective in debiasing psychophysicaily based occur when new gains or losses are added to those currently
errors are much different than those effective in debiasing asso- under consideration. This will change the location of the possi-
ciation-based errors. However, we must first consider what con- ble outcomes on the curve. For example, Northcrafi and Neale
stitutes bias. (1986) asked subjects to consider spending more money on a
Suppose that I am in line at the betting window before the project that appeared to be doomed to failure. Subjects in the
last race of the day at a race track. The man in front of me control group tended to continue to spend, thereby manifesting
bemoans his terrible luck on the prior 11 races and decides to the sunk cost effect. Other subjects were informed of the pres-
put all his remaining funds ($50) on a long shot on the last race. ence of opportunity costs. This term refers to the fact that money
Because he lost all I 1 of the prior races, we assume that he was spent on the doomed project is unavailable for use on much
at point B in Figure 2. The loss of $50 would not represent a more promising ventures. These superior investments repre-
significant decrease in utility. The gain of several thousand sent lost opportunities if the funds are spent elsewhere. Reveal-
would represent an enormous increase. Calculations based on ing to subjects the presence of this huge additional cost made
the curve in Figure 2 might indicate that his behavior was"ratio- the choice of the sunk cost option much less attractive, and
nal"; he was maximizing expected utility. If the upper portion fewer subjects succumbed to the sunk cost effect. This situation
of Figure 2 described the relation between objective light inten- can be understood by referring again to Figure 2. Consider the
sity and his subjective brightness judgments, it would not have subjects who had not been made aware of the opportunity cost
been sensible for me to say, "Sir, I respectfully point out that of a further small investment in a hopeless cause. Should the
your judgments do not increase in a linear fashion with objec- investment prove to be unsuccessful, the location of the sub-
tive intensity. Therefore, your judgment is biased, and you jeers would shift from point B (their current position) to point
might do well to correct your responses: For the same reason, it A, a small loss in utility. However, those subjects who had been
would not have been sensible for me to point out that his betting made aware of the opportunity cost of continued investment in
behavior was biased. Given the psychophysics of his situation, a lost cause would realize that such behavior would actually
his behavior followed in an unbiased manner. Because there result in a shift from point B to point Q. This represents a larger
was no bias, there was no warrant to debias. loss of psychological utility. Presenting subjects with informa-
Kahneman and Tversky (1984) pointed out that when sub- tion concerning opportunity costs decreased subjects' willing-
jects are presented with their inconsistent answers on the two ness to continue investing.
versions of the Asian disease problem, many do not want to A second way to modify psychophysical judgment behaviors
resolve the inconsistency by changing one of their answers. is to change the concatenation of related items. For example,
They can be made to realize that the inconsistency is present. Thaler (1985) asked subjects to decide whether Mr. A or Mr. B
However, they apparently do not consider their answers to be was happier. Mr. A won two lotteries, one for $50 and one for
nonnormative or "biasedY This suggests how difficult debias- $25. Mr. B won a single lottery of $75. The large majority of
ing may be on psychophysically based errors. subjects thought that Mr. A would be happier. Mr. A would
Suppose a person's own psychophysical function is not used receive two separate winnings, and because of the concavity of
as a basis for making a decision. Instead, a benevolent other the value function in the region of gains, the sum of the values
may wish to impose a "normative" framework, thereby chang- of the two winnings would be greater than value of their sum,
ing the original person's response. An example might be an that is v(25) + v(50) > v(25 + 50). Thaler (1985) pointed out that
accountant who, through his or her professional training, real- late-night mail-order advertisements take advantage of this prin-
izes that the manifestation of the sunk cost effect will have ciple by tossing in a tool set, knives, and other separable items to
adverse effects on the economic well-being of everyone in a make the gain look particularly large. Adding more of the same
company. If he or she wants to debias the sunk cost effect, what product would merely push the potential buyer along the
avenues are possible? asymptote where value increases quite slowly. Thus, segregating
Incentives, which are effective in debiasing strategy-based versus integrating gains can cause changes in one's willingness
errors, are ineffective in debiasing psychophysically based to make purchases.
496 HAL R. ARKES

A third technique is to change one's reference point. When be trained professionallyor to seek someone who isso trained is
someone is 100,000 calories in arrears on a diet, one is at point a meta-strategy that will ameliorate some judgment errors.
A. Efforts to move to the right and upward on the scale will not
result in much improvement in the immediate future. However, Conclusion
if one begins the diet anew, then one is transposed to point O.
Here improvement is easier to achieve thanks to the shape of In his excellent review of the debiasing literature,Fiscb_hoff
the curve in the region of the origin. Maxims like "Today is the 0982, p. 444) suggested that clarifyingand exploiting the cog-
first day of the rest of your life" use this principle. (A related nitiveprocesses underlying dcbiasing are major theoreticaland
economic analysis can be found in Loewenstein, 1988). practical tasks. The debiasing literature currently contains a
Fourth, one can reframe losses as gains (or gains as losses) as desultory catalog of techniques that work, techniques that do
was accomplished by Tversky and Kahneman (1981) in their not work, and techniques that work on some tasks but not
Asian disease example. (Also see McNeil, Pauker, Sox, & others. The purpose of this articlewas to divide judgment be-
Tversky, 1982.) haviors into three broad categoriesbased on functionalistcrite-
Note that these techniques do nothing to alter the shape of ria, namely the bases for their costs and benefits. With this
the psychophysical curve. Psychophysically based judgment taxonomy, itisthen possible to hypothesize which variablesare
errors occur because the relation between external stimuli and likely to be effectivein dcbiasing judgment errors within each
psychological responses to those stimuli is nonlinear. Because category.
the shape of the curve depicting this relation is a given, debias- Strategy-based errors occur when the cost of extra effort out-
ing consists of changing either the location of the options or the weighs the potential benefit of extra accuracy. Given this prem-
location of one's reference point on the curve. ise, debiasing should occur when the benefits of accurate judg-
ment are increased.
Association-based errors are costs of otherwise highly adap-
Training
tive system of associations within semantic memory. Errors oc-
At least one type of debiasing lies outside the categorization cur when semantically related but judgmentally harmful associ-
scheme just presented. Its success is not due to its ability to ations are brought to bear on the task. Debiasing requires the
counteract the cognitive behaviors characteristic of strategy- performance of a behavior that will activate different associa-
based, association-based, or psychophysically based judgment tions.
errors. Psychophysically based errors are due to the nonlinear rela-
This type ofdebiasing is professional training. When examin- tion between external stimuli and the subjective responses to
ing the financial state of a company, an accountant is unlikely to those stimuli. Debiasing therefore requires changing the loca-
fall prey to the sunk cost effect. Standard accounting proce- tion of one's position on the curve depicting this relation or the
dures simply allow no place for the consideration of sunk costs. position of one or more of the options.
From a psychological perspective, this is not a very interesting Twenty years of extremely creative research have docu-
instance of debiasing. However, it is instructive that quite spe- mented the presence of many judgment shortcomings. It is
cific professional training may be necessary for debiasing to be hoped that this taxonomy will help in the search for techniques
successful. Arkes and Blumer (1985) showed that taking a with which we will be able to debias such errors.
course or two in general economics did not inoculate students
against the sunk cost effect. References
Another example of this same type of debiasing is more gen-
eral and therefore much more interesting. Lehman, Lempert, Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA:
and Nisbett (1988) showed that graduate training can influence Harvard University Press.
subjects' statistical reasoning. For example, Lehman et al. Anderson, J. R., & Bower, G. H. 0973). Human associative memory.
Washington, DC: Winston.
showed that the importance of control groups is more likely to Archer, J. (1988). The sociobiology of bereavement: A reply to Little-
be apprehended by advanced psychology graduate students field and Rushton. Journal of Personality and Social Psychology, 55,
than by advanced chemistry students. The superiority of the 272-278.
psychology students may represent a result similar to the pre- Arkes, H. R., & Blumer, C. 0985). The psychology of sunk cost. Organi-
sumed superiority of the accountants in resisting the sunk cost zational Behavior and Human Decision Processes, 35, 125-140.
effect. Namely, professional training in the techniques of psy- Arkes, H. R., Christensen, C., Lai, C., & Blumer, C. (1987). Two meth-
chological research heightened their awareness of the impor- ods of reducing overconfidence. Organizational Behavior and Hu-
tance of control groups. man Performance, 39, 133-144.
Instruction in standard accounting procedures or scientific Arkes, H. R., Faust, D., Guilmette, T J., & Hart, K. (1988). Eliminating
methodology represent examples of providing people with the hindsight bias. Journal of Applied Psychology, 73, 305-307.
Arkes, H. R., & Freedman, M. R. (1984). A demonstration of the costs
tools designed to reach a normatively appropriate answer. Ed- and benefits of expertise in recognition memory. Memory & Cogni-
wards and yon Winterfeldt (1986) pointed out that "if the prob- tion, 12, 84-89.
lem is important and the tools are available people will use Arkes, H. R., & Harkness, A. R. (1980). The effect of making a diagno-
them and thus get right answers" (p. 679). Indeed, training in- sis on the subsequent recognition of symptoms. Journal of Experi-
volves giving certain (usually self-selected) people precisely mental Psychology: Human Learning and Memo~ 6, 568-575.
those tools needed to arrive at correct answers. The decision to Arkes, H. R., & Harkness, A. R. (1983). Estimates of contingency be-
COSTS AND BENEFITS OF JUDGMENT ERRORS 497

tween two dichotomous variables. Journal of Experimental Psychol- Harkness, A. R., DeBono, K. G., & Borgida, E. (1985). Personal involve-
ogy."General, 112, 117-135. ment and strategies for making contingency judgments: A stake in
Bar-Hillel, M. (1973). On the subjective probability of compound the dating game makes a difference. JoumalofPersonalityandSocial
events. Organizational Behavior and Human Performance, 9, 396- Psychology, 49, 22-32.
406. Helson, H. (1964). A daptation-leveltheory:An experimental and sys tem-
Beach, L. R., & Mitchell, T. R. (1978). A contingency model for the atic approach to behavior. New York: Harper.
selection of decision strategies. Academy of Management Revievg,3, Hoch, S. J. (1985). Counteffactual reasoning and accuracy in predicting
439--449. personal events. Journal of Experimental Psychology: Learning,
Berkeley, D., & Humphreys, R (1982). Structuring decision problems Memory, and Cognition, 11, 719-731.
and the "bias heuristic." Acta Psychologica, 50, 201-252. Hogarth, R. M. (1981). Beyond discrete biases: Functional and dys-
Billings, R. S., & Marcus, S. A. (1983). Measures of compensatory and functiona! aspects of judgmental heuristics. Psychological Bulletin,
noncompensatory models of decision behavior: Process tracing ver- 90, 197-217.
sus policy capturing. Organizational Behavior and Human Perfor- Hull, C. L. (1920). Quantitative aspects of the evolution of concepts.
mance, 31, 331-352. Psychological Monographs, 28, (1, Whole No. 123).
Chapman, L., & Chapman, J. (1967). Genesis of popular but erroneous Johnson, E. J., & Payne, J. W. (i985). Effort and accuracy in choice.
psychodiagnostic observations. Journal of Abnormal Psychology, 72, Management Science, 31, 395-414.
193-204. Kahneman, D., Slovic, R, & Tversky, A. (Eds}. (1982). Judgment under
Christensen, C. (1989). The psychophysics of spending. Journal of Be- uncertainly: Heuristics and biases. Cambridge, England: Cambridge
havioral Decision Making, 2, 69-80. University Press.
Christensen-Szalanski, J. J. J. (1980). A further examination of the Kahneman, D., & Tversky, A. (1973). On the psychology of prediction.
selection of problem-solvingstrategies: The effects of deadlines and Psychological Review, 80, 237-25 I.
analytic aptitudes. Organizational Behavior and Human Perfor- Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of
mance, 25, 107-122. decision under risk. Econometrica, 47, 263-291.
Dinnerstein, D. (1965). Intermanual effects of anchors on zones of Kahneman, D., & Tversky, A. 0984). Choices, values, and frames.
maximal sensitivity in weight-discrimination. American Journal of
American Psychologist, 39, 341-350.
Psychology, 78, 66-74.
Koriat, A., Lichtenstein, S., & Fischhoff, B. (1980). Reasons for confi-
Doherty, M. E., Mynatt, C. R., Tweney, R. D, & Schiavo, M. D. (1979).
dence. Journal of Experimental Psychology: Human Learning and
Pseudodiagnosticity. Acta Psychologica, 43, 111-121.
Memory 6, 107-118.
Edwards, W. (1983). Human cognitive capabilities, representativeness,
Kubovy, M. (1977). Response availability and the apparent spontaneity
and ground rules for research. In R C. Humphreys, O. Svenson, & A.
of numerical choices. Journal of Experimental Psychology: Human
Vari (Eds.), Analysing and aiding decision processes (pp. 507-513).
Perception and Performance, 3, 359-364.
Amsterdam: North-Holland.
Edwards, W., & yon Winterfeldt, D. (1986). On cognitive illusions and Lehman, D. R., Lempert, R. O., & Nisbett, R. E. (1988). The effects of
their implications. In H. R. Arkes & K. R. Hammond (Eds.), Judg- graduate training on reasoning: Formal discipline and thinking
about everyday-life events. American Psychologist, 43, 431-442.
ment and decision making: An interdisciplinary reader(pp. 642-679 ).
Cambridge, England: Cambridge University Press. Lichtenstein, S., Fischhoff, B., & Phillips, L. D. (1982). Calibration of
Einhorn, H. J., & Hogarth, R. M. (1981). Behavioral decision theory: probabilities: The state of the art to 1980. In Kahneman, D., Slovic,
Processes of judgment and choice. Annual Review of Psychology, 32, R, & Tversky, A. (Eds.), Judgment under uncertainly: Heuristics and
53-88. biases (pp. 306-354). Cambridge, England: Cambridge University
Press.
Fischhoff, B. (1975). Hindsight ¢ foresight: The effect of outcome
knowledge on judgment under uncertainty. Journal of Experimental Loewenstein, G. E (1988). Frames of mind in intertemporal choice.
Psychology: Human Perception and Performance, 1, 288-299. Management Science, 34, 200-214.
Fischhoff, B. (1982). Debiasing. In D. Kahneman, E Slovic, & A. Lord, C. G., Lepper, M. R., & Preston, E. (1984). Considering the
Tversky, (Eds.), Judgment under uncertainty: Heuristics and biases opposite: A corrective strategy for social judgment. JournalofPerson-
(pp. 422-444). Cambridge, England: Cambridge University Press. ality and Social Psychology, 47, 1231-1243.
Fischhoff, B., & Beyth-Marom, R. (1983). Hypothesis evaluation from McNeil, B. J., Pauker, S. J., Sox, H. C., Jr., & Tversky, A. (1982). On the
a Bayesian perspective. Psychological Review, 90, 239-260. elicitation of preferences for alternative therapies. New England
Fischhoff, B., Slovic, R, & Lichtenstein, S. (1977). Knowing with cer- Journal of Medicine, 306, 1259-1262.
tainty: The appropriateness of extreme confidence. JournalofExper- Murphy, A. H., & Winkler, R. L. (1974). Subjective probability fore-
imental Psychology: Human Perception and Performance, 3, 552- casting experiments in meteorology: Some preliminary results. Bul-
564. letin of the American MeteorologicalSociety, 55, 1206-1216.
Funder, D. C. 0987). Errors and mistakes: Evaluating the accuracy of Neely, J. H. (in press). Semantic priming effects in visual word recogni-
social judgment. Psychological Bulletin, 101, 75-90. tion: A selective review of current findings and theories. In D. Besner
Gilovich, T. (1981). Seeing the past in the present: The effect of associa- & G. Humphreys (Eds.), Basic processesin reading:Visualword recog-
tions to familiar events on judgments and decisions. Journal of Per- nition. Hillsdale, NJ: Erlbaum.
sonality and Social Psychology, 40, 797-808. Nisbett, R. E., Krantz, D. H., Jepson, C., & Kunda, Z. (1983). The use of
Gregory, W. L., Cialdini, R. B., &Carpenter, K. M. 0982). Self-relevant statistical heuristics in everyday inductive reasoning. Psychological
scenarios as mediators of likelihood estimates and compliance: Review, 90, 339-363.
Does imaginingmake it so? JournalofPersonalityandSocial~Psychol- Northcraft, G. B., & Neale, M. A. (1986). Opportunity costs and the
ogy, 43, 88-99. framing of resource allocation decisions. Organizational Behavior
Grice, H. P. 0975). Logic and conversation. In D. Davidson & G. Har- and Human Decision Processes, 37, 348-356.
man (Eds.), The logic of grammar (pp. 64-75). Encino, CA: Dickin- Northcraft, G. B., & Neale, M. A. (1987). Experts, amateurs, and real
son. estate: An anchoring-and-adjnstmentperspective on property pric-
498 ' H A L R. A R K E S

ing decisions. Organizational Behavior and Human Decision Pro- merit. Organizational Behavior and Human Performance, 6, 649-
cesses, 39, 84-97. 744.
Paquette, L., & Kida, T. (1988). The effect of decision strategy and task Smith, E. E., Sboben, E. J., & Rips, L. J. (1974). Structure and process in
complexity on decision performance. Organizational Behavior and semantic memory: A feature model for semantic decisions. Psycho-
Human Decision Processes, 41, 128-142. logical Review, 81, 214-241.
Payne, J. W. (1982). Contingent decision behavior. Psychological Bulle- Staff. (1989, September 4). Don't B-2 sure. The New Republic, pp. 7-8.
tin, 92, 382-402. Stevens, S. S. (1957). On the psyehophysieal law. Psychological Review,
Payne, J. W., Bettman, J. R., & Johnson, E. J. (1988). Adaptive strategy 64, 153-181.
selection in decision making. Journal of Experimental Psychology: Stroop, J. R. (1935). Studies of interference in serial verbal reactions.
Learning, Memory, and Cognition, 14, 534-552. Journal of Experimental Psychology, 18, 643-662.
Petty, R. E., & Cacioppo, J. T. (1984). The effects of involvement on Sutherland, H. J., Dunn, V., & Boyd, N. E (1983). The measurement of
responses to argument quantity and quality: Central and peripheral values for states of health with linear analog scales. MedicalDecision
routes to persuasion. Journal of Personality and Social Psychology,
Making, 3, 477--487.
Tetlock, P. E., & Kim, J. I. (1987). Accountability and judgment pro-
46, 69-81.
cesses in a personality prediction task. Journal of Personality and
Phillips, L. D. (1983). A theoretical perspective on heuristics and
Social Psychology, 52, 700-709.
biases in probabilistic thinking. In P. C. Humphreys, O. Svenson, & Thaler, R. (1985). Mental accounting and consumer choice. Marketing
A. Vari (Eds.), Analysing and aiding decision processes (pp. 525- Science, 4, 199-214.
543). Amsterdam: North-Holland. Thaler, R. (1986). The psychology and economics conference hand-
Posner, M. I., & Snyder, C. R. R. (1975). Attention and cognitive con- book: Comments on Simon, on Einhorn and Hogarth, and on
trol. In R. L. Solso (Ed.), Information processing and cognition: The Tversky and Kahneman. Journal of Business, 59, $279-$284.
Loyola symposium (pp. 55-85). Hillsdale, NJ: Erlbanm. Thorngate, W. (1980). Efficient decision heuristics. BehavioraIScience,
Powell, J. L. (1988). A test of the knew-it-all-along effect in the 1984 25, 219-225.
presidential and statewide elections. Journal of Applied Social Psy- Tversky, A., & Kahneman, D. (1973). Availability:A heuristic for judg-
chology, 18. 760--773. ing frequency and probability. Cognitive Psychology, 5, 207-232.
Ratcliff, R., & McKoon, G. (1981). Automatic and strategic priming in Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty:
recognition. Journal ofVerbalLearning and VerbalBehavior, 20, 204- Heuristics and biases. Science, 185, 1124-1131.
215. Tversky, A., & Kahneman, D. (198 I). The framing of decisions and the
Restle, E (1971). Instructions and the magnitude of an illusion: Cogni- rationality of choice. Science, 211, 453-458.
tive factors in the frame of reference. Perception and Psychophysics, Tversky, A., & Kahneman, D. (1983). Extensional versus intuitive rea-
9, 31-32. soning: The conjunctionfallacy in probability judgment. Psychologi-
Ross, L., Lepper, M. R., Strack, E, & Steinmetz, J. L. (1977). Social cal Review, 90, 293-315.
explanation and social expectation: Effects of real and hypothetical Wagenaar, W., & Keren, G. B. (1986). Does the expert know? The reli-
explanations of subjective likelihood. Journal of Personality and So- ability o f predictions and confidence ratings of experts. In E. Hollna-
cial Psychology, 35, 817-829. gel, G. Mancini, & D. Woods (Eds.), Intelligent decision support in
Shaklee, H., & Tucker, D. (1980). A rule analysis of judgments of co- process environments (pp. 87-103). Berlin: Springer-Verlag.
variation between events. Memory and Cognition, 8, 459-467. Yates, J. E, & Carlson, B. W. (1986). Conjunction errors: Evidence for
Slobin, D. I. (1971). Psycholinguistics. Glenview, IL: Scott, Foresman. multiplejudgment procedures, including"signedsummation"Orga-
Slovic, P., & Fischhoff, B. (1977). On the psychology of experimental nizational Behavior and Human Decision Processes, 37, 230-253.
surprises. Journal of Experimental Psychology: Human Perception
and Performance, 3, 544-551. Received December 6, 1989
Slovic, E, & Lichtenstein, S. (1971). Comparison of Bayesian and re- Revision received July 10, 1990
gression approaches to the study of information processing in judg- Accepted February 12, 1991 •

You might also like