Professional Documents
Culture Documents
ABSTRACT
Poker players make strategic decisions on the basis of imperfect information, which are
informed by their assessment of the probability they will hold the best set of cards
among all players at the conclusion of the hand. Exact mental calculations of this
probability are impossible—therefore, players must use judgment to estimate their
chances. In three studies, 69 moderately experienced poker players estimated the
probability of obtaining the best cards among all players, based on the limited
information that is known in the early stages of a hand. Although several of the
conditions typically associated with well-calibrated judgment did not apply, players’
judgments were generally accurate. The correlation between judged and true prob-
abilities was r > .8 for over five-sixths of the participants, and when judgments were
averaged across players and within hands this correlation was .96. Players slightly
overestimated their chance of obtaining the best cards, mainly where this probability
was low to moderate (<.7). Probability estimates were slightly too strongly related to
the strength of the two cards that a player holds (known only to themselves), and
insufficiently influenced by the number of opponents. Seemingly, players show some-
what insufficient regard for the cards that other players could be holding and the
potential for opponents to acquire a strong hand. The results show that even when
judgment heuristics are used to good effect in a complex probability estimation task,
predictable errors can still be observed at the margins of performance. Copyright #
2009 John Wiley & Sons, Ltd.
INTRODUCTION
Poker is a game of chance and skill where players bet on the value of cards. The strategies that maximise
profit have been distilled from the experiences of professional players (e.g. Harrington & Robertie, 2006),
and have been subjected to statistical analysis where appropriate (Sklansky, 1994). Poker is a complex game,
* Correspondence to: Dr Tim Rakow, Department of Psychology, University of Essex, Colchester, CO4 3SQ, UK.
E-mail: timrakow@essex.ac.uk
with an element of uncertainty, which lends itself to the study of probability judgment when it is informed by
specialist knowledge and experience.
Texas Hold ‘em Poker (hereafter poker) is currently the most popular version of poker, which, encouraged
by an explosion of internet play, now boasts many high-stakes tournaments offering over $1 million in prize
money. For the benefit of readers unfamiliar with poker, we first explain the basics of game. Games are
typically played with a maximum of 10 people per table. A hand of poker begins with each player being dealt
two cards face down (for their eyes only) from a regular deck of 52 cards (see Figure 1a). These two cards are
known as the ‘starting hand’. Five ‘community cards’ are then dealt face up. In the final stage of the hand each
player creates their best hand (in the hope of beating the other players) by selecting the highest value
combination of five cards from any of the seven cards that are visible to them (his/her own two starting cards
plus the five community cards).1 Valuable hands include two-, three- or four-of-a-kind, flushes (five cards of
the same suit), and straights (five cards in sequence)—the rules prescribe greater value to less likely
combinations of cards (e.g. a flush beats three-of-a-kind). Figure 1(b) shows the ranking of hands that
determines the winner. The community cards are dealt in three rounds, which are referred to as the ‘flop’
(three cards dealt), the ‘turn’ (one additional card) and the ‘river’ (one final card revealed). Each round of card
dealing is followed by a round of betting: first after the starting hands are dealt but before any community
cards are known (the ‘pre-flop’ round of betting), then after the flop, then after the turn, and finally after the
river. Bets accumulate in the centre of the table in ‘the pot’, which can be won by any player. Players may also
‘fold’ (i.e. withdraw from the current hand) at any point in the betting. Once a player has folded they no longer
contribute to the pot, but they are no longer eligible to win it—so betting can be thought of as paying to retain
the opportunity to win. A player wins if all other players fold or if he holds the best hand among players who
remain after the river. Thus a player can sometimes bluff his way to victory if confident betting leads other
players (with better hands) to fold before the betting ceases. Therefore, the winning player may not always be
the player holding the best cards, as a player who folded may actually have held the best cards.
Explicit probability judgments are not required in poker—however, at many points in the game a player
may be guided by his beliefs about the probability of particular events or outcomes. The game requires a
player to judge the chance of his hand beating each of his opponents’ unknown hands—or, more subjectively,
to judge how likely it is that an opponent is bluffing or will continue to bet. Based on such judgments, a player
must decide when to fold or bet. Thus, some of the skill in poker revolves around assessing whether these
chances justify continuing to bet—an assessment that can be made at several junctures as the hand is played
out (see Figure 1a). Whilst a player may not consciously assess these chances for each and every hand,
players must make a judgment on some level in order to make decisions about whether to continue to play. In
this paper, we focus on the most objective of these judgments: assessing the probability of obtaining the best
cards among all players given particular cards.
In estimating the probability of obtaining the best cards against a given number of unknown opponents
there is a finite number of possible outcomes. However, with 1326 unique starting hand combinations, 19 600
(3-card) flop combinations and 2 118 760 (5-card) community card combinations to consider, there is a huge
range of possible outcomes that simply overwhelms human cognition. Incapable of these calculations, people
must implement a strategy that requires limited cognitive resources but maintains a degree of accuracy. What
kind of strategy might this be? The following three paragraphs outline three classes of psychologically
plausible strategy that poker players could adopt when assessing probabilities: specific judgment heuristics,
processes of hypothesis evaluation, and memory-based processes based on encoded frequencies. These
classes of strategy are not mutually exclusive (e.g. hypothesis evaluation may involve heuristic reasoning)
and there may be individual differences in their use (e.g. preferences among strategies or skill in applying
1
For simplicity we use masculine pronouns when referring to poker players from this point forward—so ‘he’ should be read ‘he or she’,
and ‘his’ should be read as ‘his or her’. This choice of language is in also in keeping with the observation that almost all of our poker-
playing participants were male.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
498 Journal of Behavioral Decision Making
Figure 1a. The game of Texas Hold ‘Em Poker. Upper row shows community cards; lower rows show annotated
outcomes for two players, indicating the best hand of five cards that each player holds at each stage of the game.
Hand strength will change as the hand is played fully. The winner is the player holding the best hand after the final round
of betting after the river (unless all but one of the players have folded prior to this). In the second example, the 3€ in the
starting hand becomes redundant, as there are higher cards to accompany the pair of kings. 1, 2, 3 and 4 refer to betting rounds.
Figure (1b) Poker hand rankings listed from best (1) to worst (10). In any case where 2 or more opponents have the same
hand ranking the hand made up from higher ranking cards wins
them may vary with expertise). Nonetheless, they serve to map out what we should expect on the basis of
some of the key theories of probability judgment. Our studies do not provide an unequivocal test of these
strategies. Rather, these approaches are reviewed here as they provide a framework for understanding the task
and interpreting our results.
Several heuristics could simplify the process of probability assessment in poker, one of which is the
simulation heuristic (Kahneman & Tversky, 1982). Mental simulation can generate an evaluation of the
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 499
tendency of one’s model of the situation to produce different outcomes. Just as the availability heuristic uses
ease of recall to judge the relative frequency of past events or the size of existing sets (Tversky & Kahneman,
1973), the simulation heuristic uses the ease with which future possibilities can be constructed to assess their
probability. Whilst a poker player may be unable to simulate all possible hands in his mind, the simulation
heuristic allows him to reach an estimation of the chances of obtaining the best cards using a sample of
simulations as the basis for judgment. For instance, a player may consider some of the cards that his
opponents could be holding and/or some of the cards that could be dealt, and evaluate his chances by
considering the proportion of simulated opponents’ hands that are weaker than his own hand. For example, a
player holding a pair of 10s when a flop of 10^2|7€ is on the table will find it hard to simulate opponents’
hands or future community cards that can yield a hand stronger than his own. In contrast, an opponent holding
3^8 against this flop will quickly simulate possible hands for his opponents that will beat his hand—and,
accordingly, will judge his chances to be more modest. Anchoring and adjustment could also be implemented
usefully to judge the probability of obtaining the best cards for a particular hand (Tversky & Kahneman,
1974). For instance, even before the cards are dealt a player knows the potential number of players in the
hand, which could be used to calculate an equiprobabilty anchor (Teigen, 2001). For example, with five
players all have a 20% chance before the cards are dealt (assuming equal levels of skill)—this 20% prior
probability could then be revised as the cards are revealed and players fold. Additionally, the first
individuating information that a player receives is his two starting cards. The perceived strength of these cards
could be used to provide an initial estimate of the chances of obtaining the best cards (perhaps drawing on
past experience)—which again would adjust as the hand is played out.
Probability estimation in poker can be viewed as a process of evaluating successive hypotheses on the
basis of accumulating evidence, which will inevitably depend upon which cues players attend to, and how
they use this information. Kahneman and Lovallo (1993) discuss these attentional features and propose two
modes of forecasting: the inside view and the outside view, which Lagnado and Sloman (2004) characterised
with respect to probability judgment. The inside view is a singular mode of thinking and focuses on evidence
for the most salient outcome, with an ignorance for other less obvious outcomes. An example of adopting an
inside view whilst judging the probability of obtaining the best cards would be to focus on one’s own starting
hand and how hand strength may increase with future community cards, without considering the cards
opponents may be holding. Viewed according to support theory (Brenner, Koehler, & Rottenstreich, 2002;
Rottenstreich & Tversky, 1997; Tversky & Koehler, 1994) this would correspond to over-reliance on the
support (evidence) for the focal hypothesis that I hold the best cards in comparison to under-weighting the
support for the alternate hypothesis that one of my opponents holds the best cards. The outside view
represents a more distributional mode of thinking, which considers a wider set of possibilities, including the
less immediately salient ones. An ‘outside judge’ in poker may consider what cards other players might be
holding, how the community cards could affect the strength of each opponent’s hand relative to his own hand
and what additional community cards may result in a strong hand for each of his opponents. Outside
judgment is usually achieved through statistical analyses and is less reliant on heuristics. Alternatively, an
outside judge may still rely on mental simulation but simulate alternative outcomes in a more systematic and
extensive manner. Dougherty, Gettys, and Thomas (1997) have shown that generating multiple causal
scenarios for alternative hypotheses decreases the perceived likelihood of the focal hypothesis—which
counteracts the typical tendency towards overestimation of focal hypotheses. Thus, whatever strategies an
outside judge adopts, the outside view requires more careful and effortful thinking (in considering complex
rules or a greater number of possibilities). The inside view seems to be the default position in most situations.
For instance, Koriat, Lichtenstein, and Fischhoff (1980) showed that probability judgments became
somewhat more appropriate when participants were actively encouraged to think of additional possibilities
(reasons why an answer could be wrong). In fact, many studies imply that less obvious possibilities fail to
come to mind when a problem is first considered unless they are prompted (e.g. Fischhoff, Slovic, &
Lichtenstein, 1978; Tversky & Koehler, 1994).
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
500 Journal of Behavioral Decision Making
Another way to bypass intractable probability calculation is simply to rely upon memory and experience.
In a given situation, past experience of the number of wins and losses in equivalent situations can inform
assessments of the probability obtaining the best cards in the current situation. Hasher and Zacks (1979)
demonstrated that people have a good facility for tracking small frequencies within a short time frame, and
others have argued that humans are generally well adapted to the task of logging event frequencies and using
them to assess probability or make inferences (Gigerenzer, Hoffrage, & Kleinbölting, 1991). However, one
challenge that the poker player faces is in constructing the appropriate reference class of events. For instance,
if my starting hand is a pair of sixes—what past experiences should I recruit? All previous instances where I
held a pair of sixes? But perhaps none of these cases included hands with the same number of opponents that I
face now. Given the number of starting hand and flop combinations, and the variable number of opponents,
players will often find themselves in unique circumstances. In these cases, even excellent memory for past
instances provides only a first approximation to the current probability of obtaining the best cards.
Tasks that rely upon specialist knowledge, an example of which we consider in this paper, have provided a
valuable context for studying the calibration of probability judgment (see Koehler, Brenner, & Griffin, 2002).
It has frequently been noted that meteorologists have been observed to have very good, sometimes near-
perfect, calibration for probability judgments for a variety of weather events (e.g. Murphy & Winkler, 1977).
In other words, among a collection of meteorological events each of which is assigned a subjective
probability of X%, approximately X% of these events do occur. Horse-race odds have been shown to be well
calibrated (Hoerl & Fallin, 1974), and groups of executives in banking and the pharmaceutical industry have
been able to provide accurate subjective probabilities when forecasting interest rates or estimating the chance
that a project succeeds (Balthasar, Boschi, & Menke, 1978; Kabus, 1976). In contrast, physicians’ diagnostic
and prognostic probability judgments are notoriously variable in quality. A number of studies find poor
calibration in probability judgments for diagnosis for a number of diseases (e.g. Christensen-Szalanski &
Busheyhead, 1981; Poses, Cebul, Wigton, Centor, Collings, & Fleishli, 1992). This is often attributed to the
lack of feedback that doctors receive on their judgments, which contrasts with the continual feedback that
meteorologists receive (Bolger & Wright, 1994), and which also applies to the poker players that we study.
However, Koehler et al. (2002) note that feedback cannot be the sole determinant of the quality of probability
judgment, as there exists considerable variability among different studies in medicine where feedback
characteristics are similar. Others have suggested that values (e.g. the perceived severity of the outcome
event) can contaminate physicians’ probability judgments. For instance, Arkes et al. (1995) reported that
doctors overestimated the chances of terminally ill patients dying within 2, or within 6, months—a
pessimistic bias that these authors suggest could be attributed to avoiding giving false hope to patients.
However—just as for gamblers in a horse-race betting market, whose laying of bets imply good calibration
(Johnson & Bruce, 2001)—there is no rational motivation for poker players to adopt either a pessimistic or an
optimistic bias in judging their chances, as either stance is associated with failing to maximise financial gains
(either from under-betting or from over-betting). Notably, Keren (1987, Experiment 1) found that national
and international class bridge players (who receive prompt feedback) also showed superb calibration when
asked to judge the chances of making a contract in tournament play. Thus, there are instances where repeated
feedback and appropriate incentives seem to lead to good judgment.
STUDY 1
This first study is an initial exploration of the ability of moderately experienced players to judge the
probability of holding the best cards at the end of the hand using only the information that is available to them
in the early stages of a hand. Such judgments are not a formal requirement of the game of poker, but a player
whose beliefs about this are inaccurate (even if they are not explicitly stated) may lose money by over-betting,
or fail to exploit opportunities by under-betting. We consider the first two stages of a hand of poker
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 501
(Figure 1a). First, when one’s own starting cards and the number of opponents are known, we examine
judging the probability of obtaining the best cards among all players for different starting hands and for a
varying number of opponents. Second, when the first set of community cards (the three ‘flop’ cards) are also
known. This involves judging the probability of obtaining the best cards for varying combinations of starting
hands and flops (holding number of opponents constant), or, for varying flops and number of opponents (for a
given pair of starting cards).
Method
Participants
Thirty-six poker players (35 males) were recruited with an average age of 20.9 years (range 18–28 years).
Most were university students and at least 60% were members of the University of Essex Poker Society.
Players reported knowing how to play poker for a median of 29 months (inter-quartile range, IQR of 16–42),
and playing online poker a median of 15 times a month (IQR 0–20) and live (i.e. face-to-face) poker a median
of 4 times a month (IQR 1–7). Most of this play would be for real stakes, though not necessarily large stakes.
Apparatus
Probability judgment tasks were presented as a sequence of three pencil and paper tasks. Publicly available
computer simulation software (Poker Pro LabsTM, 2007) was used to obtain the true probabilities of the
judgment tasks. The ‘random card’ feature on another publicly available poker game simulator
(TheHendonMob.com, 2007) was used to randomise the 3-card flops for two of the tasks (the Flop and
Jack-Ten tasks described below).
Materials
Three different tasks were used to assess probability judgment accuracy. Playing card images were used to
provide the information about the hands.
The Pre-flop task tested the accuracy of estimating probabilities with five different starting hands against 1,
3, 5, 7 and 9 opponents. No community cards were present. The starting hands selected for analysis were
chosen by the experimenter (JL) and were A€A|, K J , 6|6^, 3|4| and 3^8 . These were selected
in an attempt to provide a range of true probabilities (SD ¼ 20.63, range ¼ 78.9%) and to provide a range of
hands for which quality typically would be perceived as good, bad or intermediate.
The Flop task tested the accuracy of estimating probabilities with the same five starting hands as used in
the Pre-flop task against two hand picked 3-card flops and three randomly dealt flops, which were different
for each starting hand. Each hand was against 5 opponents.
The Jack-Ten task tested the accuracy of estimating probabilities with the same starting hand (J 10€) in
combination with two hand picked flops and three randomly dealt flops that were different to those in the
Flop task. Accuracy was tested for each flop and starting hand combination for 1, 3, 5, 7 and 9 opponents.
The purpose of the Jack-Ten task was to see how estimates vary with the number of opponents so it was
necessary to choose a starting hand that was unlikely to be viewed as especially strong or weak, so that a
reasonable proportion of the variance in the task came from the different sets of flop cards. The hand picked
flops were included to ensure a variety of different scenarios and thus a spread of true probabilities (for the
Flop task; SD ¼ 29.0%, range ¼ 86.4%, for the Jack-Ten task, SD ¼ 32.4%, range ¼ 90.8%).
This gave a total of 75 hands to be judged (25 per task)—see the Appendix for exact details of each hand.
Each task was presented on a single page, with the 25 hands set out in a 5 5-grid configuration (e.g. starting
hands by number of opponents in the Pre-flop task). Demographic information was requested, which included
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
502 Journal of Behavioral Decision Making
sex, date-of-birth, when poker was first learnt, average frequency of playing online and live poker, and degree
scheme (i.e. major). Five independent judges rated the degree schemes for mathematical content (coded high
vs. low, according to the consensus of the judges), as this should predict mathematical skill.
Design
The study followed a within-subjects design with some additional correlational analyses. The dependent
variable was the probability judgments of the participants. These were compared against the actual
probability of obtaining the best cards, which was obtained by simulating each hand 10 million times. The
very large number of runs for the simulations ensured that this value would differ barely, if at all, from an
analytically derived probability (though such calculations are essentially intractable for many of the hands
presented)—therefore we refer to this value as the ‘true probability’. The independent variables vary between
tasks. In the Pre-flop task the independent variables are the starting hands and the number of opponents faced.
The independent variables in the Flop task are the starting hands and their flops—the number of opponents
remains constant. The independent variables in the Jack-Ten task are the flops and the number of
opponents—the starting hand remains fixed. The order that the hands were presented in each task was
determined by fixed randomisation. The order of presentation for the three tasks was counterbalanced (six
possible task orders).
Procedure
Participants provided probability estimates for each task as percentages. All participants were familiar with
the basic rules of poker and hand rankings that determine the winner (see Figure 1b). For each task,
participants were told to estimate the chance of winning if the hand was played out fully to its conclusion with
all players remaining in the game (i.e. the probability of obtaining the best cards). All opponents’ hands were
unknown. Participants were told explicitly how the true probabilities had been calculated (play for all hands
were simulated 10 million times to give an accurate estimate of the true probability). Task order was fully
counterbalanced, and each ‘batch’ of six participants received one of the six possible task orders (randomised
within batch). Each task required 25 probability estimates to be made. The tasks were completed ‘unaided’
(i.e. participants did not have calculators, or books on poker strategy or theory, to hand), and
participants typically took about 25–30 minutes to complete the three tasks and to provide demographic
information.
Results
In order to assess estimation accuracy, the signed difference (judged probability minus true probability) and
the unsigned difference (magnitude of the signed difference) was calculated for each judged value. Two
aggregate measures reflecting judgment accuracy were obtained from these signed and unsigned differences:
bias and absolute deviation. Bias is the mean of the signed differences, and absolute deviation is the mean of
the unsigned differences. In the items analysis below, this averaging was performed across participants and
within hands—therefore, the aggregate measures reflect how accurately each hand was judged. In the
subsequent analysis of individual participants, this averaging was performed across hands and within
participants—therefore, the aggregate measures reflect how well each participant performed. Absolute
deviation served as the primary indicator of accuracy—a score of zero indicates perfect judgment, though a
non-zero score does not indicate the direction of error. Negative values of the absolute deviation are not
possible—the greater the (positive) value the less accurate the judgment in absolute terms. Bias, which can be
positive or negative, gave a measure of (absolute) overestimation or underestimation, respectively.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 503
Items analysis
We analysed each of the 75 hands for which participants supplied probability judgments by calculating the
mean probability judgment, absolute deviation and the bias for each hand. Note that a bias of zero for a given
hand does not necessarily equate to no bias on the part of individuals—merely that participants who
overestimated the hand are ‘balanced out’ by those who underestimated it. The true and mean judged
probability for each hand are given in the Appendix.
We calculated the correlation between judged and true probability for each participant. There was always a
good match between judged and true probabilities (in fact, an excellent match in most cases) with correlations
ranging from .64 to .95 (median of .88, IQR of .85–.92). Figure 2(b) illustrates these correlations for four
example participants: two participants representing the lower quartile by performance, and two representing
the upper quartile by performance. Fit lines are shown (solid lines): a quadratic function plotted when this is a
significantly better fit to the data than a linear one (otherwise a linear function is shown). Examination of
these example participants indicates that the most accurate participants provided judgments that matched
very well to the true probabilities, and that even participants with below-average accuracy provided
appropriate judgments for the majority of hands. The relationship between the mean judged and true
probabilities was very strong, r ¼ .96, p < .001 (Figure 2a). The same relationship was found using median
judgment in place of mean judgment (r ¼ .96).
Figure 2. Study 1: Scatter-plots to illustrate the accuracy of probability estimates. (Dotted line is the identity line; solid
line is the best fit line.) (a) Mean estimated probability (averaged within hands across participants) plotted against true
probability. (b) Example participants representing the: (i) Lower quartile (left) and (ii) upper quartile (right) for accuracy.
Example participants were determined according to the correlation between judged and true probabilities (upper pair of
participants) and absolute deviation (lower pair)
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
504 Journal of Behavioral Decision Making
Figure 2 illustrates that, in comparison to most studies of probability judgment, participants’ judgments
were very well calibrated. Figure 2(a) shows a general tendency to overestimate (i.e. positive bias), with many
points sitting a little below the identity line. A two-step hierarchical regression with true probability as the
dependent variable was used to determine that a quadratic function of mean judged probability was a better fit
than a straight line for these data (linear ! quadratic, significant R2 change ¼ .01, F(1,72) ¼ 8.86, p ¼ .004).
The overall regression was significant, R2 ¼ .93, F(2,72) ¼ 490.6, p < .001. This function, which is shown in
Figure 2(a), illustrates that overestimation is greatest for hands where the probability of obtaining the best
cards is low or moderate (i.e. 10–50% chance), but, on average, is minimal for hands where this probability is
high (>70%). Unsurprisingly, linear regression with bias as the dependent variable confirmed this pattern:
bias is significantly better described by a quadratic function of true probability than by a linear function
(linear ! quadratic, significant R2 change ¼ .05, F(1,72) ¼ 4.80, p ¼ .032). This inverted-U-shaped function
gives maximum expected bias of 10.1% when the true probability is in the range 10–21% (i.e. a relatively flat
maximum), and a bias of zero for a true probability of 74%.
Figure 3 shows the absolute deviation for each hand, plotted as a function of the true probability. Again,
linear regression with absolute deviation as the dependent variable found that a quadratic function of true
probability was a better fit to these data than a linear one (linear ! quadratic, significant R2 change ¼ .26,
F(1,72) ¼ 25.8, p < .001). This quadratic function accounted for 26% of the variance in mean absolute
deviation, F(2,72) ¼ 12.9, p < .001. Thus, in absolute terms, judgments deviated least from the true value
near the endpoints of the probability scale (especially the upper region), and deviated more from the true
value for hands with an intermediate true probability. However, if accuracy is assessed relative to the true
probability, then judgments for hands with low true probabilities are clearly the least accurate. For instance,
for a true probability of 10% the expected judgement is double what it should be and the expected mean
absolute deviation is 122% of the true value. In contrast, for a true probability of 50%, the expected
judgement is 1.13 times what it should be and the expected mean absolute deviation is 34% of the true value.
Figure 3 also illustrates a tendency for participants to be least accurate on the Flop task (higher absolute
deviations). The mean (SD) absolute deviation was 12.4 (4.0) for the Pre-flop task, 16.9 (5.8) for the Flop task
and 11.2 (3.7) for the Jack-Ten task. A single-factor between-groups ANOVA (using data for hands as the
cases) confirmed that there was a significant effect of task upon the mean absolute deviation, F(2,72) ¼ 10.4,
Figure 3. Study 1: Accuracy (absolute deviation) by true probability and task for the 75 hands, with quadratic fit line
plotted
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 505
Table 1. Study 1: Mean (SD) judgment, absolute deviation and bias by number of opponents (Pre-flop and Jack-Ten tasks
only)
Measure (%) Number of opponents
1 3 5 7 9
Probability judgment 59.7 (22.6) 48.3 (23.4) 39.6 (22.5) 32.9 (22.1) 27.3 (21.6)
Absolute deviation 9.47 (2.66) 12.36 (3.19) 13.30 (3.39) 12.52 (4.18) 11.37 (5.18)
Bias –1.53 (4.71) 7.76 (5.72) 7.98 (6.25) 6.42 (5.71) 4.18 (5.67)
Equiprobability Anchor values 50.00 25.00 16.67 12.50 10.00
Summary measures are first obtained for each hand by first averaging across participants within each hand; means and standard deviations
are for these summary measures across several hands.
MSe ¼ 21.2, p < .001 (R2 ¼ 0.22). A Tukey’s HSD post hoc test showed that the mean absolute deviation on
the Flop task was significantly higher (i.e. worse) than that for the other two tasks.2
The tasks require participants to integrate information about different components of the game of poker:
the starting hand, the community cards and the number of opponents. The last of these variables was varied
systematically in the Pre-flop and Jack-Ten tasks—and the results for these 50 hands are shown in Table 1.
Table 1 shows a tendency towards smaller absolute deviation for 1 or 9 opponents. One-way ANOVA
indicated that there was no significant effect of the number of opponents upon absolute deviation,
F(4,45) ¼ 1.50, MSe ¼ 14.5, p ¼ .219 (R2 ¼ .12)—however, a polynomial contrast showed that the quadratic
trend was significant, F(1,45) ¼ 4.72, p ¼ .035. Bias also follows an inverted-U-shaped function of the
number of opponents, with the least bias for 1 or 9 opponents, and the greatest bias for 5 opponents. One-way
ANOVA showed that there was a significant effect of number of opponents upon bias, F(4,45) ¼ 4.87,
MSe ¼ 31.8, p ¼ .002 (R2 ¼ .30). A Tukey HSD post hoc test showed that the mean absolute deviation for
hands facing 1 opponent was significantly different from that for hands facing 3, 5 and 7 opponents, and a
polynomial contrast showed that the quadratic trend was significant, F(1,45) ¼ 13.87, p ¼ .001.
Table 1 shows that judged values are not especially close to the equiprobability anchors. Therefore, if
participants do anchor on these values, they also adjust away from them to a considerable degree. Moreover,
adjustments from these potential anchors are variable in size: standard deviations are large, and, for instance,
the average probability judgment is 10% above the equiprobability anchor for 1 opponent but 23% above the
anchor for 3 opponents.
2
Independent sample t-tests were conducted to see if there was any difference in absolute deviation the handpicked hands and the random
hands. No significant differences were found for the Flop task or the Jack-Ten task. These tasks were not combined for this analysis, as
there may have been some effect of varying numbers of opponents in the Jack-Ten task.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
506 Journal of Behavioral Decision Making
judgments that ranked hands more correctly. The relationship between absolute deviation and amount of time
since the participant first learned to play poker was also moderate, r ¼ –.32, p ¼ .056—more experienced
players made more accurate judgments. All other correlations were weak and non-significant (all jrj < .23, all
p > .184). So, perhaps surprisingly, club rankings based on 13 weekly tournaments did not significantly
predict task accuracy, so it is possible that the ability to assess the probability of achieving the best hand is not
crucial to becoming successful at poker. However we should note that these rankings were only available for
22 participants, and, with club players attending for differing numbers of weeks, are not easily combined into
a reliable indicator for comparing performance across individuals.
Discussion
One of the most striking features of this data set is that participants’ judgments were generally well calibrated,
exhibiting just a small tendency to overestimation (mainly at the lower end and in the mid-range of the
response scale). We will reserve detailed discussion of the reasons for this uncommonly high level of
performance until the General Discussion, as Studies 2 and 3 throw additional light on the process by which
judgments are made. Therefore, in discussing this study, we focus on understanding why participants found
some hands easier to judge than others.
Figure 2 shows that participants are most accurate when the chance of obtaining the best cards is relatively
high. It is important to note that these situations cannot simply be identified by using just one of the three
components of the game (starting hand, community cards or number of opponents). For instance, for many
situations with 1 opponent there was only a moderate chance of obtaining the best cards, and holding a strong
starting hand (e.g. A|A€) was no guarantee of a high probability of obtaining the best cards once the flop is
dealt. Thus the successful identification of high probability situations (and appropriately discriminating these
from other situations) is evidence of successful information integration of at least two, or very likely more,
sources of information.
Performance seemed to be dramatically worse for some hands on the Flop task than for hands on the other
tasks with similar true probabilities. Three (out of the four) hands with the highest absolute deviations on the
Flop task had true probabilities of 43.6, 44.8 and 46.8%. All were overestimated by a margin of more than
20% on average. Each of these three hands had the same starting hand of A|A€. These cards (a pair of aces
from any two suits) are the best starting hand that can be dealt. People may have overestimated their
judgments because they focused on the perceived strength of this starting hand and failed to account for
situations where they may lose. This is consistent with the inside view (Kahneman & Lovallo, 1993). Data
from the Pre-flop task support this, as the highest absolute deviations (i.e. the greatest absolute discrepancies
from the true values) are for starting hands of A|A€, K J and 6^6|. These are the best three starting
hands of the five starting hands tested in this task and are generally regarded as good cards amongst
experienced poker players. Therefore, it seems that the tendency for overestimation is greatest when a player
is holding strong a starting hand.
However, overestimation was not restricted to situations where the starting cards were strong. For
example, a starting hand of 3^8 with a flop of 10 7|8€ was one of the most overestimated hands in the
Flop task with an absolute deviation of 22.1% and a bias of 21.6%. This starting hand is amongst one of the
worst a player can be dealt due to its low value cards and poor drawing possibilities (e.g. straights and flushes
are unlikely). A pair has been made, but the chance of improvement with additional community cards (i.e. the
turn and the river) is low. The true probability for this hand is 14.5%, and, although participants were often
positively biased for hands with low true probabilities, judgments for this hand were especially inaccurate in
comparison to other hands with a similar true probability. Consistent with the inside view, players may be
using a positive test strategy (Klayman & Ha, 1987; Mussweiler & Strack, 1999) to evaluate their simulations
of future card draws. Focussing on how additional cards can improve the chances of winning whilst failing to
consider how these same cards might assist other players could explain the general tendency to overestimate.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 507
Accuracy in the Flop task was significantly worse than for the other two tasks. The Flop task required
participants to make judgments based on different combinations of starting hands and 3-card flops. However,
the lower performance in this task cannot be attributed to difficulties in integrating information concerning
the flop cards, as this was also a requirement of the Jack-Ten task. The number of opponents was fixed at 5
for the Flop task. One might assume that this meant that the chances of obtaining the best cards were not as
extreme as for the other tasks where players sometimes faced 1 or 9 opponents. However, the Jack-Ten and
Flop tasks both had the same number of hands where the true probability fell between either 0–15 or 85–
100%. One possibility is that by varying the number of opponents in the Jack-Ten and Pre-flop tasks, this
information became more salient—encouraging participants to adjust more effectively for the number of
opponents. The number of opponents is a relevant, if indirect, source of base-rate information (cf Cohen,
1981)—and it is a common finding that such information is often underutilised when its importance is not
stressed (Koehler, 1996). Thus, participants may be more likely to adjust estimates to take account of the
5 opponents that they face when they are also asked to consider 1, 3, 7 and 9 opponents (as in the Jack-Ten and
Pre-flop tasks) than when the number of opponents is fixed at 5 across the task (mean absolute deviation of
15.2% in the Flop task versus 10.6% with 5 opponents in the other two tasks). Thus although Table 1 indicates
that it may be slightly harder to judge hands with 5 opponents than with 1 opponent (perhaps because
1 opponent is a more common occurrence in actual play than 5 opponents), the greater difficulty of the Flop
task is not solely attributable to the fact that 5 opponents were faced—though differences between tasks in
salience of those 5 opponents may be important.
STUDY 2
Despite the overall high level of performance, Study 1 provides several converging pieces of evidence of a
tendency to rely slightly too strongly on the value of the starting hand to the detriment of fully incorporating
other information (the number of opponents or the flop cards). For instance, overestimation is greatest for
strong starting hands and when the number of opponents is plausibly less salient. This could be viewed in
terms of insufficient adjustment from a self-generated anchor (Epley & Gilovich, 2001; Tversky &
Kahneman, 1974): anchoring on the value of the starting hand and adjusting insufficiently for the flop cards or
the number of opponents. It could also be viewed as the result of sub-optimal simulation associated with an
‘inside view’ or a positive test strategy (Kahneman & Lovallo, 1993; Klayman & Ha, 1987): focussing on
what might enhance the value of one’s own hand with insufficient regard to how this might effect the strength
of one’s opponents’ hands (and the fact that there are multiple opponents). Study 2 builds on the groundwork
of Study 1 by directly testing for anchoring on starting card strength, consistent with over-reliance on the
‘inside view’.
Method
Participants
Twenty-one male poker players from the University of Essex participated, five of them indicated that they had
participated in Study 1. Participant characteristics were similar to Study 1, including age (mean of 20.7, range
18–25 years), frequency of live play (median of 6 times per month, IQR 4–12) and online play (median of
10 times per month, IQR 1–20). Participants in this study had generally been playing slightly longer than
those in Study 1: median of 36 months (IQR 23–51).
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
508 Journal of Behavioral Decision Making
giving four ‘sets’ of five cards to examine. Each set of five cards was then divided into two subsets in various
ways to create several starting hand and 3-card flop combinations (shown in Table 2). This created 14 hands
for which participants were to assess the probability of obtaining the best cards against 5 opponents.
Set 1 (A|A€3^3€3 ) had three variants: starting hands A|A€, A|3 and 3€3 . (For instance: with a
starting hand of A|A€ the flop was 3^3€3 , whereas, with a starting hand of A|3 the flop was
A€3^3€.)
Set 2 (A|Q K€J 10€) had four variants: starting hands A|K€, A|Q , K€J and J 10€.
Set 3 (J J^10 10€2|) had three variants: starting hands J J^, J^10 and 10 10€.
Set 4 (3 K J 9 10^) had four variants: starting hands K 3 , 10^9 , 10^3 and 9 3 .
The cards used as starting hands were selected to provide a range that varied in strength. The starting hand
and flop combinations provided a spread of true probabilities (SD ¼ 26.6%, range ¼ 64.4%).
Results
All task data were complete, though three participants failed to provide complete demographic data. A
typographic error in preparing the materials meant that one of the hands in Set 2 was not an exact re-
arrangement of the other hands in the set. With the starting hand as A|K€ the flop was presented as
Q|J 10€ when it should have been presented as Q J 10€. The true probability for the hand presented was
85.5%, when it would have been 76.2% for the intended flop. The data were treated in the same way as in
Study 1.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
Table 2. Study 2: Comparison of judged probabilities (organised by strength of starting hand)
Stronger starting hand Weaker starting hand Strong–weak starting hand
J. Liley and T. Rakow
Starting hand Mean True Starting hand Mean True True Mean judged t(20): True
[pre-flop Flop judgment probability [pre-flop Flop judgment probability probability probability vs. judged
probability] cards (%) (%) probability] cards (%) (%) difference difference (SD) difference
Set 1
A|A€ [49.3] 3€3 3^ 87.1 74.6 A|3 [17.9] 3€3^A€ 91.6 94.5 19.9 4.4 (14.5) 4.88
DOI: 10.1002/bdm
Journal of Behavioral Decision Making, 23, 496–526 (2010)
510 Journal of Behavioral Decision Making
Discussion
Study 2 confirmed the finding of Study 1 that moderately experienced poker players could provide accurate
judgments of their chances of obtaining the best cards when their opponents’ cards are unknown and two
community cards remain to be drawn. In keeping with Study 1, judgments were tainted by a degree of
3
In calculating and using this formal measure of starting hand strength we are not presuming that players know these values. However, it
is likely that players’ perception of starting hand strength with have a strong rank correlation with these calculated values, and that
training in poker strategy means that skilful players could have an approximate interval scaling of starting hand strength that corresponds
closely to this. This therefore serves as a valuable measure for assessing the cues that players attend to.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 511
overestimation. Moreover, Study 2 also sheds light on the ‘margins’ of performance: illuminating why, in
some instances, performance, even if good, is less than perfect. We illustrate this by reference to the pair-wise
comparison of different hands.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
512 Journal of Behavioral Decision Making
STUDY 3
Studies 1 and 2 show remarkably accurate probability judgments by poker players. However, we conducted
one further study to determine whether we had unwittingly helped our participants to provide a well-
calibrated set of judgments. In actual play, a poker player need only consider the strength of his cards, or his
chances of winning, for one hand at a time. However, in Study 1 participants were presented with 25 hands on
a single page, and in Study 2 participants saw three or four hands per page. One possibility is that this
encouraged explicit comparisons between hands—and that simply by ranking the hands in a somewhat
appropriate manner and spreading their judgments over the range of possible responses participants were able
to provide fairly accurate assessments. Moreover, in Study 1, participants may have been ‘cued’ to consider
the number of opponents as a relevant variable by the fact that the number of opponents was listed explicitly
for the Pre-flop and Jack-Ten tasks. Therefore, Study 3 used a more ecologically valid presentation format for
the task information (that mimicked online poker), made no explicit reference to variation in the number of
opponents, and showed participants only one hand at a time. The key findings of Studies 1 and 2 were then re-
examined in this new data set, collected using an information presentation format more closely approximated
actual poker play.
Method
Participants
28 male poker players from the University of Essex Poker Society with a mean age of 20.2 years (range 18–
24) participated. The median time since learning to play was 28 months (IQR 14–42), and the median
frequency of play was 6 times per month (IQR 4–10) for live play and 10 times per month (IQR 4–20) for
online play. Eleven participants reported participating in either Study 1 or 2—no one had participated in both
studies.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 513
Figure 4. Study 3: presentation of task information: (a) Pre-flop judgments and (b) post-flop judgments presented on two
successive pages.
making their judgment. Thus, Study 3 represents a more ecologically valid task than Studies 1 and 2 in three
respects: (1) Task-relevant information is presented as it is in actual online play, (2) information becomes
available sequentially as in actual play, and (3) hands are considered one at a time, so removing the possibility
that simultaneous presentation of hands enhances judgment by making key variables (e.g. number of
opponents) more salient or allowing explicit ranking among hands. Note also that this procedure meant that
participants were required to switch between pre-flop and post-flop judgments—just as they would be if they
were making a variety of judgments over the course of a session of play. This also diminished the similarity
between successive judgments, making it harder to make useful direct comparisons between successive
hands.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
514 Journal of Behavioral Decision Making
use of anchoring on the previous judgement for identical cards, which would hardly ever occur in actual play,
and (c) in general, a reasonable mixing of the order of hands occurred between participants.
Participants were instructed to work through the booklets in the order that they were presented to them (i.e.
as randomised by the experimenter) and to work through each booklet in order. They were asked to work
carefully through the booklets, providing an estimate each time it was asked for. Participants were asked not
to go back and change answers once they had moved on to the next scenario. This was to further discourage
artificially inflating accuracy by allowing explicit comparison or ranking among hands. After completing the
task, participants provided demographic information (age, sex and degree scheme) and information about
their poker playing (when they learned, and frequency of live and online play). Data were collected at club
poker tournaments, with players completing the task when they had finished playing or during breaks
between play. Participants took approximately 10–30 minutes to complete the task, and received a fixed
participation fee of UK£4 (approximately US$6 at the time of the study).
Results
Five participants missed out one estimate each, which was presumably the consequence of turning two pages
at once, or not realising that the back cover of a booklet sometimes showed a hand. All remaining data were
complete, and were analysed as per Studies 1 and 2.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 515
Figure 5. Study 3: Scatter-plots to illustrate the accuracy of probability estimates (dotted line is the identity line
for reference): (a) Mean estimated probability (averaged within hands across participants) plotted against true probability.
(b) Example individual participants representing the: (i) Lower quartile (left) and (ii) upper quartile (right) for accuracy.
Example participants were determined according to the correlation between judged and true probabilities (upper pair of
participants) and absolute deviation (lower pair)
Moreover, the negative t-values generally have greater magnitude than in Study 2—thus there are several
clear-cut instances of hands with weaker starting cards being over-estimated relative to other hands, contrary
to the general pattern of Study 2. Overall, with equal numbers of effects in each direction, the results shown in
Table 3 are equivocal with respect to over/under-estimation on the basis of starting hand strength.
Discussion
Study 3 provides strong evidence that the good-to-excellent accuracy in probability judgment seen in Studies
1 and 2 was not simply an artefact of the presentation format used in those studies. Using a presentation
format based on online poker and asking participants to consider only one judgment at a time, we found
similar levels of accuracy to Study 1 as assessed by the correlation between judged and true probabilities
across hands and by absolute deviation. In fact, judgment bias was slightly lower in this study.
In relation to re-examining the primary research question of Study 2, these data are equivocal. With
respect to Set 1 (A|A€3^3€3 variants) and Set 3 (J J^10 10€2| variants) we find general
concordance with what would be predicted if participants place too much weight on their starting hands—but
the effect is less marked than in Study 2. Therefore, like the findings regarding bias discussed above,
judgments for these cards, whilst not perfect, were slightly more appropriate than in previous studies. With
respect to Set 4 (3 K J 9 10^ variants), we again found results contrary to the pattern observed in Sets 1
and 3—and this discrepancy was more marked than in Study 2. It is therefore clear that overweighting the
starting hands is certainly not a universal feature. That said, every significant effect in Set 4 seems to be
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
516
Starting hand Mean True Starting hand Mean True True Mean judged t(27): True
[pre-flop Flop Judgment Probability [pre-flop Flop Judgment Probability probability probability vs. judged
probability] cards (%) (%) probability] cards (%) (%) difference difference (SD) difference
Set 3
10 10€ [30.1] J^J 2| 41.5 38.2 J^10 [22.2] 10€2|J 69.9 63.1 24.9 28.4 (19.7) 0.95
J J^ [33.8] 10€10 2| 50.7 39.2 J^10 [22.2] 10€2|J 69.9 63.1 23.9 19.2 (18.6) þ1.33
J J^ [33.8] 10€10 2| 50.7 39.2 10 10€ [30.1] J^J2| 41.5 38.2 1.0 9.2 (21.8) þ1.99
Set 4
3 K [19.4] J 9 10^ 35.3 40.0 10^3 [11.7] K 9 J 21.0 11.8 28.2 14.3 (15.4) 4.77
9 10^ [19.6] 3 J K 29.1 26.4 9 3 [14.9] K J 10^ 32.1 32.7 6.3 3.0 (19.5) þ0.89
9 10^ [19.6] 3 J K 29.1 26.4 10^3 [11.7] K 9 J 21.0 11.8 14.6 8.1 (15.3) 2.23
9 3 [14.9] K J 10^ 32.1 32.7 10^3[11.7] K 9 J 21.0 11.8 20.9 11.2 (16.5) 3.02
3 K [19.4] J 9 10^ 35.3 40.0 9 3 [14.9] K J 10^ 32.1 32.7 7.3 3.1 (19.0) 1.16
9 10^ [19.6] 3 J K 29.1 26.4 3 K [19.4] J 9 10^ 35.3 40.0 13.6 6.2 (20.4) þ1.93
p < .05; p < .01; p < .001.
DOI: 10.1002/bdm
Journal of Behavioral Decision Making, 23, 496–526 (2010)
J. Liley and T. Rakow Probability Estimation in Poker 517
largely driven by the overestimation of 10^3 paired with a flop of K 9 J. So, as discussed in Study 2, we
should retain the working hypothesis that ‘attractive’ flops may have an effect on judgment that is similar to
what we have generally seen with strong starting cards. Certainly, this would be consistent with Windschitl,
Kruger and Simms (2003) who found that competitors fail to fully appreciate that factors that assist their
performance (such as a flop offering good prospects for a highly ranked hand at the conclusion of the hand)
may not improve their chances of winning when such factors also assist their opponents.
GENERAL DISCUSSION
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
518 Journal of Behavioral Decision Making
folded, I usually know whether my cards would have beaten the winner’s hand (because the winning player’s
cards are shown unless a player wins by all other players folding). However, I never know when another
player would have won if they had not folded without showing their cards.4 Therefore, players’ feedback
concerning how frequently they end up with the best hand can be viewed as a biased sample of information. If
players fail to adjust for this bias, they will overestimate the chances of obtaining the strongest hand around
the table. Many studies imply that people have difficulty identifying when their experience is biased, or, if
they are aware of this, that they adjust insufficiently for sample bias (Fiedler, 2000; Juslin & Fiedler, 2006).
A third explanation for success in meteorologists’ forecasts is training in probabilistic thinking. Our
student participants came disproportionately (though certainly not exclusively) from more numerate
disciplines. However, studying a numerate academic discipline is no guarantee of training in probabilistic
thinking. Moreover, we found no difference in accuracy between those from more/less mathematical degree
schemes. Some participants may have read books on the theory and strategy of poker (e.g. Harrington &
Robertie, 2006)—but there is no a priori reason to suppose that our participants could be considered experts
in probabilistic reasoning.
Fourth, a failure to take account of base-rate information (i.e. the relative frequency of the target event) is a
contributory factor in many examples of poor probability judgment (e.g. Rakow, Harvey, & Finer, 2003).
Seemingly participants adopt a case-based approach: making intuitive judgments based on the individual
characteristics of the instance at hand, and neglecting to take account of the class or classes that the instance
belongs to, from which base rate information can be estimated or derived (see Griffin & Brenner, 2004).5 In
contrast, historical weather records provide meteorologists with explicit base rate information for the weather
event that they are assessing. Our poker players did not receive explicit base rate information. In Study 1, the
mean true probability across the 75 hands was 36.4% and 37.2% across 35 hands in Study 3—but we gave
participants no indication of this base rate for either study. One could argue that this is a good example of a
task where there is more than one relevant base rate (Cohen, 1981). Players should also have regard to what
we might term the a priori base rate: the probability of holding the best cards at the conclusion of the hand
based solely on the number of opponents (i.e. the ‘equiprobability anchor’—e.g. 50% with 1 opponent, or
16.7% with 5 opponents). The average value of this base rate was 20.8% in Study 1—however, neither this
figure nor the a priori base rate for individual hands were explicitly stated (though participants could have
chosen to calculate this themselves). The success of our participants in providing accurate probability
judgments implies that they did have a broadly appropriate regard for base rate information, even though this
information was not explicit. However, as discussed in Study 1, it may be that the a priori base rate is more
salient—and therefore incorporated more effectively into judgments—when it varies from hand to hand as in
the Pre-flop and Jack-Ten tasks, and, albeit less explicitly, in Study 3. Indeed, several studies have found that
manipulating base rates within subjects reduces base-rate neglect (e.g. Birnbaum & Mellers, 1983; Fischhoff,
Slovic, & Lichtenstein, 1979—for a review see Koehler, 1996).
The fifth explanation for successful probability judgment that Koehler et al. (2002) discuss is the
availability of accurate cues for judgment combined with the presence of moderate base rates (event
probabilities around 50%). Having genuinely predictive cues allows meteorologists to discriminate
effectively between situations where an event is more or less likely to occur. In the case of poker, the
information upon which judgments ought to rely (the starting cards, the community cards and the number of
players) is accessible and ought to be obvious—something which may not be the case for many diagnostic or
prognostic judgments in medicine. However, exactly how this information might best be segmented or
4
To adapt the words of Donald Rumsfeld: ‘There are known knowns: Things that we know have occurred. And there are known
counterfactuals: Things that we know would have occurred. But there are also unknown counterfactuals: Things that we don’t know
would have occurred’.
5
Note that ‘cased-based reasoning’ has a particular meaning here—the evaluation of evidence based on the features of an individual case
as opposed to distributional features—which differs from the usage of the phrase ‘case-based’ in some other contexts.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 519
combined to generate relevant cues is a far from trivial problem. For instance, in Study 2 we illustrated how
the chance of making the strongest hand at the table does not depend solely upon the pool of cards that a
player can build their hand from—but depends on how that pool is divided between the player’s hand and the
community cards, and therefore upon how opposing players can use those cards, and consequently also upon
the number of opponents. Our design allows us to consider two ‘isolated cues’: the equiprobability anchor
derived from the number of players, and the starting hand strength. For Study 1, the correlation between the
equiprobability anchor and the true probability was r ¼ .39. To provide a formal measure of card strength
we determined the probability of obtaining the best cards against 1 opponent for each starting hand
(i.e. before the flop is dealt). These values were also moderately correlated with the true probability (r ¼ .38).
Thus, when treated in isolation these cues are clearly helpful, but are not powerful enough to explain the very
strong correlations between judged and true probabilities that most of our participants achieved: each of these
cues accounts for around 15% of the variance in true probability, yet most participants’ judgments could
account for more than 70% of this variance. This implies that poker players do not simply rely upon simple
cue information, but combine or aggregate information effectively from different sources to make
appropriate judgments. It therefore seems likely that they have used information in a configural fashion, much
as Busey and Vanderkolk (2005) report in the case of experts in finger print identification. Koehler et al.
(2002) point out that in most of the examples of very good calibration in full-range tasks the event base rate is
close to 50%. This is certainly true for Keren (1987) where bridge players had a 55% chance of making a
contract, for the most accurate set of physicians’ judgments reported by Arkes et al. (1995) where patients had
a 45% chance of surviving 6 months, and for the highly accurate predictions of sports writers obtained by
Onkal and Ayton (1997, reported in Koehler et al., 2002) with 56% of home team wins in a set of English
soccer matches. Thus again, our data are unusual in that successful probability judgments are made despite
moderately low base rates (36–37%). Consequently, not only were poker players’ probability judgments
unusually good—they were surprisingly good given the task characteristics.
Thus, although some features of the task environment may have been conducive to good judgment, no
single one of these five possible reasons provides a definitive explanation of why our participants succeeded
(in probability judgment) when others have frequently failed. A sixth possibility may have contributed to the
success of our participants—or, perhaps more correctly, to the failure of others. Our task lent itself to the use
of simulation to generate accurate estimates of the true probabilities for each hand. This is not normally
possible in calibration research, where usually one relies upon observed outcomes only using the relatively
small set of outcomes for which judgments were made.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
520 Journal of Behavioral Decision Making
recognising that the support for his hand is weak, he may underestimate the strength of support for his
opponents’ hands.
How might the evaluation of the strength of evidence or the support for a hypothesis proceed in the early
stages of a hand of poker? Three quarters of our Study 1 participants produced judgments that correlated
r > .85 with the true probabilities (and three-quarters r > .81 in Study 3)—this could only be achieved by
evaluating and integrating multiple sources of information. However, inevitably, poker players were not
perfect in their judgment: the correlation between judged and true probabilities was always less than 1, and
there was a tendency towards overestimation for low and moderate probabilities. According to the strength-
weight model, these imperfections can be seen as deriving from a tendency to anchor on the value of the
starting hands, but to adjust (somewhat insufficiently) for the a priori base rate. According to support theory,
these imperfections can equivalently be seen as a tendency to evaluate the strength of one’s own hand, whilst
paying insufficient regard to the potential strength of one’s opponents’ hands. At least four lines of evidence
from our studies support this conclusion. First, in Study 1, overestimation was greatest for hands with the
strongest starting hands. Second, for rearrangements of the same five cards in Study 2, overestimation
relative to other hands was predicted by differences in starting hand strength (though this was not so in Study
3). Third, accuracy was worst in the Flop task where the a priori base rate was not salient. Fourth, analysis of
Study 1 reveals that probability judgments across the 75 hands were more strongly related to starting hand
strength than they should have been, and less strongly related to the equiprobability anchor than they should
have been. This is shown by the fact that the correlation between starting hand strength and true probability is
.38, but the correlation between starting card strength and the mean judged probability in slightly higher at .49
(see Footnote 3). In contrast, the correlation between the equiprobability anchor and the true probability is
.39, but the correlation between the equiprobabilty anchor and the mean judged probability is slightly lower at
.32. This finding is replicated for the 35 hands used in Study 3. Here the correlation between card strength and
true probability is .37, but the correlation between starting card strength and mean judged probability is
higher at .50. The correlation between the equiprobability anchor and true probability is .41, but the
correlation between the equiprobability anchor and mean judged probability is again lower at .24. Equivalent
results are obtained from analysis of each individual participant: the median correlation between judgment
and starting hand strength was .45 in Study 1 and .42 in Study 3, whereas the median correlation between
judgment and the equiprobability anchor was .29 in Study 1 and .21 in Study 3. Analysis of these correlation
coefficients for individuals shows that most participants’ judgments were more closely related to starting
hand strength than they were to the equiprobability anchor (78% of participants in Study 1 and 79% in Study
3)—even though these cues are similarly related to the true probability. Note that there is little apparent
difference in cue usage between the less and more ecologically valid version of the task (Study 1 vs. Study
3)—though the slightly lower weight given to the equiprobability anchor in Study 3 could imply a little less
regard for the number of opponents when this variable is not explicitly mentioned in Study 3.
The previous paragraph gives an account of why our poker players’ judgments are sub-optimal—but, in a
sense, it also provides the key to understanding why they were not especially far from being optimal. Players
may anchor on (or overweight) the strength of their current hand or their starting hand—but this information
is a valid cue—it is not an arbitrary anchor—and the fact that they can do so shows a measure of expertise in
assessing hands. Players may have insufficient regard for the base rate or take insufficient note of alternative
possibilities (e.g. in their mental simulations)—but they do nonetheless make some appropriate adjustment
for these factors, which participants in some other studies seem to have taken far too little notice of
(e.g. Buehler, Griffin, & Ross, 2002; Kahneman & Tversky, 1973). There are a few features of the game of
poker that might have encouraged participants to pay some (if slightly insufficient) attention to these factors
that people tend to neglect in other situations. First, poker is a zero-sum competitive game. If I win, all my
competitors lose—if one of my opponents wins, then I lose. It is likely that this makes it more natural to
evaluate the support for the hypothesis that I win relative to the support for the alternate hypothesis that one of
my opponents win without overlooking significant amounts of support for the alternate hypothesis. Second,
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 521
as already discussed, the a priori base rate ought to be fairly salient—I can see the number of players
remaining in the game, and informal or formal calculations of whether it is worth betting (i.e. paying) to
remain in the hand will probably encourage some reflection on this component of the game (for a description
of ‘counting outs’ and estimating ‘pot odds’ see Harrington & Robertie, 2006, pp. 119–144). Furthermore, the
zero-sum nature of the game ought to make it clear why the number of opponents is relevant to the calculation
of the probability of obtaining the best cards—imbuing the a priori base rate with causal properties, which
often decreases the degree of base rate neglect (Tversky & Kahneman, 1982). Third, even if a player does not
attend consciously to this base rate, as soon as he simulates outcomes for multiple opponents (ideally for all of
his opponents), and makes some adjustment to his beliefs on that basis, then he is making adjustment
equivalent to some adjustment for base rate. Similarly, even if a player fails to simulate possible opponents’
hands, a mechanical adjustment on the basis of his base rate probability of obtaining the best cards is
equivalent to incorporating the fact that with more opponents in the game there are more chances for at least
one of them to acquire a strong hand.
Future research
Texas Hold ‘em Poker offers valuable possibilities for investigating behavioural decision making and
expertise. We considered only the first stages of the game, and there is considerable scope for extending the
tasks that we used to examine how beliefs update as new information becomes available as hands are played
out (Hogarth & Einhorn, 1992). Also, by asking players to judge the probability of holding the strongest hand
assuming that all players remain in the game, we by-passed some of the more subjective or intuitive (and,
arguably, most interesting) features of the game: confidence, bluffing and betting behaviour. An
understanding of these features would require different kinds of studies, at least some involving actual play.
The findings from our judgment task provide clear predictions concerning betting behaviour. Overestimation
was greatest when starting hands were strong—therefore, we would predict a tendency to adopt sub-optimal
betting strategies when holding such hands. However, it would be important to investigate whether betting
decisions can indeed be predicted from players’ explicitly stated beliefs.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
522 Journal of Behavioral Decision Making
sum, the game has enough regularity to be comprehensible, but enough complexity and uncertainty to be
interesting. This is presumably why the game has attracted millions of aficionados—and makes it a fruitful
area for decision research.
APPENDIX
Pre-flop task
6c6d — 1 62.7 62.5 56.9
6c6d — 3 31.5 47.8
6c6d — 5 20.2 36.0
6c6d — 7 15.6 26.5
6c6d — 9 13.3 19.5 24.8
3d8h — 1 34.8 30.8 31.0
3d8h — 3 15.9 21.3
3d8h — 5 10.1 15.6
3d8h — 7 7.4 11.6 10.5
3d8h — 9 6.0 8.3
KhJh — 1 61.5 62.6 58.8
KhJh — 3 37.0 50.3
KhJh — 5 27.6 40.6
KhJh — 7 22.1 32.2 31.7
KhJh — 9 18.5 25.7
3c4c — 1 35.7 41.8
3c4c — 3 20.5 32.3 20.1
3c4c — 5 15.2 26.1
3c4c — 7 12.6 19.5 14.9
3c4c — 9 11.0 14.9
AsAc — 1 84.9 83.7 78.3
AsAc — 3 63.8 71.8
AsAc — 5 49.3 61.1
AsAc — 7 38.9 52.5
AsAc — 9 31.3 44.6 48.2
Flop task
AsAc 3s3d3h 5 74.6 85.7
AsAc Kh6s9d 5 46.8 68.3
AsAc 10c4hQh 5 43.6 65.2
AsAc 2s8cAh 5 88.4 86.9
AsAc Jd5c9s 5 44.8 68.2
3d8h 10h7s8c 5 14.5 36.1
3d8h 6h7c2c 5 5.4 20.0
3d8h Qh7d7h 5 3.7 15.5
3d8h Ah3h8s 5 54.3 57.2
3d8h 5h6h7h 5 26.0 52.1
(Continues)
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 523
Appendix A1. (Continued)
Number of True Mean estimate Mean estimate
Starting hand Flop opponents probability (Study 1) (Study 3)
KhJh 10h9sQh 5 87.2 88.0 86.8
KhJh 10d3h9h 5 48.0 52.6 49.1
KhJh 6c4c5s 5 5.6 28.2 15.8
KhJh Ac4d9s 5 9.7 23.6 17.2
KhJh Js5h8h 5 59.9 63.0 64.1
3c4c 8c5c6c 5 65.3 69.3
3c4c KhJdJs 5 2.0 11.1
3c4c As5c7h 5 27.1 24.0
3c4c Js2s6c 5 16.8 18.9
3c4c 10d3h8d 5 12.0 24.7
6c6d AsAc6h 5 83.8 82.9
6c6d 8c6h7d 5 53.0 69.9
6c6d Ah3h9h 5 6.3 28.2
6c6d Qc9sKs 5 7.3 23.0
6c6d Jd5c7s 5 12.8 27.2
Jack-10 Task
10sJh Kh4d7h 1 41.2 38.5
10sJh Kh4d7h 3 17.9 24.6
10sJh Kh4d7h 5 12.0 17.3
10sJh Kh4d7h 7 9.0 12.3
10sJh Kh4d7h 9 7.2 7.5
10sJh Ad6c6d 1 36.2 36.6 31.5
10sJh Ad6c6d 3 11.8 24.1 19.1
10sJh Ad6c6d 5 6.0 16.7 12.5
10sJh Ad6c6d 7 3.5 12.0 11.9
10sJh Ad6c6d 9 2.3 7.3 11.8
10sJh 2cJd10h 1 89.5 85.0
10sJh 2cJd10h 3 74.2 74.1
10sJh 2cJd10h 5 63.1 63.4
10sJh 2cJd10h 7 54.8 57.0
10sJh 2cJd10h 9 48.4 50.3
10sJh QhAcKs 1 93.1 94.6
10sJh QhAcKs 3 89.0 87.8
10sJh QhAcKs 5 85.5 81.1
10sJh QhAcKs 7 82.4 76.4
10sJh QhAcKs 9 79.6 71.9
10sJh 10d3dQh 1 73.2 61.5 49.1
10sJh 10d3dQh 3 43.4 48.6 34.4
10sJh 10d3dQh 5 27.5 38.4 26.7
10sJh 10d3dQh 7 18.9 29.5 22.8
10sJh 10d3dQh 9 14.0 23.4 21.9
Cards: J ¼ jack, Q ¼ queen, K ¼ king, A ¼ ace.
Suits: c ¼ clubs, d ¼ diamonds, h ¼ hearts, s ¼ spades.
This information is given for Study 2 hands in Table 2 (Study 2) and Table 3 (Study 3).
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
524 Journal of Behavioral Decision Making
ACKNOWLEDGEMENTS
This paper represents an equal collaboration between the two authors. The authors thank Steve Avons,
Michael Dowling and James O’Geran for comments on earlier versions of this work. The final version of this
paper was completed while Tim Rakow was a Visiting Fellow in the School of Psychology at the University
of New South Wales, and supported by a Short Visit Grant from The Royal Society.
REFERENCES
Ariely, D., Au, W. T., Bender, R. H., Budescu, D. V., Dietz, C. B., & Gu, H., et al. (2000). The effects of averaging
subjective probability estimates between and within judges. Journal of Experimental Psychology: Applied, 6, 130–147.
Arkes, H. R., Dawson, N. V., Speroff, T., Harrell, F. E., Alzola, C., & Phillips, R., et al. (1995). The covariance
decomposition of the probability score and its use in evaluating prognostic estimates. Medical Decision Making, 15,
120–131.
Balthasar, H. U., Boschi, R. A. A., & Menke, M. M. (1978). Calling the shots in R&D. Harvard Business Review, 56,
(May–June), 151–160.
Birnbaum, M., & Mellers, B. A. (1983). Bayesian Inference: Combining base rate opinions of sources who vary in
credibility. Journal of Personality and Social Psychology, 45, 792–803.
Bolger, F., & Wright, G. (1994). Assessing the quality of expert judgment: Issues and analysis. Decision Support Systems,
11, 1–24.
Brenner, L. A. (2003). A random support model of the calibration of subjective probabilities. Organizational Behavior
and Human Decision Processes, 90, 87–110.
Brenner, L. A., Koehler, D. J., & Rottenstreich, Y. (2002). Remarks on support theory: Recent advances and future
directions. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases: The psychology of intuitive
judgment (pp. 489–509). Cambridge: Cambridge University Press.
Buehler, R., Griffin, D., & Ross, M. (2002). Inside the planning fallacy: The causes and consequences of optimistic time
predictions. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases: The psychology of intuitive
judgment (pp. 250–270). Cambridge: Cambridge University Press.
Busey, T. A., & Vanderkolk, J. R. (2005). Behavioral and electrophysiological evidence for configural processing in
fingerprint experts. Vision Research, 45, 431–448.
Christensen-Szalanski, J. J. J., & Busheyhead, J. B. (1981). Physicians’ use of probabilistic information in a real clinical
setting. Journal of Experimental Psychology: Human Perception and Performance, 7, 928–935.
Cohen, L. J. (1981). Can human irrationality be experimentally determined? Behavioral and Brain Sciences, 4, 317–331.
Dougherty, M. R. P., Gettys, C. F., & Thomas, R. P. (1997). The role of mental simulation in judgments of likelihood.
Organizational Behavior and Human Decision Processes, 70, 135–148.
Epley, N., & Gilovich, T. (2001). Putting the adjustment back in the anchoring and adjustment heuristic. Psychological
Science, 12, 391–396.
Ericsson, K. A., & Smith, J. (1991). Prospects and limits of the empirical study of expertise: An introduction. In K. A.
Ericsson, & J. Smith (Eds.), Toward a general theory of expertise. Cambridge: Cambridge University Press.
Fiedler, K. (2000). Beware of samples! A cognitive-ecological sampling approach to judgment biases. Psychological
Review, 107, 659–676.
Fischhoff, B., Slovic, P., & Lichtenstein, S. (1978). Fault trees: Sensitivity of estimated failure probabilities to problem
representation. Journal of Experimental Psychology: Human Perception and Performance, 4, 330–344.
Fischhoff, B., Slovic, P., & Lichtenstein, S. (1979). Subjective sensitivity analysis. Organizational Behavior and Human
Decision Processes, 23, 339–359.
Gigerenzer, G., Hoffrage, U., & Kleinbölting, H. (1991). Probabilistic mental models: A Brunswikian theory of
confidence. Psychological Review, 98, 506–528.
Griffin, D., & Brenner, L. (2004). Perspectives on probability judgment calibration. In D. J. Koehler, & N. Harvey (Eds.),
Blackwell handbook of judgment and decision making (pp. 177–199). Oxford: Blackwell.
Griffin, D., & Tversky, A. (1992). The weighting of evidence and the determinants of confidence. Cognitive Psychology,
24, 411–435.
Harrington, D., & Robertie, B. (2006). Harrington on Hold ‘em: Expert strategy for no-limit tournaments, Vol. I: Strategic
play (1st ed.). Henderson, NV: Two Plus Two Publishing LLC.
Hasher, L., & Zacks, R. (1979). Automatic and effortful processes in memory. Journal of Experimental Psychology:
General, 108(3), 356–388.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
J. Liley and T. Rakow Probability Estimation in Poker 525
Hoerl, A. E., & Fallin, H. K. (1974). Reliability of subjective evaluations in a high incentive situation. Journal of the Royal
Statistical Society, 137, 227–230.
Hogarth, R. M., & Einhorn, H. J. (1992). Order effects in belief updating: The belief adjustment model. Cognitive
Psychology, 24, 1–55.
Hogarth, R. M., & Karelaia, N. (2007). Heuristic and linear models of judgment: Matching rules and environments.
Psychological Review, 114, 733–758.
Johnson, J. E. V., & Bruce, A. C. (2001). Calibration of subjective probability judgments in a naturalistic setting.
Organizational Behavior and Human Decision Processes, 85, 265–290.
Juslin, P., & Fiedler, K. (2006). Information sampling and adaptive cognition. New York: Cambridge University Press.
Kabus, I. (1976). You can bank on uncertainty. Harvard Business Review, 54 (May–June), 95–105.
Kahneman, D., & Lovallo, D. (1993). Timid choices and bold forecasts: A cognitive perspective on risk taking.
Management Science, 39, 17–31.
Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80, 237–251.
Kahneman, D., & Tversky, A. (1982). The simulation heuristic. In D. Kahneman, P. Slovic, & A. Tversky (Eds.),
Judgment under uncertainty: Heuristics and biases (pp. 201–210). Cambridge: Cambridge University Press.
Keren, G. (1987). Facing uncertainty in the game of bridge: A calibration study. Organizational Behavior and Human
Decision Processes, 39, 98–114.
Klayman, J., & Ha, Y.-W. (1987). Confirmation, disconfirmation, and information in hypothesis testing. Psychological
Review, 94, 211–228.
Koehler, J. J. (1996). The base rate fallacy reconsidered: Descriptive, normative and methodological challenges.
Behavioral and Brain Sciences, 19, 1–53.
Koehler, D. J., Brenner, L., & Griffin, D. (2002). The calibration of expert judgment: Heuristics and biases beyond the
laboratory. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases: The psychology of intuitive
judgment (pp. 686–715). Cambridge: Cambridge University Press.
Koriat, A., Lichtenstein, S., & Fischhoff, B. (1980). Reasons for confidence. Journal of Experimental Psychology: Human
Learning and Memory, 6, 107–118.
Lagnado, D. A., & Sloman, S. A. (2004). Inside and Outside Probability Judgment. In D. J. Koehler, & N. Harvey (Eds.),
Blackwell handbook of judgment and decision making (pp. 157–176). Oxford: Blackwell Publishing.
Lichtenstein, S., Fischhoff, B., & Phillips, L. D. (1982). Calibration of probabilities: The state of the art to 1980. In D.
Kahneman, P. Slovic, & A. Tversky (Eds.), Judgment under uncertainty: Heuristics and biases (pp. 306–334).
Cambridge: Cambridge University Press.
Murphy, A. H., & Winkler, R. L. (1977). Can weather forecasters formulate reliable probability forecasts in meteorology:
Some preliminary results. National Weather Digest, 2, 2–9.
Mussweiler, T., & Strack, F. (1999). Hypothesis testing and semantic priming in the anchoring paradigm: A selective
accessibility model. Journal of Experimental Social Psychology, 35, 136–164.
Poker Pro LabsTM (2007). Holdem Poker Calculator (Program Version 1.0.25). Downloaded from: http://www.
pokerprolabs.com/holdem_poker_calculator/index.html on 12 October 2007.
Poses, R. M., Cebul, R. D., Wigton, R. S., Centor, R. M., Collings, M., & Fleishli, G. (1992). Controlled trial using
computerized feedback to improve physicians’ diagnostic judgments. Academic Medicine, 67, 345–347.
Rakow, T., Harvey, N., & Finer, S. (2003). Improving calibration without training: The role of task information. Applied
Cognitive Psychology, 17, 419–441.
Rottenstreich, Y., & Tversky, A. (1997). Unpacking, repacking, and anchoring: Advances in support theory. Psycho-
logical Review, 104, 406–415.
Shanteau, J. (1992). Competence in experts: The role of task characteristics. Organizational Behavior and Human
Decision Processes, 53, 252–266.
Shanteau, J., & Thomas, R. P. (2000). Fast and frugal heuristics: What about unfriendly environments? Behavioral and
Brain Sciences, 23, 762.
Sklansky, D. (1994). The theory of poker. Henderson, NV: Two Plus Two Publishing LLC.
Stewart, T., Heideman, K., Moninger, W., & Reagan-Cirincione, P. (1992). Effects of improved information on the
components in skill in weather forecasting. Organizational Behavior and Human Decision Processes, 53, 107–134.
Teigen, K. H. (2001). When equal chances ¼ good chances: Verbal probabilities and the equiprobability effect.
Organizational Behavior and Human Decision Processes, 85, 77–108.
TheHendonMob.com (2007). Poker Calculator. Accessed from: http://www.thehendonmob.com/pokercalc/index.html
on 12 October 2007.
Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive
Psychology, 4, 207–232.
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm
526 Journal of Behavioral Decision Making
Tversky, A., & Kahneman, D. (1982). Evidential impact of base rates. In D. Kahneman, P. Slovic, & A. Tversky (Eds.),
Judgment under uncertainty: Heuristics and biases (pp. 153–160). Cambridge: Cambridge University Press.
Tversky, A., & Koehler, D. J. (1994). Support Theory: A nonextensional representation of subjective probability.
Psychological Review, 101, 547–567.
Windschitl, P. D., Kruger, J., & Simms, E. N. (2003). The influence of egocentrism and focalism on people’s optimism in
competitions: When what affects us equally affects me more. Journal of Personality and Social Psychology, 85, 389–
408.
Authors’ biographies:
James Liley was an undergraduate psychology student at the University of Essex when this research was conducted. He is
currently employed as a Statistical Officer for the UK Government Statistical Service.
Tim Rakow is a Senior Lecturer in Psychology at the University of Essex. His research interests include probability
judgment, decisions from experience and risk communication.
Authors’ address:
James Liley and Tim Rakow, Department of Psychology, University of Essex, Colchester, CO4 3SQ, UK.
Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 496–526 (2010)
DOI: 10.1002/bdm