You are on page 1of 18

Journal of Behavioral Decision Making

J. Behav. Dec. Making, 23: 335–352 (2010)


Published online 8 July 2009 in Wiley Online Library
(wileyonlinelibrary.com) DOI: 10.1002/bdm.658

Bluffing and Betting Behavior in a Simplified


Poker Game
DARRYL A. SEALE* and STEVEN E. PHELAN
University of Nevada Las Vegas, Nevada, USA

ABSTRACT

A pure-strategy, simplified poker (PSP) game is proposed, where two players draw from
a small and discrete number of hands. Equilibrium strategies of the game are described
and an experiment is conducted where 120 subjects played the PSP against a computer,
which was programmed to play either the equilibrium solution or a fictitious play (FP)
learning algorithm designed to take advantage of poor play. The results show that
players did not adopt the cutoff-type strategies predicted by the equilibrium solution;
rather they made considerable ‘‘errors’’ by: Betting when they should have checked,
checking when they should have bet, and calling when they should have folded. There is
no evidence that aggregate performance improved over time in either condition
although considerable individual differences were observed among subjects. Beha-
vioral learning theory (BLT) cannot easily explain these individual differences and
cognitive learning theory (CLT) is introduced to explain the apparent anomalies.
Copyright # 2009 John Wiley & Sons, Ltd.

key words experiment; fictitious play; poker; equilibrium solution; learning


algorithm

BLUFFING IN A SIMPLIFIED POKER GAME

Poker has fascinated game theorists from the very birth of the field (Borel, 1938; von Neumann &
Morgenstern, 1947). Von Neumann and Morgenstern, in particular, were able to derive an analytical solution
for a simplified poker game that required players to draw a continuous value from a deck in the range [0, 1]
and then bet on whether they had the highest value. Interestingly, the solution counter-intuitively required
players to bet on the highest and lowest values but make no bet (i.e., check) on intermediate values. Studies on
the same game with human subjects find that they seldom (if ever) play the optimal strategy; appearing
instead to engage in strategy that can best be characterized as some sort of payoff matching (Bearden, Schulz-
Mahlendorf, & Huettel, 2005; Rapoport, Erev, Abraham, & Olson, 1997).
Game theory has a general presumption that even when humans do not play optimal strategies, they can
learn them (e.g., Roth & Erev, 1995). Previous studies on probability matching have demonstrated that the

* Correspondence to: Darryl A. Seale, College of Business, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV
89154-6009, USA. E-mail: dseale@unlv.nevada.edu

Copyright # 2009 John Wiley & Sons, Ltd.


336 Journal of Behavioral Decision Making

behavior can be extinguished in favor of the optimal strategy given enough motivation, feedback on
outcomes, and learning time (Shanks, Tunney, & McCarthy, 2002; Vulkan, 2000). However, prior research on
gambling behavior (specifically blackjack) has found that regular casino players do not learn to play the
optimal strategy over time, despite playing several hours a day and investing relatively large sums of money
(Wagenaar, 1988).
The question of whether players can learn to play an optimal strategy in a simplified poker game (and
under what conditions) is thus an empirical question. In this paper, human subjects play a computer opponent
playing either the optimal solution or a more aggressive strategy that attempts to exploit sub-optimal play.
Our game simplifies the decision tasks observed in previous experimental studies of bluffing in poker games
by requiring only pure strategies in equilibrium rather than mixed strategies. Games with pure strategy
equilbria give decision makers their best chance to learn optimal play and aid experimenters in characterizing
deviations from such play. In line with previous research, we expect that players in the pure-strategy,
simplified poker game (PSP) will initially undertake some form of payoff matching but will then learn to play
the optimal strategy over time. Players should also learn faster in the PSP game than in other simplified poker
games which involve mixed strategies, and should also learn faster when facing an opponent who explicitly
exploits their errors.
In the next section we discuss the mechanics of simplified poker games and previous findings. We then
describe the specifics of the PSP game and its solution. This is followed by a description of the experimental
methods and the results of those experiments. The paper concludes with a discussion of the results, including
limitations and future research directions.

POKER AND GAME THEORY

To date, only the simplest forms of poker games have been solved. One game proposed by Borel, and another
by von Neumann and Morgenstern, have received the most attention in the literature. Borel (1938) proposed
and solved a two-player game where, following an ante of one unit, players are informed of the strength of
their hands—a random number drawn from the interval [0, 1]. The first player, designated as P1, can bet or
fold following inspection of her hand. If she folds the second player, P2, wins the pot (both ante amounts). If
P1 bets, P2 has the option of calling or folding. It is interesting to note that the solution to the Borel game,
which depends only on the ratio of ante and bet amounts, does not require a player to bluff with her worst
hands. The solution establishes a single threshold for P1 where hands below the threshold are folded and
hands above the value are bet. In addition, the value of this game is negative; it favors P2. This simplified
version of poker has been extended by Sakaguchi and Sakai (1981), who generalized the game to allow for
negative dependence between the values of the players’ hands; by Bellman and Blackwell (1949) who
showed that ‘‘bluffing’’ need not be conducted by mixing between pure strategies; and by Karlin and Restrepo
(1957), who considered multiple stages of betting.
The two-player poker game proposed by von Neumann and Morgenstern (1947), hereafter referred to as
the von Neumann game, is similar to that of Borel’s except that P1 chooses between bet or check, rather than
between bet or fold. If P1 checks, P2 is not required to make a decision, rather, both hands are revealed and the
highest hand wins the pot. If P1 bets, then P2 decides between fold and call (see Figure 1 for an example of
betting tree for the von Neumann game). The solution to the von Neumann game, similar to that of Borel’s,
depends only on the ratio of ante to bet amounts. The optimal policy requires P1 to establish two cutoff points,
a and b, where P1 bets with hands valued less than a or greater than b. The optimal policy for P2 is to establish
a single cutoff point c, where 0 < a < c < b < 1, and where P2 folds with hands valued less than c and calls
otherwise. Thus, P1 should establish three separate zones for bluffing, checking and betting, and P2 should
establish two zones, one for folding and one for calling (see Figure 2). Perhaps the most notable result of the
von Neumann and Morgenstern solution is that P1 is required to bluff (bet) with her worst hands, while

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
D. A. Seale and S. E. Phelan Bluffing and Betting Behavior 337

Figure 1. Betting tree for the von Neumann game

Figure 2. Bet, check, fold, and call zones for P1 and P2

checking on stronger hands. The rationale for bluffing is explained by two possible motives: The first is to
give a false impression of strength when holding a weak hand, and the second is to give a false impression of
weakness when holding a strong hand. If P1 was known not to bluff, then any bet on her part would imply
strength and would not be called by P2. This would make it difficult for P1 to make money, even on strong
hands.
The von Neumann game has been extended by Newman (1959), who allowed for arbitrarily high bets; by
Friedman (1971), who gave P2 an added strategy to raise; by Cutler (1975), who incorporated an unlimited
number of raises for the players, provided that the raise equals the size of the pot (pot-limit poker); and by
Ferguson, Ferguson, and Gawargy (2004), who allowed (i) P2 to bet if P1 checks, and (ii) P1 to raise if P1
initially checked and P2 bet. The PSP game employed in the present study differs from the von Neumann in
one key respect: Poker hands are dealt, without replacement, from a ‘‘deck’’ containing only seven ordered
cards {2, 3, 4, 5, 6, 7, 8}, rather than assuming hands are randomly distributed over a [0, 1] interval. With ante
of one unit and bet of two units, the solution1 to the PSP game is for P1 to bluff (bet) with {2}, check with
{3, 4, 5, 6} and bet with {7, 8}. The optimal strategy for P2 is to fold with {2, 3, 4, 5} and call with {6, 7, 8}.
Following optimal play, the value of the game is 0.0952; the game favors P1.
Analytical models of poker where only P1 receives a card have also been proposed. Reiley, Urbancic and
Walker (2008) introduced ‘‘stripped-down poker’’ as a classroom exercise to introduce basic game theory
topics, such as extensive and strategic form solution techniques, mixed-strategy Nash equilibrium and
signaling. Gardner (1995) introduced a similar version of this game, termed ‘‘liar’s poker,’’ as a pedagogical

1
The optimal strategy considers all 42 (7  6) unique and equally likely hands that can occur in the PSP when cards are drawn without
replacement. Details on the derivation of the discrete solution to the simplified poker are available from the primary author.

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
338 Journal of Behavioral Decision Making

exercise to describe mixed strategies and bluffing. He also extended the solution to three types of cards, and
introduced a ‘‘one-card stud poker’’ game where each player draws a card. Other analytical models of poker
have been studied by Kuhn (1950), Nash and Shapley (1950), and Friedman (1971), among others.
There are two studies in the experimental literature that connect with ours. In one study (Rapoport et al.,
1997) pairs of subjects, with one assigned the role of P1 and the other the role of P2, played a number of
repetitions of a ‘‘simplified poker game.’’ After both subjects anted a single chip, only P1 drew a hand, which
was either a High card or a Low card from a well-shuffled deck containing 150 cards. After considering her
hand, P1 could either fold (in which case the pot went to P2) or bet, by adding chip(s) to the pot. If P1 decided
to bet, P2 could either call or fold. If P1 bet with a Low (High) card and P2 called, the pot went to P2 (P1). The
equilibrium policies depend on the proportion of High and Low cards, and the ratio of the ante to bet amounts.
When each card value is equally likely, and ante of one unit and bet ¼ B units, the equilibrium solution is for
P1 to always bet High cards, and to bet Low cards with probability B/(B þ 2). P2 should call with probability
2/(B þ 2). The value of the game favors P1.
Rapoport et al. reported that, compared to the equilibrium solution, subjects did not bluff enough in the
role of P1, and called too often in the role of P2. Additionally, players did not tend to adopt best reply policies,
and individual differences were considerable. The Rapoport et al. poker game is clearly the simplest of those
reviewed, but differs in several key respects from the poker game adopted in the present study. First, subjects
met face-to-face, and their decisions could have easily been influenced by the mere presence or certain
unspecified visual aspects of their opponent. Second, P1 chose between fold or bet, not between check or bet
as in the original Von Neumann game. Finally, only P1 received a card, indicating one-sided or asymmetric
uncertainty.
A second experimental study (Bearden et al., 2005), based on the von Neumann game, compares closely to
the approach taken in this paper. First, two subjects each ante a single chip into the pot. Then, each player
draws a card in the range [0,1,. . .,1000]. P1 then has the option to bet another B chips or check after seeing her
card. If P1 bets then P2 has the option to call or fold. If P1 checks, or P1 bets and P2 calls, the pot is awarded to
the player with the highest card. The value of B was set at two chips in the first experiment and six chips in the
second.
In general, the authors found that ‘‘. . .there is a tendency for the subjects to bet and call with a higher
probability as the value of their hands increases’’ (p. 16) with most subjects tending to bet or call above some
threshold. Contrary to theoretical predictions, P1 did not bluff on low cards and set their upper thresholds too
low, whereas P2 placed their thresholds too high. There were also significant individual differences in the
results with a small proportion of players actually bluffing on low cards (although not to the predicted
degree). In the higher stakes experiment, the subjects were more conservative than predicted but still did not
follow the optimal policy to any greater degree.
In the Bearden et al. study both P1 and P2 were human subjects (in fact a cohort of four subjects were
randomly matched against each other each round). This complicates comparison with the present study as it
opens the possibility that the subjects were adapting to each others’ play in a best response fashion rather than
converging to the optimal solution. Bearden et al. conducted a third experiment where subjects were assigned
to the role of P1 and played against a computer following a fixed policy as P2. Subjects were told the
computer would call bets above a given threshold. The threshold was fixed at 400, 600, and 800 in three
different experimental conditions. Contrary to the previous studies, subjects showed a much greater tendency
to bluff in the 600 and 800 Conditions but there was virtually no bluffing in the 400 Condition. There were
also considerable individual differences with around 40% of subjects engaging in no bluffing at all in any
condition.
Clearly, there are many versions of simplified poker games that have been studied analytically and/or
experimentally. In the present study, we adopt a discrete version of the von Neumann game that yields a pure-
strategy solution that requires P1 to bet with both her best and worst hands. And unlike the poker game of
Bearden et al., where subjects played against each other, subjects in the PSP game played against the

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
D. A. Seale and S. E. Phelan Bluffing and Betting Behavior 339

computer, programmed to play either the equilibrium strategy (ES) or a more aggressive fictitious play (FP)
strategy designed to take advantage of poor play. These modest changes in design allow us to examine the
extent to which (i) subject’ decisions are in proportion to predicted levels, (ii) decision behavior approaches
equilibrium play with experience in the game, (iii) individual differences are observed, and (iv) probability
matching, and behavioral learning theory (BLT) or cognitive learning theory (CLT) aid in explaining
observed results.

LEARNING IN SIMPLIFIED POKER GAMES

In the first two experiments of Bearden et al., subjects did not play the optimal strategy derived from rational
choice theory, appearing to use some sort of probability matching instead. Probability matching has been
extensively studied in binary choice tasks (Shanks et al., 2002; Vulkan, 2000). In these tasks, subjects are
presented with two stimuli, S1 and S2, in a series of repeated trials. At the start of each trial subjects are asked
to guess which stimulus will appear next. The experimenter manipulates the situation so that S1 is more likely
to present (say 75% of the time). It can be demonstrated mathematically that a utility-maximizing subject
should always choose S1 to maximize the number of correct guesses. However, subjects are commonly
observed to choose a stimulus in proportion to its probability of presentation, choosing S1 about 75% of the
time. Similarly, in the Bearden et al. poker game, subjects are observed to bet on a card in proportion to its
chances of winning (i.e., the probability of betting on a card rose uniformly across the range from 0 to 1000).
There has been a great deal of theorizing about why subjects do not behave rationally in these types of
tasks. The dominant explanation maintains that probability matching should be extinguished over time given
the right incentives, feedback, and experience with the task (Shanks et al., 2002; Sprinkle, 2000; Vulkan,
2000). These predictions are consistent with a broader perspective among experimental game theorists that
rejects hyper-rationality but views human subjects as capable of reinforcement learning that enables them to
converge on the normative solution over time. Game theorists have increasingly sought to model these
learning processes in formal models (Camerer, 2003; Erev & Roth, 1998; Smith, 1991).
The issue of how subjects determine which learning model to apply has received much less attention in the
game theory literature (Fudenberg & Levine, 1998; Sonsino, 1997). Behavioral researchers have known for
many years that subjects will utilize various heuristics or ‘‘mental models’’ to attempt to solve a task;
heuristics which in many cases may not lead to an optimal solution (Kahneman, Slovic, & Tversky, 1982).
The choice of heuristic is influenced, in turn, by the framing of a problem, its representativeness, and the
ready availability of a similar situation. For instance, the re-framing of the problem in Bearden et al.’s third
experiment, by providing information on the opponent’s cutoff point, had the effect of dramatically
increasing the amount of bluffing (although not to optimal levels). Similarly, the emergence of probability
matching found in the first two experiments might best be explained by the availability heuristic. Subjects are
likely to be familiar with a range of card games where the highest card wins and are thus more likely to
assume their probability of winning increases with the strength of the card (Levi & Pryor, 1987; Tversky &
Kahneman, 2003). While familiar to professional poker players, the art of bluffing on low cards may not form
part of a typical subject’s behavioral repertoire and must instead be learned (Ferguson & Ferguson, 2003).
In essence, subjects that hold different cognitive models of the game will exhibit different patterns of play
(Erev & Roth, 1999). Klein and Baxter (2006) make a distinction between behavioral learning and cognitive
learning. BLT refers to the optimization of play within a selected model, while CLT refers to the selection of
the right model. In the Bearden et al. case, a subject would engage in behavioral learning when attempting to
maximize payoffs in a probability matching model. Cognitive learning would involve testing whether a
probability matching model offered the best alternative among a universe of possible models. Rational choice
theory has tended to focus on the ability to optimize a given model while ignoring both the search for
alternative cognitive models and the rules for starting and stopping such a search (Gigerenzer, 2001).

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
340 Journal of Behavioral Decision Making

CLT is also well positioned to explain individual differences in performance. Subjects who begin with (or
acquire) a superior cognitive model of the game are likely to outperform those trying to optimize an inferior
model. There is also evidence that cognitive ability is a factor in determining relative performance, in part
because cognitive ability leads to a greater ability to form new cognitive models (Stanovich & West, 2000;
West & Stanovich, 2003).
The PSP is an interesting game to study the interplay between cognitive and behavioral learning because
the optimal strategy can be calculated and requires a combination of betting on high cards and bluffing on low
cards. While betting on high cards is somewhat intuitive, bluffing on low cards is counter-intuitive to many
players. However, bluffing is only required of P1 not P2. BLT would predict that P1 and P2 should learn at the
same rate because they are being presented with the same number of trials and similar feedback while CLT
would predict that P2 should learn the optimal strategy faster (and more often) because there is no
requirement to develop a new cognitive model.

METHOD

Design
The current study undertook three modifications of the Bearden et al. model that tended to favor behavioral
learning. First, subjects only played a computer opponent. This eliminated the possibility that a learning
pattern was caused by adaptations to an opponent’s idiosyncratic play. Second, the number of possible poker
hands was reduced from 1000 to 7; thus players had only seven states of the world to learn. This range was
chosen, in part, because it resulted in a pure-strategy solution to the game. The final modification was to
manipulate the strategy of the computer opponent. For one-half of the subjects, the computer played the ES
while the other half of the subjects played a computer opponent utilizing FP (described below). The FP
algorithm was programmed to exploit weakness in its opponent’s strategy, effectively delivering greater
losses for poor strategies and providing a greater incentive for subjects to adopt equilibrium play in the PSP
game. Thus, BLT would predict a faster rate of learning for subjects in the FP Conditions, than in the ES
conditions, whereas CLT would predict a faster rate of learning for subjects assigned to the role of P2 than the
role of P1, regardless of condition.

Subjects
One hundred twenty subjects, with approximately equal proportions of males and females, participated in the
experiment. Most were undergraduate business students between 18 and 25 years of age, motivated to
participate by the promise of extra credit applied to their course grade, and a potential cash award of $25.
Their chance of winning the award depended, in part, on their cumulative earnings across all games.2

Procedure
Subjects were recruited from the College of Business Research Subject Pool. After reviewing an
advertisement for the study, subjects responded via email to the experimenter, indicating their interest in
participation. The experimenter randomly assigned subjects to one of four conditions: In Condition P1-FP,
subjects were assigned to the role of P1 and the computer played the FP algorithm. In Condition P1-
ES, subjects were again assigned the role of P1 and the computer played ES. In Conditions FP-P2 and ES-P2,

2
A cash award of $25 was given to one subject in each of the four conditions. To determine who won the award in each condition we
divided each player’s earnings by total player earnings for that condition. These quotients, which necessarily sum to one, were used as
each player’s probability of winning the award. We then listed the players by subject number (although listing them in any order would
do) to create a cumulative probability distribution over the [0, 1] interval where each player’s range corresponded to their chance of
winning. We then selected a random number [0, 1] using Excel, and identified the player whose range contained the random number as the
winner of the cash award.

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
D. A. Seale and S. E. Phelan Bluffing and Betting Behavior 341

subjects were assigned to the role of P2 and the computer, in the role of P1, adopted the FP algorithm or ES,
respectively. Although the subjects knew their role at the beginning of the experiment, and that this role
would not change, they did not know how the computer would play, only that it would try to win as many
hands as possible by, in some cases, adapting to the play of its opponent.
Subjects were then forwarded the experimental instructions, informed consent, and instructions for
logging in to the software, via return email (see Appendix). Each subject was given a unique login code that
identified their assigned condition, player-number and role in the experiment. They were told to bring the
instructions with them to any on-campus computer lab. After logging in to the software, subjects confirmed
receiving the instructions and informed consent document, and understanding their rights as a human subject,
by entering their name and student identification number. The instructions fully explained the game, which is
played by two players (P1 and P2). One of the players was the subject, the other was the computer playing a
programmed strategy. At the beginning of each hand, both players anted one chip, then the computer dealt a
single card to each player from a ‘‘deck’’ containing only seven possible cards—from the 2 through 8 of
Hearts. Subjects were told that each player ‘‘knows only the value of the card that he/she was dealt, not the
value of card dealt to the other player’’, and that the computer does not cheat or ‘‘peek’’ at the subjects’ card.
After the cards were dealt, P1 had the option to check or bet. If P1 checked, then both cards were turned face
up and the player with the highest card value won the pot. If P1 decided to bet, then she added two more chips
to the pot and the computer informed P2 that P1 has decided to bet. P2 could then fold or call. If P2 folded,
neither card was turned up, and P1 won the pot. Alternatively, Player 2 could call by adding two chips to the
pot. If P2 called, both cards were turned face up and the player with the highest card value won the pot. Ties
were not possible as the cards were dealt without replacement.
Subjects did not learn their player-role until they logged in to the software. They also did not know that the
game would last 200 trials, only that they would ‘‘play a large number of hands over a computer network’’
and that they would receive feedback at the end of each hand. At the beginning of the experiment, each player
was given an endowment3 of 500 chips, which could be used for ante and bet or call decisions. The computer
kept track of the number of chips and reported this at the beginning of each hand.
At the conclusion of the experiment, the subjects were thanked for their participation and reminded that
extra credit for their participation would be posted shortly. They were also told that if they won one of the
$25 prizes, they would be contacted by email. One hundred forty-six subjects signed up for the experimental
sessions, but only 120 subjects (82%) showed up. However, all 120 who started playing the PSP completed
the 200 trials.

Fictitious play algorithm


The FP process, introduced by Brown (1951), is an iterative method for finding approximate equilibrium
solutions for discrete, two-person zero-sum games in strategic form. Brown asked us to imagine how two
statisticians, ignorant of game-theoretic techniques and engaged in repeated play of a zero-sum game, would
choose their strategies. He argued that both players would track their opponents play, use this history to
compute an expected value for each strategy, and then choose the strategy that yields the greatest expected
value. This simple technique requires no special knowledge of calculus, linear programming or differential
equations; only that players maintain a relative frequency distribution of opponent’s past choices and select a
best reply strategy. Although Robinson (1951) proved that iterations of this type must converge to the
equilibrium solution for any two-person zero-sum game, there is little evidence that FP has ever been a
popular solution technique (Seale & Burnett, 2006). FP has, however, been widely embraced as a model of

3
The endowment of 500 chips essentially guaranteed that subjects, even with completely irrational play, would not exhaust their betting
resources. Given that a subject could wager no more than three chips per round, their endowment would last 166 rounds, assuming they
lost every round. In actual play, across all conditions, the worst-performing subject lost 114 chips; he/she ended the game with 386 chips.

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
342 Journal of Behavioral Decision Making

learning. See, for instance, Young (1993) and Young (1998) for a general understanding of FP; Milgrom and
Roberts (1991) for an example of adaptive FP; Van Der Genugten (2000) for a weakened form of FP; and
Kaniovski and Young, (1995) who describe a model of FP where players sample randomly and independently
from their memory of past observations). In the present study, we adopted the standard version of FP because
it is among the most parsimonious learning models and readily adaptable to each player-role in the PSP game.
In the role of P1, the FP algorithm begins by playing the equilibrium solution, and tracks, by card,
the number of times its opponent calls. Although the algorithm does not know the value of the card when its
opponent folds (and is not programmed to cheat or peek), this is easily estimated4. Thus, the algorithm can,
over a series of repeated plays, estimate for each card the probability its opponent calls or folds. Knowing the
card dealt to itself, a simple Bayesian calculation yields the probability that the opponent will call or fold, and
therefore, the expected value of the FP algorithm’s choices of betting or checking.
In the role of P2, the FP algorithm’s task is even easier; it learns on which cards P1 has checked (of course,
it also learns on which cards P1 bets, provided the algorithm calls), then knowing the number of cards in the
deck, can estimate the likelihood that P1 bets on each card. Over a series of repeated plays, the algorithm
updates these estimated probabilities then uses them to compute an expected payoff for its call or fold
decision. In each role, and for each opponent, the algorithm begins by playing ES. It deviates from
equilibrium play only when expected value calculations direct it to do so. Recall that optimal play (ES)
requires P1 to bet with her worst and best hands. Accordingly, we would expect P1 to bet with {2, 7, 8}, and
check with {3, 4, 5, 6}. P2 should fold with {2, 3, 4, 5} and call with {6, 7, 8}.

RESULTS

The results are organized into four sections. The first section investigates the aggregate proportions of bluff,
check and bet decisions for P1, and fold and call decisions for P2. This section also examines the number of
decisions inconsistent with equilibrium predictions by condition and block, with Block 1 defined as the first
100 repeated games, and Block 2 as the second set of 100 games. The second section explores probability-
matching behavior by analyzing the proportion of bet and call decisions by value of card dealt. Next, we
consider exploitability, or the cost of errors observed in the decision strategies subjects employed. The final
section examines several measures of individual differences. Together, these results show that subjects made
a considerable number of non-equilibrium decisions in the PSP. Subjects, assigned to the role of P1, regularly
failed to bluff on the lowest-valued card and bet too often on intermediate-valued cards. In the role of P2,
subjects frequently called on intermediate-valued cards when they should have folded. The results also show
that players did not approach equilibrium play with repeated experience in the PSP, despite half of the
subjects playing against an FP algorithm designed to take advantage of poor play. Finally, the results reveal
considerable individual differences, evident in the number of non-equilibrium choices, degree of
exploitability, and differences in strategy profiles observed.

Aggregate decisions
To verify that players’ hands were dealt randomly, the number of times each card value appeared was counted
and submitted to a condition by player-type by card (4  2  7) ANOVA. As expected, none of the main
effects or interactions was significant. To examine decision behavior by subjects assigned to the role of P1, we
computed the proportion of bluff, check, and bet decisions consistent with equilibrium predictions. These
results, which are plotted in Figure 3, show that subjects’ bluffing behavior (betting on a 2) in Condition P1-

4
Recall that players’ cards are revealed only if P1 checks, or if P1 bets and P2 calls.

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
D. A. Seale and S. E. Phelan Bluffing and Betting Behavior 343

Figure 3. Proportion of decisions consistent with equilibrium predictions by type of decision and condition

ES was consistent with equilibrium predictions only 52% of the time. Their consistency in checking decisions
was even worse, with subjects checking on only 38% of the instances of drawing a 3, 4, 5 or 6. Betting on high
cards (7, 8) showed the most consistency with equilibrium predictions; subjects bet over 90% of the time
required by the equilibrium solution. Similar patterns emerged for subjects in Condition P1-FP. The
proportion of bluffing (35%), checking (41%), and betting (94%) decisions were similarly inconsistent with
equilibrium predictions. T-tests (at a 95% confidence interval, one observation per subject per type of
decision) of the sample proportions indicated that none were consistent with equilibrium predictions.
We conducted similar analyses to examine folding and calling behavior for subjects assigned to the role of
P2. In Condition ES-P2 subjects folded 59% of the time and called 95% of the time the equilibrium solution
predicted they should. Similar proportions were observed for subjects in Condition FP-P2, where subjects
folded on 62% and called on 92% of the occasions they were predicted to do so. T-tests (at a 95% confidence
interval, one observation per subject per type of decision) of each of these sample proportions indicated that
none were consistent with equilibrium predictions.

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
344 Journal of Behavioral Decision Making

75%
P1-FP P1-NE
FP-P2 ES-P2

Mean 50%

25%

0%
10 30 50 70 90 110 130 150 170 190
Trial

Figure 4. Ten-period moving average number of decisions inconsistent with equilibrium play by condition

To understand if subject behavior changed over time, we submitted the proportion of bluff, check and bet
decisions for P1, and fold and call decisions for P2 to five separate condition by block (2  2) ANOVAs, with
block as a repeated factor. None of the ANOVAs reported significant effects for condition or block, or the
block by condition interaction. The interpretation of the ANOVAs is clear: Subject’ decision behavior was no
different when paired with a programmed strategy playing ES, or a more aggressive FP strategy. In addition,
there is no evidence that subject’ play changed with experience in the PSP game; subjects failed to ‘‘learn’’ or
approach equilibrium play with repetitions of the game.
To further illustrate decision behavior over time, we plot in Figure 4 the 10-period moving average
proportion of decisions inconsistent with equilibrium play. Overall, approximately 45% of the decisions
subjects made as P1 and 27% of the decisions made as P2 were inconsistent with equilibrium predictions.
These levels of inconsistent play are significantly lower (Z ¼ 22.468, p < 0.001) for subjects in the role of P2.
Much of this difference can be explained by the more intuitive structure of the optimal policy, displayed
earlier in Figure 2. The optimal policy for P2 is less complex than that for P1, requiring the player to establish
a single cutoff threshold between fold and call decisions.

Probability matching
To examine the prevalence of probability matching behavior, we plot the proportions of bet and call decisions
by condition and card in Figure 5. The proportions of bet decisions appear on the top panel of the figure, and
the proportion of call decisions appear on the bottom panel. With the exception of card 2, the likelihood of
betting increased monotonically in the value of the subjects’ card, in both conditions. The shapes of these
distributions are consistent with a probability matching explanation: Subjects were more likely bet as their
likelihood of winning (as measured by the strength of their card) increased. The non-monotonic result for 2
deserves additional comment. If subjects choose to check, rather than bet (bluff), they would lose their ante
amount with certainty as both players’ cards are revealed. To avoid a certain loss, players must bluff and hope
their computer opponent folds. Inspection of subject-level plots by card, not reported here, revealed that
several subjects understood this necessity to bluff or face certain loss. The proportions of call decisions also
increased monotonically (with the exception of 2 in Condition FP-P2) in the value of card for P2. Again, the

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
D. A. Seale and S. E. Phelan Bluffing and Betting Behavior 345

Figure 5. Proportion of bet or call decisions by condition and card value

shapes of these distributions are consistent with a probability matching explanation, as subjects were more
likely to increase their wager as the strength of their hand increased.

Exploitability
The analyses reported above indicate that subjects, in the roles of both P1 and P2, made a considerable
number of non-equilibrium decisions in the PSP, and that the frequency of these ‘‘errors’’ did not decrease
with repeated play. To better understand the impact of these errors we examine the cost of deviations, or
exploitability,5 from equilibrium play. Our measure of exploitability, Vg, assumes a best-response strategy
adapted to the known distribution of subjects’ past play. We interpret Vg as the ‘‘observed value’’ of the PSP
game. Recall that the ‘‘theoretical value’’ of the PSP is 0.0952; players in the role of P1 (P2) are expected
to win (lose) 0.0952 units with each repetition of the game. Aggregate decision behavior for subjects assigned
to the role of P1 resulted in decreasing the observed value of the game to 0.1758 for Condition P1-FP and to

5
We thank an anonymous reviewer for suggesting this analysis.

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
346 Journal of Behavioral Decision Making

Figure 6. Distribution of Vg for subjects in the roles of Player 1 and Player 2

0.2249 for Condition P1-ES. When subjects were assigned to the role of P2, their aggregate decisions
increased the observed value of the PSP for their computer opponents to 0.2501 in Condition ES-P2 and to
0.2303 in Condition FP-P2. T-tests (at the 95% confidence interval, one observation per subject per condition)
of the sample means show that the observed value of the PSP game is significantly different from the
theoretical value of 0.0952.
To determine if exploitability differed between conditions or changed over time, we computed Vg, by
condition, for both the first and second block of 100 trials. We then conducted two (for subjects in the role of
P1 or P2), condition by block (2  2) ANOVAs with block as a repeated factor. Neither the within-subject or
between-subject effects were significant. Consistent with findings reported earlier, subjects’ decision
behavior was not affected by their computer opponent playing either ES or FP, and there was no change in the
impact of their decisions over time. We conclude the analysis of exploitability by comparing the distributions
of Vg in Figure 6. Consistent with the ANOVA results reported above, data are collapsed across block and
condition. The distribution for P1 appears as the dark-shaded bars on the left side of the figure, while the
distribution for P2 appears as the light-shaded bars on the right side of the figure. The theoretical value of the
game is represented by the vertical dashed-line. It should not be surprising that the distribution of Vg for P1 is
farther from the theoretical value of the game than the distribution for P2, as the optimal policy for P1 is less-
intuitive than that for P2.

Individual differences
CLT predicts there should be individual differences among subjects due to heterogeneity in the ‘‘mental
models’’ held by subjects and that the mental models will be revised at different rates, contrary to the
predictions of BLT. To examine this hypothesis, we conduct a one-way analysis of variance on the data for
each condition with the percentage of decisions in concordance with the Nash equilibrium as the dependent
variable and block as the repeated measure across subjects. The results of this analysis for each condition are
shown in Table 1.
The results show no statistically significant change in behavior between blocks under any condition but
statistically significant differences between subjects across all conditions. The presence of an interaction
between block and subject indicates differences in subject behavior across blocks. A plot of the change in

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
D. A. Seale and S. E. Phelan Bluffing and Betting Behavior 347

Table 1. Repeated measures ANOVA of accuracy by condition


Condition

Effect P1-ES P1-FP ES-P2 FP-P2


Mean (total) 55.1% 54.9% 73.1% 75.7%
Mean (Block 1) 55.8% 53.8% 72.2% 72.0%
Mean (Block 2) 55.4% 55.8% 73.9% 74.8%
No. of observations 6000 6000 2605 2559
F(block) 0.20 2.54 2.21 1.99
F(subject, n ¼ 30) 10.28 3.78 8.57 15.56
F(subject  block) 1.91 2.89 0.90 3.19

p < 0.001.

accuracy across blocks for each subject in Condition P1-FP is presented in Figure 7. As indicated by the
interaction term, there is no consistent pattern among subjects. Some subjects show increased accuracy,
others no change, and others a decrease in accuracy over time as subjects search for an effective strategy in the
PSP. Similar patterns were exhibited in the other three conditions.
Substantial individual differences were also noted in several measures not reported here, including the
subjects’ actual earnings and degree of exploitability. We also examined individual ‘‘strategy profiles’’ for
each subject, which showed the propensity of bet or call decisions by value of card drawn. Although many of
these strategy profiles tended to support probability matching behavior, we also observed strategy profiles

Figure 7. Interaction of subject and block for condition P1-FP

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
348 Journal of Behavioral Decision Making

that were somewhat irrational (i.e., showed a higher propensity for betting on low cards than high cards) and/
or difficult to classify. Several subjects, when informally questioned about their strategies at the conclusion of
the experiment, reported that were deliberately attempting to fool their computer opponent. Perhaps they
expected that their irrational play would be met with irrational play by their computer opponent. Instead,
irrational play was ignored when the computer played ES, and exploited when the computer played FP.

DISCUSSION

Probability matching behavior was evident in both the aggregate data and the strategy profiles of many
individual subjects. In addition, subject’ decisions were far from optimal and did not improve with over
200 repetitions of play in either the ES or FP Conditions. Most errors, for those subjects assigned to the role of
P1, were due to a failure to bluff on the lowest-valued card, and betting on intermediate-valued cards when
subjects should have checked. Most of the errors for subjects in the role of P2 were due to calling on
intermediate-valued cards when they should have folded. These deviations from Nash equilibrium play
showed a substantial degree of exploitability that was not affected by condition or experience in the PSP
game. Finally, the static results at the population level concealed some significant individual differences.
Some players were significantly more accurate than others, and some showed significant improvements over
time. Others, however, got less accurate over time.
Despite large differences in experimental designs between both the Bearden et al. and Rapoport et al.
studies, our results are very similar. In all three studies, players failed to bluff near optimal levels in the role of
P1, and called too frequently in the role of P2. The studies also report high levels of individual differences that
did not diminish over time. And consistent with Bearden et al., we found aggregate subject behavior
consistent with a probability matching explanation.
Our results provide little support for either behavioral or cognitive learning theories. BLT’s prediction that
subjects would learn faster in the FP conditions, where poor play was exploited, was not supported.
Heterogeneity was observed between subjects as predicted by CLT but CLT’s prediction that subjects
assigned to the role of P2 would learn faster was also not supported. Regardless of condition or role, there was
no evidence of learning as subjects’ behavior failed to approach equilibrium predictions with repeated play.
In retrospect the seven-card strategy space, combined with stochastic outcomes, proved to be a difficult
environment for subject learning.
The results have several implications for how experimental game theorists approach the theory of adaptive
learning in games. In the face of compelling evidence, game theorists have reluctantly given up the notion
that humans are computationally capable of instantaneously solving even moderately complex games
(Gigerenzer, 2001). The fall back position was that humans could learn to improve their play over time to
reach an equilibrium point (Smith, 1991). This implies that behavior will eventually reach equilibrium—an
important assumption in standard economic models.
The current study suggests that games with counter-intuitive elements might take longer to learn than
predicted, because subjects must adjust their mental models to incorporate these counter-intuitive elements
into their solution sets, and that subjects might not make these adjustments at the same time. Moreover, our
results indicate that lower levels of performance (against an FP algorithm) may not lead to faster adjustment
and that some subjects may even regress away from an optimal solution. The latter result is readily explained
by CLT as a search among mental models, some of which may yield better results in the short term, but fail in
the longer term. The result of declining accuracy is more difficult to explain using BLT which posits that
performance should always increase in the face of more experience, feedback, or motivation. The most
dangerous implication of this study (for economists) is that subjects may never reach an optimal solution if
they do not stumble upon the correct mental model.

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
D. A. Seale and S. E. Phelan Bluffing and Betting Behavior 349

Of course, a study such as this suffers from a number of limitations. It is possible that subjects were not
sufficiently motivated, came to the experiment with prior beliefs about optimal play in bluffing and betting
situations, were not given enough time or feedback, or did not understand the instructions properly (Shanks
et al., 2002; Stanovich & West, 2000). Outcomes were also determined by a stochastic process—for
example, in some cases P1 might bet a 7 and win, and in other cases might bet and lose. Two-hundred trials
may simply not be adequate for inexperienced subjects to approach equilibrium behavior in a sequential
game with a moderately large strategy space.
There are clearly abundant opportunities for future research in this area. Simple extensions include
increasing the number of trials, incentives, or training on the current task. It would also be interesting to
explore other classes of games with counter-intuitive solutions to see if they display similar experimental
results. Finally, further research on mental models is required, not only to understand how the initial search
and selection of a mental model is conducted but also to examine how these models are evaluated, updated,
refined, or abandoned.

APPENDIX

Login code: __ __ __ __ __ __ __ __
University of Nevada Las Vegas
Department of Management

Instructions for simplified poker


Welcome to the experiment on simplified poker. The instructions are simple. If you follow them carefully and
make good decisions, you will earn your extra credit points and may be in the running for a cash bonus of $25.
Your chances of earning the cash bonus are explained below in the ‘‘Payment’’ section of these instructions.

Description of the task


This simplified poker game is played by two players, called Player 1 and Player 2, who play a large number of
hands over a computer network. At the beginning of each hand, both players ‘‘ante’’ one chip, then the
computer deals a single card to each player from a ‘‘deck’’ containing only seven possible cards (see card
values, below):
Each player knows only the value of the card that he/she was dealt, not the value dealt to the other player.
After the cards are dealt, Player 1 has the option to check or bet. If Player 1 checks, both players turn their
cards face up and the player with the highest card wins the pot. If Player 1 decides to bet, she adds two more
chips to the pot and the computer informs Player 2 that Player 1 has decided to bet. Player 2 may then fold or
call. If Player 2 folds, neither card is turned up, and Player 1 wins the pot. Alternatively, Player 2 may call by
adding two chips to the pot. If Player 2 calls, both cards are turned face up and the player with the highest card
wins the pot.
When you login to the software, the computer will inform you whether you are Player 1 or Player 2. This
assignment will not change during the course of the experiment and the computer will become the ‘‘other
player’’. For example, if you are assigned as Player 2, the computer will play as Player 1; if you are assigned
as Player 1, the computer will play as Player 2. In either case, you will be playing against a programmed
strategy that, in some cases, is trying to adapt to play of its opponent, and win as many hands as possible.
Although the computer shuffles and deals the cards, and keeps track of your bankroll, it is not programmed to
cheat; it does not know what card you were dealt and does not ‘‘peek’’ at cards that were not turned up.
Consider this a fair game.

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
350 Journal of Behavioral Decision Making

At the beginning of the experiment the computer will give you a ‘‘bankroll’’ of 500 chips and keep track of
your ante, and bet or call wagers. After you have completed 200 hands of play, the computer will tell you that
the experiment is over and report the balance of your bankroll. If you have a problem with the software, please
email Professor Darryl Seale at dseale@unlv.nevada.edu, or call 702-895-3365.

Logging in to the simplified poker software


1. Go to any UNLV computer lab and log in with your user ID and password.
2. Using the ‘‘My Computer’’ icon, go to L:\classes.
3. Click on the folder labeled ‘‘Darryl Seale.’’
4. Click on the folder labeled ‘‘Poker.’’
5. Double-click on the file listed as Simplified-Poker.exe.
6. When prompted, enter your name, course information, and the ‘‘login code’’ listed on first page of these
instructions.
Payment
Near the end of the semester, we will report to your professor that you have earned credit for participating in
this study. In addition, four players will be selected and paid a cash bonus of $25. Your chance of being
selected for the cash bonus will depend, in part, on how many points you earn during the poker game. The
four winners will be contacted using the contact information you provided when you first logged in to the
poker software.
I sincerely thank you for your participation.
Good Luck!

REFERENCES

Bearden, J. N., Schulz-Mahlendorf, W., & Huettel, S. (2005). An experimental study of Von Neumann’s two-person [0,1]
poker. Unpublished Manuscript Obtained From Primary Author, Durham, NC: Duke University.
Bellman, R., & Blackwell, D. (1949). Some two-person games involving bluffing. Proceedings of the National Academy
of Sciences, 35, 600–605.
Borel, E. (1938). Traité du Calcul des Probabilités et ses Applications, Volume IV, Fascicule 2, Applications aux jeux des
hazard. Paris: Gauthier-Villars.
Brown, G. (1951). Iterative solution of games by fictitious play. Activity Analysis of Production and Allocation, 13, 374–
376.
Camerer, C. F. (2003). Behavioral game theory: Experiments in strategic interaction. Princeton: Princeton University
Press.
Cutler, W. (1975). An optimal strategy for pot-limit poker. American Mathematical Monthly, 82, 368–376.
Erev, I., & Roth, A. (1998). Predicting how people play games: Reinforcement learning in experimental games with
unique, mixed strategy equilibria. The American Economic Review, 88, 848–881.
Erev, I., & Roth, A. E. (1999). On the rule of reinforcement learning in experimental games: The cognitive game-theoretic
approach. Games and human behavior: Essays in honor of Amnon Rapoport (pp. 53–77). Mahwah, NJ: Lawrence
Erlbaum Associates, Inc.
Ferguson, C., & Ferguson, T. (2003). On the Borel and Von Neumann poker models. Game Theory and Applications, 9,
17–32.
Ferguson, C., Ferguson, T., & Gawargy, C. (2004). Uniform (0, 1) two-person poker models. Paper Presented at the 11th
International Symposium on Dynamic Games and Applications.
Friedman, L. (1971). Optimal bluffing strategies in poker. Management Science, 17, 764–771.
Fudenberg, D., & Levine, D. (1998). The theory of learning in games. Cambridge, Mass: MIT Press.
Gardner, R. (1995). Games for business and economics. New York: John Wiley & Sons.
Gigerenzer, G. (2001). The adaptive toolbox. In G. Gigerenzer , & R. Selten (Eds.), Bounded rationality: The adaptive
toolbox (pp. 37–50).

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
D. A. Seale and S. E. Phelan Bluffing and Betting Behavior 351

Kahneman, D., Slovic, P., & Tversky, A. (1982). Judgment under uncertainty: Heuristics and biases. New York, NY:
Cambridge University Press.
Kaniovski, Y., & Young, H. (1995). Learning dynamics in games with stochastic perturbations. Games and Economic
Behavior, 11, 330–363.
Karlin, S., & Restrepo, R. (1957). Multistage poker models. Contributions to the theory of games (Vol. 3). Princeton, New
Jersey: Princeton University Press.
Klein, G., & Baxter, H. (2006). Cognitive Transformation Theory: Contrasting Cognitive and Behavioral Learning.
Orlando, FL. Paper presented at the InterService/Industry Training, Simulation & Education Conference.
Kuhn, H. W. (1950). A simplified two-person poker. H. W. Kuhn , & A. W. Tucker (Eds.), Contributions to the Theory of
Games, I (pp. 97–103). Annals of Mathematical Studies, Number 24. Princeton, New Jersey: Princeton University
Press.
Levi, A., & Pryor, J. (1987). Use of the availability heuristic in probability estimates of future events: The effects of
imagining outcomes versus imagining reasons. Organizational Behavior and Human Decision Processes, 40, 219–234.
Milgrom, P., & Roberts, J. (1991). Adaptive and sophisticated learning in normal form games. Games and Economic
Behavior, 3, 82–100.
Nash, J. F., & Shapley, L. S. (1950). A simple three-person poker game. H. W. Kuhn , & A. W. Tucker (Eds.),
Contributions to the theory of games, I (pp. 105–116). Annals of Mathematical Studies, Number 24. Princeton:
New Jersey: Princeton University Press.
Newman, D. (1959). A model for ‘‘real’’ poker. Operations Research, 7, 557–560.
Rapoport, A., Erev, I., Abraham, E., & Olson, D. (1997). Randomization and adaptive learning in a simplified poker game.
Organizational Behavior and Human Decision Processes, 69, 31–49.
Rieley, D. H., Urbancic, M. B., & Walker, M. (2008). Stripped-down poker: A classroom game to illustrate equilibrium
bluffing. Journal of Economic Education, 39, 323–341.
Robinson, J. (1951). An iterative method of solving a game. Annals of Mathematics, 54, 296–301.
Roth, A., & Erev, I. (1995). Learning in extensive-form games: Experimental data and simple dynamic models in the
intermediate term. Games and Economic Behavior, 8, 164–212.
Sakaguchi, M., & Sakai, S. (1981). Partial information in a simplified two person poker. Mathematica Japonica, 26, 695–
705.
Seale, D., & Burnett, J. (2006). Solving large games with simulated fictitious play. International Game Theory Review, 8,
437.
Shanks, D., Tunney, R., & Mccarthy, J. (2002). A re-examination of probability matching and rational choice. Journal of
Behavioral Decision Making, 15, 233–250.
Smith, V. (1991). Rational choice: The contrast between economics and psychology. Journal of Political Economy, 99,
877–897.
Sonsino, D. (1997). Learning to learn, pattern recognition, and Nash equilibrium. Games and Economic Behavior, 18,
286–331.
Sprinkle, G. (2000). The effect of incentive contracts on learning and performance. The Accounting Review, 75, 299–326.
Stanovich, K., & West, R. (2000). Individual differences in reasoning: Implications for the rationality debate? Behavioral
and Brain Sciences, 23, 645–726.
Tversky, A., & Kahneman, D. (2003). Availability: A heuristic for judging frequency and probability. In B. J. Baars, W. P.
Banks & J. B. Newman (Eds). Essential sources in the scientific study of consciousness. Cambridge, Mass: MIT Press.
Van Der Genugten, B. (2000). A weakened form of fictitious play in two-person zero-sum games. International Game
Theory Review, 2, 307–328.
Von Neumann, J., & Morgenstern, O. (1947). Theory of games and economic behavior (2nd ed.). Princeton, NJ: Princeton
University Press.
Vulkan, N. (2000). An economist’s perspective on probability matching. Journal of Economic Surveys, 14, 101–118.
Wagenaar, W. (1988). Paradoxes of gambling behaviour. Hove, UK: Lawrence Erlbaum Associates.
West, R., & Stanovich, K. (2003). Is probability matching smart? Associations between probabilistic choices and
cognitive ability. Memory & Cognition, 31, 243–251.
Young, H. P. (1993). The evolution of conventions. Econometrica, 61, 57–84.
Young, H. P. (1998). Individual strategy and social structure: An evolutionary theory of institutions. Princeton, New
Jersey: Princeton University Press.

Authors’ biographies:
Darryl A. Seale is Professor of Management at University of Nevada Las Vegas. He received his PhD and Masters
degrees from the University of Arizona and MBA degree from Penn State University. His research interests are decision
making and behavioral game theory.

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm
352 Journal of Behavioral Decision Making

Steven E. Phelan is Associate Professor of Strategic Management and Director of the University of Nevada Las Vegas
Center for Entrepreneurship. He received his PhD from La Trobe University (Australia) in 1998. His interests include
competitive dynamics, organizational architecture, and entrepreneurial competence.

Authors’ address:
Darryl A. Seale and Steven E. Phelan, Department of Management, University of Nevada Las Vegas, 4505 Maryland
Parkway, Las Vegas, NV 89154-6009, USA.

Copyright # 2009 John Wiley & Sons, Ltd. Journal of Behavioral Decision Making, 23, 335–352 (2010)
DOI: 10.1002/bdm

You might also like