Professional Documents
Culture Documents
3 Player
3 Player
,M414
1a)/899-87|
Dilemma
Peter S. Fader*
Dilemma
Assistant Professor of Marketing, Wharton School of Business, University of Pennsylvania, Philadelphia, PA 19104
* Professor of
Management
Management, Massachusetts
Institute of
Technology, Cambridge,
MA 02139
MIX
LIBRARIES
ABSTRACT
The presence
of a third party can afTect attempts by
two players
to
cooperate in a threeis
may
e
,
find
it
advantageous
somewhere between
full
(i
defection
We
examine
this
phenomenon
in tourney
1,
44 algorithms in tourney
2).
seems
also
to
be a key
indicator of success
We
examine other
Detailed tournament
to be
how
these
outside players will affect the level of cooperation that the superpowers might achieve.
Consider the dramatic effect of outside players on OPEC. After a decade of highly profitable
cooperation (collusion) the cartel collapsed, partially because of increased production by
non-OPEC
to
be the sole cooperator. Early in 1986, the Saudis finally gave up their attempts to maintain
cooperation and began increasing output and offering price discounts in an attempt to "punish" the
United Kingdom. Oil prices dropped as low as $8/barrel. Recently, the Saudis and
attempted
to stabilize prices at
OPEC have
In another example, firms in the U.S. microelectronics industries, adversely aiTected by the
growing influence and economic power of foreign competition, have formed the Microelectronics and
to
advantage relative
may
Whether or not it
in the social
good
to
it is
important
to
understand
This paper addresses the role of implicit coalitions in a repeated, generalized prisoner's dilemma (GPD). The classical prisoner's dilemma, a two-player, two-act game, captures the essential confiict
(i.e
more
(price
The
more
number of
R&D investment).
GPD and
in a
motivating implicit coalitions, we describe two competitive strategy tournaments in the spirit of
Axelrod (1980a,b). Results of the tournaments illustrate the importanceof implicit coalitions
repeated GPD.
its
robustness
across a series of environments which vary from very "nice" to very "nasty."
FORMALIZATION
We begin with a review of the classic prisoner's dilemma (PD) and then motivate and formalize our generalizations. We illustrate our formalization using the GPD game used in the first
competitive strateg>' tournament.
Over
its
in
economics, political science, sociology, and psycholog>'. See Axelrod (1984) for a review of these and
other applications of the
PD
(C) or defect (D). If both players
If
mutual cooperation, he receives the temptation payoff of /, while the cooperating player gets the
"sucker's payoff" of s. If both choose to defect, then each receives the punishment payoff of p.
Table
2x2 PD.
Table
1:
Classic Prisoner's
Dilemma'
Player 2
C2
r,
D2
3
Player
These three
combined
repeated,
into one
compound
inequality:
l>r>p>s. This
to
is
is
discourage oscillations:
4)
Continued cooperation
is
2r>t + s.
among N players we
use the
PD framework PD paradigm
to
balance
Thus, we state a
set of conditions
players and which, in the case of two players, reduce to the classical
PD conditions
Several researchers have proposed and analyzed N-player PD's, usually to study the behavior of
large groups or entire communities. For example, each person finds
it
is
often
Hamburger
Goehring and Kahan (1976). Taylor (1976), Dawes (1980), and Schelling (1973). The primary
mode
to
of analysis for
many
players are the payoff functions, C(n) and Din), which describe the payofTs
to
many
players
we are more concerned with games involving fewer players and more
we wish
to
when
them
We seek to determine,
some
must deal with
among other
To address these
continuous actions
We define
symmetric'
First
the
...
A^) be
thepayofi"to player
if the
A"*,
to
D and C
in the classical
PD. Each
payofTs by choosing A^, regardless of the actions of the other players. At the
other extreme,
if all
same
maximize joint
payofTs.
This
analysis.
is
For example, positive linear transformations which vary by player do not affect our
By symmetric we mean
.A
..
P.{
..
.A
...
.A^,
...
).
first
A!^ is fixed
means more
prices, or
we consider the
and assume
A"^ is less
for player
1.
It
As long as A, >
A", player
increases
its
aP
< An alternative
interpretation
is
for
A >
,
A'^.
that unilateral
t
we
implicitly
assume
when
2)
A''.
1:
>0
dA
J
for
;=
2.3, ..,N.
it
that
applies for
all feasible
actions by all of
competitors competitors
is
Mutual cooperation
same amount,
all
+
dAj
+
dA^
+
dAj^
>
A ^ A^
J
for all
action,
we have
assumed
4)
We wish
an analogy
to 2r
>< -f
s.
we choose
a simple one by
making it unattractive
take
is,
,A,^)+
P^tA^, A^,
...
A^^-a)
<
A', j
1,2,...,N.
An Example
Our
first
in
to profits
and the
we chose
of consumer
response
were
chosen
to
understand.
1):
we used was
(in
terms of player
Pj = 3375A-^-^A^^Af^{Aj-l)
- 480
(1)
A little calculus (taking the derivative of (1) and setting it equal to zero) yields the
payoff maximizing action, A"
non-cooperative
1.40,
which
is
A,
1
= A2 = A3, we can
1.50.
through 4 hold
and
A", the
(1)
becomes the
classical
PD.
Suppose, for the sake of illustration, that players 2 and 3 are committed to choose the same action as
each other. Table 2 shows the possible payoffs under this restriction.
Table 2: Example payoffs when players 2 choose identical actions.
Players 2 and 3
and3
A:
Player
1
(i e.
maximize own
score)
to
maximize share
of
In the
is
calculated to be A'
15/11
1.36
In an
oligopoly,
Note that
to
IMPLICIT COALITIONS
Suppose player
A''.
3 in a
repeated three-player
strateg>' of consistently
choosing
two-player game. However, in the three-player game, maintaining two-way cooperation at A" also
may not be
A').]
A"*)
1 1
<
12
= P,{A\ A^
to A"^
and
A', players
A"^
and
it
some other
action,
somewhere between
and
If they
to
choose an
action which maximizes their joint payoff against the defecting third-player
A"".
defined (for
player
1) as:
P,{A\
In the
example above.
A""
13/9
GPD game
in variant, just
In
...
and
N
3,
Table 3 illustrates the impact of implicit coalitions. Notice that for a fixed action by player
best cooperative response by players
1
the
and
is
players
and 2
is itself a
two-player
Abreu (1986) has recently proposed a class of strategies known as "carrot and stick" strategies which use severe punishments (as low, as P' and even lower! as a credible threat to enforce maximally collusive behavior.
Table 3: Illustration of payoffs when actions are limited to A\ A", and A"^. (The first, second, and third lines refer to
the payoffs to players 1,2,
and 3,
respectively.)
A, = A'
A,
= A'
A,=A'
A, = A''
-A"
A, = A''
Finally,
we acknowledge
that
in the three
ranges of
competitors' actions
We define
and
/"^f
ranges
[A'^,
A"] and
[A"", A'],
h(A^.A^)
ifA^.A^SA^
if
GENERIC;
A/^) =
A"
f,(A^,A)
ifA^.A^^A-^
the
GPD,
it is
(In the
the payoffs in any one round depend only on the actions in that round, but each player can observe the
previous actions by their competitors.) Unfortunately, as Axelrod (1981) showed for the classical PD,
there
is
may be best.
COALITION
is
competitors are
known
to
be playing
COALITION
Axelrod 1980a, b) pioneered a methodology aimed at identifying strategies
(
PD
to
simple or complex; some participants even created strategies that tried to identify opponent's
strategies
and then
of all,
TIT
and
in
each
key properties that helped distinguish the most successful strategies, such as niceness which means
"never be the
first to defect."
A second tournament was run soon after the first tournament's results were tallied
geographic origins. The winner, once again, was TIT
victory
This time
Axelrod received 62 entries from participants representing a wide range of ages, disciplines, and
was no fiuke
the importance of niceness, Axelrod also identified pivotal
provocability
(i.e..
never
let
(i.e.,
do not
pays
to cooperate.
Axelrod's tournaments have provoked praise and criticism, but they have raised a
interesting ideas.
number of
GPD.
MITCSl:
In
THEFIRSTGPDTOURNAMENT
first
for
MIT Competitive
Strategy Tournament), similar in design to Axelrod's second tournament, but featuring the
Each game
in the tourney
with three programmed strategies choosing actions each period from a continuous range As in
Axelrod's tournament, each possible grouping of entries engaged in five repeated games, and the
overall winner
games
in
which
it
participated. Contestants were given full information about the payoff function used,
and
in every
access to the past actions of all three players in that game. Entries were
By July
1985,
game
theorists,
submitted the best and most creative entries they had found after running their own mini-
GPD
contains a wide variety of creative efforts from some very strategically-minded people
Description of Entries
TIT
FOR TAT.
Six
strategies recognized implicit coalitions and incorporated the implicit coalition action, A"" into their
algorithms.
We label
fit
into the
GENERIC framework,
Most algorithms
of ways, including:
fj
of the previous decisions of each competitor, not just their most recent actions
tried to incorporate continuous alternatives, but participants did so in a variety
MIN:
Start at A'. In
all
MAX:
Start at A'. In
all
AVG:
Start at A'. In
all
to
Two MAXs,
extremely
action only
if
both competitors do so
(pronounced
MAXCUM)
rounds
it
it
TIT
AVG,
MXCM is both provocable and forgiving since follows any actions made by the leading is able to ignore inefTective strategy. However, MXCM is distinguished from AVG because
it
strategies. In contrast,
AVG will always give equal weight to the actions of both competitors,
to
generalize TIT
until
any
it
goes to
A''
for all
subsequent rounds.
We label
this
XTRM.
strategies (eg, always cooperate rALL-A'j, always defect
random (RND)
randomly
[A'',
A'] or used a
constant action, or
RND.
Beyond these general descriptions
of strategy types, the entries differed due to specific tactics or
Following Axelrod (1984), strategies which start the game cooperatively and are never the
cut action below A' are termed nice as opposed to nasty strategies
, ,
first to
first to defect.
Self-awareness allows strategies to consider the previous decisions of all three players (not just
the two competitors)
to
effects.
Many
no way
to increase
payoffs by choosing actions outside of this range. Strategies that were willing to go below A'' or above
Some
strategies tried to induce cooperation by raising actions slightly above the level specified by
10
MIN
have action-raising
initiati%'e
,
that
is,
an attempt
to
Finally, several strategies occasionally used the envious action A' to try to outscore their
own
score).
These descriptions are admittedly vague. The actual implementation of some of these features
can vary greatly from strategy
to strategy.
we ignore
fine-grain difTerences
among algorithms,
since the
it
mere presence of a particular feature was generally more important was implemented.
MITCSl RESULTS
AND INTERPRETATION
Table 4 presents a summary of the strategies and their performances. The strategies are ranked
by their average score per round. For comparison, mutual cooperation pays 20 units per period
each player, while each period of mutual defection
approximately 12 units
to
(i.e. all
to
A'')
yields
COALITION.
to
The
top four algorithms in the tournament recognized the coalition property, and all six IC strategies
continuous nature of the actions. Most entrants used simple heuristics (eg MIN,
MAX, AVG) to
(i.e.
address this problem with varying degrees of success. The standard averaging strategy
nice,
6'"'
AVG with no self-awareness, envy, or action-raising/cutting initiative) finished in easily beating standard MAX (ranked W^) and standard MIN (12'").
bounded
Several of the descriptive features were highly influential. First and foremost
is
place,
niceness,
reconfirming the findings of Axelrod. Nice algorithms were able to reap great benefits by avoiding
the short-term temptation to defect.
The
make
competitor responded
would return
to
standard
make
still
Other important features were boundedness and lack of envy. Only one successful algorithm ever
exhibited envious behavior, but that strategy (ranked 2-) would only go to A'
if
Perhaps
if
this
second-ranked strategy
11
Table 4
Official
MlTCSl
Results.
Rank
Notes
(1)
to
Table 4:
Default features include niceness, no self-awareness, bounded actions, no action-raising or -cutting initiative, and no en\-y Exceptions are noted as N for nastiness, S for self-awareness, U for unbounded actions, R for action-raising initiative, C for action-cutting initiative, and E for envy.
.
(2)
(MIT) denotes an algorithm entered by a member of the MIT community which was not eligible to win the tournament. Post-tournament testing indicates that the inclusion of these entries does not affect the ordering of the top algorithms.
Weighted average of all three players' actions, using cumulative scores as weights. Mimics previous move of one opponent on odd rounds, other opponent on even rounds.
Geometric mean of opponents' previous actions.
Stays at A'
for
to A'.
(7)
(8)
Random walk
Mimics actions of one opponent chosen at random at start of game; actions limited Uniform random variable between A andA^. (10) Only algorithm to introduce action increases above A^ (11) Only algorithm to introduce action cuts below A'.
(9)
to
range XA'.A"^].
it
to
win the
tournament.
The value
comparing standard
(ranked 6- and
12-, respectively) to their equivalent but unbounded counterparts (ranked 19- and 22,
respectively).
to
round
to
MIN.
For example, some strategies, unlike TIT
FOR TAT, considered their own previous decisions in determining future actions, e.g.
across
all
averaging
who used
it
as a ratchet on actions.
example, only
competitors.
Little
their
initiative.
Some
of
the nice, bounded entries were able to encourage cooperation and discourage cheating with
appropriate rewards and penalties, but these successes were counterbalanced by the unsuccessful
strategies that brought on their
own demise by
much
at the
wrong
times.
Table 4 seems
to depict
it
in nearly
on a payoff-per-round basis
is
equivalent
to a
An Alternative Champion
The winning algorithm, COALITION, was
the only highly-ranked strategy that did not
acknowledge the continuity of actions. Apparently, none of its top rivals could use the continuous of
13
Old
Rank
With
this in
mind, we sought
to
COALEN'C
it
was unique
to the payofTfunction
could be
Soon after we completed the analysis of MITCSl, we announced a second tournament, MITCS2,
with the following payofTfunction:
n,
= 200{8-6A^ + A^ + A^){A^-l)-l80.
to a linear
(2)
Equation
(2)
corresponds
demand function
in economics.
As before,
MITCSl
full
= A^ = A^ = A')
full
defection (A^
= A^ = Aj = A'^) pays
The key
action
(A'')
difference between
is
now depend on
14 +
A +A
(3)
Af^
^
and
A
"
12
13
+ A
:;
'c
(A)
For example,
if player
(3),
3 chooses
''
= A "=
1.440.
However,
if
player 3 chooses
then
A/ =
and
Aj''
= A/= 1.441.
of COALITION
and
of three entries
was matched
games
of 200 rounds^
Tournament Results
By fall
MITCS2. Five
strategies carried
over (some with slight modifications) from MITCSl. These strategies were included again because they led to interesting pricing behavior in the original tournament. Finally, suggestions from other
individuals
field of 44
who did
not wish to officially participate led to 6 more submissions, thus rounding out the
unique entries.
in
Table
6,
game
all
games.
15
Table
6.
MITCS2
N=
Official Results
(see
Notes
to
Table
6,
next page)
Strategy
Rank
1
Entrant
Robert
E.
L.
Type
Nasty Bound
Description
Marks
Bishop;
COALENC
(26+A^ + Aj)/20
Robert
Tony Haig
3
Paul
R.
Pudaite
John Hulland
Neil
5 6
7 8
Bergmann
Tony Haig
James M.
Lattin
1.4
1,4 1.4
17.096
17.091
Original
A""
13/9)
= (13+A)/10;
17.085 17.084
17.075
A" = (13+A^)/10
A"^ A"'
= (13+Aj)/10 =
(13 +A^)/10; looks
17.065
A" =
1.44
(MITCSl #7)'
Scott A. Neslin Scott A. Neslin
4
A"
1.4 1.4
17.064
17.063
Standard
increases
AVG.S AVG.S
10
11
17.042
MXCM.S
L.
17.025
1.4
Mimics previous
A""
12
13
Robert
Bishop
COALENC AVG
AVG.S AVG.S
AVG.S
16.993 16.989
13/9; uses
A"
Ad
Gradually
shifts
14
15
(MITCSl #5)
Weighted average
16 17
18 19
Karel Xajman
(MITCSl #6i
AVG
James M.
Lattin
(MITCSl #11)
Karel Najman
20
21
MAX AVG
COALITION MIN
1.4
16.926 16.908
MAX: maximum
Unbounded
AVG
= 85/59
Terry Elrod
1.4 1.4
16.907 16.887
Same
Plays
22
23
James
Laltm
(MITCSl #21)
Neil
COALENC
COALITION MIN MXCM.S MIN
N
1.4
16.765
24
25 26 27
28
Bergmann
16.754
16.707
COALITION
A" =
Chris Jones
Standard MIN:
minimum
(MITCSl #10)
Karel Najman
Ad
16.679
16.567 16.187
Mimic previous
Unbounded MIN
Play
A'^ until
1
(MITCSl #17)
29 30
31
anyone cuts
price; play
A thereafter
James M. Lattin
Karel Najman
MAX
ALL- 1 44
32
33
(MITCSl #26)
MXCM
XTRM
ALL-A'^.R
N N N N N N
N
16.162
Start at
.4
15.992
1.4
Weighted
Start at
1
MAX
1.44
Always choose
Start at
1
a'
1.4
.4
34
35 36
37
Choose
.4 for
A'
(MITCSl #35)
Pel*r
J.
1.4 1.4
if
Brock
Pudaite
AVG
ALL-A*^
ALL-A'' ALL-A"*
and lA + A
>/2
Paul
R.
38 39 40
41
Karel Najman
Paul R Pudaite
Peter
J.
Brock
Lattin
James M.
Peter
J.
ALL1.39
ALL-A-^.C
42
43
Brock
(MITCSl #32)
(MITCSl #36)
RANDOM
ALLA'
44
N N N N N N N N
ChooseA =
Start at
1
(6
+ A+A)/6
to
maximizejoint
profits
13.979
5;
thereafter choose
13.854
SUrt
A"
1.39
13.734
13.213
Start at
.4;
thereafter choose
1
A
players
1
Always choose
.39
)/6 to
13.089
1.333
ChooseA
= (7A -A
hurt non-ALL-A
1
12.817
.333 and
.5
11.754
16
Notes
'
to
Table 6:
stratg>' types are defined in the descriptions
to strategies
The basic
,
and
in the text.
,
The
to action-cutting initiative
Firm
is
assumed
to
entry in
refers to its
ranking
in that
tournament.
(i.e.,
One
pattern that
COALENC
method
The winning
of coalition action.
uses
(4) to
player 3 and averages this action with an A"" calculated against player
1.50
2.
if A^
COALENC with A'' = (1.44 + 1.45)/2 = compared to a A" of 1.44 that (4) would suggest (and most COALENC entries would use).
andA^ =
1.40, this
1.445, as
At first glance
this
may seem
like
an inefficient
rule, since
it
action slightly above the "optimal" A". But notice which routine
came
A" (4)
13/9
1.444...
This
is
will
exceed the
A" suggested by
below
1.444...
A pattern
emerges: the top two strategies consistently choose higher coalition actions than any of the other
COALENC entries.
.440.
we examine
phenomenon
further, but
now we
briefly
COALITION,
as compared to
its
sterling performance in
in
MITCSl.
to the different
mix of strategies
MITCS2 compared
to
MITCSl:
COALITION. This
is
particularly true
is
moderate
levels.
due
to the
A" to
rely on,
flexible
and forgiving
successful coalition.
for a
MITCS2
recognized this idea and used one of two lower bounds - fixed at 1.40 or floating
The
results in Table 6
show no
entries
#4 and #5
games
same except
constituent
17
all too
surprising, after
all,
1
when
action cutting
is
severe enough to
.40
anyway
algorithms. In
MITCS2, just
as in
MITCSl,
AVG (entry
MAX (#19),
MIX
(# 25). The value of having bounded actions can be seen once again by
to
to
MITCSl, but
this is only
because of the smaller number of extreme action cutters. Only three entries
fact, the
entries
from the
A" is
e.g.,
less likely
#7, often
will be
mistaken
for a defector
higher
A"
s will
To generate a
slightly
more magnanimous
strategy,
If
we included another
potentially beneficial
its
A" calculation.
own previous
action
stable.
No
longer would different subsets of players seek different coalition actions, environments
MITCSl
Our new
strategy,
named CE A VG3
an
(for coalition
coalition actions),
is still
CEAVG3
new
39-^A,-(-A-^A,
A/^=
^
(5)
30
for the
those of entry #1, this small increase combined with the moderating influence of the lagged A^ term
helps the
new
entries.
Table
new
strategy.)
18
Table
entry
8:
it
When we extend
the
PD to the GPD we find the possibility of implicit coalitions, that is, coalitions
We also expect,
that use the continuous nature of the action space will do better than those that do not.
We tested our conjectures in two three-player GPD's as described by equations (1) and (2). Our
methodology was that of computer tournaments.
In both tournaments, implicit coalitions proved to be the key feature that distinguished the
most
MITCSl
a simple discrete
COALITION
strategy-
won and
MITCS2, several
different variants of
did surprisingly well, especially considering the differences in the payoff function from
entries,
magnanimity seemed
to
best algorithms.
Finally,
in a variety of hypothetical
environments,
although at least one alternative algorithm did well in the nastiest of environments.
Despite the fine showings of CEAVG3 and the other
GPD
is
a rich and complex problem and our tournaments only begin to tap its
complexity. Nonetheless,
we do feel
GPD.
Beyond the obvious questions of more than
As
to recognize
possibilities for
game theory
21
References
Abreu, Dilip (1986), "Extremal Equilibria of Oligopolistic Supergames," Journal ofEconomic Theory,
39:191-225.
Axelrod, Robert (1980a), "EfTective Choice in the Prisoner's Dilemma," Journal of Conflict Resolution, 24 (March), 3-25.
(1980b), "More Effective Choice in the Prisoner's Dilemma," Journa/o/"Con/7ic/ Resolution, 24 (September). 379-403.
(1981),
Reuiew, 75:306-318.
{1984),
Barnaby, Frank (1987), "The Nuclear Arsenal in the Middle East," Technology Review, 90 (MayJune), 27-35.
3\. 169-193.
Supergames," i?ei;ieu;o/"coriomic
Goehring, Dwight J. and James P. Kahan (1976), "The Uniform N-Person Prisoner's Dilemma Game," Journal of Conflict Resolution, 20 (March), 1 1 1-128.
Griffin,
Management, MIT.
Hamburger, Henry and Company.
(1919),
Games
Lambin, Jean J (1976), Advertising, Competition, and Market Conduct New York: American Elsevier Publishing Co.
,
in Oligopoly
Over Time.
Philippe Naert, and Alain Bultez (1975), "Optimal Marketing Behavior European Economic Review, 6(1), 105-128
Schelling,
in Oligopoly,"
Thomas C. (1973), "Hockey Helmets, Concealed Weapons, and Daylight Saving: Binary Choices with Externalities," Journal of Conflict Resolution, 17:381-428.
A Study of
New York
Wiley.
Simon, Hermann (1979), "Dynamics of Price Elasticity and Brand Life Cycle: Journal of Marketing Research, 16 (November), 439-452
Taylor, Michael (1916), Anarchy
Telser, Lester G. (1962), "The
An
Empirical Study,"
Demand for Brand Goods as Estimated from Consumer Panel Data," Review of Economics and Statistics, 44 (August), 300-324.
L5^^:L',
? ^'^2 L)
22
Date Due
%^
Lib-26-fi7
MIT LIBRARIES
111