Repeated games strategies

REPEATED GAMES
REPEATED GAMES
¢ Up to now we focus our attention on games
where there is not repeated interaction over time
among players
¢ Nevertheless, in many situations players
repeatedly interact among them for a long period.
Examples:
Firms competition.
Supply chain relationships.
Interactions among colleagues in the
workplace.
One shot games versus repeated games
¢ In repeated games, players may condition their

strategies to the history of the game (what
happened in the past matters!)
¢ In repeated games there exist punishments!
More sophistacated strategies can be played

and more equilibria exist wrt to one-shot
games!
A MOTIVATING EXAMPLE: INDUCING COOPERATION
L C R Z
T 5,5 3,6 0,0 0,0
M 6,3 4,4 0,0 0,0
B 0,0 0,0 1,1 -1,-1
W 0,0 0,0 -1,1 -2,-2
Which are the NE in pure strategies of this one-

shot game?
INDUCING COOPERATION
¢ Suppose players play this game twice.

¢ After each repetition, players observe what every
player has played.
¢ Players’ payoff is the sum of what they get in
each round (discount rate equal to one).
¢ Is there a SPNE in which the “good” action profile
(T,L) is played in some round?
¢ Player 1 (row) plays T in the first round and plays
M in the second if and only if in the first round
(T,L) has been played, otherwise he plays B.
¢ Player 2 (column) plays L in the first round and
plays C in the second if and only if in the first
round (T,L) has been played, otherwise he plays
R.
Let check whether the previous one is a SPNE: two

cases have to be analyzed
Case 1: players played (T,L) at round 1

Case 2: players did not play (T,L) at round 1
¢ The previous one is not the unique SPNE of the
game
¢ Players 1 always plays M and player 2 always
play C
¢ This is also a SPNE of the game (without any
cooperation!)
Remark: If the one-shot game has a NE, there
exists s SPNE of the repeated game such that
players always play this strategy for all
repetitions.
REPEATED GAMES
¢ Consider a game G (which we’ll call the stage game
or the constituent game).
¢ Let the set of players I={1,…,n}. Let call a player’s

stage game choices as actions rather than strategies
and reserve “strategy” for choices in the repeated
game.
¢ Let Ai be the set of actions available to player i and A

the set of action profiles, A=Xi∊Iai.
¢ Let G be played several times (perhaps an infinite
number of times) and award each player a payoff
which is the sum (perhaps discounted) of the
payoffs she got in each period from playing G.
¢ Then this sequence of stage games is itself a

game: a repeated game or a supergame.
¢ Two statements are implicit when we say that in each
period we’re playing the same stage game:
¢ A) For each player the set of actions available to her

in any period in the game G is the same regardless of
which period it is and regardless of what actions have
taken place in the past.
¢ B) The payoffs to the players from the stage game in

any period depend only on the action profile for G
which was played in that period, and this stage-game
payoff to a player for a given action profile for G is
independent of which period it is played.
¢ A profile of actions played at time t is denoted by
at=(at1,… atn)
¢ We want to be able to condition the players’

stage-game action choices in later periods upon
actions taken earlier by other players. To do this
we need the concept of a history: a description of
all the actions taken up through the previous
period. We define the history at time t as
ht=(a1… at-1)
STRATEGIES AND HISTORIES
¢ A strategy for player i at time t is a specification of an
action for each possible history.
¢ It will be useful to distinguish a particularly simple

class of repeated-game strategies: open-loop
strategies: a player plays the same strategy at time
any time irrespective of what has been played in
the past, i.e irrespective from the history of the
game.
¢ When a player’s strategy depends on the history,

we say that it is a closed-loop strategy.
Remark: A sequence of open loop stage-game Nash
equilibrium strategy profiles is a SPNE of the repeated
game.
¢ Playing the same Nash equilibrium strategy of the

stage game at any round is a SPNE of the game.
¢ However, players can play more sophisticated

strategies, as we have seen in our motivating
example
FINITELY REPEATED GAMES
¢ Consider a game that is repeated a finite number of times
T and that this number of repetitions is common
knowledge.
Proposition: In a subgame-perfect equilibrium of the

repeated game the last-period play for any history must be
a Nash equilibrium of the stage game.
Corollary: Suppose the stage game has a unique Nash-

equilibrium. Then every subgame-perfect equilibrium
strategy profile of the repeated game involves a repetition
every period of the Nash-equilibrium stage-game strategy
profile.
FINITELY REPEATED GAMES
A motivating example: the centipede game
1 2 1 2 1 2
100,100
c c c c c
s s s s s
s
1 0 2 97 99 98
1 3 2 100 99 101
This game is NOT a repeated game, but it

conveys the intuition why SPNE seems a too
demanding equilibrium concept that it does not
predict well players’ behavior.
PRISONER DILEMMA
Consider the following prisoner dilemma repeated T times, with T
finite and larger or equal than two. How many SPNE of this repeated
game?
Bart
Homer Non Coop. Coop.
Non Coop. 0,0 10, -3
Coop. -3,10 5,5

INFINITELY REPEATED GAMES
¢ Cooperation is never possible in finite games like

the previous one because forward looking players
anticipate the end of the game and never
cooperate.
¢ Infinitely repeated games are an easy way to
model repeated interaction avoiding the paradox
of the finite games where each stage game has a
unique NE.
¢ In many real world situations players interact “as”
their interaction is a never-ending story.
¢ Since there is not an end of the relation,
cooperation may be sustained by future
punishment.
¢ Players trade-off the current benefit of a non-
cooperating behavior, versus the cost of a
future punishment.
¢ Players’ payoff is the discounted sum of the
stage-game payoff, with discount δ <1.
¢ We cannot apply backward induction in infinitely

repeated games and they cannot represented in
an extensive form.
¢ Nevertheless we can define SPNE and apply the
sequential rationality logic.
¢ Consider the browser game with two possible
price and let δ be the discount factor.
EXAMPLE: BROWSER GAME
Two firms competing a la Bertrand with homogeneous product.
100 custormers. Each firm can fix either price L=11 or H=20,
zero marginal cost of production
Consider the following strategy for each firm

P=20 at t=0 and for all t>0 if at time t’<t
p1=p2=20
P=11 otherwise
Is this strategy profile a SPNE of the game?
EXAMPLE: BROWSER GAME CONT’ED
Case 1: if at time t-1 a firm played p=11, then at
time t p=11 is the unique NE
Case 2: if there are not deviations, to deviate is not

a best response iff
(1/1-δ)(20)50≥ (11)100+ (δ /1-δ)(11)50,
that is
δ ≥2/11
PRISONER DILEMMA
If the discount factor is high enough there always exists a strategy to
sustain cooperation (with the strongest punishment: no cooperation
forever).
Bart
Homer Non Coop. Coop.
Non Coop. 0,0 10, -3
Coop. -3,10 5,5

TRIGGER STRATEGY
¢ Trigger strategy punishes non-cooperation with
a never-ending (costly) punishment:
(1) Player plays C(oop) in t=1
(2) Plays C in all t’>1 if both players played NC in

each previous time t<t’, otherwise plays NC for
all t ≥ t’.
For which discount factor such strategy is a SPNE?

(1/1-δ)(5) ≥ 10+ (δ /1-δ)0, that is
δ ≥1/2
TRIGGER STRATEGY CONT’ED
¢ Trigger strategy is not the only strategy to sustain
cooperation, but it is the one with the largest
punishment. It follows
¢ Remark: If the discount factor is not high enough

to sustain cooperation when players adopt a
trigger strategy, no strategy sustains cooperation
in equilibrium
¢ Of course it may exist other less punitive
strategies to sustain cooperation.
¢ It may be interesting in fact to find the less costly
( in term of punishment strategy) to sustain
cooperation
¢ Typically to sustain cooperation, strategies
provide a punishment for T periods and after
players cooperate again
Example: Suppose that the discount factor is 3/4 , which
is the minimum number T of punishing periods to
sustain cooperation?
Compare the players’ payoff in case of cooperation and in
case of deviation from cooperation. To cooperate is a
best response iff
(1/1-δ)5≥ 10+ (δT /1-δ)5
Substituing δ=3/4 we get
T≥3.
COLLUSION WITH DEMAND SHOCKS
¢ Two firms compete a la Cournot for an infinite
number of times . Let δ be the discount factor and
Q=12-P market demand with zero marginal cost
and homogeneous good.
¢ The (unique) NE of the one-shot game is the
Cournot-Nash equilibrium.
¢ Which is the quantity that maximizes joint profits?
Which is the quantity that colluding firms should
produce?
¢ Which is the trigger strategy of the infinitely
repeated game? Which is the minimum discount
factor that sustains collusion?
¢ In the Cournot one-shot game each firm produces
q=4 (for a total production equal to 8) and makes
profits equal to 16.
¢ If firms collude they can jointly produce the
monopoly quantity (equal to 6), that is q=3 each
firm making profits equal to 18.
¢ The trigger strategy is that each firm produces
q=3 at time t if each firm produces q=3 at time t-
1, otherwise it produces q=4 forever.
¢ The optimal deviation from the collusive strategy is
to produce q=(12-3)/2=4.5 (selling at p=4.5)
¢ Therefore colluding is a best response if and only
if
16d 18
20.25 + <
1- d 1- d
¢ It follows that collusion is sustained by any
δ≥0.53
¢ Suppose now that each firm can only produce
an amount equal to 3 or 4 and the discount
factor is 0.7.
¢ Each firm only observes the equilibrium (market
price) but not the quantity produced by the
other firm.
¢ Trigger strategy: each firm produces 3 if the
equilibrium price of the previous period is 6, it
produces a quantity equal to 4 forever, if the
market price is lower.
¢ Suppose now that with exogenous probability
equal to 1/10 in each period there may exist a
negative demand shock and P=11-Q.
¢ Which is the best (profit maximizing) collusive
strategy for the firms?
¢ Which is the minimum number of punishing periods
that sustain collusion?
¢ Three periods of punishments are enough! In fact,
20.25 + 16(0.7)(1 + d + d ) < 18 + 18(0.7)(1 + d + d )

2 2
¢ The optimal punishment strategy is to sell q=4 for
three periods after having observed a low price
and then to shift back to the cooperating stratgy
(q=3)
¢ Remark: in a colluding market you may observe
cyclical prices when there is imperfect information
about the other players’ strategies
MORE ON COLLUSION
¢ Consider the previous game (without any exogeneous
shocks) and suppose the (common) discount factor is
equal to 0.5
¢ We claimed that colluding at monopoly price is not
possible even if firms play a trigger strategy.
¢ Should we conclude that firms necessarily compete a
la Cournot because are unable to collude?
¢ The answer is not. Suppose firms agree on producing
a quantity equal to 3.5 each.
¢ In equilibrium each firm makes profits equal to
3.5x5=17.5
MORE ON COLLUSION
¢ If a firm deviates from this collusive agreement, it
produces (given the other firm is producing 3.5) the
quantity that maximizes its current profits, that is
4.25, as it can be easily computed from the reaction
curve.
¢ It can be easily proved that the deviation is not
profitable:
0.5 1
18.63 + 16 < 17.5
1 - 0.5 1 - 0.5
MORE ON COLLUSION
¢ Why does it happen?
¢ By increasing the quantity produced in each
period, firms make a single-period deviation less
profitable, and therefore they reduce incentive to
deviate.
¢ Firms do not necessarily collude at monopolist
price, but at the largest price that allows them to
collude (and eventually to not be catched by
Antitrust Authority!)
MORE INFORMATION IS ALWAYS BETTER (FOR
CONSUMERS)?
¢ Internet can be considered as the “market” where

information is almost costless.
¢ Consumers are (almost) perfectly informed about
the prices of each sellers (search engine
websites etc)
¢ Hence, for homogeneous goods sold by more
than one sellers Betrand model suggests that
prices are equal to marginal costs, and more in
general that we should observe a fierce price
competition.
¢ Books are typically homogeneous goods.
¢ However it is not the case. It is an oligopolistic
market (in the US two big sellers, Amazon and
Barnes&Noble) and data suggest collusion.
¢ Why these two competitors are able to play
“cooperatively”?

Repeated games strategies

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Repeated games strategies

Uploaded by

Copyright:

Available Formats

REPEATED GAMES

¢ In repeated games, players may condition their

More sophistacated strategies can be played

T 5,5 3,6 0,0 0,0

M 6,3 4,4 0,0 0,0

B 0,0 0,0 1,1 -1,-1

W 0,0 0,0 -1,1 -2,-2

Which are the NE in pure strategies of this one-

¢ Suppose players play this game twice.

Let check whether the previous one is a SPNE: two

Case 1: players played (T,L) at round 1

¢ Let the set of players I={1,…,n}. Let call a player’s

¢ Let Ai be the set of actions available to player i and A

¢ Then this sequence of stage games is itself a

¢ A) For each player the set of actions available to her

¢ B) The payoffs to the players from the stage game in

¢ We want to be able to condition the players’

¢ It will be useful to distinguish a particularly simple

¢ When a player’s strategy depends on the history,

¢ Playing the same Nash equilibrium strategy of the

¢ However, players can play more sophisticated

Proposition: In a subgame-perfect equilibrium of the

Corollary: Suppose the stage game has a unique Nash-

This game is NOT a repeated game, but it

Coop. -3,10 5,5

¢ Cooperation is never possible in finite games like

¢ We cannot apply backward induction in infinitely

Consider the following strategy for each firm

Case 2: if there are not deviations, to deviate is not

Coop. -3,10 5,5

(1) Player plays C(oop) in t=1

(2) Plays C in all t’>1 if both players played NC in

For which discount factor such strategy is a SPNE?

¢ Remark: If the discount factor is not high enough

(1/1-δ)5≥ 10+ (δT /1-δ)5

Substituing δ=3/4 we get

20.25 + 16(0.7)(1 + d + d ) < 18 + 18(0.7)(1 + d + d )

¢ Internet can be considered as the “market” where

You might also like