Dynamic Bargaining Game Analysis

Dynamic Games: Bargaining
Carlos Hurtado
Department of Economics
University of Illinois at Urbana-Champaign
hrtdmrt2@illinois.edu
June 27th, 2016
C. Hurtado (UIUC - Economics) Game Theory

On the Agenda
1 Finite Horizon Bilateral Bargaining
2 Rubinstein Bargaining Outcome
3 Exercises
4 Infinitely Repeated Games
5 How Players evaluate payoffs in infinitely repeated games?
6 An Infinitely Repeated Prisoner’s Dilemma
7 Exercises

Finite Horizon Bilateral Bargaining
On the Agenda
3 Exercises
7 Exercises

I Why are economists interested in bilateral bargaining?
It is a foundational or fundamental economic relationship on which the theory of

markets should be built.
I Today we are going to study a model of alternating offers bargaining that was
formulated separately by Dale Stahl and Ariel Rubenstein.
I The model uses a repited version of the ultimatum game where two players have a
continuous dollar to divide.
I There are several stages on the game, so we will use a discount factor δi
I How do we interpret a discount factor δi ?
Either as the time cost of money or the probability of continuing to another stage
(if all players share the same discount factor).
C. Hurtado (UIUC - Economics) Game Theory 1 / 31


I Ultimatum Game: Two players have a continuous dollar to divide.
Player 1 proposes to divide the dollar at x ∈ [0, 1], where he will keep x and player
2 will receive 1 − x .
Player 2 can choose to either Accept this division of the dollar, or Reject it, in
which case each player receives 0.

I Ultimatum Game:
Players: I = {1, 2}
Actions: A1 = {x |x ∈ [0, 1]} and A2 = {A, R}
Strategies: S1 = x and S2 (x ) ∈ A2 = {A, R}
Payoffs: (x , A) → (x , 1 − x ) or (x , R) → (0, 0)

I Any division (p, 1 − p) of the dollar can be sustained as a Nash equilibrium:
1 : s1 = p

R if 1 − x < 1 − p
2 : s2 (p) =
A if 1 − x ≥ 1 − p
I Notice that 2’s strategy specifies how he responds to any offer, not just the one
that 1 actually makes in equilibrium.
I We might interpret this equilibrium as "2 demands at least 1 − p, and 1 offers 2

the minimal amount that 2 will accept."
I In the case of p < 1, however, player 2’s strategy involves a not believable threat
to reject a positive amount 1 − p in favor of 0. Not rational behavior1 .
1
Rationality requires that player 2 accept any non negative offer. If we assume that 2 accepts only positive offers, there is
not best response for 1: he wants to choose x < 1 as large as possible, which is not well-defined

Theorem
There exist a unique SPNE such that Player 1’s offer is p=1 and Player 2’s strategy is
Accept with probability one, that is:
P1 : p=1
P2 : A
Proof: If Player 1 offers p < 1 he is leaving money on the table

If p < 1, then ∃ > 0 such that p + = 1.
Notice that p < p + 2 < 1 and 1 − p − 2 > 0
Then s̃1 = p + 2 is accepted and strictly dominates s1 = p. Hence, in any SPNE
Player 1 offers p = 1.
R and A is a best response because Player 2 always gets payoff of 0.
However, assume that Player 1 offers p = 1 and Player 2 Accepts with probability
π < 1.
The expected payoff of Player 1 is then 1 · π < 1.
Player 1 could offer π < s̃1 = π + 1−π
2
< 1 and Player 2 Accepts because
1 − π+1
2
> 0.
Player 1 could deviate of offering p = 1, a suboptimal strategy.
Finite horizon: bargaining for 2 stages

I Now consider a 2-stage analysis where player 1 proposes to divide the dollar at a
level x1 . Player 2 Accept this division of the dollar, or Reject it. If Rejection occurs
player 2 proposes to divid the dollar at a level x2 . as if they where playing the
Ulitmatum game. Notice that this is not a repeated game, as there is only one
dollar to be divided over time (not a new one for each stage).

I Nash equilibrium has two useful implications that are worth noting:
1. If i makes j an offer that j will accept, then the best response property of a Nash
equilibrium requires that i offer j the minimal amount that j will accept, according
to j’s strategy.
2. Player j’s rule for which offers he accepts at a given stage is based upon the
discounted value in that stage of rejecting the offer: j should reject an offer that
gives him y in stage t if and only if the present value in stage t of what he will get
by rejecting exceeds y. The present value of continuing to the next stage thus
determines a player’s cutoff for accepting/rejecting offers.

I Any division of the dollar in either stage 1 or stage 2 can be sustained by a Nash
equilibrium (Why?).
I Subgame perfection determines a unique division of the dollar (Why?).

I Consider the following 2 stage bargaining game in which $1 is to be split between
players 1 and 2. Each player i has discount factor δi ∈ (0, 1)
- Determine the subgame perfect equilibrium of this game

(Hint: use backward induction).
Subgame Perfect Nash Equilibrium:

I Player 1: Stage 1 plays x1 = 1 − δ2 ; Stage 2: accept any x2 .
I Player 2: Stage 1 accept if and only if 1 − x1 ≥ δ2 ; Stage 2: play x2 = 0


players 1 and 2. Each player i has discount factor δi ∈ (0, 1). Determine the
subgame perfect equilibrium of this game.


Subgame Perfect Nash Equilibrium:
I Player 1:
Stage 0: plays x0 = 1 − δ22 ;
Stage 1: plays x1 = 1 − δ2 ;
Stage 2: accept any x2 .
I Player 2:
Stage 0: accept if and only if 1 − x0 ≥ δ22 ;
Stage 1: accept if and only if 1 − x1 ≥ δ2 ;
Stage 2: play x2 = 0

Rubinstein Bargaining Outcome
On the Agenda
3 Exercises
7 Exercises

I Two players bargain over a pie of size 1
I Agreement: X = {(x1 , x2 ) ∈ R2 |x1 + x2 = 1; xi ≥ 0 ∀i}
I Bargaining Procedure:
- Players can take actions only at T = {1, 2, · · · }
- At each t ∈ T player i proposes x ∈ X , player j chooses A or R
- If player j chooses A bargaining ends
- If player j chooses R bargaining continues with j propose
I The game is infinite in two dimensions: 1) Continuum of choices 2) Unbounded

paths.

I Ariel Rubinstein retired yesterday, Jun 26th, 2017. He started teaching in October
1976 (as a Ph.D. student at the Hebrew U.) for a period of 41 years. He is the one
that rediscovered Nash.
Theorem
Every bargaining game of alternating offers in which the players’ preferences are well
behave has a unique subgame perfect Nash equilibrium. In this equilibrium Player 1
proposes to Player 2 exactly the present value that he will get when is his turn, and
accepts an offer from Player 2 that is at leas as good as what we gets if he continues the
game. Hence, any subgame perfect equilibrium must have the players reach an
agreement in the first round.

Exercises
On the Agenda
3 Exercises
7 Exercises

Exercises
Exercises
players 1 and 2. Each player i has discount factor δi ∈ (0, 1). Determine the
subgame perfect equilibrium of this game.

Infinitely Repeated Games
On the Agenda
3 Exercises
7 Exercises

I If a game has a unique Nash equilibrium, then its finite repetition has a unique
SPNE (Exercise).
I Our intuition, however, is that long-term relationships may be fundamentally

different from one-shot meetings.
I Infinitely repeated games also model a long-term relationship in which the players
do not know a priori when they will stop repeating the game: there is no
pre-ordained number of repetitions.
I Recall the terminology: The game that is being repeated is the stage game.
I The stages of the game are t = 0, 1, 2, .... An infinitely repeated game is also
sometimes called a supergame.

How Players evaluate payoffs in infinitely repeated games?
On the Agenda
3 Exercises
7 Exercises


I A player receives an infinite number of payoffs in the game corresponding to the
infinite number of plays of the stage game.
I We need a way to calculate a finite payoff from this infinite stream of payoffs in
order that a player can compare his strategies in the infinitely repeated game.
I The most widely used approach is discounted payoffs:
- Let ρit denote the payoff that player i receives in the t-th stage of the game.
- Player i evaluates an infinite sequence of payoffs as a sum of discounted values.
∞
X
δit ρit
t=0
- This will be a finite number as long as (|ρit |)∞

t=0 is bounded above.
- This discounted sum is typically modified by putting (1 - δi ) in front (for reasons
that will be clear in a moment).
∞
X
(1 − δi ) δit ρit
t=0
- This is a renormalization of utility that doesn’t change player i’s ranking of any
two infinite sequences of payoffs.

I We all know that:
an − b n (a − b) an−1 + an−2 b + an−3 b 2 + . . . + ab n−2 + b n−1

=
n−1
X
= (a − b) an−j−1 b j
j =0
I Set a = x and b = 1:
xn − 1 (x − 1) x n−1 + x n−2 + x n−3 + . . . + a + 1

=
n−1
X
= (x − 1) xj
j =0
I Hence:
n−1
X xn − 1 1 − xn
xj = =
x −1 1−x
j =0

I Note that, if |x | < 1:

lim x n = 0
n→∞
I Then:
n−1 ∞
X X 1
lim xj = xj =
n→∞ 1−x
j =0 j =0
I In particular, the (1 − δi ) insures that player i evaluates the sequence in which he

receives a constant c in each period as c:
∞
X 1
(1 − δi ) δit c = c(1 − δi ) =c
1 − δi
t=0

I An alternative approach is limiting average payoffs:

Pn−1
t=0
ρit
lim
n→∞ n
I The existence of this limit is sometimes a problem.
I The advantage of this formula, however, is that it is easy to calculate the limiting
payoff if the sequence of payoffs eventually reaches some constant payoff.

An Infinitely Repeated Prisoner’s Dilemma
On the Agenda
3 Exercises
7 Exercises


I We will analyze the game using discounted payoffs. Consider the following version
of the prisoner’s dilemma:
1/2 c nc
c 2,2 -3,3
nc 3,-3 -2,-2
I Here, c refers to "cooperate" while nc refers to "don’t cooperate".
I A theorem that we stated in the beginning of the class implies that there is a
unique SPNE in the finite repetition of this game, namely nc, nc in each and every
stage.
I This remains an SPNE outcome of the infinitely repeated game:
player 1 : nc in every stage
player 2 : nc in every stage
I Given the other player’s strategy, playing nc maximizes player i’s payoff in each
stage of the game and hence maximizes his discounted payoff (and also his average
payoff, if that is how he’s calculating his return in the infinite game).
I This isn’t a very interesting equilibrium. Why bother with infinite repetition if this
is all that we can come up with?
I In particular, we ask "Can the players sustain (c, c) as the outcome in each and
every stage of the game as a noncooperative equilibrium?"
I Consider the following strategy as played by both players:
1. Play c to start the game and as long as both players play c.

2. If any player ever chooses nc, then switch to nc for the rest of the game.
I This is a trigger strategy in the sense that bad behavior (i.e., playing nc) by either
player triggers the punishment of playing nc in the remainder of the game.
I It is sometimes also called a "grim" trigger strategy to emphasize how unforgiving

it is: if either player ever chooses nc, then player i will punish his opponent forever.

I Does the use of this trigger strategy define an SPNE?
I Playing c in any stage does not maximize a player’s payoff in that stage (nc is the
best response within a stage).
I Suppose player i starts with this strategy and considers deviating in stage k to
receive a payoff of 3 instead of 2.
I Thereafter, his opponent chooses nc, and so he will also choose nc in the
remainder of the game.
I The use of trigger strategies therefore defines a Nash equilibrium if and only if the
equilibrium payoff of 2 in each stage is at least as large as the payoff from
deviating to nc in stage k and ever thereafter:
∞
" k−1 ∞
#
X X X
(1 − δi ) δit · 2 ≥ (1 − δi ) δit · 2 + δik · 3 + δit · (−2)
t=0 t=0 t=k+1

I We have:
∞
" k−1 ∞
#
X X X
(1 − δi ) δit ·2 ≥ (1 − δi ) δit ·2+ δik ·3+ δit · (−2)
t=0 t=0 t=k+1
∞
X ∞
X
δit · 2 ≥ δik · 3 + δit · (−2)
t=k t=k+1
∞ ∞
X X
δit · 2 ≥ 3 + δi δit · (−2)
t=0 t=0
2 2δi
≥ 3−
1 − δi 1 − δi
5δi ≥ 1
1
δi ≥
5

I Deviating from the trigger strategy produces a one-time bonus of changing one’s
stage payoff from 2 to 3. The cost, however, is a lower payoff ever after.
I We see that the one-time bonus is worthwhile for player i only if his discount
factor is low (δi < 1/5), so that he doesn’t put much weight upon the low payoffs
in the future.
I When each δi ≥ 1/5, do the trigger strategies define a subgame perfect Nash
equilibrium (in addition to being a Nash equilibrium)?
I The answer is Yes!
I A subgame of the infinitely repeated game is determined by a history, or a finite
sequence of plays of the game.
I There are two kinds of histories to consider:
1. If each player chose c in each stage of the history, then the trigger strategies
remain in effect and define a Nash equilibrium in the subgame.
2. If some player has chosen nc in the history, then the two players use the
strategies (nc,nc) in every stage.

I Whichever of the two kinds of history we have, the strategies define a Nash
equilibrium in the subgame. The trigger strategies therefore define a subgame
perfect Nash equilibrium whenever they define a Nash equilibrium.
I Recall the fundamental importance of the Prisoner’s Dilemma: it illustrates quite

simply the contrast between self-interested behavior and mutually beneficial
behavior.
I The play of (nc, nc) instead of (c, c) represents the cost of noncooperative
behavior in comparison to what the two players can achieve if they instead were
able to cooperate.
I What we’ve shown is that that the cooperative outcome can be sustained as a
noncooperative equilibrium in a long-term relationship provided that the players
care enough about future payoffs!

A General Analysis
I We will calculate a lower bound on δi that is sufficient to insure that player i will
not deviate from si∗ .
I Suppose player i deviates from si∗ in stage k. We make two observations:
1. Player −i switches to s−i in each stage t > k. Player i’s best response is to
choose si in each stage after the k-th stage (recall our assumption that
(s1 , s2 ) is a Nash equilibrium).
2. The maximal payoff that player i can gain in the k-th stage is di (by
assumption).

A General Analysis
I The following inequality is therefore necessary and sufficient for player i to prefer
his trigger strategy to the deviation that we are considering:
∞
" k−1 ∞
#
X X X
(1 − δi ) δit · πi∗ ≥ (1 − δi ) δit · πi∗ + δik · di + δit · πi
t=0 t=0 t=k+1
∞ ∞
X X
δit · πi∗ ≥ δik · di + δit · πi
t=k t=k+1
∞ ∞
X X
δit · πi∗ ≥ d i + δi δit · πi
t=0 t=0
πi∗ πi δi
≥ di +
1 − δi 1 − δi
πi∗ ≥ di (1 − δi ) + δi πi
di − πi∗
δi ≥
di − π

A General Analysis
I As in the previous example, we have obtained a lower bound on δi that is sufficient

to insure that player i will not deviate from his trigger strategy given that the
other player uses his trigger strategy.
I Several observations:
1. The analysis focuses on a single player at a time and exclusively on his
payoffs. The bound thus extends immediately to stage games with n > 2
players. The assumption that there are two players has no role in the above
analysis.
2. Notice that any player who deviates from the "better" strategies (s1∗ , s2∗ )
triggers the switch by both players to the Nash equilibrium strategies (s1 , s2 ).
This is unfair in the sense that both players suffer from the bad behavior of
one of the two players (it is part of the definition of equilibrium).
3. If di ≤ πi∗ , then player i has no incentive to deviate from si∗ (he doesn’t even
get a one-stage "bonus" from ending the play of (s1∗ , s2∗ ) for the rest of the
game). We thus don’t have to worry about player i’s willingness to stick to
his trigger strategy regardless of the value of his discount factor.

Exercises
On the Agenda
3 Exercises
7 Exercises

Exercises
Exercises
I Consider the following stage game:
1\2 L C R
T 1,-1 2,1 1,0
I
M 3,4 0,1 -3,2
B 4,-5 -1,3 1,1
a. Find the unique pure strategy Nash equilibrium.
b. Write down a trigger strategy where the outcome of the game is (M, L).
c. Find a lower bound on δi that is sufficient to insure that player i will not deviate
from his trigger strategy given that the other player uses his trigger strategy.

Dynamic Bargaining Game Analysis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dynamic Bargaining Game Analysis

Uploaded by

Copyright:

Available Formats

Dynamic Games: Bargaining

June 27th, 2016

C. Hurtado (UIUC - Economics) Game Theory

1 Finite Horizon Bilateral Bargaining

2 Rubinstein Bargaining Outcome

4 Infinitely Repeated Games

5 How Players evaluate payoffs in infinitely repeated games?

6 An Infinitely Repeated Prisoner’s Dilemma

C. Hurtado (UIUC - Economics) Game Theory

1 Finite Horizon Bilateral Bargaining

2 Rubinstein Bargaining Outcome

4 Infinitely Repeated Games

5 How Players evaluate payoffs in infinitely repeated games?

6 An Infinitely Repeated Prisoner’s Dilemma

C. Hurtado (UIUC - Economics) Game Theory

Finite Horizon Bilateral Bargaining

I Why are economists interested in bilateral bargaining?

It is a foundational or fundamental economic relationship on which the theory of

I How do we interpret a discount factor δi ?

C. Hurtado (UIUC - Economics) Game Theory 1 / 31

Finite Horizon Bilateral Bargaining

C. Hurtado (UIUC - Economics) Game Theory 2 / 31

Finite Horizon Bilateral Bargaining

Actions: A1 = {x |x ∈ [0, 1]} and A2 = {A, R}

Strategies: S1 = x and S2 (x ) ∈ A2 = {A, R}

C. Hurtado (UIUC - Economics) Game Theory 3 / 31

Finite Horizon Bilateral Bargaining

I Any division (p, 1 − p) of the dollar can be sustained as a Nash equilibrium:

I We might interpret this equilibrium as "2 demands at least 1 − p, and 1 offers 2

Finite Horizon Bilateral Bargaining

Proof: If Player 1 offers p < 1 he is leaving money on the table

Finite horizon: bargaining for 2 stages

C. Hurtado (UIUC - Economics) Game Theory 6 / 31

Finite horizon: bargaining for 2 stages

C. Hurtado (UIUC - Economics) Game Theory 7 / 31

Finite horizon: bargaining for 2 stages

Finite horizon: bargaining for 2 stages

- Determine the subgame perfect equilibrium of this game

Finite horizon: bargaining for 2 stages

Subgame Perfect Nash Equilibrium:

C. Hurtado (UIUC - Economics) Game Theory 10 / 31

Finite horizon: bargaining for 3 stages

C. Hurtado (UIUC - Economics) Game Theory 11 / 31

Finite horizon: bargaining for 3 stages

C. Hurtado (UIUC - Economics) Game Theory 12 / 31

Finite horizon: bargaining for 3 stages

Subgame Perfect Nash Equilibrium:

Stage 0: plays x0 = 1 − δ22 ;

Stage 2: accept any x2 .

Stage 0: accept if and only if 1 − x0 ≥ δ22 ;

Stage 1: accept if and only if 1 − x1 ≥ δ2 ;

C. Hurtado (UIUC - Economics) Game Theory 13 / 31

1 Finite Horizon Bilateral Bargaining

2 Rubinstein Bargaining Outcome

4 Infinitely Repeated Games

5 How Players evaluate payoffs in infinitely repeated games?

6 An Infinitely Repeated Prisoner’s Dilemma

C. Hurtado (UIUC - Economics) Game Theory

Rubinstein Bargaining Outcome

I Two players bargain over a pie of size 1

I Agreement: X = {(x1 , x2 ) ∈ R2 |x1 + x2 = 1; xi ≥ 0 ∀i}

- Players can take actions only at T = {1, 2, · · · }

- At each t ∈ T player i proposes x ∈ X , player j chooses A or R