Professional Documents
Culture Documents
Handout 10 Dynamic Programming Nov14
Handout 10 Dynamic Programming Nov14
1. Introduction
xt = h(st )
st+1 = g(st , xt )
Vj+1 (s) = max{r(s, x) + βVj (s′ )}, s.t. s′ = g(s, x), s given
x
f (y)
For a function f : R 7→ R,
this definition merely states
that for every z between x
and y, the point (z, f (z)) on
the graph of f is above the
straight line joining the points
(x, f (x)) and (y, f (y)). f (x) (1 − t)f (x) + tf (y)
x y
f
y=x
I A point x∗ is a x∗1
fixed-point of function f
if it satisfies f (x∗ ) = x∗ .
I Notice that x
f (f (. . . f (x∗ ) . . . )) =
x∗ .
x∗0
f
A mapping f : X 7→ X
from a metric space X into f (t)
itself is said to be a strong
|f (x) − f (y)|
contraction with modulus δ,
if 0 ≤ δ < 1 and
x0 = 6
6
x1 = f (x0 ) = 1 + 2 =4
4
x2 = f (x1 ) = 1 + 2 =3
3
x3 = f (x2 ) = 1 + 2 = 2.5
2.5
x4 = f (x3 ) = 1 + 2 = 2.25
..
.
x0 x
If we keep iterating, we will get arbitrarily close to the solution
x∗ = 2.
when the states and controls can be defined in such a way that
only x appears in the transition equation, i.e.,
s′ = g(x) ⇒ gs (s, x∗ ) = 0,
for each s.
3. Iterate over j to convergence on steps 1 and 2.
with s0 given at t = 0
I ϵt is a sequence of i.i.d. r.v. : P[ϵt ≤ e] = F (e) for all t
I ϵt+1 is realized at t + 1, after xt has been chosen at t.
I At time t:
I st is known
I st+j is unknown (j ≥ 1)
The consumer
I is endowed with A0 units of the consumption good,
I does not have income
I can save in a bank deposit, which yields a interest rate r.
The budget constraint is
At+1 = R(At − ct )
Consumer problem:
∞
∑
V (A0 ) = max β t u(ct ) (objective)
{ct ,At+1 }∞
t=0 t=0
At+1 = R(At − ct ) ∀t = 0, 1, 2, . . .
(budget constraint)
Bellman equation
{ }
V (A) = max′ u(c) + βV (A′ ) + λ[R(A − c) − A′ ]
c, A
I This says that the maximum lifetime utility the consumer can
get must be equal to the sum of current utility plus the
discounted value of the lifetime utility he will get starting next
period.
so the FOCs is
u′ (c)
− + βV ′ (A′ ) = 0 ⇒ u′ (c) = βRV ′ (A′ )
R
The envelope condition is:
V ′ (A) = λR = u′ (c)
The problem is
so the FOCs is
u′ (c) − βRV ′ (A′ ) = 0
but in this case the envelope condition is not useful:
Euler equation
I This says that at the optimum, if the consumer gets one more
unit of the good, he must be indifferent between consuming it
now (getting u′ (c)) or saving it (which increases next-period
assets by R) an consuming it later, getting a discounted value
of βRu′ (c′ ).
I Notice that this is the say result we found on Lecture 8
(Applications of consumer theory), in the two-period
intertemporal consumption problem!
∑
T
U (c0 , c1 , . . . , cT ) = β t ln ct
t=0
At+1 = R(At − ct )
∑
T
V0 (A0 ) = max β t ln c∗t
{ct }
t=0
subject to At+1 = R(At − ct )
Consumer problem:
∑
T
V0 (A0 ) = max β t ln ct (objective)
{c,A}
t=0
At+1 = R(At − ct ) ∀t = 0, . . . , T
(budget constraint)
AT +1 ≥ 0 (leave no debts)
AT +1 = R(AT − cT ), AT +1 ≥ 0
A
We need to find cT and AT +1 . Substitute cT = AT − TR+1 in the
objective function:
[ ]
AT +1
max ln AT − subject to AT +1 ≥ 0
AT +1 R
c∗T −1 = 1
1+β AT −1 ⇒ A∗T = Rβ
1+β AT −1
{ {
1 = RβcT −2 VT′ −1 (AT −1 ) AT −1 = Rβ(1 + β)cT −2
⇒
AT −1 = R(AT −2 − cT −2 ) AT −1 = R(AT −2 − cT −2 )
It follows that
R(β+β 2 )
c∗T −2 = 1
A
1+β+β 2 T −2
⇒ A∗T −1 = A
1+β+β 2 T −2
where
θT −2 = (β + 2β 2 ) ln R + (β + 2β 2 ) ln β − (1 + β + β 2 ) ln(1 + β + β 2 )
t c∗t A∗t+1
T AT 0AT
1 1
T −1 1+β AT −1 Rβ 1+β AT −1
1 1+β
T −2 A
1+β+β 2 T −2 Rβ 1+β+β 2 AT −2
We could guess that after k iterations:
1 k−1
T −k A
1+β+···+β k T −k Rβ 1+β+···+β
1+β+···+β k
AT −k
1−β 1 − βk
= AT −k Rβ AT −k
1 − β k+1 1 − β k+1
1−β k
Since AT −k+1 = Rβ 1−β k+1 AT −k , setting k = T, T − 1:
1 − βT
A1 = Rβ A0
1 − β T +1
1 − β T −1
A2 = Rβ A1
1 − βT
1 − β T −1
= (Rβ)2 A0
1 − β T +1
Iterating in this fashion we find that
1 − β T +1−t
At = (Rβ)t A0
1 − β T +1
1−β
c∗t = At
1 − β T +1−t
[ ]
1−β t1 − β
T +1−t
= (Rβ) A0
1 − β T +1−t 1 − β T +1
1−β
= (Rβ)t A0
1 − β T +1
ϕ
That is
ln c∗t = t ln(Rβ) + ln ϕ
∑
T ∑
T
V0 (A0 ) ≡ β t
ln c∗t = β t (t ln(Rβ) + ln ϕ)
t=0 t=0
∑
T ∑
T
= ln(Rβ) β t t + ln ϕ βt
t=0 t=0
( )
β 1− βT 1 − β T +1
= − Tβ T
ln(Rβ) + ln ϕ
1−β 1−β 1−β
( )
β 1 − βT
− Tβ T
ln(Rβ) + . . .
1−β 1−β
=
1 − β T +1 1−β 1 − β T +1
+ ln + ln A0
1−β 1−β T +1 1−β
( )
β 1 − βT 1 − β T +1 1−β 1 − β T +1
V0 (A0 ) = − T βT ln(Rβ)+ ln + ln A0
1−β 1−β 1−β 1−β T +1 1−β
1 β ln R + β ln β + (1 − β) ln(1 − β)
V0 (A0 ) = ln A0 +
1−β (1 − β)2
©Randall Romero Aguilar, PhD EC-3201 / 2019.II 66
The policy function
Policy function
t Vt (A)
T ln A
T −1 (1 + β) ln A + θT −1
T −2 (1 + β + β 2 ) ln A + θT −2
..
.
( )
1 β 1 − βT 1 1−β
0 ln A + − T βT ln(Rβ) + ln
1−β 1−β 1−β 1−β 1 − β T +1
Notice that the value function changes each period, but only
because each period the remaining horizon becomes one period
shorter.
©Randall Romero Aguilar, PhD EC-3201 / 2019.II 68
Time-invariant value function
or simply { }
V (A) = max′ ln c + βV (A′ )
c,A
That is, if
β ln R + β ln β + (1 − β) ln(1 − β)
θ=
(1 − β)2
but the term in square brackets must be zero from the FOC.
In this model
I the consumer is endowed with k0 units of a good that can be
used either for consumption or for the production of additional
good
I we refer to “capital” to the part of the good that is used for
future production
I capital fully depreciates with the production process.
I The lifetime utility of the consumer is again
∑
U (c0 , c1 , . . . , c∞ ) = ∞ t
t=0 β ln ct ,
I The production function is y = Ak α , where A > 0 and
0 < α < 1 are parameters.
I The budget constraint is ct + kt+1 = Aktα .
Consumer problem:
∞
∑
V (k0 ) = max ∞ β t ln ct (objective)
{ct ,kt+1 }t=0
t=0
Notice how this result is similar to the one we got in the savings
model: the return for giving up one unit of current consumption is:
*
The j subscript refers
©Randall Romero Aguilar, PhD
to an iteration, not to EC-3201
the horizon.
/ 2019.II 81
Starting from V0 = 0, the problem becomes:
{ }
V1 (k) = max
′
ln(Ak α − k ′ ) + β × 0
k
V1 (k) = ln c∗ + β × 0
= ln (Ak α )
= ln A + α ln k
V2 (k) = ln(c∗ ) + β ln A + αβ ln k ′∗
= ln(1 − θ1 ) + ln(Ak α ) + β[ln A + α ln θ1 + α ln(Ak α )]
= (1 + αβ) ln(Ak α ) + β ln A + [ln(1 − θ1 ) + αβ ln θ1 ]
= (1 + αβ) ln(Ak α ) + ϕ2
The FOC is
1 αβ(1 + αβ)
=
Ak α − k ′ k′
αβ + α2 β 2
k ′∗ = Ak α = θ2 Ak α
1 + αβ + α2 β 2
j V (k)
1 (1) ln (Ak α )
2 (1 + αβ) ln (Ak α ) + ϕ2
3 (1 + αβ + α2 β 2 ) ln (Ak α ) + ϕ3
Vj (k) = (1 + αβ + . . . + αj β j ) ln (Ak α ) + ϕj
V (k) = 1
1−αβ ln (Ak α ) + Φ
I Notice that we did not formally prove that the ϕj sequence
actually converges (so far we are just assuming it does.)
I FOC is
{
1 αβ k ′∗ = αβAk α = αβy
′
= ⇒
Ak − k
α (1 − αβ)k ′ c∗ = (1 − αβ)Ak α = (1 − αβ)y
Therefore
(1−αβ) ln(1−αβ)+β ln[A(αβ)α ]
(1 − β)Φ= 1−αβ
αβ
β ln A + ln (αβ) + ln(1 − αβ)(1−αβ)
Φ=
(1 − β)(1 − αβ)
k ′∗ (k) = αβAk α
c∗ (k) = (1 − αβ)Ak α
1 β ln A+ln(αβ)αβ +ln(1−αβ)(1−αβ)
V (k) = ln (Ak α ) +
1 − αβ (1−β)(1−αβ)
k ′∗ (y) = αβy
c∗ (y) = (1 − αβ)y
1 β ln A+ln(αβ)αβ +ln(1−αβ)(1−αβ)
V (y) = ln y +
1 − αβ (1−β)(1−αβ)
where { }
∑
w̄ := (1 − β) c + β V (w′ )ϕ(w′ )
w′
I Here w̄ is a constant depending on β, c and the wage
distribution called the reservation wage.
I The agent should accept if and only if the current wage offer
exceeds the reservation wage.
I Clearly, we can compute this reservation wage if we can
compute the value function.