Computational Economics: Session 16: Numerical Dynamic Programming

Computational Economics
Session 16:
Numerical Dynamic Programming
Agenda
Discrete-time dynamic programming.
Continuous-time dynamic programming.
Methods for nite-state problems:
Value function iteration.
Policy function iteration.
Gaussian acceleration methods.
Methods for continuous-state problems:
Discretization.
Parametric approximation methods.
Projection methods.
Discrete-Time Dynamic Programming
The objective is to maximize the expected NPV of payos
E
t=0
(x
t
, u
t
, t) +W(x
T+1
)
subject to the law of motion

Pr(x
t+1
x|x
t
, u
t
, t) = F(x, x
t
, u
t
, t), x
0
given.
Notation:
is the per-period payo.
W is the terminal payo.
x
t
X is the state; X is the set of states.
u
t
D(x
t
, t) is the control; D(x
t
, t) is the nonempty set of feasible
controls in state x
t
at time t.
Discrete-Time Dynamic Programming
The value function V (x
t
, t) is the maximum expected NPV of payos
from time t onward if the state at time t is x
t
.
The value function V (x
t
, t) satises the Bellman equation
V (x
t
, t) = max
u
t
D(x
t
,t)
(x
t
, u
t
, t) +E
t
{V (x
t+1
, t +1)|x
t
, u
t
}
with terminal condition V (x
T+1
, T +1) = W(x
T+1
).
The optimal policy function U(x
t
, t) satises
U(x
t
, t) arg max
u
t
D(x
t
,t)
(x
t
, u
t
, t) +E
t
{V (x
t+1
, t +1)|x
t
, u
t
} .
In the autonomous, discounted, innite-horizon case (x, u, t) is replaced
by
t
(x, u),
where [0, 1) is the discount factor, and neither F() nor D() depend
explicitly on t.
The value function V (x) satises the Bellman equation
V (x) = max
uD(x)
(x, u) +E
V (x
+
)|x, u
and the optimal policy function U(x) satises

U(x) arg max
uD(x)
(x, u) +E
V (x
+
)|x, u
.
Continuous-Time Dynamic Programming: Deterministic Case
The state at time is x(t) X R
n
continuous states.
The objective is to maximize the NPV of payos
T
0
e
t
(x, u, t)dt +W(x(T))
x = f(x, u, t), x(0) = x
0
.
The Bellman equation is
V (x, t) V
t
(x, t) = max
uD(x,t)
(x, u, t) +
n
i=1
V
x
i
(x, t)f
i
(x, u, t)
with terminal condition V (x, T) = W(x).
In the autonomous innite-horizon case the Bellman equation becomes
V (x) = max
uD(x)
(x, u) +
n
i=1
V
x
i
(x)f
i
(x, u).
Continuous-Time Dynamic Programming: Stochastic Case
Continuous states. Brownian motion.
The objective is to maximize the expected NPV of payos
E
T
0
e
t
(x, u, t)dt +W(x(T))

dx = f(x, u, t)dt +(x, u, t)dz, x(0) = x
0
,
where
f(x, u, t) is the n 1 vector of instantaneous drifts;
(x, u, t) is the n n matrix of instantaneous standard deviations;
dz is white noise.
The Bellman equation is
V (x, t) V
t
(x, t) = max
uD(x,t)
(x, u, t) +
n
i=1
V
xi
(x, t)f
i
(x, u, t)
+
1
2
tr
(x, u, t)(x, u, t)
V
xx
(x, t)
with terminal condition V (x, T) = W(x), where tr(A) is the trace of the matrix A.
In the autonomous innite-horizon case the Bellman equation becomes
V (x) = max
uD(x)
(x, u) +
n
i=1
V
xi
(x)f
i
(x, u) +
1
2
tr
(x, u)(x, u)
V
xx
(x)
.
Finite-State Problems
The set of states is X = {x
1
, x
2
, . . . , x
n
}. Time is discrete.
The law of motion is a controlled discrete-time, nite-state, rst-order
Markov process, where q
t
ij
(u) is the probability that the state transits
from x
i
to x
j
if the control is u at time t.
Finite-horizon case: Let V
t
i
= V (x
i
, t), i = 1, . . . , n, t = 0, . . . , T +1. The
Bellman equation is
V
t
i
= max
uD(x
i
,t)
(x
i
, u, t) +
n
j=1
q
t
ij
(u)V
t+1
j
with terminal condition V
T+1
i
= W(x
i
).
Recursive system of nonlinear equations. Solve backwards from t = T +1
to t = 0 for V
t
i
, i = 1, . . . , n.
Innite-horizon case: Let V
i
= V (x
i
), i = 1, . . . , n. The Bellman equation
is
V
i
= max
uD(x
i
)
(x
i
, u) +
n
j=1
q
ij
(u)V
j
.
System of nonlinear equations. The contraction mapping theorem en-
sures existence and uniqueness of a solution.
Finite-State Problems: Value Function Iteration
Dene the operator T pointwise by
(TV )
i
= max
uD(x
i
)
(x
i
, u) +
n
j=1
q
ij
(u)V
j
, i = 1, . . . , n.
Value function iteration:
Initialization: Choose initial guess V
0
and stopping criterion .
Step 1: Compute V
l+1
= TV
l
.
Step 2: If ||V
l+1
V
l
|| < , stop; otherwise, go to step 1.
The sequence
V
l
l=0
converges linearly at rate to V
and ||V
l+1
V
||
||V
l
V
||. Hence,
||V
l
V
||
||V
l+1
V
l
||
1
.
To ensure ||V
l+1
V
|| < , stop if ||V

l+1
V
l
|| (1 ).
Maximization is costliest. Exploit special structure of objective (e.g.,
concavity, monotonicity) whenever possible.
Finite-State Problems: Policy Function Iteration
Dene the operator U pointwise by
(UV )
i
arg max
uD(x
i
)
(x
i
, u) +
n
j=1
q
ij
(u)V
j
, i = 1, . . . , n.
Let U
i
= U(x
i
), i = 1, . . . , n, Q
U
= (q
ij
(U
i
))
i,j
,
U
= ((x
i
, U
i
))
i
. Then
the value V
U
of following policy U forever satises the system of linear
equations
V
U
=
U
+Q
U
V
U
V
U
=
I Q
U
U
.
Policy function iteration (a.k.a. Howard improvement):
Initialization: Choose initial guess V
0
and stopping criterion . (Or:
Choose U
0
instead of V
0
and go to step 2.)
Step 1: Compute U
l+1
= UV
l
.
Step 2: Solve

I Q
U
l+1
V
l+1
=
U
l+1
to obtain V
l+1
.
Step 3: If ||V
l+1
V
l
|| < , stop; otherwise, go to step 1.
Step 2 computes the value of following policy U
l+1
forever.
Finite-State Problems: Policy Function Iteration
Modied policy iteration with k steps: Replace step 2 by
Step 2a: Set W
0
= V
l
.
Step 2b: Compute W
j+1
=
U
l+1
+Q
U
l+1
W
j
, j = 0, . . . , k.
Step 2c: Set V
l+1
= W
k+1
.
Step 2 computes the value of following policy U
l+1
for k +1 periods.
The sequence

V
l
l=0
converges linearly to V
and
||V
l+1
V
|| min
,
(1
k
)
1
||U
l
U
|| +
k+1
||V
l
V
||.
Rate approaches
k+1
as U
l
approaches U
accelerated convergence.
Finite-State Problems: Gaussian Acceleration Methods
Idea: The Bellman equation is a system of nonlinear equations. Treat it
as such!
Pre-Gauss-Jacobi iteration (a.k.a. value function iteration):
V
l+1
i
= max
uD(x
i
)
(x
i
, u) +
n
j=1
q
ij
(u)V
l
j
, i = 1, . . . , n.
Gauss-Jacobi iteration:
V
l+1
i
= max
uD(x
i
)
(x
i
, u) +
j=i
q
ij
(u)V
l
j
1 q
ii
(u)
, i = 1, . . . , n.
Pre-Gauss-Seidel iteration:
V
l+1
i
= max
uD(x
i
)
(x
i
, u) +
n
j<i
q
ij
(u)V
l+1
j
+
n
ji
q
ij
(u)V
l
j
, i = 1, . . . , n.
Gauss-Seidel iteration:
V
l+1
i
= max
uD(x
i
)
(x
i
, u) +
j<i
q
ij
(u)V
l+1
j
+
j>i
q
ij
(u)V
l
j
1 q
ii
(u)
, i = 1, . . . , n.
Finite-State Problems: Gaussian Acceleration Methods
Idea: The Gauss-Seidel methods depend on the ordering of states. Ex-
ploit it!
Downwind (solid) and upwind (dashed) directions. Source: Judd, K. (1998), Figure
12.1.
Upwind Gauss-Seidel: In iteration l, rst order the state space such that
q
i,i+1
(U
l
i
) q
i+1,i
(U
l
i+1
), i = 1, . . . , n. Then traverse the states space in
decreasing order.
Simulated upwind Gauss-Seidel: In iteration l, rst simulate the Markov
process under U
l
. Then traverse the simulated states in decreasing order.
Alternating sweep Gauss-Seidel: In iteration l, traverse the state space
in increasing (decreasing) order if l is odd (even).
Continuous-State Problems: Discretization
Specify a nite-state problem that is similar to the continuous-state prob-
lem under consideration.
Example: Optimal growth. The Bellman equation is
V (k) = max
c[0,k+f(k)]
u(c) +V (k +f(k) c).
Replace the set of states [0, ) by K = {k
1
, k
2
, . . . , k
n
}. Choose K large
enough so that the initial and the steady state are contained in it.
To ensure landing on a point in K, take the control to be next periods
state and rewrite the Bellman equation as
V (k) = max
k
+
K
u(k +f(k) k
+
) +V (k
+
).
Remarks:
Easy and robust.
Sometimes requires reformulating the problem and/or altering the set
of states and controls.
Requires a large number of points, particularly if the state space is
multidimensional.
Inecient approximation to smooth problems.
Continuous-State Problems: Parametric Approximation Methods
Approximate the value function using the family of functions
V (x; a)
and use methods for nite-state problems to choose the parameters a.
Parametric dynamic programming with value function iteration:
Initialization: Choose a functional form for

V (x; a), where a R
m
,
and a set of points X = {x
i
}
n
i=1
, where n m. Choose initial guess
a
0
and stopping criterion .
Step 1 (maximization step): Compute
v
i
= max
uD(x
i
)
(x
i
, u) +

V (x
+
; a)dF(x
+
, x
i
, u), i = 1, . . . , n.
Step 2 (tting step): Compute a
l+1
such that

V (x; a
l+1
) approximates
the Lagrange data {(x
i
, v
i
)}
n
i=1
.
Step 3: If ||
V (x; a
l+1
)

V (x; a
l
)|| < , stop; otherwise, go to step 1.
Three interconnected components:
Numerical integration.
Maximization.
Function approximation (CompEcon toolbox: help cetools).
Continuous-State Problems: Parametric Approximation Methods
The computable approximation

T to the contraction mapping T may be
neither contractive nor monotonic.
Shape-preserving methods.
Dynamic programming and the shape of the value function. Source: Judd, K. (1998),
Figure 12.2.
Linear spline (C
0
).
Schumaker shape-preserving spline (C
1
; use envelope theorem to ob-
tain Hermite data {(x
i
, v
i
, v
}
n
i=1
).
Bilinear and simplicial interpolation (C
0
).
Could use policy function iteration instead of value function iteration,
but Gauss-Seidel methods are harder to adapt.
Continuous-State Problems: Projection Methods
The Bellman equation is a functional equation.
V (x; a)
and choose the parameters a such that

V (x; a) almost satises the
Bellman equation.
The residual function is dened pointwise by
R(x; a) =
V (x; a) + max
uD(x)
(x, u) +

V (x
+
; a)dF(x
+
, x, u).
Special case: Suppose the FOC ensures optimality. Then the value
function V (x) and the optimal policy function U(x) satisfy
V (x) = (x, U(x)) +
V (x
+
)dF(x
+
, x, U(x)),
0 =
u
(x, U(x)) +
V (x
+
)dF
u
(x
+
, x, U(x)).

V (x; a) and
the optimal policy function using

U(x; b).
Continuous-State Problems: Projection Methods
R(x; a, b) =
R
1
(x; a, b)
R
2
(x; a, b)
V (x; a) +(x,

U(x; b) +

V (x
+
; a)dF(x
+
, x,

U(x; b))
u
(x,

U(x; b)) +

V (x
+
; a)dF
u
(x
+
, x,

U(x; b))
.
Even more special case: Suppose the FOC ensures optimality and can
be solved in closed form for U(x). Then the value function V (x) satises
V (x) = (x, U(x)) +
V (x
+
)dF(x
+
, x, U(x)).

V (x; a).
R(x; a) =
V (x; a) +(x, U(x)) +

V (x
+
; a)dF(x
+
, x, U(x)).
Projection methods are natural for continuous-time problems.

Computational Economics: Session 16: Numerical Dynamic Programming

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computational Economics: Session 16: Numerical Dynamic Programming

Uploaded by

Copyright:

Available Formats

Computational Economics

subject to the law of motion

and the optimal policy function U(x) satises

subject to the law of motion

|| < , stop if ||V

V (x; a) +(x, U(x)) +

You might also like