# Lecture Notes 7

Dynamic Programming
In these notes, we will deal with a fundamental tool of dynamic macroeco-
nomics: dynamic programming. Dynamic programming is a very convenient
way of writing a large set of dynamic problems in economic analysis as most of
the properties of this tool are now well established and understood.
1
In these
lectures, we will not deal with all the theoretical properties attached to this
tool, but will rather give some recipes to solve economic problems using the
tool of dynamic programming. In order to understand the problem, we will
ﬁrst deal with deterministic models, before extending the analysis to stochas-
tic ones. However, we shall give some preliminary deﬁnitions and theorems
that justify all the approach
7.1 The Bellman equation and associated theorems
7.1.1 A heuristic derivation of the Bellman equation
Let us consider the case of an agent that has to decide on the path of a set
of control variables, {y
t
}

t=0
in order to maximize the discounted sum of its
future payoﬀs, u(y
t
, x
t
) where x
t
is state variables assumed to evolve according
to
x
t+1
= h(x
t
, y
t
), x
0
given.
1
For a mathematical exposition of the problem see Bertsekas [1976], for a more economic
approach see Lucas, Stokey and Prescott [1989].
1
We ﬁnally make the assumption that the model is markovian.
The optimal value our agent can derive from this maximization process is
given by the value function
V (x
t
) = max
{y
t+s
∈D(x
t+s
)}

s=0

s=0
β
s
u(y
t+s
, x
t+s
) (7.1)
where D is the set of all feasible decisions for the variables of choice. Note
that the value function is a function of the state variable only, as since the
model is markovian, only the past is necessary to take decisions, such that all
the path can be predicted once the state variable is observed. Therefore, the
value in t is only a function of x
t
. (7.1) may now be rewritten as
V (x
t
) = max
{yt∈D(xt),{y
t+s
∈D(x
t+s
)}

s=1
u(y
t
, x
t
) +

s=1
β
s
u(y
t+s
, x
t+s
) (7.2)
making the change of variable k = s −1, (7.2) rewrites
V (x
t
) = max
{yt∈D(xt),{y
t+1+k
∈D(x
t+1+k
)}

k=0
u(y
t
, x
t
) +

k=0
β
k+1
u(y
t+1+k
, x
t+1+k
)
or
V (x
t
) = max
{yt∈D(xt)
u(y
t
, x
t
) + β max
{y
t+1+k
∈D(x
t+1+k
)}

k=0

k=0
β
k
u(y
t+1+k
, x
t+1+k
)
(7.3)
Note that, by deﬁnition, we have
V (x
t+1
) = max
{y
t+1+k
∈D(x
t+1+k
)}

k=0

k=0
β
k
u(y
t+1+k
, x
t+1+k
)
such that (7.3) rewrites as
V (x
t
) = max
yt∈D(xt)
u(y
t
, x
t
) + βV (x
t+1
) (7.4)
This is the so–called Bellman equation that lies at the core of the dynamic
programming theory. With this equation are associated, in each and every
period t, a set of optimal policy functions for y and x, which are deﬁned by
{y
t
, x
t+1
} ∈ Argmax
y∈D(x)
u(y, x) + βV (x
t+1
) (7.5)
2
Our problem is now to solve (7.4) for the function V (x
t
). This problem is
particularly complicated as we are not solving for just a point that would
satisfy the equation, but we are interested in ﬁnding a function that satisﬁes
the equation. A simple procedure to ﬁnd a solution would be the following
1. Make an initial guess on the form of the value function V
0
(x
t
)
2. Update the guess using the Bellman equation such that
V
i+1
(x
t
) = max
yt∈D(xt)
u(y
t
, x
t
) + βV
i
(h(y
t
, x
t
))
3. If V
i+1
(x
t
) = V
i
(x
t
), then a ﬁxed point has been found and the prob-
lem is solved, if not we go back to 2, and iterate on the process until
convergence.
In other words, solving the Bellman equation just amounts to ﬁnd the ﬁxed
point of the bellman equation, or introduction an operator notation, ﬁnding
the ﬁxed point of the operator T, such that
V
i+1
= TV
i
where T stands for the list of operations involved in the computation of the
Bellman equation. The problem is then that of the existence and the unique-
ness of this ﬁxed–point. Luckily, mathematicians have provided conditions for
the existence and uniqueness of a solution.
7.1.2 Existence and uniqueness of a solution
Deﬁnition 1 A metric space is a set S, together with a metric ρ : S ×S −→
R
+
, such that for all x, y, z ∈ S:
1. ρ(x, y) 0, with ρ(x, y) = 0 if and only if x = y,
2. ρ(x, y) = ρ(y, x),
3. ρ(x, z) ρ(x, y) + ρ(y, z).
3
Deﬁnition 2 A sequence {x
n
}

n=0
in S converges to x ∈ S, if for each ε > 0
there exists an integer N
ε
such that
ρ(x
n
, x) < ε for all n N
ε
Deﬁnition 3 A sequence {x
n
}

n=0
in S is a Cauchy sequence if for each ε > 0
there exists an integer N
ε
such that
ρ(x
n
, x
m
) < ε for all n, m N
ε
Deﬁnition 4 A metric space (S, ρ) is complete if every Cauchy sequence in
S converges to a point in S.
Deﬁnition 5 Let (S, ρ) be a metric space and T : S →S be function mapping
S into itself. T is a contraction mapping (with modulus β) if for β ∈ (0, 1),
ρ(Tx, Ty) βρ(x, y), for all x, y ∈ S.
We then have the following remarkable theorem that establishes the existence
and uniqueness of the ﬁxed point of a contraction mapping.
Theorem 1 (Contraction Mapping Theorem) If (S, ρ) is a complete met-
ric space and T : S −→ S is a contraction mapping with modulus β ∈ (0, 1),
then
1. T has exactly one ﬁxed point V ∈ S such that V = TV ,
2. for any V ∈ S, ρ(T
n
V
0
, V ) < β
n
ρ(V
0
, V ), with n = 0, 1, 2 . . .
Since we are endowed with all the tools we need to prove the theorem, we shall
do it.
Proof: In order to prove 1., we shall ﬁrst prove that if we select any sequence {Vn}

n=0
,
such that for each n, Vn ∈ S and
Vn+1 = TVn
this sequence converges and that it converges to V ∈ S. In order to show convergence of
{Vn}

n=0
, we shall prove that {Vn}

n=0
is a Cauchy sequence. First of all, note that the
contraction property of T implies that
ρ(V2, V1) = ρ(TV1, TV0) βρ(V1, V0)
4
and therefore
ρ(Vn+1, Vn) = ρ(TVn, TVn−1) βρ(Vn, Vn−1) . . . β
n
ρ(V1, V0)
Now consider two terms of the sequence, Vm and Vn, m > n. The triangle inequality implies
that
ρ(Vm, Vn) ρ(Vm, Vm−1) +ρ(Vm−1, Vm−2) +. . . +ρ(Vn+1, Vn)
therefore, making use of the previous result, we have
ρ(Vm, Vn)

β
m−1

m−2
+. . . +β
n

ρ(V1, V0)
β
n
1 −β
ρ(V1, V0)
Since β ∈ (0, 1), β
n
→ 0 as n → ∞, we have that for each ε > 0, there exists Nε ∈ N
such that ρ(Vm, Vn) < ε. Hence {Vn}

n=0
is a Cauchy sequence and it therefore converges.
Further, since we have assume that S is complete, Vn converges to V ∈ S.
We now have to show that V = TV in order to complete the proof of the ﬁrst part.
Note that, for each ε > 0, and for V0 ∈ S, the triangular inequality implies
ρ(V, TV ) ρ(V, Vn) +ρ(Vn, TV )
But since {Vn}

n=0
is a Cauchy sequence, we have
ρ(V, TV ) ρ(V, Vn) +ρ(Vn, TV )
ε
2
+
ε
2
for large enough n, therefore V = TV .
Hence, we have proven that T possesses a ﬁxed point and therefore have established
its existence. We now have to prove uniqueness. This can be obtained by contradiction.
Suppose, there exists another function, say W ∈ S that satisﬁes W = TW. Then, the
deﬁnition of the ﬁxed point implies
ρ(V, W) = ρ(TV, TW)
but the contraction property implies
ρ(V, W) = ρ(TV, TW) βρ(V, W)
which, as β > 0 implies ρ(V, W) = 0 and so V = W. The limit is then unique.
Proving 2. is straightforward as
ρ(T
n
V0, V ) = ρ(T
n
V0, TV ) βρ(T
n−1
V0, V )
but we have ρ(T
n−1
V0, V ) = ρ(T
n−1
V0, TV ) such that
ρ(T
n
V0, V ) = ρ(T
n
V0, TV ) βρ(T
n−1
V0, V ) β
2
ρ(T
n−2
V0, V ) . . . β
n
ρ(V0, V )
which completes the proof. 2
This theorem is of great importance as it establishes that any operator that
possesses the contraction property will exhibit a unique ﬁxed–point, which
therefore provides some rationale to the algorithm we were designing in the
previous section. It also insure that whatever the initial condition for this
5
algorithm, if the value function satisﬁes a contraction property, simple itera-
tions will deliver the solution. It therefore remains to provide conditions for
the value function to be a contraction. These are provided by the following
theorem.
Theorem 2 (Blackwell’s Suﬃciency Conditions) Let X ⊆ R

and B(X)
be the space of bounded functions V : X −→ R with the uniform metric. Let
T : B(X) −→B(X) be an operator satisfying
1. (Monotonicity) Let V, W ∈ B(X), if V (x) W(x) for all x ∈ X, then
TV (x) TW(x)
2. (Discounting) There exists some constant β ∈ (0, 1) such that for all
V ∈ B(X) and a 0, we have
T(V + a) TV + βa
then T is a contraction with modulus β.
Proof: Let us consider two functions V, W ∈ B(X) satisfying 1. and 2., and such that
V W +ρ(V, W)
Monotonicity ﬁrst implies that
TV T(W +ρ(V, W))
and discounting
TV TW +βρ(V, W))
since ρ(V, W) 0 plays the same role as a. We therefore get
TV −TW βρ(V, W)
Likewise, if we now consider that V W +ρ(V, W), we end up with
TW −TV βρ(V, W)
Consequently, we have
|TV −TW| βρ(V, W)
so that
ρ(TV, TW) βρ(V, W)
which deﬁnes a contraction. This completes the proof. 2
6
This theorem is extremely useful as it gives us simple tools to check whether
a problem is a contraction and therefore permits to check whether the simple
algorithm we were deﬁned is appropriate for the problem we have in hand.
As an example, let us consider the optimal growth model, for which the
Bellman equation writes
V (k
t
) = max
ct∈C
u(c
t
) + βV (k
t+1
)
with k
t+1
= F(k
t
) − c
t
. In order to save on notations, let us drop the time
subscript and denote the next period capital stock by k

, such that the Bellman
equation rewrites, plugging the law of motion of capital in the utility function
V (k) = max
k

∈K
u(F(k) −k

) + βV (k

)
Let us now deﬁne the operator T as
(TV )(k) = max
k

∈K
u(F(k) −k

) + βV (k

)
We would like to know if T is a contraction and therefore if there exists a
unique function V such that
V (k) = (TV )(k)
In order to achieve this task, we just have to check whether T is monotonic
and satisﬁes the discounting property.
1. Monotonicity: Let us consider two candidate value functions, V and W,
such that V (k) W(k) for all k ∈ K . What we want to show is that
(TV )(k) (TW)(k). In order to do that, let us denote by
¯
k

the optimal
next period capital stock, that is
(TV )(k) = u(F(k) −
¯
k

) + βV (
¯
k

)
But now, since V (k) W(k) for all k ∈ K , we have V (
¯
k

) W(
¯
k

),
such that it should be clear that
(TV )(k) u(F(k) −
¯
k

) + βW(
¯
k

)
max
k

∈K
u(F(k) −k

) + βW(k

) = (TW(k

))
7
Hence we have shown that V (k) W(k) implies (TV )(k) (TW)(k)
and therefore established monotonicity.
2. Discounting: Let us consider a candidate value function, V , and a posi-
tive constant a.
(T(V + a))(k) = max
k

∈K
u(F(k) −k

) + β(V (k

) + a)
= max
k

∈K
u(F(k) −k

) + βV (k

) + βa
= (TV )(k) + βa
Therefore, the Bellman equation satisﬁes discounting in the case of op-
timal growth model.
Hence, the optimal growth model satisﬁes the Blackwell’s suﬃcient condi-
tions for a contraction mapping, and therefore the value function exists and is
unique. We are now in position to design a numerical algorithm to solve the
bellman equation.
7.2 Deterministic dynamic programming
7.2.1 Value function iteration
The contraction mapping theorem gives us a straightforward way to compute
the solution to the bellman equation: iterate on the operator T, such that
V
i+1
= TV
i
up to the point where the distance between two successive value
function is small enough. Basically this amounts to apply the following algo-
rithm
1. Decide on a grid, X, of admissible values for the state variable x
X = {x
1
, . . . , x
N
}
formulate an initial guess for the value function V
0
(x) and choose a
stopping criterion ε > 0.
8
2. For each x

∈ X, = 1, . . . , N, compute
V
i+1
(x

) = max
{x

∈X}
u(y(x

, x

), x

) + βV
i
(x

)
3. If V
i+1
(x) −V
i
(x) < ε go to the next step, else go back to 2.
4. Compute the ﬁnal solution as
y

(x) = y(x, x

)
and
V

(x) =
u(y

(x), x)
1 −β
In order to better understand the algorithm, let us consider a simple example
and go back to the optimal growth model, with
u(c) =
c
1−σ
−1
1 −σ
and
k

= k
α
−c + (1 −δ)k
Then the Bellman equation writes
V (k) = max
0ck
α
+(1−δ)k
c
1−σ
−1
1 −σ
+ βV (k

)
From the law of motion of capital we can determine consumption as
c = k
α
+ (1 −δ)k −k

such that plugging this results in the Bellman equation, we have
V (k) = max
(1−δ)kk

k
α
+(1−δ)k
(k
α
+ (1 −δ)k −k

)
1−σ
−1
1 −σ
+ βV (k

)
Now, let us deﬁne a grid of N feasible values for k such that we have
K = {k
1
, . . . , k
N
}
9
and an initial value function V
0
(k) — that is a vector of N numbers that relate
each k

to a value. Note that this may be anything we want as we know — by
the contraction mapping theorem — that the algorithm will converge. But, if
we want it to converge fast enough it may be a good idea to impose a good
initial guess. Finally, we need a stopping criterion.
Then, for each = 1, . . . , N, we compute the feasible values that can be
taken by the quantity in left hand side of the value function
V
,h

(k
α

+ (1 −δ)k

−k

h
)
1−σ
−1
1 −σ
+ βV (k
h
) for h feasible
It is important to understand what “h feasible” means. Indeed, we only com-
pute consumption when it is positive and smaller than total output, which
restricts the number of possible values for k

. Namely, we want k

to satisfy
0 k

k
α
+ (1 −δ)k
which puts a lower and an upper bound on the index h. When the grid of
values is uniform — that is when k
h
= k+(h−1)d
k
, where d
k
is the increment
in the grid, the upper bound can be computed as
h = E
_
k
α
i
+ (1 −δ)k
i
−k
d
k
_
+ 1
Then we ﬁnd
V

,h
max
h=1,...,h
V
,h
and set
V
i+1
(k

) = V

,h
and keep in memory the index h

= Argmax
h=1,...,N
V
,h
, such that we have
k

(k

) = k
h

In ﬁgures 7.1 and 7.2, we report the value function and the decision rules
obtained from the deterministic optimal growth model with α = 0.3, β = 0.95,
10
δ = 0.1 and σ = 1.5. The grid for the capital stock is composed of 1000 data
points ranging from (1 − ∆
k
)k

to (1 + ∆
k
)k

, where k

state and ∆
k
= 0.9. The algorithm
2
then converges in 211 iterations and 110.5
seconds on a 800Mhz computer when the stopping criterion is ε = 1e
−6
.
Figure 7.1: Deterministic OGM (Value function, Value iteration)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
−4
−3
−2
−1
0
1
2
3
4
k
t
Matlab Code: Value Function Iteration
sigma = 1.5; % utility parameter
delta = 0.1; % depreciation rate
beta = 0.95; % discount factor
alpha = 0.30; % capital elasticity of output
nbk = 1000; % number of data points in the grid
crit = 1; % convergence criterion
epsi = 1e-6; % convergence parameter
ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1));
dev = 0.9; % maximal deviation from steady state
kmin = (1-dev)*ks; % lower bound on the grid
kmax = (1+dev)*ks; % upper bound on the grid
2
This code and those that follow are not eﬃcient from a computational point of view, as
they are intended to have you understand the method without adding any coding complica-
tions. Much faster implementations can be found in the accompanying matlab codes.
11
Figure 7.2: Deterministic OGM (Decision rules, Value iteration)
0 1 2 3 4 5
0
1
2
3
4
5
k
t
Next period capital stock
0 1 2 3 4 5
0
0.5
1
1.5
2
k
t
Consumption
dk = (kmax-kmin)/(nbk-1); % implied increment
kgrid = linspace(kmin,kmax,nbk)’; % builds the grid
v = zeros(nbk,1); % value function
dr = zeros(nbk,1); % decision rule (will contain indices)
while crit>epsi;
for i=1:nbk
%
% compute indexes for which consumption is positive
%
tmp = (kgrid(i)^alpha+(1-delta)*kgrid(i)-kmin);
imax = min(floor(tmp/dk)+1,nbk);
%
% consumption and utility
%
c = kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid(1:imax);
util = (c.^(1-sigma)-1)/(1-sigma);
%
% find value function
%
[tv(i),dr(i)] = max(util+beta*v(1:imax));
end;
crit = max(abs(tv-v)); % Compute convergence criterion
v = tv; % Update the value function
end
%
% Final solution
%
kp = kgrid(dr);
12
c = kgrid.^alpha+(1-delta)*kgrid-kp;
util= (c.^(1-sigma)-1)/(1-sigma);
A possible improvement of the method is to have a much looser grid on the
capital stock but have a pretty ﬁne grid on the control variable (consumption
in the optimal growth model). Then the next period value of the state variable
can be computed much more precisely. However, because of this precision and
the fact the grid is rougher, it may be the case that the computed optimal
value for the next period state variable does not lie in the grid, such that
the value function is unknown at this particular value. Therefore, we use
any interpolation scheme to get an approximation of the value function at
this value. One advantage of this approach is that it involves less function
evaluations and is usually less costly in terms of CPU time. The algorithm is
then as follows:
1. Decide on a grid, X, of admissible values for the state variable x
X = {x
1
, . . . , x
N
}
Decide on a grid, Y , of admissible values for the control variable y
Y = {y
1
, . . . , y
M
} with M N
formulate an initial guess for the value function V
0
(x) and choose a
stopping criterion ε > 0.
2. For each x

∈ X, = 1, . . . , N, compute
x

,j
= h(y
j
, x

)∀j = 1, . . . , M
Compute an interpolated value function at each x

,j
= h(y
j
, x

):
¯
V
i
(x

,j
)
V
i+1
(x

) = max
{y∈Y }
u(y, x

) + β
¯
V
i
(x

,j
)
13
3. If V
i+1
(x) −V
i
(x) < ε go to the next step, else go back to 2.
4. Compute the ﬁnal solution as
V

(x) =
u(y

, x)
1 −β
I report now the matlab code for this approach when we use cubic spline
interpolation for the value function, 20 nodes for the capital stock and 1000
nodes for consumption. The algorithm converges in 182 iterations and 40.6
seconds starting from initial condition for the value function
v
0
(k) =
((c

/y

) ∗ k
α
)
1−σ
−1
(1 −σ)
Matlab Code: Value Function Iteration with Interpolation
sigma = 1.5; % utility parameter
delta = 0.1; % depreciation rate
beta = 0.95; % discount factor
alpha = 0.30; % capital elasticity of output
nbk = 20; % number of data points in the K grid
nbk = 1000; % number of data points in the C grid
crit = 1; % convergence criterion
epsi = 1e-6; % convergence parameter
ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1));
dev = 0.9; % maximal deviation from steady state
kmin = (1-dev)*ks; % lower bound on the grid
kmax = (1+dev)*ks; % upper bound on the grid
kgrid = linspace(kmin,kmax,nbk)’; % builds the grid
cmin = 0.01; % lower bound on the grid
cmak = kmax^alpha; % upper bound on the grid
c = linspace(cmin,cmax,nbc)’; % builds the grid
v = zeros(nbk,1); % value function
dr = zeros(nbk,1); % decision rule (will contain indices)
util = (c.^(1-sigma)-1)/(1-sigma);
while crit>epsi;
for i=1:nbk;
kp = A(j)*k(i).^alpha+(1-delta)*k(i)-c;
vi = interp1(k,v,kp,’spline’);
[Tv(i),dr(i)] = max(util+beta*vi);
14
end
crit = max(abs(tv-v)); % Compute convergence criterion
v = tv; % Update the value function
end
%
% Final solution
%
kp = kgrid(dr);
c = kgrid.^alpha+(1-delta)*kgrid-kp;
util= (c.^(1-sigma)-1)/(1-sigma);
v = util/(1-beta);
Figure 7.3: Deterministic OGM (Value function, Value iteration with interpo-
lation)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
−4
−3
−2
−1
0
1
2
3
4
k
t
7.2.3 Policy iterations: Howard Improvement
The simple value iteration algorithm has the attractive feature of being par-
ticularly simple to implement. However, it is a slow procedure, especially for
inﬁnite horizon problems, since it can be shown that this procedure converges
at the rate β, which is usually close to 1! Further, it computes unnecessary
quantities during the algorithm which slows down convergence. Often, com-
15
Figure 7.4: Deterministic OGM (Decision rules, Value iteration with interpo-
lation)
0 1 2 3 4 5
0
1
2
3
4
5
k
t
Next period capital stock
0 1 2 3 4 5
0
0.5
1
1.5
2
k
t
Consumption
putation speed is really important, for instance when one wants to perform a
sensitivity analysis of the results obtained in a model using diﬀerent parame-
terization. Hence, we would like to be able to speed up convergence. This can
be achieved relying on Howard improvement method. This method actually
iterates on policy functions rather than iterating on the value function. The
algorithm may be described as follows
1. Set an initial feasible decision rule for the control variable y = f
0
(x) and
compute the value associated to this guess, assuming that this rule is
operative forever:
V (x
t
) =

s=0
β
s
u(f
i
(x
t+s
), x
t+s
)
taking care of the fact that x
t+1
= h(x
t
, y
t
) = h(x
t
, f
i
(x
t
)) with i = 0.
Set a stopping criterion ε > 0.
2. Find a new policy rule y = f
i+1
(x) such that
f
i+1
(x) ∈ Argmax
y
u(y, x) + βV (x

)
with x

= h(x, f
i
(x))
16
3. check if f
i+1
(x) −f
i
(x) < ε, if yes then stop, else go back to 2.
Note that this method diﬀers fundamentally from the value iteration algorithm
in at least two dimensions
i one iterates on the policy function rather than on the value function;
ii the decision rule is used forever whereas it is assumed that it is used
only two consecutive periods in the value iteration algorithm. This is
precisely this last feature that accelerates convergence.
Note that when computing the value function we actually have to solve a linear
system of the form
V
i+1
(x

) = u(f
i+1
(x

), x

) + βV
i+1
(h(x

, f
i+1
(x

))) ∀x

∈ X
for V
i+1
(x

), which may be rewritten as
V
i+1
(x

) = u(f
i+1
(x

), x

) + βQV
i+1
(x

) ∀x

∈ X
where Q is an (N ×N) matrix
Q
j
=
_
1 if x

j
≡ h(f
i+1
(x

), x

) = x

0 otherwise
Note that although it is a big matrix, Q is sparse, which can be exploited in
solving the system, to get
V
i+1
(x) = (I −βQ)
−1
u(f
i+1
(x), x)
We apply this algorithm to the same optimal growth model as in the previous
section and report the value function and the decision rules at convergence
in ﬁgures 7.5 and 7.6. The algorithm converges in only 18 iterations and 9.8
seconds, starting from the same initial guess and using the same parameteri-
zation!
17
Matlab Code: Policy Iteration
sigma = 1.50; % utility parameter
delta = 0.10; % depreciation rate
beta = 0.95; % discount factor
alpha = 0.30; % capital elasticity of output
nbk = 1000; % number of data points in the grid
crit = 1; % convergence criterion
epsi = 1e-6; % convergence parameter
ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1));
dev = 0.9; % maximal deviation from steady state
kmin = (1-dev)*ks; % lower bound on the grid
kmax = (1+dev)*ks; % upper bound on the grid
kgrid = linspace(kmin,kmax,nbk)’; % builds the grid
v = zeros(nbk,1); % value function
kp0 = kgrid; % initial guess on k(t+1)
dr = zeros(nbk,1); % decision rule (will contain indices)
%
% Main loop
%
while crit>epsi;
for i=1:nbk
%
% compute indexes for which consumption is positive
%
imax = min(floor((kgrid(i)^alpha+(1-delta)*kgrid(i)-kmin)/devk)+1,nbk);
%
% consumption and utility
%
c = kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid(1:imax);
util = (c.^(1-sigma)-1)/(1-sigma);
%
% find new policy rule
%
[v1,dr(i)]= max(util+beta*v(1:imax));
end;
%
% decision rules
%
kp = kgrid(dr)
c = kgrid.^alpha+(1-delta)*kgrid-kp;
%
% update the value
%
util= (c.^(1-sigma)-1)/(1-sigma);
Q = sparse(nbk,nbk);
for i=1:nbk;
Q(i,dr(i)) = 1;
18
end
Tv = (speye(nbk)-beta*Q)\u;
crit= max(abs(kp-kp0));
v = Tv;
kp0 = kp;
end
Figure 7.5: Deterministic OGM (Value function, Policy iteration)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
−4
−3
−2
−1
0
1
2
3
4
k
t
As experimented in the particular example of the optimal growth model,
policy iteration algorithm only requires a few iterations. Unfortunately, we
have to solve the linear system
(I −βQ)V
i+1
= u(y, x)
which may be particularly costly when the number of grid points is impor-
tant. Therefore, a number of researchers has proposed to replace the matrix
inversion by an additional iteration step, leading to the so–called modiﬁed pol-
icy iteration with k steps, which replaces the linear problem by the following
iteration scheme
1. Set J
0
= V
i
19
Figure 7.6: Deterministic OGM (Decision rules, Policy iteration)
0 1 2 3 4 5
0
1
2
3
4
5
k
t
Next period capital stock
0 1 2 3 4 5
0
0.5
1
1.5
2
k
t
Consumption
2. iterate k times on
J
i+1
= u(y, x) + βQJ
i
, i = 0, . . . , k
3. set V
i+1
= J
k
.
When k −→∞, J
k
tends toward the solution of the linear system.
7.2.4 Parametric dynamic programming
This last technics I will describe borrows from approximation theory using
either orthogonal polynomials or spline functions. The idea is actually to
make a guess for the functional for of the value function and iterate on the
parameters of this functional form. The algorithm then works as follows
1. Choose a functional form for the value function
¯
V (x; Θ), a grid of in-
terpolating nodes X = {x
1
, . . . , x
N
}, a stopping criterion ε > 0 and an
initial vector of parameters Θ
0
.
2. Using the conjecture for the value function, perform the maximization
step in the Bellman equation, that is compute w

= T(
¯
V (x, Θ
i
))
w

= T(
¯
V (x

, Θ
i
)) = max
y
u(y, x

) + β
¯
V (x

, Θ
i
)
20
s.t. x

= h(y, x

) for = 1, . . . , N
3. Using the approximation method you have chosen, compute a new vector
of parameters Θ
i+1
such that
¯
V (x, Θ
i+1
) approximates the data (x

, w

).
4. If
¯
V (x, Θ
i+1
) −
¯
V (x, Θ
i
) < ε then stop, else go back to 2.
First note that for this method to be implementable, we need the payoﬀ func-
tion and the value function to be continuous. The approximation function
may be either a combination of polynomials, neural networks, splines. Note
that during the optimization problem, we may have to rely on a numerical
maximization algorithm, and the approximation method may involve numer-
ical minimization in order to solve a non–linear least–square problem of the
form:
Θ
i+1
∈ Argmin
Θ
N

=1
(w

¯
V (x

; Θ))
2
This algorithm is usually much faster than value iteration as it may not require
iterating on a large grid. As an example, I will once again focus on the optimal
growth problem we have been dealing with so far, and I will approximate the
value function by
¯
V (k; Θ) =
p

i=1
θ
i
ϕ
i
_
2
k −k
k −k
−1
_
where {ϕ
i
(.)}
p
i=0
is a set of Chebychev polynomials. In the example, I set
p = 10 and used 20 nodes. Figures 7.8 and 7.7 report the decision rule and
the value function in this case, and table 7.1 reports the parameters of the
approximation function. The algorithm converged in 242 iterations, but took
less much time that value iterations.
21
Figure 7.7: Deterministic OGM (Value function, Parametric DP)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
−4
−3
−2
−1
0
1
2
3
4
k
t
Figure 7.8: Deterministic OGM (Decision rules, Parametric DP)
0 1 2 3 4 5
0
1
2
3
4
5
k
t
Next period capital stock
0 1 2 3 4 5
0
0.5
1
1.5
2
k
t
Consumption
22
Table 7.1: Value function approximation
θ
0
0.82367
θ
1
2.78042
θ
2
-0.66012
θ
3
0.23704
θ
4
-0.10281
θ
5
0.05148
θ
6
-0.02601
θ
7
0.01126
θ
8
-0.00617
θ
9
0.00501
θ
10
-0.00281
23
Matlab Code: Parametric Dynamic Programming
sigma = 1.50; % utility parameter
delta = 0.10; % depreciation rate
beta = 0.95; % discount factor
alpha = 0.30; % capital elasticity of output
nbk = 20; % # of data points in the grid
p = 10; % order of polynomials
crit = 1; % convergence criterion
iter = 1; % iteration
epsi = 1e-6; % convergence parameter
ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1));
dev = 0.9; % maximal dev. from steady state
kmin = (1-dev)*ks; % lower bound on the grid
kmax = (1+dev)*ks; % upper bound on the grid
rk = -cos((2*[1:nbk]’-1)*pi/(2*nnk)); % Interpolating nodes
kgrid = kmin+(rk+1)*(kmax-kmin)/2; % mapping
%
% Initial guess for the approximation
%
v = (((kgrid.^alpha).^(1-sigma)-1)/((1-sigma)*(1-beta)));;
X = chebychev(rk,n);
th0 = X\v
Tv = zeros(nbk,1);
kp = zeros(nbk,1);
%
% Main loop
%
options=foptions;
options(14)=1e9;
while crit>epsi;
k0 = kgrid(1);
for i=1:nbk
param = [alpha beta delta sigma kmin kmax n kgrid(i)];
kp(i) = fminu(’tv’,k0,options,[],param,th0);
k0 = kp(i);
Tv(i) = -tv(kp(i),param,th0);
end;
theta= X\Tv;
crit = max(abs(Tv-v));
v = Tv;
th0 = theta;
iter= iter+1;
end
24
Matlab Code: Extra Functions
function res=tv(kp,param,theta);
alpha = param(1);
beta = param(2);
delta = param(3);
sigma = param(4);
kmin = param(5);
kmax = param(6);
n = param(7);
k = param(8);
kp = sqrt(kp.^2); % insures positivity of k’
v = value(kp,[kmin kmax n],theta); % computes the value function
c = k.^alpha+(1-delta)*k-kp; % computes consumption
d = find(c<=0); % find negative consumption
c(d) = NaN; % get rid off negative c
util = (c.^(1-sigma)-1)/(1-sigma); % computes utility
util(d) = -1e12; % utility = low number for c<0
res = -(util+beta*v); % compute -TV (we are minimizing)
function v = value(k,param,theta);
kmin = param(1);
kmax = param(2);
n = param(3);
k = 2*(k-kmin)/(kmax-kmin)-1;
v = chebychev(k,n)*theta;
function Tx=chebychev(x,n);
X=x(:);
lx=size(X,1);
if n<0;error(’n should be a positive integer’);end
switch n;
case 0;
Tx=[ones(lx,1)];
case 1;
Tx=[ones(lx,1) X];
otherwise
Tx=[ones(lx,1) X];
for i=3:n+1;
Tx=[Tx 2*X.*Tx(:,i-1)-Tx(:,i-2)];
end
end
A potential problem with the use of Chebychev polynomials is that they
do not put any assumption on the shape of the value function that we know
to be concave and strictly increasing in this case. This is why Judd [1998]
25
recommends to use shape–preserving methods such as Schumaker approxima-
tion in this case. Judd and Solnick [1994] have successfully applied this latter
technic to the optimal growth model and found that the approximation was
very good and dominated a lot of other methods (they actually get the same
precision with 12 nodes as the one achieved with a 1200 data points grid using
a value iteration technic).
7.3 Stochastic dynamic programming
In a large number of problems, we have to deal with stochastic shocks —
just think of a standard RBC model — dynamic programming technics can
obviously be extended to deal with such problems. I will ﬁrst show how
we can obtain the Bellman equation before addressing some important issues
concerning discretization of shocks. Then I will describe the implementation
of the value iteration and policy iteration technics for the stochastic case.
7.3.1 The Bellman equation
The stochastic problem diﬀers from the deterministic problem in that we now
have a . . . stochastic shock! The problem then deﬁnes a value function which
has as argument the state variable x
t
but also the stationary shock s
t
, whose
sequence {s
t
}
+∞
t=0
satisﬁes
s
t+1
= Φ(s
t
, ε
t+1
) (7.6)
where ε is a white noise process. The value function is therefore given by
V (x
t
, s
t
) = max
{y
t+τ
∈D(x
t+τ
,s
t+τ
)}

τ=0
E
t

τ=0
β
τ
u(y
t+τ
, x
t+τ
, s
t+τ
) (7.7)
subject to (7.6) and
x
t+1
= h(x
t
, y
t
, s
t
) (7.8)
Since both y
t
, x
t
and s
t
are either perfectly observed or decided in period t
they are known in t, such that we may rewrite the value function as
V (x
t
, s
t
) = max
{yt∈D(xt,st),{y
t+τ
∈D(x
t+τ
,s
t+τ
)}

τ=1
}
u(y
t
, x
t
, s
t
)+E
t

τ=1
β
τ
u(y
t+τ
, x
t+τ
, s
t+τ
)
26
or
V (x
t
, s
t
) = max
yt∈D(xt,st)
u(y
t
, x
t
, s
t
)+ max
{y
t+τ
∈D(x
t+τ
,s
t+τ
)}

τ=1
E
t

τ=1
β
τ
u(y
t+τ
, x
t+τ
, s
t+τ
)
Using the change of variable k = τ −1, this rewrites
V (x
t
, s
t
) = max
yt∈D(x
t+τ
,s
t+τ
)
u(y
t
, x
t
, s
t
)+ max
{y
t+1+k
∈D(x
t+1+k
,s
t+1+k
)}

k=0
βE
t

k=0
β
k
u(y
t+1+k
, x
t+1+k
, s
t+1+k
)
It is important at this point to recall that
E
t
(X(ω
t+τ
)) =
_
X(ω
t+τ
)f(ω
t+τ

t
)dω
t+τ
=
_ _
X(ω
t+τ
)f(ω
t+τ

t+τ−1
)f(ω
t+τ−1

t
)dω
t+τ

t+τ−1
=
_
. . .
_
X(ω
t+τ
)f(ω
t+τ

t+τ−1
) . . . f(ω
t+1

t
)dω
t+τ
. . . dω
t+1
which as corollary the law of iterated projections, such that the value function
rewrites
V (x
t
, s
t
) = max
yt∈D(xt,st)
u(y
t
, x
t
, s
t
)
+ max
{y
t+1+k
∈D(x
t+1+k
,s
t+1+k
)}

k=0
βE
t
E
t+1

k=0
β
k
u(y
t+1+k
, x
t+1+k
, s
t+1+k
)
or
V (x
t
, s
t
) = max
yt∈D(xt,st)
u(y
t
, x
t
, s
t
)
+ max
{y
t+1+k
∈D(x
t+1+k
,s
t+1+k
)}

k=0
β
_
E
t+1

k=0
β
k
u(y
t+1+k
, x
t+1+k
, s
t+1+k
)f(s
t+1
|s
t
)ds
t+1
Note that because each value for the shock deﬁnes a particular mathematical
object the maximization of the integral corresponds to the integral of the max-
imization, therefore the max operator and the summation are interchangeable,
so that we get
V (x
t
, s
t
) = max
yt∈D(xt,st)
u(y
t
, x
t
, s
t
)

_
max
{y
t+1+k
}

k=0
E
t+1

k=0
β
k
u(y
t+1+k
, x
t+1+k
, s
t+1+k
)f(s
t+1
|s
t
)ds
t+1
27
By deﬁnition, the term under the integral corresponds to V (x
t+1
, s
t+1
),
such that the value rewrites
V (x
t
, s
t
) = max
yt∈D(xt,st)
u(y
t
, x
t
, s
t
) + β
_
V (x
t+1
, s
t+1
)f(s
t+1
|s
t
)ds
t+1
or equivalently
V (x
t
, s
t
) = max
yt∈D(xt,st)
u(y
t
, x
t
, s
t
) + βE
t
V (x
t+1
, s
t+1
)
which is precisely the Bellman equation for the stochastic dynamic program-
ming problem.
7.3.2 Discretization of the shocks
A very important problem that arises whenever we deal with value iteration or
policy iteration in a stochastic environment is that of the discretization of the
space spanned by the shocks. Indeed, the use of a continuous support for the
stochastic shocks is infeasible for a computer that can only deals with discrete
supports. We therefore have to transform the continuous problem into a dis-
crete one with the constraint that the asymptotic properties of the continous
and the discrete processes should be the same. The question we therefore face
is: does there exist a discrete representation for s which is equivalent to its
continuous original representation? The answer to this question is yes. In
particular as soon as we deal with (V)AR processes, we can use a very powerful
tool: Markov chains.
Markov Chains: A Markov chain is a sequence of random values whose
probabilities at a time interval depends upon the value of the number at the
previous time. W will restrict ourselves to discrete-time Markov chains, in
which the state changes at certain discrete time instants, indexed by an integer
variable t. At each time step t, the Markov chain is in a state, denoted by
s ∈ S ≡ {s
1
, . . . , s
M
}. S is called the state space.
28
The Markov chain is described in terms of transition probabilities π
ij
. This
transition probability should read as follows
If the state happens to be s
i
in period t, the probability that the next state
is equal to s
j
is π
ij
.
We therefore get the following deﬁnition
Deﬁnition 6 A Markov chain is a stochastic process with a discrete indexing
S, such that the conditional distribution of s
t+1
is independent of all previ-
ously attained state given s
t
:
π
ij
= Prob(s
t
+ 1 = s
j
|s
t
= s
i
), s
i
, s
j
∈ S.
The important assumption we shall make concerning Markov processes is
that the transition probabilities, π
ij
, apply as soon as state s
i
is activated
no matter the history of the shocks nor how the system got to state s
i
. In
other words there is no hysteresis. From a mathematical point of view, this
corresponds to the so–called Markov property
P(s
t+1
= s
j
|s
t
= s
i
, s
t−1
= s
i
n−1
, . . . s
0
= s
i
0
) = P(s
t+1
= s
j
|s
t
= s
i
) = π
ij
for all period t, all states s
i
, s
j
∈ S, and all sequences {s
in
}
t−1
n=0
of earlier
states. Thus, the probability the next state s
t+1
does only depend on the
current realization of s.
The transition probabilities π
ij
must of course satisfy
1. π
ij
0 for all i, j = 1, . . . , M
2.

M
j=1
π
ij
= 1 for all i = 1, . . . , M
All of the elements of a Markov chain model can then be encoded in a
transition probability matrix
Π =
_
_
_
π
11
. . . π
1M
.
.
.
.
.
.
.
.
.
π
M1
. . . π
MM
_
_
_
29
Note that Π
k
ij
then gives the probability that s
t+k
= s
j
given that s
t
= s
i
.
In the long run, we obtain the steady state equivalent for a Markov chain: the
invariant distribution or stationary distribution
Deﬁnition 7 A stationary distribution for a Markov chain is a distribution
π such that
1. π
j
0 for all j = 1, . . . , M
2.

M
j=1
π
j
= 1
3. π = Ππ
Gaussian–quadrature approach to discretization Tauchen and Hussey
[1991] provide a simple way to discretize VAR processes relying on gaussian
3
I will only present the case of an AR(1) process of the form
s
t+1
= ρs
t
+ (1 −ρ)s + ε
t+1
where ε
t+1
;N (0, σ
2
). This implies that
1
σ

_

−∞
exp
_

1
2
_
s
t+1
−ρs
t
−(1 −ρ)s
σ
_
2
_
ds
t+1
=
_
f(s
t+1
|s
t
)ds
t+1
= 1
which illustrates the fact that s is a continuous random variable. Tauchen and
Hussey [1991] propose to replace the integral by
_
Φ(s
t+1
; s
t
, s)f(s
t+1
|s)ds
t+1

_
f(s
t+1
|s
t
)
f(s
t+1
|s)
f(s
t+1
|s)ds
t+1
= 1
where f(s
t+1
|s) denotes the density of s
t+1
conditional on the fact that s
t
= s
(which amounts to the unconditional density), which in our case implies that
Φ(s
t+1
; s
t
, s) ≡
f(s
t+1
|s
t
)
f(s
t+1
|s)
= exp
_

1
2
_
_
s
t+1
−ρs
t
−(1 −ρ)s
σ
_
2

_
s
t+1
−s
σ
_
2
__
3
Note that we already discussed this issue in lectures notre # 4.
30
then we can use the standard linear transformation and impose z
t
= (s
t

s)/(σ

2) to get
1

π
_

−∞
exp
_

_
(z
t+1
−ρz
t
)
2
−z
2
t+1
j)
__
exp
_
−z
2
t+1
_
dz
t+1
for which we can use a Gauss–Hermite quadrature. Assume then that we have
i
and weights ω
i
, i = 1, . . . , n, the quadrature leads to
the formula
1

π
n

j=1
ω
j
Φ(z
j
; z
i
; x) 1
in other words we might interpret the quantity ω
j
Φ(z
j
; z
i
; x) as an “estimate”
´ π
ij
≡ Prob(s
t+1
= s
j
|s
t
= s
i
) of the transition probability from state i to
state j, but remember that the quadrature is just an approximation such that
it will generally be the case that

n
j=1
´ π
ij
= 1 will not hold exactly. Tauchen
and Hussey therefore propose the following modiﬁcation:
´ π
ij
=
ω
j
Φ(z
j
; z
i
; s)

πς
i
where ς
i
=
1

π

n
j=1
ω
j
Φ(z
j
; z
i
; x).
Matlab Code: Discretization of AR(1)
function [s,p]=tauch_huss(xbar,rho,sigma,n)
% xbar : mean of the x process
% rho : persistence parameter
% sigma : volatility
% n : number of nodes
%
% returns the states s and the transition probabilities p
%
[xx,wx] = gauss_herm(n); % nodes and weights for s
s = sqrt(2)*s*xx+mx; % discrete states
x = xx(:,ones(n,1));
z = x’;
w = wx(:,ones(n,1))’;
%
% computation
%
p = (exp(z.*z-(z-rx*x).*(z-rx*x)).*w)./sqrt(pi);
sx = sum(p’)’;
p = p./sx(:,ones(n,1));
31
7.3.3 Value iteration
As in the deterministic case, the convergence of the simple value function
iteration procedure is insured by the contraction mapping theorem. This is
however a bit more subtil than that as we have to deal with the convergence
of a probability measure, which goes far beyond this introduction to dynamic
programming.
4
The algorithm is basically the same as in the deterministic
case up to the delicate point that we have to deal with the expectation. The
algorithm then writes as follows
1. Decide on a grid, X, of admissible values for the state variable x
X = {x
1
, . . . , x
N
}
and the shocks, s
S = {s
1
, . . . , s
M
} together with the transition matrix Π = (π
ij
)
formulate an initial guess for the value function V
0
(x) and choose a
stopping criterion ε > 0.
2. For each x

∈ X, = 1, . . . , N, and s
k
∈ S, k = 1, . . . , M compute
V
i+1
(x

, s
k
) = max
{x

∈X}
u(y(x

, s
k
, x

), x

, s
k
) + β
M

j=1
π
kj
V
i
(x

, s

j
)
3. If V
i+1
(x, s) −V
i
(x, s) < ε go to the next step, else go back to 2.
4. Compute the ﬁnal solution as
y

(x, s) = y(x, x

(s, s), s)
As in the deterministic case, I will illustrate the method relying on the optimal
growth model. In order to better understand the algorithm, let us consider a
simple example and go back to the optimal growth model, with
u(c) =
c
1−σ
−1
1 −σ
4
The interested reader should then refer to Lucas et al. [1989] chapter 9.
32
and
k

= exp(a)k
α
−c + (1 −δ)k
where a

= ρa + ε

. Then the Bellman equation writes
V (k, a) = max
c
c
1−σ
−1
1 −σ
+ β
_
V (k

, a

)dµ(a

|a)
From the law of motion of capital we can determine consumption as
c = exp(a)k
α
+ (1 −δ)k −k

such that plugging this results in the Bellman equation, we have
V (k, a) = max
k

(exp(a)k
α
+ (1 −δ)k −k

)
1−σ
−1
1 −σ
+ β
_
V (k

, a

)dµ(a

|a)
A ﬁrst problem that we encounter is that we would like to be able to eval-
uate the integral involved by the rational expectation. We therefore have to
discretize the shock. Here, I will consider that the technology shock can be
accurately approximated by a 2 state markov chain, such that a can take on
2 values a
1
and a
2
(a
1
< a
2
). I will also assume that the transition matrix is
symmetric, such that
Π =
_
π 1 −π
1 −π π
_
a
1
, a
2
and π are selected such that the process reproduces the conditional ﬁrst
and second order moments of the AR(1) process
First order moments
πa
1
+ (1 −π)a
2
= ρa
1
(1 −π)a
1
+ πa
2
= ρa
2
Second order moments
πa
2
1
+ (1 −π)a
2
2
−(ρa
1
)
2
= σ
2
ε
(1 −π)a
2
1
+ πa
2
2
−(ρa
2
)
2
= σ
2
ε
From the ﬁrst two equations we get a
1
= −a
2
and π = (1 + ρ)/2. Plugging
these last two results in the two last equations, we get a
1
=
_
σ
2
ε
/(1 −ρ
2
).
33
Hence, we will actually work with a value function of the form
V (k, a
k
) = max
c
c
1−σ
−1
1 −σ
+ β
2

j=1
π
kj
V (k

, a

j
)
Now, let us deﬁne a grid of N feasible values for k such that we have
K = {k
1
, . . . , k
N
}
and an initial value function V
0
(k) — that is a vector of N numbers that relate
each k

to a value. Note that this may be anything we want as we know — by
the contraction mapping theorem — that the algorithm will converge. But, if
we want it to converge fast enough it may be a good idea to impose a good
initial guess. Finally, we need a stopping criterion.
In ﬁgures 7.9 and 7.10, we report the value function and the decision rules
obtained for the stochastic optimal growth model with α = 0.3, β = 0.95,
δ = 0.1, σ = 1.5, ρ = 0.8 and σ
ε
= 0.12. The grid for the capital stock
is composed of 1000 data points ranging from 0.2 to 6. The algorithm then
converges in 192 iterations when the stopping criterion is ε = 1e
−6
.
34
Figure 7.9: Stochastic OGM (Value function, Value iteration)
0 1 2 3 4 5 6
−6
−4
−2
0
2
4
6
k
t
Figure 7.10: Stochastic OGM (Decision rules, Value iteration)
0 2 4 6
0
1
2
3
4
5
6
k
t
Next period capital stock
0 2 4 6
0
0.5
1
1.5
2
k
t
Consumption
35
Matlab Code: Value Function Iteration
sigma = 1.50; % utility parameter
delta = 0.10; % depreciation rate
beta = 0.95; % discount factor
alpha = 0.30; % capital elasticity of output
rho = 0.80; % persistence of the shock
se = 0.12; % volatility of the shock
nbk = 1000; % number of data points in the grid
nba = 2; % number of values for the shock
crit = 1; % convergence criterion
epsi = 1e-6; % convergence parameter
%
% Discretization of the shock
%
p = (1+rho)/2;
PI = [p 1-p;1-p p];
se = 0.12;
ab = 0;
a1 = exp(-se*se/(1-rho*rho));
a2 = exp(se*se/(1-rho*rho));
A = [a1 a2];
%
% Discretization of the state space
%
kmin = 0.2;
kmax = 6;
kgrid = linspace(kmin,kmax,nbk)’;
c = zeros(nbk,nba);
util = zeros(nbk,nba);
v = zeros(nbk,nba);
Tv = zeros(nbk,nba);
while crit>epsi;
for i=1:nbk
for j=1:nba;
c = A(j)*k(i)^alpha+(1-delta)*k(i)-k;
neg = find(c<=0);
c(neg) = NaN;
util(:,j) = (c.^(1-sigma)-1)/(1-sigma);
util(neg,j) = -1e12;
end
[Tv(i,:),dr(i,:)] = max(util+beta*(v*PI));
end;
crit = max(max(abs(Tv-v)));
v = Tv;
iter = iter+1;
end
36
kp = k(dr);
for j=1:nba;
c(:,j) = A(j)*k.^alpha+(1-delta)*k-kp(:,j);
end
7.3.4 Policy iterations
As in the deterministic case, we may want to accelerate the simple value
iteration using Howard improvement. The stochastic case does not diﬀer that
much from the deterministic case, apart from the fact that we now have to
deal with diﬀerent decision rules, which implies the computation of diﬀerent
Q matrices. The algorithm may be described as follows
1. Set an initial feasible set of decision rule for the control variable y =
f
0
(x, s
k
), k = 1, . . . , M and compute the value associated to this guess,
assuming that this rule is operative forever, taking care of the fact that
x
t+1
= h(x
t
, y
t
, s
t
) = h(x
t
, f
i
(x
t
, s
t
), s
t
) with i = 0. Set a stopping
criterion ε > 0.
2. Find a new policy rule y = f
i+1
(x, s
k
), k = 1, . . . , M, such that
f
i+1
(x, s
k
) ∈ Argmax
y
u(y, x, s
k
) + β
M

j=1
π
kj
V (x

, s

j
)
with x

= h(x, f
i
(x, s
k
), s
k
)
3. check if f
i+1
(x, s) −f
i
(x, s) < ε, if yes then stop, else go back to 2.
When computing the value function we actually have to solve a linear system
of the form
V
i+1
(x

, s
k
) = u(f
i+1
(x

, s
k
), x

, s
k
) + β
M

j=1
π
kj
V
i+1
(h(x

, f
i+1
(x

, s
k
), s
k
), s

j
)
for V
i+1
(x

, s
k
) (for all x

∈ X and s
k
∈ S), which may be rewritten as
V
i+1
(x

, s
k
) = u(f
i+1
(x

, s
k
), x

, s
k
) + βΠ
k.
QV
i+1
(x

, .) ∀x

∈ X
37
where Q is an (N ×N) matrix
Q
j
=
_
1 if x

j
≡ h(f
i+1
(x

), x

) = x

0 otherwise
Note that although it is a big matrix, Q is sparse, which can be exploited in
solving the system, to get
V
i+1
(x, s) = (I −βΠ⊗Q)
−1
u(f
i+1
(x, s), x, s)
We apply this algorithm to the same optimal growth model as in the previous
section and report the value function and the decision rules at convergence in
ﬁgures 7.11 and 7.12. The algorithm converges in only 17 iterations, starting
from the same initial guess and using the same parameterization!
Figure 7.11: Stochastic OGM (Value function, Policy iteration)
0 1 2 3 4 5 6
−6
−4
−2
0
2
4
6
k
t
38
Figure 7.12: Stochastic OGM (Decision rules, Policy iteration)
0 2 4 6
0
1
2
3
4
5
6
k
t
Next period capital stock
0 2 4 6
0
0.5
1
1.5
2
k
t
Consumption
Matlab Code: Policy Iteration
sigma = 1.50; % utility parameter
delta = 0.10; % depreciation rate
beta = 0.95; % discount factor
alpha = 0.30; % capital elasticity of output
rho = 0.80; % persistence of the shock
se = 0.12; % volatility of the shock
nbk = 1000; % number of data points in the grid
nba = 2; % number of values for the shock
crit = 1; % convergence criterion
epsi = 1e-6; % convergence parameter
%
% Discretization of the shock
%
p = (1+rho)/2;
PI = [p 1-p;1-p p];
se = 0.12;
ab = 0;
a1 = exp(-se*se/(1-rho*rho));
a2 = exp(se*se/(1-rho*rho));
A = [a1 a2];
%
% Discretization of the state space
%
kmin = 0.2;
kmax = 6;
kgrid = linspace(kmin,kmax,nbk)’;
c = zeros(nbk,nba);
util = zeros(nbk,nba);
39
v = zeros(nbk,nba);
Tv = zeros(nbk,nba);
Ev = zeros(nbk,nba); % expected value function
dr0 = repmat([1:nbk]’,1,nba); % initial guess
dr = zeros(nbk,nba); % decision rule (will contain indices)
%
% Main loop
%
while crit>epsi;
for i=1:nbk
for j=1:nba;
c = A(j)*kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid;
neg = find(c<=0);
c(neg) = NaN;
util(:,j) = (c.^(1-sigma)-1)/(1-sigma);
util(neg,j) = -inf;
Ev(:,j) = v*PI(j,:)’;
end
[v1,dr(i,:)]= max(u+beta*Ev);
end;
%
% decision rules
%
kp = kgrid(dr);
Q = sparse(nbk*nba,nbk*nba);
for j=1:nba;
c = A(j)*kgrid.^alpha+(1-delta)*kgrid-kp(:,j);
%
% update the value
%
util(:,j)= (c.^(1-sigma)-1)/(1-sigma);
Q0 = sparse(nbk,nbk);
for i=1:nbk;
Q0(i,dr(i,j)) = 1;
end
Q((j-1)*nbk+1:j*nbk,:) = kron(PI(j,:),Q0);
end
Tv = (speye(nbk*nba)-beta*Q)\util(:);
crit= max(max(abs(kp-kp0)));
v = reshape(Tv,nbk,nba);
kp0 = kp;
iter= iter+1;
end
c=zeros(nbk,nba);
for j=1:nba;
c(:,j) = A(j)*kgrid.^alpha+(1-delta)*kgrid-kp(:,j);
end
40
7.4 How to simulate the model?
At this point, no matter whether the model is deterministic or stochastic, we
have a solution which is actually a collection of data points. There are several
ways to simulate the model
• One may simply use the pointwise decision rule and iterate on it
• or one may use an interpolation scheme (either least square methods,
spline interpolations . . . ) in order to obtain a continuous decision rule
Here I will pursue the second route.
7.4.1 The deterministic case
After having solved the model by either value iteration or policy iteration, we
have in hand a collection of points, X, for the state variable and a decision
rule that relates the control variable to this collection of points. In other words
we have Lagrange data which can be used to get a continuous decision rule.
This decision rule may then be used to simulate the model.
In order to illustrate this, let’s focus, once again on the deterministic opti-
mal growth model and let us assume that we have solved the model by either
value iteration or policy iteration. We therefore have a collection of data points
for the capital stock K = {k
1
, . . . , k
N
}, which form our grid, and the associ-
ated decision rule for the next period capital stock k

i
= h(k
i
), i = 1, . . . , N,
and the consumption level which can be deduced from the next period capital
stock using the law of motion of capital together with the resource constraint.
Also note that we have the upper and lower bound of the grid k
1
= k and
k
N
= k. We will use Chebychev polynomials to approximate the rule in or-
der to avoid multicolinearity problems in the least square algorithm. In other
words we will get a decision rule of the form
k

=
n

=0
θ

T

(ϕ(k))
41
where ϕ(k) is a linear transformation that maps [k, k] into [-1;1]. The algo-
rithm works as follows
1. Solve the model by value iteration or policy iteration, such that we have
a collection of data points for the capital stock K = {k
1
, . . . , k
N
} and
associated decision rule for the next period capital stock k

i
= h(k
i
),
i = 1, . . . , N. Choose a maximal order, n, for the Chebychev polynomials
approximation.
2. Apply the transformation
x
i
= 2
k
i
−k
k −k
−1
to the data and compute T

(x
i
), = 1, . . . , n, and i = 1 . . . , N.
3. Perform the least square approximation such that
Θ = (T

(x)

T

(x))
−1
T

(x)h(k)
For the parameterization we have been using so far, setting
ϕ(k) = 2
k −k
k −k
−1
and for an approximation of order 7, we obtain
θ = {2.6008, 2.0814, −0.0295, 0.0125, −0.0055, 0.0027, −0.0012, 0.0007, }
such that the decision rule for the capital stock can be approximated as
k
t+1
= 2.6008 + 2.0814T
1
(ϕ(k
t
)) −0.0295T
2
(ϕ(k
t
)) + 0.0125T
3
(ϕ(k
t
))
−0.0055T
4
(ϕ(k
t
)) + 0.0027T
5
(ϕ(k
t
)) −0.0012T
6
(ϕ(k
t
)) + 0.0007T
7
(ϕ(k
t
))
Then the model can be used in the usual way such that in order to get the
transitional dynamics of the model we just have to iterate on this backward
42
looking ﬁnite–diﬀerences equation. Consumption, output and investment can
be computed using their deﬁnition
i
t
= k
t+1
−(1 −δ)k
t
y
t
= k
α
t
c
t
= y
t
−i −t
For instance, ﬁgure 7.13 reports the transitional dynamics of the model when
the economy is started 50% below its steady state. In the ﬁgure, the plain line
represents the dynamics of each variable, and the dashed line its steady state
level.
Figure 7.13: Transitional dynamics (OGM)
0 10 20 30 40 50
1
1.5
2
2.5
3
Time
Capital stock
0 10 20 30 40 50
0.26
0.27
0.28
0.29
0.3
Time
Investment
0 10 20 30 40 50
1
1.1
1.2
1.3
1.4
Time
Output
0 10 20 30 40 50
0.7
0.8
0.9
1
1.1
Time
Consumption
Matlab Code: Transitional Dynamics
n = 8; % Order of approximation+1
transk = 2*(kgrid-kmin)/(kmax-kmin)-1; % transform the data
%
% Computes the matrix of Chebychev polynomials
%
43
Tk = [ones(nbk,1) transk];
for i=3:n;
Tk=[Tk 2*transk.*Tk(:,i-1)-Tk(:,i-2)];
end
b=Tk\kp; % Performs OLS
k0 = 0.5*ks; % initial capital stock
nrep= 50; % number of periods
k = zeros(nrep+1,1); % initialize dynamics
%
% iteration loop
%
k(1)= k0;
for t=1:nrep;
trkt = 2*(k(t)-kmin)/(kmax-kmin)-1;
Tk = [1 trkt];
for i=3:n;
Tk= [Tk 2*trkt.*Tk(:,i-1)-Tk(:,i-2)];
end
k(t+1)=Tk*b;
end
y = k(1:nrep).^alpha;
i = k(2:nrep+1)-(1-delta)*k(1:nrep);
c = y-i;
Note: At this point, I assume that the model has already been solved and that the capital
grid has been stored in kgrid and the decision rule for the next period capital stock is stored
in kp.
Another way to proceed would have been to use spline approximation, in
which case the matlab code would have been (for the simulation part)
Matlab Code: OGM simulation (cubic spline interpolation
k(1)= k0;
for t=1:nrep;
k(t+1)=interp1(grid,kp,k(t),’cubic’);
end
y = k(1:nrep).^alpha;
i = k(2:nrep+1)-(1-delta)*k(1:nrep);
c = y-i;
7.4.2 The stochastic case
As in the deterministic case, it will often be a good idea to represent the
solution as a continuous decision rule, such that we should apply the same
44
methodology as before. The only departure from the previous experiment
is that since we have approximated the stochastic shock by a Markov chain,
we have as many decision rule as we have states in the Markov chain. For
instance, in the optimal growth model we solved in section 7.3, we have 2
decision rules which are given by
5
k
1,t+1
= 2.8630 + 2.4761T
1
(ϕ(k
t
)) −0.0211T
2
(ϕ(k
t
)) + 0.0114T
3
(ϕ(k
t
))
−0.0057T
4
(ϕ(k
t
)) + 0.0031T
5
(ϕ(k
t
)) −0.0014T
6
(ϕ(k
t
)) + 0.0009T
7
(ϕ(k
t
))
k
2,t+1
= 3.2002 + 2.6302T
1
(ϕ(k
t
)) −0.0543T
2
(ϕ(k
t
)) + 0.0235T
3
(ϕ(k
t
))
−0.0110T
4
(ϕ(k
t
)) + 0.0057T
5
(ϕ(k
t
)) −0.0027T
6
(ϕ(k
t
)) + 0.0014T
7
(ϕ(k
t
))
The only diﬃculty we have to deal with in the stochastic case is to simulate
the Markov chain. Simulating a series of length T for the markov chain can
be achieved easily
1. Compute the cumulative distribution of the markov chain (Π
c
) — i.e.
the cumulative sum of each row of the transition matrix. Hence, Π
c
ij
corresponds to the the probability that the economy is in a state lower
of equal to j given that it was in state i in period t −1.
2. Set the initial state and simulate T random numbers from a uniform
distribution: {p
t
}
T
t=1
. Note that these numbers, drawn from a U
[0;1]
can
be interpreted as probabilities.
3. Assume the economy was in state i in t −1, ﬁnd the index j such that
Π
c
i(j−1)
< p
t
Π
c
ij
then s
j
is the state of the economy in period t
5
The code to generate these two rules is exactly the same as for the deterministic version
of the model since, when computing b=Tk\kp matlab will take care of the fact that kp is a
(N × 2) matrix, such that b will be a ((n + 1) × 2) matrix where n is the approximation
order.
45
Matlab Code: Simulation of Markov Chain
s = [-1;1]; % 2 possible states
PI = [0.9 0.1;
0.1 0.9]; % Transition matrix
s0 = 1; % Initial state
T = 1000; % Length of the simulation
p = rand(T,1); % Simulated distribution
j = zeros(T,1);
j(1) = s0; % Initial state
%
for k=2:n;
j(k)=find(((sim(k)<=cum_PI(state(k-1),2:cpi+1))&...
(sim(k)>cum_PI(state(k-1),1:cpi))));
end;
chain=s(j); % simulated values
We are then in position to simulate our optimal growth model. Figure 7.14
reports a simulation of the optimal growth model we solved before.
Figure 7.14: Simulated OGM
0 50 100 150 200
1.5
2
2.5
3
3.5
4
4.5
Time
Capital stock
0 50 100 150 200
0
0.1
0.2
0.3
0.4
0.5
Time
Investment
0 50 100 150 200
0.8
1
1.2
1.4
1.6
1.8
2
Time
Output
0 50 100 150 200
0.8
1
1.2
1.4
1.6
Time
Consumption
46
Matlab Code: Simulating the OGM
n = 8; % Order of approximation+1
transk = 2*(kgrid-kmin)/(kmax-kmin)-1; % transform the data
%
% Computes approximation
%
Tk = [ones(nbk,1) transk];
for i=3:n;
Tk=[Tk 2*transk.*Tk(:,i-1)-Tk(:,i-2)];
end
b=Tk\kp;
%
% Simulate the markov chain
%
long = 200; % length of simulation
[a,id] = markov(PI,A,long,1); % markov is a function:
% a contains the values for the shock
% id the state index
k0=2.6; % initial value for the capital stock
k=zeros(long+1,1); % initialize
y=zeros(long,1); % some matrices
k(1)=k0; % initial capital stock
for t=1:long;
% Prepare the approximation for the capital stock
trkt = 2*(k(t)-kmin)/(kmax-kmin)-1;
Tk = [1 trkt];
for i = 3:n;
Tk = [Tk 2*trkt.*Tk(:,i-1)-Tk(:,i-2)];
end
k(t+1) = Tk*b(:,id(t)); % use the appropriate decision rule
y(t) = A(id(t))*k(t).^alpha; % computes output
end
i=k(2:long+1)-(1-delta)*k(1:long); % computes investment
c=y-i; % computes consumption
Note: At this point, I assume that the model has already been solved and that the capital
grid has been stored in kgrid and the decision rule for the next period capital stock is stored
in kp.
47
7.5 Handling constraints
In this example I will deal the problem of a household that is constraint on
her borrowing and who faces uncertainty on the labor market. This problem
has been extensively studied by Ayiagari [1994].
We consider the case of a household who determines her consumption/saving
plans in order to maximize her lifetime expected utility, which is characterized
by the function:
6
E
t

τ=0
β
τ
_
c
1−σ
t+τ
−1
1 −σ
_
with σ ∈ (0, 1) ∪ (1, ∞) (7.9)
where 0 < β < 1 is a constant discount factor, c denotes the consumption
bundle. In each and every period, the household enters the period with a level
of asset a
t
carried from the previous period, for which she receives an constant
real interest rate r. She may also be employed and receives the aggregate real
wage w or may be unemployed in which case she just receives unemployment
beneﬁts, b, which are assumed to be a fraction of the real wage: b = γw
where γ is the replacement ratio. These revenus are then used to consume
and purchase assets on the ﬁnancial market. Therefore, the budget constraint
in t is given by
a
t+1
= (1 + r)a
t
+ ω
t
−c
t
(7.10)
where
ω
t
=
_
w if the household is employed
γw otherwise
(7.11)
We assume that the household’s state on the labor market is a random variable.
Further the probability of being (un)employed in period t is greater if the
household was also (un)employed in period t − 1. In other words, the state
on the labor market, and therefore ω
t
, can be modeled as a 2 state Markov
chain with transition matrix Π. In addition, the household is submitted to a
6
Et(.) denotes mathematical conditional expectations. Expectations are conditional on
information available at the beginning of period t.
48
borrowing contraint that states that she cannot borrow more than a threshold
level a

a
t+1
a

(7.12)
In this case, the Bellman equation writes
V (a
t
, ω
t
) = max
ct0
u(c
t
) + βE
t
V (a
t+1
, ω
t+1
)
s.t. (7.10)–(7.12), which may be solved by one of the methods we have seen
before. Anyway, implementing any of these method taking into account the
constraint is straightforward as, plugging the budget constraint into the utility
function, the Bellman equation can be rewritten
V (a
t
, ω
t
) = max
a
t+1
u((1 + r)a
t
+ ω
t
−a
t+1
) + βE
t
V (a
t+1
, ω
t+1
)
s.t. (7.11)–(7.12).
But, since the borrowing constraint should be satisﬁed in each and every
period, we have
V (a
t
, ω
t
) = max
a
t+1
a

u((1 + r)a
t
+ ω
t
−a
t+1
) + βE
t
V (a
t+1
, ω
t+1
)
In other words, when deﬁning the grid for a, one just has to take care of the
fact that the minimum admissible value for a is a

. Positivity of consumption
— which should be taken into account when searching the optimal value for
a
t+1
— ﬁnally imposes a
t+1
< (1 +r)a
t

t
, such that the Bellman equation
writes
V (a
t
, ω
t
) = max
{a

a
t+1
(1+r)at+ωt}
u((1 + r)a
t
+ ω
t
−a
t+1
) + βE
t
V (a
t+1
, ω
t+1
)
The matlab code borrow.m solves and simulate this problem.
49
50
Bibliography
Ayiagari, R.S., Uninsured Idiosyncratic Risk and Aggregate Saving, Quarterly
Journal of Economics, 1994, 109 (3), 659–684.
Bertsekas, D., Dynamic Programming and Stochastic Control, New York:
Judd, K. and A. Solnick, Numerical Dynamic Programming with Shape–
Preserving Splines, Manuscript, Hoover Institution 1994.
Judd, K.L., Numerical methods in economics, Cambridge, Massachussets:
MIT Press, 1998.
Lucas, R., N. Stokey, and E. Prescott, Recursive Methods in Economic Dy-
namics, Cambridge (MA): Harvard University Press, 1989.
Tauchen, G. and R. Hussey, Quadrature Based Methods for Obtaining Ap-
proximate Solutions to Nonlinear Asset Pricing Models, Econometrica,
1991, 59 (2), 371–396.
51
Index
Bellman equation, 1, 2
Blackwell’s suﬃciency conditions, 6
Cauchy sequence, 4
Contraction mapping, 4
Contraction mapping theorem, 4
Discounting, 6
Howard Improvement, 15, 37
Invariant distribution, 30
Lagrange data, 41
Markov chain, 28
Metric space, 3
Monotonicity, 6
Policy functions, 2
Policy iterations, 15, 37
Stationary distribution, 30
Transition probabilities, 29
Transition probability matrix, 30
Value function, 2
Value iteration, 32
52
Contents
7 Dynamic Programming 1
7.1 The Bellman equation and associated theorems . . . . . . . . . 1
7.1.1 A heuristic derivation of the Bellman equation . . . . . 1
7.1.2 Existence and uniqueness of a solution . . . . . . . . . . 3
7.2 Deterministic dynamic programming . . . . . . . . . . . . . . . 8
7.2.1 Value function iteration . . . . . . . . . . . . . . . . . . 8
7.2.2 Taking advantage of interpolation . . . . . . . . . . . . 13
7.2.3 Policy iterations: Howard Improvement . . . . . . . . . 15
7.2.4 Parametric dynamic programming . . . . . . . . . . . . 20
7.3 Stochastic dynamic programming . . . . . . . . . . . . . . . . . 26
7.3.1 The Bellman equation . . . . . . . . . . . . . . . . . . . 26
7.3.2 Discretization of the shocks . . . . . . . . . . . . . . . . 28
7.3.3 Value iteration . . . . . . . . . . . . . . . . . . . . . . . 32
7.3.4 Policy iterations . . . . . . . . . . . . . . . . . . . . . . 37
7.4 How to simulate the model? . . . . . . . . . . . . . . . . . . . . 41
7.4.1 The deterministic case . . . . . . . . . . . . . . . . . . . 41
7.4.2 The stochastic case . . . . . . . . . . . . . . . . . . . . . 44
7.5 Handling constraints . . . . . . . . . . . . . . . . . . . . . . . . 48
53
54
List of Figures
7.1 Deterministic OGM (Value function, Value iteration) . . . . . . 11
7.2 Deterministic OGM (Decision rules, Value iteration) . . . . . . 12
7.3 Deterministic OGM (Value function, Value iteration with inter-
polation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7.4 Deterministic OGM (Decision rules, Value iteration with inter-
polation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
7.5 Deterministic OGM (Value function, Policy iteration) . . . . . 19
7.6 Deterministic OGM (Decision rules, Policy iteration) . . . . . . 20
7.7 Deterministic OGM (Value function, Parametric DP) . . . . . . 22
7.8 Deterministic OGM (Decision rules, Parametric DP) . . . . . . 22
7.9 Stochastic OGM (Value function, Value iteration) . . . . . . . . 35
7.10 Stochastic OGM (Decision rules, Value iteration) . . . . . . . . 35
7.11 Stochastic OGM (Value function, Policy iteration) . . . . . . . 38
7.12 Stochastic OGM (Decision rules, Policy iteration) . . . . . . . . 39
7.13 Transitional dynamics (OGM) . . . . . . . . . . . . . . . . . . . 43
7.14 Simulated OGM . . . . . . . . . . . . . . . . . . . . . . . . . . 46
55
56
List of Tables
7.1 Value function approximation . . . . . . . . . . . . . . . . . . . 23
57

We ﬁnally make the assumption that the model is markovian. The optimal value our agent can derive from this maximization process is given by the value function V (xt ) =
{yt+s ∈D (xt+s )}∞ s=0 ∞ s=0

max

β s u(yt+s , xt+s )

(7.1)

where D is the set of all feasible decisions for the variables of choice. Note that the value function is a function of the state variable only, as since the model is markovian, only the past is necessary to take decisions, such that all the path can be predicted once the state variable is observed. Therefore, the value in t is only a function of xt . (7.1) may now be rewritten as V (xt ) =
{yt ∈D (xt ),{yt+s ∈D (xt+s )}∞ s=1 ∞ s=1

max

u(yt , xt ) +

β s u(yt+s , xt+s )

(7.2)

making the change of variable k = s − 1, (7.2) rewrites V (xt ) = or V (xt ) =
{yt ∈D (xt ) {yt ∈D (xt ),{yt+1+k ∈D (xt+1+k )}∞ k=0

max

u(yt , xt ) +

β k+1 u(yt+1+k , xt+1+k )

k=0 ∞ k=0

max

u(yt , xt ) + β

{yt+1+k ∈D (xt+1+k )}∞ k=0

max

β k u(yt+1+k , xt+1+k ) (7.3)

Note that, by deﬁnition, we have V (xt+1 ) =
∞ {yt+1+k ∈D (xt+1+k )}k=0

max

∞ k=0

β k u(yt+1+k , xt+1+k )

such that (7.3) rewrites as V (xt ) = max u(yt , xt ) + βV (xt+1 )
yt ∈D (xt )

(7.4)

This is the so–called Bellman equation that lies at the core of the dynamic programming theory. With this equation are associated, in each and every period t, a set of optimal policy functions for y and x, which are deﬁned by {yt , xt+1 } ∈ Argmax u(y, x) + βV (xt+1 )
y∈D (x)

(7.5)

2

Our problem is now to solve (7.4) for the function V (xt ). This problem is particularly complicated as we are not solving for just a point that would satisfy the equation, but we are interested in ﬁnding a function that satisﬁes the equation. A simple procedure to ﬁnd a solution would be the following 1. Make an initial guess on the form of the value function V0 (xt ) 2. Update the guess using the Bellman equation such that Vi+1 (xt ) = max u(yt , xt ) + βVi (h(yt , xt ))
yt ∈D (xt )

3. If Vi+1 (xt ) = Vi (xt ), then a ﬁxed point has been found and the problem is solved, if not we go back to 2, and iterate on the process until convergence. In other words, solving the Bellman equation just amounts to ﬁnd the ﬁxed point of the bellman equation, or introduction an operator notation, ﬁnding the ﬁxed point of the operator T , such that Vi+1 = T Vi where T stands for the list of operations involved in the computation of the Bellman equation. The problem is then that of the existence and the uniqueness of this ﬁxed–point. Luckily, mathematicians have provided conditions for the existence and uniqueness of a solution.

7.1.2

Existence and uniqueness of a solution

Deﬁnition 1 A metric space is a set S, together with a metric ρ : S × S −→ R+ , such that for all x, y, z ∈ S: 1. ρ(x, y) 0, with ρ(x, y) = 0 if and only if x = y,

2. ρ(x, y) = ρ(y, x), 3. ρ(x, z) ρ(x, y) + ρ(y, z). 3

∞ Deﬁnition 2 A sequence {xn }n=0 in S converges to x ∈ S, if for each ε > 0

there exists an integer Nε such that

ρ(xn , x) < ε for all n

there exists an integer Nε such that

∞ Deﬁnition 3 A sequence {xn }n=0 in S is a Cauchy sequence if for each ε > 0

ρ(xn , xm ) < ε for all n, m

Deﬁnition 4 A metric space (S, ρ) is complete if every Cauchy sequence in S converges to a point in S. Deﬁnition 5 Let (S, ρ) be a metric space and T : S → S be function mapping ρ(T x, T y) βρ(x, y), for all x, y ∈ S.

S into itself. T is a contraction mapping (with modulus β) if for β ∈ (0, 1),

We then have the following remarkable theorem that establishes the existence and uniqueness of the ﬁxed point of a contraction mapping. Theorem 1 (Contraction Mapping Theorem) If (S, ρ) is a complete metric space and T : S −→ S is a contraction mapping with modulus β ∈ (0, 1), then 1. T has exactly one ﬁxed point V ∈ S such that V = T V , 2. for any V ∈ S, ρ(T n V0 , V ) < β n ρ(V 0 , V ), with n = 0, 1, 2 . . . Since we are endowed with all the tools we need to prove the theorem, we shall do it.
∞ Proof: In order to prove 1., we shall ﬁrst prove that if we select any sequence {Vn }n=0 ,

such that for each n, Vn ∈ S and Vn+1 = T Vn this sequence converges and that it converges to V ∈ S. In order to show convergence of {Vn }∞ , we shall prove that {Vn }∞ is a Cauchy sequence. First of all, note that the n=0 n=0 contraction property of T implies that ρ(V2 , V1 ) = ρ(T V1 , T V0 ) βρ(V1 , V0 )

4

and therefore ρ(Vn+1 , Vn ) = ρ(T Vn , T Vn−1 ) βρ(Vn , Vn−1 ) ... β n ρ(V1 , V0 )

Now consider two terms of the sequence, Vm and Vn , m > n. The triangle inequality implies that ρ(Vm , Vn ) ρ(Vm , Vm−1 ) + ρ(Vm−1 , Vm−2 ) + . . . + ρ(Vn+1 , Vn ) therefore, making use of the previous result, we have ρ(Vm , Vn ) β m−1 + β m−2 + . . . + β n ρ(V1 , V0 ) βn ρ(V1 , V0 ) 1−β

Since β ∈ (0, 1), β n → 0 as n → ∞, we have that for each ε > 0, there exists Nε ∈ N such that ρ(Vm , Vn ) < ε. Hence {Vn }∞ is a Cauchy sequence and it therefore converges. n=0 Further, since we have assume that S is complete, Vn converges to V ∈ S. We now have to show that V = T V in order to complete the proof of the ﬁrst part. Note that, for each ε > 0, and for V0 ∈ S, the triangular inequality implies ρ(V, T V ) ρ(V, Vn ) + ρ(Vn , T V )

But since {Vn }∞ is a Cauchy sequence, we have n=0 ρ(V, T V ) ρ(V, Vn ) + ρ(Vn , T V ) ε ε + 2 2

for large enough n, therefore V = T V . Hence, we have proven that T possesses a ﬁxed point and therefore have established its existence. We now have to prove uniqueness. This can be obtained by contradiction. Suppose, there exists another function, say W ∈ S that satisﬁes W = T W . Then, the deﬁnition of the ﬁxed point implies ρ(V, W ) = ρ(T V, T W ) but the contraction property implies ρ(V, W ) = ρ(T V, T W ) βρ(V, W )

which, as β > 0 implies ρ(V, W ) = 0 and so V = W . The limit is then unique. Proving 2. is straightforward as ρ(T n V0 , V ) = ρ(T n V0 , T V ) βρ(T n−1 V0 , V )

but we have ρ(T n−1 V0 , V ) = ρ(T n−1 V0 , T V ) such that ρ(T n V0 , V ) = ρ(T n V0 , T V ) which completes the proof. βρ(T n−1 V0 , V ) β 2 ρ(T n−2 V0 , V ) ... β n ρ(V0 , V ) 2

This theorem is of great importance as it establishes that any operator that possesses the contraction property will exhibit a unique ﬁxed–point, which therefore provides some rationale to the algorithm we were designing in the previous section. It also insure that whatever the initial condition for this 5

we have T (V + a) then T is a contraction with modulus β. W ) T (W + ρ(V. (Discounting) There exists some constant β ∈ (0. 1) such that for all V ∈ B(X) and a 0. T V + βa Proof: Let us consider two functions V. if the value function satisﬁes a contraction property. W ∈ B(X). It therefore remains to provide conditions for the value function to be a contraction. We therefore get TV − TW Likewise.. W ) βρ(V. and 2. Theorem 2 (Blackwell’s Suﬃciency Conditions) Let X ⊆ R and B(X) be the space of bounded functions V : X −→ R with the uniform metric. This completes the proof. W )) W + ρ(V. W ) βρ(V. These are provided by the following theorem. Let T : B(X) −→ B(X) be an operator satisfying T V (x) T W (x) 1. (Monotonicity) Let V.algorithm. W ) W + ρ(V. if we now consider that V βρ(V. W ) 2 TW − TV Consequently. if V (x) W (x) for all x ∈ X. and such that V Monotonicity ﬁrst implies that TV and discounting TV since ρ(V. then 2. W ). we end up with βρ(V. W ) T W + βρ(V. W )) 0 plays the same role as a. T W ) which deﬁnes a contraction. simple iterations will deliver the solution. 6 . W ∈ B(X) satisfying 1. we have |T V − T W | so that ρ(T V.

for which the Bellman equation writes V (kt ) = max u(ct ) + βV (kt+1 ) ct ∈C with kt+1 = F (kt ) − ct .This theorem is extremely useful as it gives us simple tools to check whether a problem is a contraction and therefore permits to check whether the simple algorithm we were deﬁned is appropriate for the problem we have in hand. V and W . such that V (k) (T V )(k) k ∈K k ∈K subscript and denote the next period capital stock by k . u(F (k) − k ) + βW (k ) k ∈K max u(F (k) − k ) + βW (k ) = (T W (k )) 7 . let us drop the time equation rewrites. let us consider the optimal growth model. What we want to show is that next period capital stock. we just have to check whether T is monotonic and satisﬁes the discounting property. that is (T V )(k) = u(F (k) − k ) + βV (k ) But now. such that the Bellman (T W )(k). since V (k) such that it should be clear that (T V )(k) W (k) for all k ∈ K . plugging the law of motion of capital in the utility function V (k) = max u(F (k) − k ) + βV (k ) Let us now deﬁne the operator T as (T V )(k) = max u(F (k) − k ) + βV (k ) We would like to know if T is a contraction and therefore if there exists a unique function V such that V (k) = (T V )(k) In order to achieve this task. we have V (k ) W (k ). As an example. 1. In order to save on notations. let us denote by k the optimal W (k) for all k ∈ K . Monotonicity: Let us consider two candidate value functions. In order to do that.

.1 Deterministic dynamic programming Value function iteration The contraction mapping theorem gives us a straightforward way to compute the solution to the bellman equation: iterate on the operator T . 2. . Decide on a grid. the Bellman equation satisﬁes discounting in the case of optimal growth model. V . Discounting: Let us consider a candidate value function. 7.2. such that Vi+1 = T Vi up to the point where the distance between two successive value function is small enough. xN } formulate an initial guess for the value function V0 (x) and choose a stopping criterion ε > 0. the optimal growth model satisﬁes the Blackwell’s suﬃcient conditions for a contraction mapping. of admissible values for the state variable x X = {x1 . X . Basically this amounts to apply the following algorithm 1.2 7. . 8 . and therefore the value function exists and is unique. and a positive constant a. (T (V + a))(k) = = max u(F (k) − k ) + β(V (k ) + a) max u(F (k) − k ) + βV (k ) + βa k ∈K k ∈K = (T V )(k) + βa Therefore. We are now in position to design a numerical algorithm to solve the bellman equation.Hence we have shown that V (k) W (k) implies (T V )(k) (T W )(k) and therefore established monotonicity. . Hence.

x ) + βVi (x ) {x ∈X } 3. For each x ∈ X . . . . we have V (k) = max (k α + (1 − δ)k − k )1−σ − 1 + βV (k ) 1−σ kα +(1−δ)k (1−δ)k k Now.2. let us consider a simple example and go back to the optimal growth model. If Vi+1 (x) − Vi (x) < ε go to the next step. kN } 9 . N . x ) and V (x) = u(y (x). 4. let us deﬁne a grid of N feasible values for k such that we have K = {k1 . with u(c) = and k = k α − c + (1 − δ)k Then the Bellman equation writes V (k) = c1−σ − 1 + βV (k ) k α +(1−δ)k 1 − σ max c1−σ − 1 1−σ 0 c From the law of motion of capital we can determine consumption as c = k α + (1 − δ)k − k such that plugging this results in the Bellman equation. x ). . . . . = 1. else go back to 2. compute Vi+1 (x ) = max u(y(x . . x) 1−β In order to better understand the algorithm. Compute the ﬁnal solution as y (x) = y(x.

we want k to satisfy 0 k k α + (1 − δ)k which puts a lower and an upper bound on the index h. the upper bound can be computed as h=E Then we ﬁnd V and set Vi+1 (k ) = V .. if we want it to converge fast enough it may be a good idea to impose a good initial guess. which restricts the number of possible values for k .h . where dk is the increment +1 max V h=1.1 and 7.h .. When the grid of in the grid. . Then. .and an initial value function V0 (k) — that is a vector of N numbers that relate each k to a value.N V k (k ) = kh such that we have In ﬁgures 7. we report the value function and the decision rules obtained from the deterministic optimal growth model with α = 0. we need a stopping criterion. . Finally.h α ki + (1 − δ)ki − k dk values is uniform — that is when kh = k + (h − 1)dk .h and keep in memory the index h = Argmaxh=1. 10 .. . But..h . N . we compute the feasible values that can be taken by the quantity in left hand side of the value function V . Indeed... Namely.3.2. Note that this may be anything we want as we know — by the contraction mapping theorem — that the algorithm will converge.95.. β = 0. we only compute consumption when it is positive and smaller than total output.h ≡ (k α + (1 − δ)k − kh )1−σ − 1 + βV (kh ) for h feasible 1−σ It is important to understand what “h feasible” means. . for each = 1..

30. Much faster implementations can be found in the accompanying matlab codes. = 1. 11 .5 seconds on a 800Mhz computer when the stopping criterion is ε = 1e−6 .1: Deterministic OGM (Value function.5 1 1.5 kt 3 3.δ = 0. 0. = (1-dev)*ks.1.5 4 4. % maximal deviation from steady state % lower bound on the grid % upper bound on the grid This code and those that follow are not eﬃcient from a computational point of view.5. 0.9.5. = 0. = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1)). The grid for the capital stock is composed of 1000 data state and ∆k = 0.95.5 5 sigma delta beta alpha nbk crit epsi ks dev kmin kmax 2 = = = = 1.9. = (1+dev)*ks. Matlab Code: Value Function Iteration % utility parameter % depreciation rate % discount factor % capital elasticity of output % number of data points in the grid % convergence criterion % convergence parameter = 1000. where k denotes the steady 0. Figure 7. The algorithm2 then converges in 211 iterations and 110. 0.5 2 2.1 and σ = 1. as they are intended to have you understand the method without adding any coding complications. = 1e-6. Value iteration) 4 3 2 1 0 −1 −2 −3 −4 0 points ranging from (1 − ∆k )k to (1 + ∆k )k .

% % consumption and utility % c = kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid(1:imax).kmax. for i=1:nbk % % compute indexes for which consumption is positive % tmp = (kgrid(i)^alpha+(1-delta)*kgrid(i)-kmin).Figure 7. end % % Final solution % kp = kgrid(dr). end.5 Consumption kt 1 2 kt 3 4 5 dk kgrid v dr = = = = (kmax-kmin)/(nbk-1). Value iteration) 5 4 3 1 2 1 0 0 1 2 3 4 5 0. % % find value function % [tv(i). linspace(kmin. crit = max(abs(tv-v)).^(1-sigma)-1)/(1-sigma). v = tv. zeros(nbk. imax = min(floor(tmp/dk)+1. util = (c. % % % % implied increment builds the grid value function decision rule (will contain indices) while crit>epsi.1).nbk)’. % Compute convergence criterion % Update the value function 12 .dr(i)] = max(util+beta*v(1:imax)).5 0 0 Next period capital stock 2 1.1). zeros(nbk.2: Deterministic OGM (Decision rules.nbk).

such that the value function is unknown at this particular value. util= (c. .j N formulate an initial guess for the value function V0 (x) and choose a = h(yj . Therefore.2 Taking advantage of interpolation A possible improvement of the method is to have a much looser grid on the capital stock but have a pretty ﬁne grid on the control variable (consumption in the optimal growth model). xN } Decide on a grid.j ) 13 . it may be the case that the computed optimal value for the next period state variable does not lie in the grid. . . 7.2. 2. . . of admissible values for the control variable y Y = {y1 .^(1-sigma)-1)/(1-sigma). compute x . . . x ) + β Vi (x . . of admissible values for the state variable x X = {x1 . . . However. . we use any interpolation scheme to get an approximation of the value function at this value. . because of this precision and the fact the grid is rougher.c = kgrid. Y . Decide on a grid. x )∀j = 1. . One advantage of this approach is that it involves less function evaluations and is usually less costly in terms of CPU time. = 1.j Compute an interpolated value function at each x {y∈Y } = h(yj . Then the next period value of the state variable can be computed much more precisely. x ): Vi (x . yM } with M stopping criterion ε > 0. N . .j ) Vi+1 (x ) = max u(y. M . The algorithm is then as follows: 1. . For each x ∈ X . X . .^alpha+(1-delta)*kgrid-kp.

% depreciation rate = 0.01. % linspace(kmin. The algorithm converges in 182 iterations and 40.^alpha+(1-delta)*k(i)-c.5. % (1+dev)*ks.^(1-sigma)-1)/(1-sigma).1). else go back to 2.v.30. % (c.nbk)’. % utility parameter = 0. % zeros(nbk. 1000.1. maximal deviation from steady state lower bound on the grid upper bound on the grid builds the grid lower bound on the grid upper bound on the grid builds the grid value function decision rule (will contain indices) while crit>epsi. % linspace(cmin.’spline’). = max(util+beta*vi). % kmax^alpha. % discount factor = 0.6 seconds starting from initial condition for the value function v0 (k) = ((c /y ) ∗ k α )1−σ − 1 (1 − σ) sigma delta beta alpha nbk nbk crit epsi ks dev kmin kmax kgrid cmin cmak c v dr util Matlab Code: Value Function Iteration with Interpolation = 1.kmax. for i=1:nbk.1). If Vi+1 (x) − Vi (x) < ε go to the next step. 14 . x) 1−β I report now the matlab code for this approach when we use cubic spline interpolation for the value function. = interp1(k.kp. 1. % zeros(nbk. 20 nodes for the capital stock and 1000 nodes for consumption. % 0.3. kp vi [Tv(i).nbc)’. Compute the ﬁnal solution as V (x) = u(y .9. = = = = = = = = = = 0. 4. 1e-6. % (1-dev)*ks.dr(i)] = A(j)*k(i).cmax. % % % % number of data points in the K grid number of data points in the C grid convergence criterion convergence parameter = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1)). % capital elasticity of output = = = = 20.95.

since it can be shown that this procedure converges at the rate β.3 Policy iterations: Howard Improvement The simple value iteration algorithm has the attractive feature of being particularly simple to implement.3: Deterministic OGM (Value function. com15 .5 2 2. Often. v = util/(1-beta). % Update the value function end % % Final solution % kp = kgrid(dr). which is usually close to 1! Further. c = kgrid.^(1-sigma)-1)/(1-sigma).2. it is a slow procedure.5 5 7. % Compute convergence criterion v = tv. it computes unnecessary quantities during the algorithm which slows down convergence.5 kt 3 3. However. util= (c. Value iteration with interpolation) 4 3 2 1 0 −1 −2 −3 −4 0 0. Figure 7.5 1 1.end crit = max(abs(tv-v)). especially for inﬁnite horizon problems.^alpha+(1-delta)*kgrid-kp.5 4 4.

Set a stopping criterion ε > 0. for instance when one wants to perform a sensitivity analysis of the results obtained in a model using diﬀerent parameterization. fi (x)) 16 . assuming that this rule is operative forever: V (xt ) = ∞ s=0 β s u(fi (xt+s ). yt ) = h(xt .Figure 7. we would like to be able to speed up convergence.4: Deterministic OGM (Decision rules. fi (xt )) with i = 0.5 0 0 Next period capital stock 2 1. Value iteration with interpolation) 5 4 3 1 2 1 0 0 1 2 3 4 5 0. xt+s ) taking care of the fact that xt+1 = h(xt . This method actually iterates on policy functions rather than iterating on the value function. This can be achieved relying on Howard improvement method. The algorithm may be described as follows 1. Set an initial feasible decision rule for the control variable y = f0 (x) and compute the value associated to this guess. 2. Hence. x) + βV (x ) y with x = h(x. Find a new policy rule y = fi+1 (x) such that fi+1 (x) ∈ Argmax u(y.5 Consumption kt 1 2 kt 3 4 5 putation speed is really important.

Note that this method diﬀers fundamentally from the value iteration algorithm in at least two dimensions i one iterates on the policy function rather than on the value function. else go back to 2. x ) + βQVi+1 (x ) ∀x ∈ X where Q is an (N × N ) matrix Qj= 1 if xj ≡ h(fi+1 (x ). if yes then stop. This is precisely this last feature that accelerates convergence. starting from the same initial guess and using the same parameterization! 17 . to get Vi+1 (x) = (I − βQ)−1 u(fi+1 (x). which may be rewritten as Vi+1 (x ) = u(fi+1 (x ).6. Q is sparse. The algorithm converges in only 18 iterations and 9. Note that when computing the value function we actually have to solve a linear system of the form Vi+1 (x ) = u(fi+1 (x ).3. x ) + βVi+1 (h(x . check if fi+1 (x) − fi (x) < ε. ii the decision rule is used forever whereas it is assumed that it is used only two consecutive periods in the value iteration algorithm.5 and 7. which can be exploited in solving the system. x ) = x 0 otherwise ∀x ∈ X Note that although it is a big matrix. x) We apply this algorithm to the same optimal growth model as in the previous section and report the value function and the decision rules at convergence in ﬁgures 7. fi+1 (x ))) for Vi+1 (x ).8 seconds.

^(1-sigma)-1)/(1-sigma).95. % % consumption and utility % c = kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid(1:imax). for i=1:nbk.dr(i)) = 1. dev = 0.1).30.^(1-sigma)-1)/(1-sigma). util = (c.50. % decision rule (will contain indices) % % Main loop % while crit>epsi.9. for i=1:nbk % % compute indexes for which consumption is positive % imax = min(floor((kgrid(i)^alpha+(1-delta)*kgrid(i)-kmin)/devk)+1. Matlab Code: Policy Iteration % utility parameter % depreciation rate % discount factor % capital elasticity of output % number of data points in the grid % convergence criterion % convergence parameter = 1000.nbk). % % update the value % util= (c. Q(i. % % find new policy rule % [v1.dr(i)]= max(util+beta*v(1:imax)).10. % upper bound on the grid kgrid = linspace(kmin. Q = sparse(nbk. % % decision rules % kp = kgrid(dr) c = kgrid.^alpha+(1-delta)*kgrid-kp. % builds the grid v = zeros(nbk.nbk).nbk)’.1). % value function kp0 = kgrid. 18 . = 1e-6. 0. % lower bound on the grid kmax = (1+dev)*ks. % maximal deviation from steady state kmin = (1-dev)*ks.kmax. 0. end.sigma delta beta alpha nbk crit epsi = = = = 1. = 1. % initial guess on k(t+1) dr = zeros(nbk. ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1)). 0.

end Tv = crit= v = kp0 = end (speye(nbk)-beta*Q)\u.5 4 4. a number of researchers has proposed to replace the matrix inversion by an additional iteration step.5 kt 3 3. which replaces the linear problem by the following iteration scheme 1.5 2 2. policy iteration algorithm only requires a few iterations. max(abs(kp-kp0)). Figure 7. Set J0 = Vi 19 . Unfortunately. leading to the so–called modiﬁed policy iteration with k steps. Therefore. Tv.5 5 As experimented in the particular example of the optimal growth model. x) which may be particularly costly when the number of grid points is important. Policy iteration) 4 3 2 1 0 −1 −2 −3 −4 0 0.5: Deterministic OGM (Value function. we have to solve the linear system (I − βQ)Vi+1 = u(y. kp.5 1 1.

5 Consumption kt 1 2 kt 3 4 5 2. terpolating nodes X = {x1 . Θi )) = max u(y. Choose a functional form for the value function V (x. Policy iteration) 5 4 3 1 2 1 0 0 1 2 3 4 5 0. set Vi+1 = Jk . . x) + βQJi . iterate k times on Ji+1 = u(y. Θi ) y 20 .6: Deterministic OGM (Decision rules. . . xN }. When k −→ ∞. . .Figure 7.2. . x ) + β V (x . that is compute w = T (V (x. .4 Parametric dynamic programming This last technics I will describe borrows from approximation theory using either orthogonal polynomials or spline functions. . a stopping criterion ε > 0 and an 2. The idea is actually to make a guess for the functional for of the value function and iterate on the parameters of this functional form. k 3. Jk tends toward the solution of the linear system. i = 0. 7.5 0 0 Next period capital stock 2 1. Θ). Θi )) w = T (V (x . Using the conjecture for the value function. a grid of ininitial vector of parameters Θ0 . perform the maximization step in the Bellman equation. The algorithm then works as follows 1.

we need the payoﬀ function and the value function to be continuous. I will once again focus on the optimal growth problem we have been dealing with so far. Θi ) < ε then stop. compute a new vector of parameters Θi+1 such that V (x.1 reports the parameters of the approximation function. Figures 7.s. I set 21 . If V (x. .8 and 7. . Θi+1 ) − V (x. and I will approximate the value function by V (k. As an example. First note that for this method to be implementable. {ϕi (. x = h(y. N 3. and table 7. In the example. Using the approximation method you have chosen. Θ))2 This algorithm is usually much faster than value iteration as it may not require iterating on a large grid. x ) for = 1. w ). The approximation function may be either a combination of polynomials. 4. . neural networks.t. . and the approximation method may involve numerical minimization in order to solve a non–linear least–square problem of the form: Θi+1 ∈ Argmin Θ N =1 (w − V (x . but took less much time that value iterations.)}p i=0 is a set of Chebychev polynomials. else go back to 2. we may have to rely on a numerical maximization algorithm. Θ) = i=1 p θ i ϕi 2 k−k −1 k−k where p = 10 and used 20 nodes. The algorithm converged in 242 iterations.7 report the decision rule and the value function in this case. splines. Θi+1 ) approximates the data (x . Note that during the optimization problem.

Figure 7.5 2 2.5 Consumption k 1 2 t k 3 t 4 5 22 .5 0 0 Next period capital stock 2 1.5 5 Figure 7.7: Deterministic OGM (Value function.5 kt 3 3.5 1 1. Parametric DP) 4 3 2 1 0 −1 −2 −3 −4 0 0.8: Deterministic OGM (Decision rules. Parametric DP) 5 4 3 1 2 1 0 0 1 2 3 4 5 0.5 4 4.

Table 7.78042 -0.66012 0.23704 -0.00501 -0.82367 2.00281 23 .05148 -0.01126 -0.1: Value function approximation θ0 θ1 θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ10 0.00617 0.02601 0.10281 0.

th0 = X\v Tv = zeros(nbk. options(14)=1e9. 1. % % Main loop % options=foptions..30.th0).1).sigma delta beta alpha nbk p crit iter epsi = = = = = = = = = Matlab Code: Parametric Dynamic Programming 1.th0). k0 = kgrid(1).1). kp = zeros(nbk. % discount factor 0. while crit>epsi. end.95.k0. 1. th0 = theta. iter= iter+1.^(1-sigma)-1)/((1-sigma)*(1-beta))).n). % utility parameter 0. 10. % Interpolating nodes kgrid = kmin+(rk+1)*(kmax-kmin)/2.10.options. crit = max(abs(Tv-v)). % maximal dev. dev = 0.param.9. % depreciation rate 0. % mapping % % Initial guess for the approximation % v = (((kgrid. theta= X\Tv. % capital elasticity of output 20. kp(i) = fminu(’tv’. % % % % % # of data points in the grid order of polynomials convergence criterion iteration convergence parameter ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1)).param. end 24 . % upper bound on the grid rk = -cos((2*[1:nbk]’-1)*pi/(2*nnk)). from steady state kmin = (1-dev)*ks.50. 1e-6. X = chebychev(rk. for i=1:nbk param = [alpha beta delta sigma kmin kmax n kgrid(i)]. v = Tv. k0 = kp(i).[]. % lower bound on the grid kmax = (1+dev)*ks.^alpha). Tv(i) = -tv(kp(i).

^(1-sigma)-1)/(1-sigma).i-2)]. This is why Judd [1998] 25 . Tx=[Tx 2*X.^alpha+(1-delta)*k-kp.n)*theta. kmin = param(5).error(’n should be a positive integer’).theta). X=x(:).i-1)-Tx(:. c(d) = NaN. n = param(3). kmax = param(2).1).1) X].theta). beta = param(2). k = 2*(k-kmin)/(kmax-kmin)-1. v = chebychev(k.Matlab Code: Extra function res=tv(kp. lx=size(X. for i=3:n+1. Tx=[ones(lx. end end A potential problem with the use of Chebychev polynomials is that they do not put any assumption on the shape of the value function that we know to be concave and strictly increasing in this case. kmax = param(6). case 1. c = k. alpha = param(1). otherwise Tx=[ones(lx. v = value(kp. if n<0. sigma = param(4). Functions % % % % % % % % insures positivity of k’ computes the value function computes consumption find negative consumption get rid off negative c computes utility utility = low number for c<0 compute -TV (we are minimizing) function Tx=chebychev(x.n). Tx=[ones(lx. k = param(8).theta).param. res = -(util+beta*v). case 0.1)].[kmin kmax n].^2). d = find(c<=0). kp = sqrt(kp. kmin = param(1). function v = value(k.param. util(d) = -1e12. delta = param(3).*Tx(:.end switch n. n = param(7). util = (c.1) X].

Judd and Solnick [1994] have successfully applied this latter technic to the optimal growth model and found that the approximation was very good and dominated a lot of other methods (they actually get the same precision with 12 nodes as the one achieved with a 1200 data points grid using a value iteration technic).1 The Bellman equation The stochastic problem diﬀers from the deterministic problem in that we now have a . 7. st )+Et β τ u(yt+τ . we have to deal with stochastic shocks — just think of a standard RBC model — dynamic programming technics can obviously be extended to deal with such problems.6) and xt+1 = h(xt . I will ﬁrst show how we can obtain the Bellman equation before addressing some important issues concerning discretization of shocks. st+τ ) (7.st+τ )}∞ τ =0 max Et β τ u(yt+τ . xt+τ . whose +∞ sequence {st }t=0 satisﬁes st+1 = Φ(st . yt . stochastic shock! The problem then deﬁnes a value function which has as argument the state variable xt but also the stationary shock st .st+τ )}∞ } τ =1 (7. εt+1 ) ∞ τ =0 (7. st ) = {yt+τ ∈D (xt+τ . The value function is therefore given by V (xt . xt and st are either perfectly observed or decided in period t ∞ τ =1 max u(yt . xt+τ .3.6) where ε is a white noise process. st ) they are known in t. Then I will describe the implementation of the value iteration and policy iteration technics for the stochastic case.st ).3 Stochastic dynamic programming In a large number of problems.recommends to use shape–preserving methods such as Schumaker approximation in this case. . st ) = {yt ∈D (xt .8) Since both yt . . 7. st+τ ) 26 . xt . such that we may rewrite the value function as V (xt .{yt+τ ∈D (xt+τ .7) subject to (7.

. so that we get V (xt . st ) max βEt Et+1 ∞ k=0 + or V (xt . .st ) max u(yt . f (ωt+1 |ωt )dωt+τ . st+1+k ) yt ∈D (xt . this rewrites V (xt . X(ωt+τ )f (ωt+τ |ωt+τ −1 ) . st ) = yt ∈D (xt . xt+1+k . st+1+k ) It is important at this point to recall that Et (X(ωt+τ )) = = = X(ωt+τ )f (ωt+τ |ωt )dωt+τ X(ωt+τ )f (ωt+τ |ωt+τ −1 )f (ωt+τ −1 |ωt )dωt+τ dωt+τ −1 .st+1+k )}∞ k=0 β k u(yt+1+k .st ) max u(yt . xt+τ . st+1+k )f (st+1 |st )dst+1 Note that because each value for the shock deﬁnes a particular mathematical object the maximization of the integral corresponds to the integral of the maximization.or V (xt . xt . . xt+1+k . st ) ∞ k=0 +β {yt+1+k }k=0 max ∞ Et+1 β k u(yt+1+k . such that the value function rewrites V (xt ..st+1+k )}k=0 max βEt ∞ k=0 β k u(yt+1+k . st ) = {yt+1+k ∈D (xt+1+k . xt . st )+ {yt+τ ∈D (xt+τ .st ) max u(yt . xt . therefore the max operator and the summation are interchangeable. dωt+1 which as corollary the law of iterated projections. . xt+1+k . st ) = yt ∈D (xt . st+1+k )f (st+1 |st )dst+1 27 .st+1+k )}∞ k=0 β k u(yt+1+k . st+τ ) Using the change of variable k = τ − 1. st ) = yt ∈D (xt+τ . xt+1+k .st+τ ) max u(yt . st )+ ∞ {yt+1+k ∈D (xt+1+k . xt . xt .st ) max u(yt . st ) = yt ∈D (xt . st ) max β Et+1 ∞ k=0 + {yt+1+k ∈D (xt+1+k ..st+τ )}∞ τ =1 max Et ∞ τ =1 β τ u(yt+τ .

indexed by an integer variable t. The question we therefore face is: does there exist a discrete representation for s which is equivalent to its continuous original representation? The answer to this question is yes. st+1 ) max u(yt . st ) + βEt V (xt+1 . the Markov chain is in a state. xt .st ) which is precisely the Bellman equation for the stochastic dynamic programming problem. . sM }. W will restrict ourselves to discrete-time Markov chains. st ) = or equivalently V (xt . In particular as soon as we deal with (V)AR processes. S is called the state space.By deﬁnition.3. 28 . st+1 )f (st+1 |st )dst+1 yt ∈D (xt . st+1 ). the term under the integral corresponds to V (xt+1 . . xt . . st ) = max u(yt . We therefore have to transform the continuous problem into a discrete one with the constraint that the asymptotic properties of the continous and the discrete processes should be the same. such that the value rewrites V (xt . we can use a very powerful tool: Markov chains. Indeed. .2 Discretization of the shocks A very important problem that arises whenever we deal with value iteration or policy iteration in a stochastic environment is that of the discretization of the space spanned by the shocks. 7. At each time step t. Markov Chains: A Markov chain is a sequence of random values whose probabilities at a time interval depends upon the value of the number at the previous time. denoted by s ∈ S ≡ {s1 . the use of a continuous support for the stochastic shocks is infeasible for a computer that can only deals with discrete supports.st ) yt ∈D (xt . st ) + β V (xt+1 . in which the state changes at certain discrete time instants.

We therefore get the following deﬁnition Deﬁnition 6 A Markov chain is a stochastic process with a discrete indexing S . such that the conditional distribution of st+1 is independent of all previously attained state given st : πij = Prob(st + 1 = sj |st = si ). . apply as soon as state si is activated no matter the history of the shocks nor how the system got to state s i . . πij 2. This transition probability should read as follows If the state happens to be si in period t. Thus. π M M  29 . . . . . . the probability that the next state is equal to sj is πij . . πM 1 .  . .. π1M  . . The transition probabilities πij must of course satisfy 1. . 0 for all i. M for all period t. the probability the next state st+1 does only depend on the current realization of s. this corresponds to the so–called Markov property P (st+1 = sj |st = si . .The Markov chain is described in terms of transition probabilities πij . The important assumption we shall make concerning Markov processes is that the transition probabilities. . From a mathematical point of view. . . In other words there is no hysteresis. and all sequences {sin }t−1 of earlier n=0 M j=1 πij All of the elements of a Markov chain model can then be encoded in a transition probability matrix  π11 . πij . sj ∈ S . . st−1 = sin−1 . . j = 1. . si . . s0 = si0 ) = P (st+1 = sj |st = si ) = πij states. all states si . sj ∈ S . M = 1 for all i = 1.  Π= . .

. 0 for all j = 1. σ 2 ). s) ≡ 3 where f (st+1 |s) denotes the density of st+1 conditional on the fact that st = s f (st+1 |st ) 1 = exp − f (st+1 |s) 2 st+1 − ρst − (1 − ρ)s σ 2 − st+1 − s σ 2 Note that we already discussed this issue in lectures notre # 4. M M j=1 πj =1 3. st . π = Ππ Gaussian–quadrature approach to discretization Tauchen and Hussey [1991] provide a simple way to discretize VAR processes relying on gaussian quadrature. In the long run. Tauchen and Hussey [1991] propose to replace the integral by Φ(st+1 . N (0. This implies that 1 √ σ 2π ∞ −∞ exp − 1 2 st+1 − ρst − (1 − ρ)s σ 2 dst+1 = f (st+1 |st )dst+1 = 1 which illustrates the fact that s is a continuous random variable. 30 . . st . . we obtain the steady state equivalent for a Markov chain: the invariant distribution or stationary distribution Deﬁnition 7 A stationary distribution for a Markov chain is a distribution π such that 1. s)f (st+1 |s)dst+1 ≡ f (st+1 |st ) f (st+1 |s)dst+1 = 1 f (st+1 |s) (which amounts to the unconditional density).k Note that Πij then gives the probability that st+k = sj given that st = si . πj 2.3 I will only present the case of an AR(1) process of the form st+1 = ρst + (1 − ρ)s + εt+1 where εt+1 . . which in our case implies that Φ(st+1 .

but remember that the quadrature is just an approximation such that it will generally be the case that n j=1 πij = 1 will not hold exactly./sx(:.1)). zi . i = 1. s) √ πij = πςi where ςi = 1 √ π n j=1 ωj Φ(zj .ones(n.then we can use the standard linear transformation and impose zt = (st − √ s)/(σ 2) to get 1 √ π ∞ −∞ 2 exp − (zt+1 − ρzt )2 − zt+1 j) 2 exp −zt+1 dzt+1 for which we can use a Gauss–Hermite quadrature.ones(n.rho. .wx] = gauss_herm(n).*w). p = p. zi . % nodes and weights for s s = sqrt(2)*s*xx+mx.*(z-rx*x)).1))’.p]=tauch_huss(xbar.*z-(z-rx*x). 31 .sigma. zi . .ones(n. the quadrature leads to the formula 1 √ π n ωj Φ(zj . n. % discrete states x = xx(:. sx = sum(p’)’. x) as an “estimate” πij ≡ Prob(st+1 = sj |st = si ) of the transition probability from state i to state j. % % computation % p = (exp(z. . Assume then that we have the quadrature nodes zi and weights ωi . . x) j=1 1 in other words we might interpret the quantity ωj Φ(zj . zi . Tauchen and Hussey therefore propose the following modiﬁcation: ωj Φ(zj .n) % xbar : mean of the x process % rho : persistence parameter % sigma : volatility % n : number of nodes % % returns the states s and the transition probabilities p % [xx./sqrt(pi).1)). z = x’. w = wx(:. x). Matlab Code: Discretization of AR(1) function [s.

Compute the ﬁnal solution as y (x. . N . [1989] chapter 9. which goes far beyond this introduction to dynamic programming. . This is however a bit more subtil than that as we have to deal with the convergence of a probability measure. . s) − Vi (x. . of admissible values for the state variable x X = {x1 . xN } and the shocks.7. s) = y(x.3 Value iteration As in the deterministic case. X . k = 1. sj ) 3. s). sM } together with the transition matrix Π = (πij ) formulate an initial guess for the value function V0 (x) and choose a stopping criterion ε > 0. = 1. I will illustrate the method relying on the optimal growth model. sk ) + β j=1 πkj Vi (x . . . . . and sk ∈ S . else go back to 2. . . x . The algorithm then writes as follows 1. sk ) = max u(y(x . . . . .3. x (s. s) As in the deterministic case. .4 The algorithm is basically the same as in the deterministic case up to the delicate point that we have to deal with the expectation. let us consider a simple example and go back to the optimal growth model. . sk . with u(c) = 4 c1−σ − 1 1−σ 32 The interested reader should then refer to Lucas et al. M compute M {x ∈X } Vi+1 (x . s) < ε go to the next step. 4. the convergence of the simple value function iteration procedure is insured by the contraction mapping theorem. . In order to better understand the algorithm. x ). s S = {s1 . Decide on a grid. For each x ∈ X . 2. If Vi+1 (x.

We therefore have to discretize the shock. we have V (k. a )dµ(a |a) From the law of motion of capital we can determine consumption as c = exp(a)k α + (1 − δ)k − k such that plugging this results in the Bellman equation. a) = max c c1−σ − 1 +β 1−σ V (k . such that Π= π 1−π 1−π π a1 . I will also assume that the transition matrix is symmetric.and k = exp(a)k α − c + (1 − δ)k where a = ρa + ε . . such that a can take on 2 values a1 and a2 (a1 < a2 ). Then the Bellman equation writes V (k. we get a1 = 33 2 σε /(1 − ρ2 ). a2 and π are selected such that the process reproduces the conditional ﬁrst and second order moments of the AR(1) process First order moments πa1 + (1 − π)a2 = ρa1 (1 − π)a1 + πa2 = ρa2 Second order moments 2 πa2 + (1 − π)a2 − (ρa1 )2 = σε 1 2 2 + πa2 − (ρa )2 = σ 2 (1 − π)a1 2 ε 2 From the ﬁrst two equations we get a1 = −a2 and π = (1 + ρ)/2. a) = max k (exp(a)k α + (1 − δ)k − k )1−σ − 1 +β 1−σ V (k . I will consider that the technology shock can be accurately approximated by a 2 state markov chain. Here. Plugging these last two results in the two last equations. a )dµ(a |a) A ﬁrst problem that we encounter is that we would like to be able to evaluate the integral involved by the rational expectation.

Finally. . if we want it to converge fast enough it may be a good idea to impose a good initial guess. aj ) j=1 Now. we will actually work with a value function of the form c1−σ − 1 V (k. .5.10. σ = 1.12. But.3.Hence. kN } and an initial value function V0 (k) — that is a vector of N numbers that relate each k to a value. we report the value function and the decision rules obtained for the stochastic optimal growth model with α = 0. ak ) = max +β c 1−σ 2 πkj V (k . β = 0. . 34 . Note that this may be anything we want as we know — by the contraction mapping theorem — that the algorithm will converge.1. The grid for the capital stock is composed of 1000 data points ranging from 0. δ = 0. ρ = 0.95. In ﬁgures 7. .2 to 6.8 and σε = 0. The algorithm then converges in 192 iterations when the stopping criterion is ε = 1e−6 . we need a stopping criterion.9 and 7. let us deﬁne a grid of N feasible values for k such that we have K = {k1 .

Figure 7. Value iteration) 6 4 2 0 −2 −4 −6 0 1 2 3 kt 4 5 6 Figure 7.9: Stochastic OGM (Value function. Value iteration) 6 5 4 3 2 1 0 0 2 4 6 Next period capital stock 2 1.5 1 0.10: Stochastic OGM (Decision rules.5 0 0 Consumption k 2 t k 4 t 6 35 .

kmax.12. c = zeros(nbk. 0.sigma delta beta alpha rho se = = = = = = 1. c(neg) = NaN.nba).:)] = max(util+beta*(v*PI)). a1 = exp(-se*se/(1-rho*rho)).j) = -1e12.95. kgrid = linspace(kmin. end 36 . nba = 2.dr(i.30. util(:. A = [a1 a2]. end [Tv(i. util = zeros(nbk.nba). kmax = 6. % % Discretization of the state space % kmin = 0. PI = [p 1-p. while crit>epsi. 0.nbk)’. 0. v = zeros(nbk. neg = find(c<=0). ab = 0.j) = (c.^(1-sigma)-1)/(1-sigma). epsi = 1e-6. se = 0.10. 0.:).12. iter = iter+1. v = Tv. Tv = zeros(nbk. end.2.1-p p].80. util(neg. crit = 1. Matlab Code: Value Function Iteration % utility parameter % depreciation rate % discount factor % capital elasticity of output % persistence of the shock % volatility of the shock % % % % number of data points in the grid number of values for the shock convergence criterion convergence parameter nbk = 1000.50. 0. % % Discretization of the shock % p = (1+rho)/2. for i=1:nbk for j=1:nba. a2 = exp(se*se/(1-rho*rho)). c = A(j)*k(i)^alpha+(1-delta)*k(i)-k. crit = max(max(abs(Tv-v))).nba).nba).

M . else go back to 2. which implies the computation of diﬀerent Q matrices. assuming that this rule is operative forever. sk ).^alpha+(1-delta)*k-kp(:. sk ) = u(fi+1 (x . . fi+1 (x . sk ) = u(fi+1 (x . . which may be rewritten as Vi+1 (x . Find a new policy rule y = fi+1 (x. . we may want to accelerate the simple value iteration using Howard improvement. . sk ) ∈ Argmax u(y. k = 1. . st ) = h(xt . st ) with i = 0. The algorithm may be described as follows 1. sk ) (for all x ∈ X and sk ∈ S ). c(:. x. fi (xt . When computing the value function we actually have to solve a linear system of the form M Vi+1 (x . yt . such that M fi+1 (x. for j=1:nba. st ).) ∀x ∈ X 37 . k = 1.j). sk ). 2. sk ) + β y πkj V (x . Set a stopping criterion ε > 0. The stochastic case does not diﬀer that much from the deterministic case. x . s) < ε. if yes then stop. sk ) + βΠk. x . sj ) for Vi+1 (x . taking care of the fact that xt+1 = h(xt .j) = A(j)*k.3. sk ). Set an initial feasible set of decision rule for the control variable y = f0 (x. sj ) j=1 with x = h(x. end 7. . . . sk ). sk ). fi (x. sk ) + β j=1 πkj Vi+1 (h(x .4 Policy iterations As in the deterministic case. check if fi+1 (x. QVi+1 (x . s) − fi (x. apart from the fact that we now have to deal with diﬀerent decision rules. . sk ). sk ). sk ) 3. M and compute the value associated to this guess.kp = k(dr).

The algorithm converges in only 17 iterations. Policy iteration) 6 4 2 0 −2 −4 −6 0 1 2 3 kt 4 5 6 38 .11: Stochastic OGM (Value function. starting from the same initial guess and using the same parameterization! Figure 7.12. which can be exploited in solving the system. Q is sparse. s) We apply this algorithm to the same optimal growth model as in the previous section and report the value function and the decision rules at convergence in ﬁgures 7. s) = (I − βΠ ⊗ Q)−1 u(fi+1 (x. s). to get Vi+1 (x. x ) = x 0 otherwise Note that although it is a big matrix.where Q is an (N × N ) matrix Qj= 1 if xj ≡ h(fi+1 (x ).11 and 7. x.

a1 = exp(-se*se/(1-rho*rho)).Figure 7. 0. 0.12. nba = 2. se = 0.1-p p].30.2. Matlab Code: Policy Iteration % utility parameter % depreciation rate % discount factor % capital elasticity of output % persistence of the shock % volatility of the shock % % % % number of data points in the grid number of values for the shock convergence criterion convergence parameter nbk = 1000.kmax. % % Discretization of the state space % kmin = 0. A = [a1 a2]. kgrid = linspace(kmin. a2 = exp(se*se/(1-rho*rho)). util = zeros(nbk. epsi = 1e-6.10.5 0 0 Consumption kt 2 kt 4 6 sigma delta beta alpha rho se = = = = = = 1. ab = 0.80.95. 0. 0.50.12. Policy iteration) 6 5 4 3 2 1 0 0 2 4 6 Next period capital stock 2 1.nba). crit = 1.5 1 0.nbk)’. c = zeros(nbk.nba). % % Discretization of the shock % p = (1+rho)/2. 0. 39 .12: Stochastic OGM (Decision rules. kmax = 6. PI = [p 1-p.

1. Ev(:. util(neg.nba).dr(i.j)) = 1. % decision rule (will contain indices) % % Main loop % while crit>epsi. Tv = zeros(nbk.nba).j).j) = (c. iter+1. c(:. Q = sparse(nbk*nba. end c=zeros(nbk.^(1-sigma)-1)/(1-sigma). util(:.nbk). end Q((j-1)*nbk+1:j*nbk.:).nbk. c = A(j)*kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid.nbk*nba).nba).j) = A(j)*kgrid. Q0 = sparse(nbk. for i=1:nbk for j=1:nba. end [v1. neg = find(c<=0).j)= (c. c(neg) = NaN. Q0(i.:) = kron(PI(j.v = zeros(nbk. Ev = zeros(nbk.j) = v*PI(j.nba). for j=1:nba.^(1-sigma)-1)/(1-sigma).nba). % expected value function dr0 = repmat([1:nbk]’. % % update the value % util(:.nba). end 40 .:)’. c = A(j)*kgrid.^alpha+(1-delta)*kgrid-kp(:. for j=1:nba. kp. end.nba).j) = -inf.^alpha+(1-delta)*kgrid-kp(:.:)]= max(u+beta*Ev).j).Q0). % initial guess dr = zeros(nbk. % % decision rules % kp = kgrid(dr). max(max(abs(kp-kp0))). end Tv = crit= v = kp0 = iter= (speye(nbk*nba)-beta*Q)\util(:). reshape(Tv. for i=1:nbk.dr(i.

We will use Chebychev polynomials to approximate the rule in order to avoid multicolinearity problems in the least square algorithm. In other words we will get a decision rule of the form n ated decision rule for the next period capital stock ki = h(ki ). once again on the deterministic optimal growth model and let us assume that we have solved the model by either value iteration or policy iteration. We therefore have a collection of data points for the capital stock K = {k1 . for the state variable and a decision rule that relates the control variable to this collection of points.1 The deterministic case After having solved the model by either value iteration or policy iteration. kN }. we have a solution which is actually a collection of data points.7. . Also note that we have the upper and lower bound of the grid k1 = k and kN = k. spline interpolations . . k = =0 θ T (ϕ(k)) 41 . i = 1. . . . N . ) in order to obtain a continuous decision rule Here I will pursue the second route. . let’s focus. . we have in hand a collection of points. X . 7. . This decision rule may then be used to simulate the model. and the associand the consumption level which can be deduced from the next period capital stock using the law of motion of capital together with the resource constraint. . There are several ways to simulate the model • One may simply use the pointwise decision rule and iterate on it • or one may use an interpolation scheme (either least square methods. no matter whether the model is deterministic or stochastic.4 How to simulate the model? At this point. In order to illustrate this.4. . which form our grid. In other words we have Lagrange data which can be used to get a continuous decision rule.

.0012. kN } and 3.0295. n. n.0027T5 (ϕ(kt )) − 0. and i = 1 . .6008 + 2. such that we have associated decision rule for the next period capital stock ki = h(ki ).0125.0055T4 (ϕ(kt )) + 0. 0. .0125T3 (ϕ(kt )) −0.0814. . } such that the decision rule for the capital stock can be approximated as kt+1 = 2. . setting ϕ(k) = 2 k−k −1 k−k and for an approximation of order 7. . −0.0007T7 (ϕ(kt )) Then the model can be used in the usual way such that in order to get the transitional dynamics of the model we just have to iterate on this backward 42 . Perform the least square approximation such that Θ = (T (x) T (x))−1 T (x)h(k) For the parameterization we have been using so far. k] into [-1.where ϕ(k) is a linear transformation that maps [k. −0. i = 1. N .0814T1 (ϕ(kt )) − 0. Solve the model by value iteration or policy iteration.0027. . . Choose a maximal order. a collection of data points for the capital stock K = {k1 . ki − k −1 k−k = 1. Apply the transformation xi = 2 to the data and compute T (xi ). The algorithm works as follows 1.1]. .6008. . for the Chebychev polynomials approximation. 0. . 2.0007.0055. −0. 0. 2. . N . .0295T2 (ϕ(kt )) + 0. we obtain θ = {2. . .0012T6 (ϕ(kt )) + 0.

looking ﬁnite–diﬀerences equation.7 0 Consumption 10 20 Time 30 40 50 10 20 Time 30 40 50 Matlab Code: Transitional Dynamics n = 8.1 1 0.1 1 0 Output 1. In the ﬁgure.2 1.5 2 1.8 0.9 0.5 1 0 Capital stock 0. and the dashed line its steady state level. output and investment can be computed using their deﬁnition it = kt+1 − (1 − δ)kt α yt = k t ct = y t − i − t For instance.29 0.3 1. % transform the data % % Computes the matrix of Chebychev polynomials % 43 . the plain line represents the dynamics of each variable.4 1. Figure 7.13: Transitional dynamics (OGM) 3 2.28 0. ﬁgure 7.3 0. Consumption. % Order of approximation+1 transk = 2*(kgrid-kmin)/(kmax-kmin)-1.27 0.13 reports the transitional dynamics of the model when the economy is started 50% below its steady state.26 0 Investment 10 20 Time 30 40 50 10 20 Time 30 40 50 1.

in which case the matlab code would have been (for the simulation part) Matlab Code: OGM simulation (cubic spline interpolation k(1)= k0.i-1)-Tk(:.kp. end y = k(1:nrep). Tk= [Tk 2*trkt. Note: At this point.i-2)].i-2)]. trkt = 2*(k(t)-kmin)/(kmax-kmin)-1.4.1) transk].*Tk(:. it will often be a good idea to represent the solution as a continuous decision rule.*Tk(:.Tk = [ones(nbk. Tk = [1 trkt].^alpha.’cubic’). for i=3:n. i = k(2:nrep+1)-(1-delta)*k(1:nrep). i = k(2:nrep+1)-(1-delta)*k(1:nrep). c = y-i.2 The stochastic case As in the deterministic case. Another way to proceed would have been to use spline approximation.1). for i=3:n.5*ks.k(t). % Performs OLS k0 = 0. % number of periods k = zeros(nrep+1. I assume that the model has already been solved and that the capital grid has been stored in kgrid and the decision rule for the next period capital stock is stored in kp. Tk=[Tk 2*transk. k(t+1)=interp1(grid.^alpha. 7. for t=1:nrep. end k(t+1)=Tk*b. end b=Tk\kp. for t=1:nrep.i-1)-Tk(:. c = y-i. such that we should apply the same 44 . % initialize dynamics % % iteration loop % k(1)= k0. % initial capital stock nrep= 50. end y = k(1:nrep).

e. ﬁnd the index j such that Πc i(j−1) < pt Πc ij then sj is the state of the economy in period t The code to generate these two rules is exactly the same as for the deterministic version of the model since. 2.0057T4 (ϕ(kt )) + 0. we have 2 decision rules which are given by5 k1.0009T7 (ϕ(kt )) k2.3.2002 + 2.t+1 = 3.0110T4 (ϕ(kt )) + 0.t+1 = 2. such that b will be a ((n + 1) × 2) matrix where n is the approximation order.1] can be interpreted as probabilities.methodology as before.0014T6 (ϕ(kt )) + 0.0027T6 (ϕ(kt )) + 0.0014T7 (ϕ(kt )) The only diﬃculty we have to deal with in the stochastic case is to simulate the Markov chain. when computing b=Tk\kp matlab will take care of the fact that kp is a (N × 2) matrix. 3.0543T2 (ϕ(kt )) + 0. Simulating a series of length T for the markov chain can be achieved easily 1.0031T5 (ϕ(kt )) − 0. 5 45 .0114T3 (ϕ(kt )) −0.0057T5 (ϕ(kt )) − 0.0211T2 (ϕ(kt )) + 0. For instance. in the optimal growth model we solved in section 7.6302T1 (ϕ(kt )) − 0. we have as many decision rule as we have states in the Markov chain. Set the initial state and simulate T random numbers from a uniform T distribution: {pt }t=1 .4761T1 (ϕ(kt )) − 0. Hence.8630 + 2. The only departure from the previous experiment is that since we have approximated the stochastic shock by a Markov chain. Assume the economy was in state i in t − 1. Πc ij corresponds to the the probability that the economy is in a state lower of equal to j given that it was in state i in period t − 1.0235T3 (ϕ(kt )) −0. Compute the cumulative distribution of the markov chain (Πc ) — i. the cumulative sum of each row of the transition matrix. Note that these numbers. drawn from a U[0.

4 1. Figure 7.8 0 100 Time 150 200 50 100 Time 150 200 46 . 0. % 2 possible states = [0.9]. end.2:cpi+1))&.2 1 0. % Simulated distribution = zeros(T.14 reports a simulation of the optimal growth model we solved before.1.2 1 0.8 0 50 1.9 0.1 0 0 50 100 Time Consumption 150 200 Capital stock 0.5 2 1.5 0..1]. % Transition matrix = 1. % Initial state We are then in position to simulate our optimal growth model.8 1.1 0.5 0 50 100 Time Output 150 200 0. j(k)=find(((sim(k)<=cum_PI(state(k-1).3 Investment 2 1.s PI s0 T p j j(1) % for k=2:n.4 0.5 3 2.4 1.6 1.1).5 4 3. = s0.1:cpi)))).1). chain=s(j).2 0. (sim(k)>cum_PI(state(k-1).. % simulated values Matlab Code: Simulation of Markov Chain = [-1. Figure 7. % Length of the simulation = rand(T. % Initial state = 1000.14: Simulated OGM 4.6 1.

% length of simulation [a. % Prepare the approximation for the capital stock trkt = 2*(k(t)-kmin)/(kmax-kmin)-1. % use the appropriate decision rule % computes output Note: At this point.1). % some matrices k(1)=k0. % initial value for the capital stock k=zeros(long+1.*Tk(:. % transform the data % % Computes approximation % Tk = [ones(nbk. % initialize y=zeros(long.Matlab Code: Simulating the OGM n = 8.i-2)].i-1)-Tk(:.long. Tk = [Tk 2*trkt.6. for i=3:n.id] = markov(PI. end k(t+1) y(t) end i=k(2:long+1)-(1-delta)*k(1:long). Tk = [1 trkt]. % Order of approximation+1 transk = 2*(kgrid-kmin)/(kmax-kmin)-1. % markov is a function: % a contains the values for the shock % id the state index k0=2. c=y-i.*Tk(:.1) transk]. for i = 3:n. % computes investment % computes consumption = Tk*b(:. 47 .^alpha. I assume that the model has already been solved and that the capital grid has been stored in kgrid and the decision rule for the next period capital stock is stored in kp.id(t)). end b=Tk\kp.1).A. % % Simulate the markov chain % long = 200. = A(id(t))*k(t).i-2)].i-1)-Tk(:.1). % initial capital stock for t=1:long. Tk=[Tk 2*transk.

In each and every period.10) (7. ∞) (7. These revenus are then used to consume and purchase assets on the ﬁnancial market. Further the probability of being (un)employed in period t is greater if the household was also (un)employed in period t − 1.9) where 0 < β < 1 is a constant discount factor.11) We assume that the household’s state on the labor market is a random variable. which is characterized by the function: 6 ∞ τ =0 Et βτ c1−σ − 1 t+τ 1−σ with σ ∈ (0. and therefore ωt . Therefore. This problem has been extensively studied by Ayiagari [1994]. 1) ∪ (1.7. Expectations are conditional on information available at the beginning of period t. b. 48 . In other words.5 Handling constraints In this example I will deal the problem of a household that is constraint on her borrowing and who faces uncertainty on the labor market. the household enters the period with a level of asset at carried from the previous period. the state on the labor market. the budget constraint in t is given by at+1 = (1 + r)at + ωt − ct where ωt = w γw if the household is employed otherwise (7.) denotes mathematical conditional expectations. She may also be employed and receives the aggregate real wage w or may be unemployed in which case she just receives unemployment beneﬁts. In addition. which are assumed to be a fraction of the real wage: b = γw where γ is the replacement ratio. the household is submitted to a 6 Et (. c denotes the consumption bundle. We consider the case of a household who determines her consumption/saving plans in order to maximize her lifetime expected utility. for which she receives an constant real interest rate r. can be modeled as a 2 state Markov chain with transition matrix Π.

we have V (at .t. Positivity of consumption — which should be taken into account when searching the optimal value for at+1 — ﬁnally imposes at+1 < (1 + r)at + ωt . when deﬁning the grid for a. ωt ) = max u((1 + r)at + ωt − at+1 ) + βEt V (at+1 . one just has to take care of the fact that the minimum admissible value for a is a . which may be solved by one of the methods we have seen before. implementing any of these method taking into account the constraint is straightforward as. But. since the borrowing constraint should be satisﬁed in each and every period. ωt+1 ) at+1 s. (7. ωt ) = max u((1 + r)at + ωt − at+1 ) + βEt V (at+1 . such that the Bellman equation writes V (at . Anyway.m solves and simulate this problem. (7.12).t.11)–(7. ωt ) = max u(ct ) + βEt V (at+1 . ωt+1 ) at+1 a In other words.12). plugging the budget constraint into the utility function.10)–(7. the Bellman equation writes V (at .12) s. ωt ) = max u((1 + r)at + ωt − at+1 ) + βEt V (at+1 . 49 .borrowing contraint that states that she cannot borrow more than a threshold level a at+1 In this case. the Bellman equation can be rewritten V (at . ωt+1 ) ct 0 a (7. ωt+1 ) {a at+1 (1+r)at +ωt } The matlab code borrow.

50 .

Tauchen.. Uninsured Idiosyncratic Risk and Aggregate Saving. Manuscript. 1994. 1998. 371–396. R. Numerical methods in economics. Hussey. 1989.S. and E. Cambridge (MA): Harvard University Press.Bibliography Ayiagari.L. Dynamic Programming and Stochastic Control. N.. 109 (3). Hoover Institution 1994. Bertsekas. Stokey. Quarterly Journal of Economics. Numerical Dynamic Programming with Shape– Preserving Splines. Judd. Cambridge.. 51 . Econometrica. G. 1976. 59 (2). K. and A. Recursive Methods in Economic Dynamics. 659–684. Judd. and R. Massachussets: MIT Press.. K. Lucas. 1991. D. Quadrature Based Methods for Obtaining Approximate Solutions to Nonlinear Asset Pricing Models. R. Solnick. Prescott. New York: Academic Press.

28 Metric space. 6 Cauchy sequence. 15. 6 Policy functions. 41 Markov chain. 2 Blackwell’s suﬃciency conditions. 1. 37 Stationary distribution. 32 52 . 37 Invariant distribution. 29 Transition probability matrix. 30 Lagrange data. 30 Value function. 30 Transition probabilities. 3 Monotonicity. 4 Contraction mapping. 6 Howard Improvement. 4 Contraction mapping theorem. 2 Policy iterations.Index Bellman equation. 4 Discounting. 2 Value iteration. 15.

. Value function iteration . . . . . . . Stochastic dynamic programming . . . . . .3 7. .3. .2. . . . . . . . .3 7. . . . . . . . . .1 The Bellman equation and associated theorems .5 A heuristic derivation of the Bellman equation . . .4 7. . . . . The deterministic case . . .1. . .2. . . . . . . . . . Policy iterations: Howard Improvement . . . . . . . . .2 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The stochastic case . . . . . . Taking advantage of interpolation . . .1 7. . . . 53 . . .1 7. 1 1 1 3 8 8 13 15 20 26 26 28 32 37 41 41 44 48 Deterministic dynamic programming . . . . . . . . .4 7. . .3. . . . . . . . . . . . . . . . . . . . . . . .3 7. . . . . . . . .1 7. . . Value iteration . . . . . . . . The Bellman equation .2 7.2 7. . .2. .2 7. . . . . . . . .1 7. . . .4. . . . .Contents 7 Dynamic Programming 7. .2 7. .1. . . . . .4. . 7. . . Parametric dynamic programming . . . . . . . .4 7. .3. . . . . . . . . . . . . . Policy iterations . . . . . . . . . . . . . . . . . .3. . How to simulate the model? . . Existence and uniqueness of a solution . . . . . . . . Discretization of the shocks . . . . . . . . . . . . Handling constraints . . . . . . .2. . . . . . . . . . . . . .

54 .

. .10 Stochastic OGM (Decision rules. . . .4 7. . .6 7. . . .13 Transitional dynamics (OGM) . . . Policy iteration) . . . 7. .9 Deterministic OGM (Value function. . . . . . . . . . 7. . . . . . Deterministic OGM (Decision rules. . . Deterministic OGM (Decision rules. . . .5 7. . . . Value iteration with interpolation) . Parametric DP) . 16 19 20 22 22 35 35 38 39 43 46 15 11 12 7. . . . . . . . . . . . . Policy iteration) . . . .11 Stochastic OGM (Value function. . . Deterministic OGM (Value function. . . . . . . . . .3 7. . .12 Stochastic OGM (Decision rules.8 7. . Deterministic OGM (Value function. . . . . Stochastic OGM (Value function. . . Deterministic OGM (Decision rules. . . . . . . Deterministic OGM (Decision rules. . . . . . . . . . 55 . . . . . . . . . . . . . . . . . . . . . . . . Value iteration) . . . . . . .1 7. . . .List of Figures 7. .14 Simulated OGM . Value iteration with interpolation) . 7. . Parametric DP) . Value iteration) . . Policy iteration) . . . . Value iteration) . . . . . . . .7 7. .2 7. . . . . 7. . . . . . . Deterministic OGM (Value function. . . . . . . . Value iteration) . . . . . Policy iteration) . . . . . .

56 .

. . .1 Value function approximation . . . . .List of Tables 7. . . . . . . . . 23 57 . . . .