## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Dynamic Programming

In these notes, we will deal with a fundamental tool of dynamic macroeco-

nomics: dynamic programming. Dynamic programming is a very convenient

way of writing a large set of dynamic problems in economic analysis as most of

the properties of this tool are now well established and understood.

1

In these

lectures, we will not deal with all the theoretical properties attached to this

tool, but will rather give some recipes to solve economic problems using the

tool of dynamic programming. In order to understand the problem, we will

ﬁrst deal with deterministic models, before extending the analysis to stochas-

tic ones. However, we shall give some preliminary deﬁnitions and theorems

that justify all the approach

7.1 The Bellman equation and associated theorems

7.1.1 A heuristic derivation of the Bellman equation

Let us consider the case of an agent that has to decide on the path of a set

of control variables, {y

t

}

∞

t=0

in order to maximize the discounted sum of its

future payoﬀs, u(y

t

, x

t

) where x

t

is state variables assumed to evolve according

to

x

t+1

= h(x

t

, y

t

), x

0

given.

1

For a mathematical exposition of the problem see Bertsekas [1976], for a more economic

approach see Lucas, Stokey and Prescott [1989].

1

We ﬁnally make the assumption that the model is markovian.

The optimal value our agent can derive from this maximization process is

given by the value function

V (x

t

) = max

{y

t+s

∈D(x

t+s

)}

∞

s=0

∞

s=0

β

s

u(y

t+s

, x

t+s

) (7.1)

where D is the set of all feasible decisions for the variables of choice. Note

that the value function is a function of the state variable only, as since the

model is markovian, only the past is necessary to take decisions, such that all

the path can be predicted once the state variable is observed. Therefore, the

value in t is only a function of x

t

. (7.1) may now be rewritten as

V (x

t

) = max

{yt∈D(xt),{y

t+s

∈D(x

t+s

)}

∞

s=1

u(y

t

, x

t

) +

∞

s=1

β

s

u(y

t+s

, x

t+s

) (7.2)

making the change of variable k = s −1, (7.2) rewrites

V (x

t

) = max

{yt∈D(xt),{y

t+1+k

∈D(x

t+1+k

)}

∞

k=0

u(y

t

, x

t

) +

∞

k=0

β

k+1

u(y

t+1+k

, x

t+1+k

)

or

V (x

t

) = max

{yt∈D(xt)

u(y

t

, x

t

) + β max

{y

t+1+k

∈D(x

t+1+k

)}

∞

k=0

∞

k=0

β

k

u(y

t+1+k

, x

t+1+k

)

(7.3)

Note that, by deﬁnition, we have

V (x

t+1

) = max

{y

t+1+k

∈D(x

t+1+k

)}

∞

k=0

∞

k=0

β

k

u(y

t+1+k

, x

t+1+k

)

such that (7.3) rewrites as

V (x

t

) = max

yt∈D(xt)

u(y

t

, x

t

) + βV (x

t+1

) (7.4)

This is the so–called Bellman equation that lies at the core of the dynamic

programming theory. With this equation are associated, in each and every

period t, a set of optimal policy functions for y and x, which are deﬁned by

{y

t

, x

t+1

} ∈ Argmax

y∈D(x)

u(y, x) + βV (x

t+1

) (7.5)

2

Our problem is now to solve (7.4) for the function V (x

t

). This problem is

particularly complicated as we are not solving for just a point that would

satisfy the equation, but we are interested in ﬁnding a function that satisﬁes

the equation. A simple procedure to ﬁnd a solution would be the following

1. Make an initial guess on the form of the value function V

0

(x

t

)

2. Update the guess using the Bellman equation such that

V

i+1

(x

t

) = max

yt∈D(xt)

u(y

t

, x

t

) + βV

i

(h(y

t

, x

t

))

3. If V

i+1

(x

t

) = V

i

(x

t

), then a ﬁxed point has been found and the prob-

lem is solved, if not we go back to 2, and iterate on the process until

convergence.

In other words, solving the Bellman equation just amounts to ﬁnd the ﬁxed

point of the bellman equation, or introduction an operator notation, ﬁnding

the ﬁxed point of the operator T, such that

V

i+1

= TV

i

where T stands for the list of operations involved in the computation of the

Bellman equation. The problem is then that of the existence and the unique-

ness of this ﬁxed–point. Luckily, mathematicians have provided conditions for

the existence and uniqueness of a solution.

7.1.2 Existence and uniqueness of a solution

Deﬁnition 1 A metric space is a set S, together with a metric ρ : S ×S −→

R

+

, such that for all x, y, z ∈ S:

1. ρ(x, y) 0, with ρ(x, y) = 0 if and only if x = y,

2. ρ(x, y) = ρ(y, x),

3. ρ(x, z) ρ(x, y) + ρ(y, z).

3

Deﬁnition 2 A sequence {x

n

}

∞

n=0

in S converges to x ∈ S, if for each ε > 0

there exists an integer N

ε

such that

ρ(x

n

, x) < ε for all n N

ε

Deﬁnition 3 A sequence {x

n

}

∞

n=0

in S is a Cauchy sequence if for each ε > 0

there exists an integer N

ε

such that

ρ(x

n

, x

m

) < ε for all n, m N

ε

Deﬁnition 4 A metric space (S, ρ) is complete if every Cauchy sequence in

S converges to a point in S.

Deﬁnition 5 Let (S, ρ) be a metric space and T : S →S be function mapping

S into itself. T is a contraction mapping (with modulus β) if for β ∈ (0, 1),

ρ(Tx, Ty) βρ(x, y), for all x, y ∈ S.

We then have the following remarkable theorem that establishes the existence

and uniqueness of the ﬁxed point of a contraction mapping.

Theorem 1 (Contraction Mapping Theorem) If (S, ρ) is a complete met-

ric space and T : S −→ S is a contraction mapping with modulus β ∈ (0, 1),

then

1. T has exactly one ﬁxed point V ∈ S such that V = TV ,

2. for any V ∈ S, ρ(T

n

V

0

, V ) < β

n

ρ(V

0

, V ), with n = 0, 1, 2 . . .

Since we are endowed with all the tools we need to prove the theorem, we shall

do it.

Proof: In order to prove 1., we shall ﬁrst prove that if we select any sequence {Vn}

∞

n=0

,

such that for each n, Vn ∈ S and

Vn+1 = TVn

this sequence converges and that it converges to V ∈ S. In order to show convergence of

{Vn}

∞

n=0

, we shall prove that {Vn}

∞

n=0

is a Cauchy sequence. First of all, note that the

contraction property of T implies that

ρ(V2, V1) = ρ(TV1, TV0) βρ(V1, V0)

4

and therefore

ρ(Vn+1, Vn) = ρ(TVn, TVn−1) βρ(Vn, Vn−1) . . . β

n

ρ(V1, V0)

Now consider two terms of the sequence, Vm and Vn, m > n. The triangle inequality implies

that

ρ(Vm, Vn) ρ(Vm, Vm−1) +ρ(Vm−1, Vm−2) +. . . +ρ(Vn+1, Vn)

therefore, making use of the previous result, we have

ρ(Vm, Vn)

β

m−1

+β

m−2

+. . . +β

n

ρ(V1, V0)

β

n

1 −β

ρ(V1, V0)

Since β ∈ (0, 1), β

n

→ 0 as n → ∞, we have that for each ε > 0, there exists Nε ∈ N

such that ρ(Vm, Vn) < ε. Hence {Vn}

∞

n=0

is a Cauchy sequence and it therefore converges.

Further, since we have assume that S is complete, Vn converges to V ∈ S.

We now have to show that V = TV in order to complete the proof of the ﬁrst part.

Note that, for each ε > 0, and for V0 ∈ S, the triangular inequality implies

ρ(V, TV ) ρ(V, Vn) +ρ(Vn, TV )

But since {Vn}

∞

n=0

is a Cauchy sequence, we have

ρ(V, TV ) ρ(V, Vn) +ρ(Vn, TV )

ε

2

+

ε

2

for large enough n, therefore V = TV .

Hence, we have proven that T possesses a ﬁxed point and therefore have established

its existence. We now have to prove uniqueness. This can be obtained by contradiction.

Suppose, there exists another function, say W ∈ S that satisﬁes W = TW. Then, the

deﬁnition of the ﬁxed point implies

ρ(V, W) = ρ(TV, TW)

but the contraction property implies

ρ(V, W) = ρ(TV, TW) βρ(V, W)

which, as β > 0 implies ρ(V, W) = 0 and so V = W. The limit is then unique.

Proving 2. is straightforward as

ρ(T

n

V0, V ) = ρ(T

n

V0, TV ) βρ(T

n−1

V0, V )

but we have ρ(T

n−1

V0, V ) = ρ(T

n−1

V0, TV ) such that

ρ(T

n

V0, V ) = ρ(T

n

V0, TV ) βρ(T

n−1

V0, V ) β

2

ρ(T

n−2

V0, V ) . . . β

n

ρ(V0, V )

which completes the proof. 2

This theorem is of great importance as it establishes that any operator that

possesses the contraction property will exhibit a unique ﬁxed–point, which

therefore provides some rationale to the algorithm we were designing in the

previous section. It also insure that whatever the initial condition for this

5

algorithm, if the value function satisﬁes a contraction property, simple itera-

tions will deliver the solution. It therefore remains to provide conditions for

the value function to be a contraction. These are provided by the following

theorem.

Theorem 2 (Blackwell’s Suﬃciency Conditions) Let X ⊆ R

and B(X)

be the space of bounded functions V : X −→ R with the uniform metric. Let

T : B(X) −→B(X) be an operator satisfying

1. (Monotonicity) Let V, W ∈ B(X), if V (x) W(x) for all x ∈ X, then

TV (x) TW(x)

2. (Discounting) There exists some constant β ∈ (0, 1) such that for all

V ∈ B(X) and a 0, we have

T(V + a) TV + βa

then T is a contraction with modulus β.

Proof: Let us consider two functions V, W ∈ B(X) satisfying 1. and 2., and such that

V W +ρ(V, W)

Monotonicity ﬁrst implies that

TV T(W +ρ(V, W))

and discounting

TV TW +βρ(V, W))

since ρ(V, W) 0 plays the same role as a. We therefore get

TV −TW βρ(V, W)

Likewise, if we now consider that V W +ρ(V, W), we end up with

TW −TV βρ(V, W)

Consequently, we have

|TV −TW| βρ(V, W)

so that

ρ(TV, TW) βρ(V, W)

which deﬁnes a contraction. This completes the proof. 2

6

This theorem is extremely useful as it gives us simple tools to check whether

a problem is a contraction and therefore permits to check whether the simple

algorithm we were deﬁned is appropriate for the problem we have in hand.

As an example, let us consider the optimal growth model, for which the

Bellman equation writes

V (k

t

) = max

ct∈C

u(c

t

) + βV (k

t+1

)

with k

t+1

= F(k

t

) − c

t

. In order to save on notations, let us drop the time

subscript and denote the next period capital stock by k

**, such that the Bellman
**

equation rewrites, plugging the law of motion of capital in the utility function

V (k) = max

k

∈K

u(F(k) −k

) + βV (k

)

Let us now deﬁne the operator T as

(TV )(k) = max

k

∈K

u(F(k) −k

) + βV (k

)

We would like to know if T is a contraction and therefore if there exists a

unique function V such that

V (k) = (TV )(k)

In order to achieve this task, we just have to check whether T is monotonic

and satisﬁes the discounting property.

1. Monotonicity: Let us consider two candidate value functions, V and W,

such that V (k) W(k) for all k ∈ K . What we want to show is that

(TV )(k) (TW)(k). In order to do that, let us denote by

¯

k

the optimal

next period capital stock, that is

(TV )(k) = u(F(k) −

¯

k

) + βV (

¯

k

)

But now, since V (k) W(k) for all k ∈ K , we have V (

¯

k

) W(

¯

k

),

such that it should be clear that

(TV )(k) u(F(k) −

¯

k

) + βW(

¯

k

)

max

k

∈K

u(F(k) −k

) + βW(k

) = (TW(k

))

7

Hence we have shown that V (k) W(k) implies (TV )(k) (TW)(k)

and therefore established monotonicity.

2. Discounting: Let us consider a candidate value function, V , and a posi-

tive constant a.

(T(V + a))(k) = max

k

∈K

u(F(k) −k

) + β(V (k

) + a)

= max

k

∈K

u(F(k) −k

) + βV (k

) + βa

= (TV )(k) + βa

Therefore, the Bellman equation satisﬁes discounting in the case of op-

timal growth model.

Hence, the optimal growth model satisﬁes the Blackwell’s suﬃcient condi-

tions for a contraction mapping, and therefore the value function exists and is

unique. We are now in position to design a numerical algorithm to solve the

bellman equation.

7.2 Deterministic dynamic programming

7.2.1 Value function iteration

The contraction mapping theorem gives us a straightforward way to compute

the solution to the bellman equation: iterate on the operator T, such that

V

i+1

= TV

i

up to the point where the distance between two successive value

function is small enough. Basically this amounts to apply the following algo-

rithm

1. Decide on a grid, X, of admissible values for the state variable x

X = {x

1

, . . . , x

N

}

formulate an initial guess for the value function V

0

(x) and choose a

stopping criterion ε > 0.

8

2. For each x

∈ X, = 1, . . . , N, compute

V

i+1

(x

) = max

{x

∈X}

u(y(x

, x

), x

) + βV

i

(x

)

3. If V

i+1

(x) −V

i

(x) < ε go to the next step, else go back to 2.

4. Compute the ﬁnal solution as

y

(x) = y(x, x

)

and

V

(x) =

u(y

(x), x)

1 −β

In order to better understand the algorithm, let us consider a simple example

and go back to the optimal growth model, with

u(c) =

c

1−σ

−1

1 −σ

and

k

= k

α

−c + (1 −δ)k

Then the Bellman equation writes

V (k) = max

0ck

α

+(1−δ)k

c

1−σ

−1

1 −σ

+ βV (k

)

From the law of motion of capital we can determine consumption as

c = k

α

+ (1 −δ)k −k

**such that plugging this results in the Bellman equation, we have
**

V (k) = max

(1−δ)kk

k

α

+(1−δ)k

(k

α

+ (1 −δ)k −k

)

1−σ

−1

1 −σ

+ βV (k

)

Now, let us deﬁne a grid of N feasible values for k such that we have

K = {k

1

, . . . , k

N

}

9

and an initial value function V

0

(k) — that is a vector of N numbers that relate

each k

**to a value. Note that this may be anything we want as we know — by
**

the contraction mapping theorem — that the algorithm will converge. But, if

we want it to converge fast enough it may be a good idea to impose a good

initial guess. Finally, we need a stopping criterion.

Then, for each = 1, . . . , N, we compute the feasible values that can be

taken by the quantity in left hand side of the value function

V

,h

≡

(k

α

+ (1 −δ)k

−k

h

)

1−σ

−1

1 −σ

+ βV (k

h

) for h feasible

It is important to understand what “h feasible” means. Indeed, we only com-

pute consumption when it is positive and smaller than total output, which

restricts the number of possible values for k

. Namely, we want k

to satisfy

0 k

k

α

+ (1 −δ)k

which puts a lower and an upper bound on the index h. When the grid of

values is uniform — that is when k

h

= k+(h−1)d

k

, where d

k

is the increment

in the grid, the upper bound can be computed as

h = E

_

k

α

i

+ (1 −δ)k

i

−k

d

k

_

+ 1

Then we ﬁnd

V

,h

max

h=1,...,h

V

,h

and set

V

i+1

(k

) = V

,h

and keep in memory the index h

= Argmax

h=1,...,N

V

,h

, such that we have

k

(k

) = k

h

**In ﬁgures 7.1 and 7.2, we report the value function and the decision rules
**

obtained from the deterministic optimal growth model with α = 0.3, β = 0.95,

10

δ = 0.1 and σ = 1.5. The grid for the capital stock is composed of 1000 data

points ranging from (1 − ∆

k

)k

to (1 + ∆

k

)k

, where k

**denotes the steady
**

state and ∆

k

= 0.9. The algorithm

2

then converges in 211 iterations and 110.5

seconds on a 800Mhz computer when the stopping criterion is ε = 1e

−6

.

Figure 7.1: Deterministic OGM (Value function, Value iteration)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−4

−3

−2

−1

0

1

2

3

4

k

t

Matlab Code: Value Function Iteration

sigma = 1.5; % utility parameter

delta = 0.1; % depreciation rate

beta = 0.95; % discount factor

alpha = 0.30; % capital elasticity of output

nbk = 1000; % number of data points in the grid

crit = 1; % convergence criterion

epsi = 1e-6; % convergence parameter

ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1));

dev = 0.9; % maximal deviation from steady state

kmin = (1-dev)*ks; % lower bound on the grid

kmax = (1+dev)*ks; % upper bound on the grid

2

This code and those that follow are not eﬃcient from a computational point of view, as

they are intended to have you understand the method without adding any coding complica-

tions. Much faster implementations can be found in the accompanying matlab codes.

11

Figure 7.2: Deterministic OGM (Decision rules, Value iteration)

0 1 2 3 4 5

0

1

2

3

4

5

k

t

Next period capital stock

0 1 2 3 4 5

0

0.5

1

1.5

2

k

t

Consumption

dk = (kmax-kmin)/(nbk-1); % implied increment

kgrid = linspace(kmin,kmax,nbk)’; % builds the grid

v = zeros(nbk,1); % value function

dr = zeros(nbk,1); % decision rule (will contain indices)

while crit>epsi;

for i=1:nbk

%

% compute indexes for which consumption is positive

%

tmp = (kgrid(i)^alpha+(1-delta)*kgrid(i)-kmin);

imax = min(floor(tmp/dk)+1,nbk);

%

% consumption and utility

%

c = kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid(1:imax);

util = (c.^(1-sigma)-1)/(1-sigma);

%

% find value function

%

[tv(i),dr(i)] = max(util+beta*v(1:imax));

end;

crit = max(abs(tv-v)); % Compute convergence criterion

v = tv; % Update the value function

end

%

% Final solution

%

kp = kgrid(dr);

12

c = kgrid.^alpha+(1-delta)*kgrid-kp;

util= (c.^(1-sigma)-1)/(1-sigma);

7.2.2 Taking advantage of interpolation

A possible improvement of the method is to have a much looser grid on the

capital stock but have a pretty ﬁne grid on the control variable (consumption

in the optimal growth model). Then the next period value of the state variable

can be computed much more precisely. However, because of this precision and

the fact the grid is rougher, it may be the case that the computed optimal

value for the next period state variable does not lie in the grid, such that

the value function is unknown at this particular value. Therefore, we use

any interpolation scheme to get an approximation of the value function at

this value. One advantage of this approach is that it involves less function

evaluations and is usually less costly in terms of CPU time. The algorithm is

then as follows:

1. Decide on a grid, X, of admissible values for the state variable x

X = {x

1

, . . . , x

N

}

Decide on a grid, Y , of admissible values for the control variable y

Y = {y

1

, . . . , y

M

} with M N

formulate an initial guess for the value function V

0

(x) and choose a

stopping criterion ε > 0.

2. For each x

∈ X, = 1, . . . , N, compute

x

,j

= h(y

j

, x

)∀j = 1, . . . , M

Compute an interpolated value function at each x

,j

= h(y

j

, x

):

¯

V

i

(x

,j

)

V

i+1

(x

) = max

{y∈Y }

u(y, x

) + β

¯

V

i

(x

,j

)

13

3. If V

i+1

(x) −V

i

(x) < ε go to the next step, else go back to 2.

4. Compute the ﬁnal solution as

V

(x) =

u(y

, x)

1 −β

I report now the matlab code for this approach when we use cubic spline

interpolation for the value function, 20 nodes for the capital stock and 1000

nodes for consumption. The algorithm converges in 182 iterations and 40.6

seconds starting from initial condition for the value function

v

0

(k) =

((c

/y

) ∗ k

α

)

1−σ

−1

(1 −σ)

Matlab Code: Value Function Iteration with Interpolation

sigma = 1.5; % utility parameter

delta = 0.1; % depreciation rate

beta = 0.95; % discount factor

alpha = 0.30; % capital elasticity of output

nbk = 20; % number of data points in the K grid

nbk = 1000; % number of data points in the C grid

crit = 1; % convergence criterion

epsi = 1e-6; % convergence parameter

ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1));

dev = 0.9; % maximal deviation from steady state

kmin = (1-dev)*ks; % lower bound on the grid

kmax = (1+dev)*ks; % upper bound on the grid

kgrid = linspace(kmin,kmax,nbk)’; % builds the grid

cmin = 0.01; % lower bound on the grid

cmak = kmax^alpha; % upper bound on the grid

c = linspace(cmin,cmax,nbc)’; % builds the grid

v = zeros(nbk,1); % value function

dr = zeros(nbk,1); % decision rule (will contain indices)

util = (c.^(1-sigma)-1)/(1-sigma);

while crit>epsi;

for i=1:nbk;

kp = A(j)*k(i).^alpha+(1-delta)*k(i)-c;

vi = interp1(k,v,kp,’spline’);

[Tv(i),dr(i)] = max(util+beta*vi);

14

end

crit = max(abs(tv-v)); % Compute convergence criterion

v = tv; % Update the value function

end

%

% Final solution

%

kp = kgrid(dr);

c = kgrid.^alpha+(1-delta)*kgrid-kp;

util= (c.^(1-sigma)-1)/(1-sigma);

v = util/(1-beta);

Figure 7.3: Deterministic OGM (Value function, Value iteration with interpo-

lation)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−4

−3

−2

−1

0

1

2

3

4

k

t

7.2.3 Policy iterations: Howard Improvement

The simple value iteration algorithm has the attractive feature of being par-

ticularly simple to implement. However, it is a slow procedure, especially for

inﬁnite horizon problems, since it can be shown that this procedure converges

at the rate β, which is usually close to 1! Further, it computes unnecessary

quantities during the algorithm which slows down convergence. Often, com-

15

Figure 7.4: Deterministic OGM (Decision rules, Value iteration with interpo-

lation)

0 1 2 3 4 5

0

1

2

3

4

5

k

t

Next period capital stock

0 1 2 3 4 5

0

0.5

1

1.5

2

k

t

Consumption

putation speed is really important, for instance when one wants to perform a

sensitivity analysis of the results obtained in a model using diﬀerent parame-

terization. Hence, we would like to be able to speed up convergence. This can

be achieved relying on Howard improvement method. This method actually

iterates on policy functions rather than iterating on the value function. The

algorithm may be described as follows

1. Set an initial feasible decision rule for the control variable y = f

0

(x) and

compute the value associated to this guess, assuming that this rule is

operative forever:

V (x

t

) =

∞

s=0

β

s

u(f

i

(x

t+s

), x

t+s

)

taking care of the fact that x

t+1

= h(x

t

, y

t

) = h(x

t

, f

i

(x

t

)) with i = 0.

Set a stopping criterion ε > 0.

2. Find a new policy rule y = f

i+1

(x) such that

f

i+1

(x) ∈ Argmax

y

u(y, x) + βV (x

)

with x

= h(x, f

i

(x))

16

3. check if f

i+1

(x) −f

i

(x) < ε, if yes then stop, else go back to 2.

Note that this method diﬀers fundamentally from the value iteration algorithm

in at least two dimensions

i one iterates on the policy function rather than on the value function;

ii the decision rule is used forever whereas it is assumed that it is used

only two consecutive periods in the value iteration algorithm. This is

precisely this last feature that accelerates convergence.

Note that when computing the value function we actually have to solve a linear

system of the form

V

i+1

(x

) = u(f

i+1

(x

), x

) + βV

i+1

(h(x

, f

i+1

(x

))) ∀x

∈ X

for V

i+1

(x

**), which may be rewritten as
**

V

i+1

(x

) = u(f

i+1

(x

), x

) + βQV

i+1

(x

) ∀x

∈ X

where Q is an (N ×N) matrix

Q

j

=

_

1 if x

j

≡ h(f

i+1

(x

), x

) = x

0 otherwise

Note that although it is a big matrix, Q is sparse, which can be exploited in

solving the system, to get

V

i+1

(x) = (I −βQ)

−1

u(f

i+1

(x), x)

We apply this algorithm to the same optimal growth model as in the previous

section and report the value function and the decision rules at convergence

in ﬁgures 7.5 and 7.6. The algorithm converges in only 18 iterations and 9.8

seconds, starting from the same initial guess and using the same parameteri-

zation!

17

Matlab Code: Policy Iteration

sigma = 1.50; % utility parameter

delta = 0.10; % depreciation rate

beta = 0.95; % discount factor

alpha = 0.30; % capital elasticity of output

nbk = 1000; % number of data points in the grid

crit = 1; % convergence criterion

epsi = 1e-6; % convergence parameter

ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1));

dev = 0.9; % maximal deviation from steady state

kmin = (1-dev)*ks; % lower bound on the grid

kmax = (1+dev)*ks; % upper bound on the grid

kgrid = linspace(kmin,kmax,nbk)’; % builds the grid

v = zeros(nbk,1); % value function

kp0 = kgrid; % initial guess on k(t+1)

dr = zeros(nbk,1); % decision rule (will contain indices)

%

% Main loop

%

while crit>epsi;

for i=1:nbk

%

% compute indexes for which consumption is positive

%

imax = min(floor((kgrid(i)^alpha+(1-delta)*kgrid(i)-kmin)/devk)+1,nbk);

%

% consumption and utility

%

c = kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid(1:imax);

util = (c.^(1-sigma)-1)/(1-sigma);

%

% find new policy rule

%

[v1,dr(i)]= max(util+beta*v(1:imax));

end;

%

% decision rules

%

kp = kgrid(dr)

c = kgrid.^alpha+(1-delta)*kgrid-kp;

%

% update the value

%

util= (c.^(1-sigma)-1)/(1-sigma);

Q = sparse(nbk,nbk);

for i=1:nbk;

Q(i,dr(i)) = 1;

18

end

Tv = (speye(nbk)-beta*Q)\u;

crit= max(abs(kp-kp0));

v = Tv;

kp0 = kp;

end

Figure 7.5: Deterministic OGM (Value function, Policy iteration)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−4

−3

−2

−1

0

1

2

3

4

k

t

As experimented in the particular example of the optimal growth model,

policy iteration algorithm only requires a few iterations. Unfortunately, we

have to solve the linear system

(I −βQ)V

i+1

= u(y, x)

which may be particularly costly when the number of grid points is impor-

tant. Therefore, a number of researchers has proposed to replace the matrix

inversion by an additional iteration step, leading to the so–called modiﬁed pol-

icy iteration with k steps, which replaces the linear problem by the following

iteration scheme

1. Set J

0

= V

i

19

Figure 7.6: Deterministic OGM (Decision rules, Policy iteration)

0 1 2 3 4 5

0

1

2

3

4

5

k

t

Next period capital stock

0 1 2 3 4 5

0

0.5

1

1.5

2

k

t

Consumption

2. iterate k times on

J

i+1

= u(y, x) + βQJ

i

, i = 0, . . . , k

3. set V

i+1

= J

k

.

When k −→∞, J

k

tends toward the solution of the linear system.

7.2.4 Parametric dynamic programming

This last technics I will describe borrows from approximation theory using

either orthogonal polynomials or spline functions. The idea is actually to

make a guess for the functional for of the value function and iterate on the

parameters of this functional form. The algorithm then works as follows

1. Choose a functional form for the value function

¯

V (x; Θ), a grid of in-

terpolating nodes X = {x

1

, . . . , x

N

}, a stopping criterion ε > 0 and an

initial vector of parameters Θ

0

.

2. Using the conjecture for the value function, perform the maximization

step in the Bellman equation, that is compute w

= T(

¯

V (x, Θ

i

))

w

= T(

¯

V (x

, Θ

i

)) = max

y

u(y, x

) + β

¯

V (x

, Θ

i

)

20

s.t. x

= h(y, x

) for = 1, . . . , N

3. Using the approximation method you have chosen, compute a new vector

of parameters Θ

i+1

such that

¯

V (x, Θ

i+1

) approximates the data (x

, w

).

4. If

¯

V (x, Θ

i+1

) −

¯

V (x, Θ

i

) < ε then stop, else go back to 2.

First note that for this method to be implementable, we need the payoﬀ func-

tion and the value function to be continuous. The approximation function

may be either a combination of polynomials, neural networks, splines. Note

that during the optimization problem, we may have to rely on a numerical

maximization algorithm, and the approximation method may involve numer-

ical minimization in order to solve a non–linear least–square problem of the

form:

Θ

i+1

∈ Argmin

Θ

N

=1

(w

−

¯

V (x

; Θ))

2

This algorithm is usually much faster than value iteration as it may not require

iterating on a large grid. As an example, I will once again focus on the optimal

growth problem we have been dealing with so far, and I will approximate the

value function by

¯

V (k; Θ) =

p

i=1

θ

i

ϕ

i

_

2

k −k

k −k

−1

_

where {ϕ

i

(.)}

p

i=0

is a set of Chebychev polynomials. In the example, I set

p = 10 and used 20 nodes. Figures 7.8 and 7.7 report the decision rule and

the value function in this case, and table 7.1 reports the parameters of the

approximation function. The algorithm converged in 242 iterations, but took

less much time that value iterations.

21

Figure 7.7: Deterministic OGM (Value function, Parametric DP)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−4

−3

−2

−1

0

1

2

3

4

k

t

Figure 7.8: Deterministic OGM (Decision rules, Parametric DP)

0 1 2 3 4 5

0

1

2

3

4

5

k

t

Next period capital stock

0 1 2 3 4 5

0

0.5

1

1.5

2

k

t

Consumption

22

Table 7.1: Value function approximation

θ

0

0.82367

θ

1

2.78042

θ

2

-0.66012

θ

3

0.23704

θ

4

-0.10281

θ

5

0.05148

θ

6

-0.02601

θ

7

0.01126

θ

8

-0.00617

θ

9

0.00501

θ

10

-0.00281

23

Matlab Code: Parametric Dynamic Programming

sigma = 1.50; % utility parameter

delta = 0.10; % depreciation rate

beta = 0.95; % discount factor

alpha = 0.30; % capital elasticity of output

nbk = 20; % # of data points in the grid

p = 10; % order of polynomials

crit = 1; % convergence criterion

iter = 1; % iteration

epsi = 1e-6; % convergence parameter

ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1));

dev = 0.9; % maximal dev. from steady state

kmin = (1-dev)*ks; % lower bound on the grid

kmax = (1+dev)*ks; % upper bound on the grid

rk = -cos((2*[1:nbk]’-1)*pi/(2*nnk)); % Interpolating nodes

kgrid = kmin+(rk+1)*(kmax-kmin)/2; % mapping

%

% Initial guess for the approximation

%

v = (((kgrid.^alpha).^(1-sigma)-1)/((1-sigma)*(1-beta)));;

X = chebychev(rk,n);

th0 = X\v

Tv = zeros(nbk,1);

kp = zeros(nbk,1);

%

% Main loop

%

options=foptions;

options(14)=1e9;

while crit>epsi;

k0 = kgrid(1);

for i=1:nbk

param = [alpha beta delta sigma kmin kmax n kgrid(i)];

kp(i) = fminu(’tv’,k0,options,[],param,th0);

k0 = kp(i);

Tv(i) = -tv(kp(i),param,th0);

end;

theta= X\Tv;

crit = max(abs(Tv-v));

v = Tv;

th0 = theta;

iter= iter+1;

end

24

Matlab Code: Extra Functions

function res=tv(kp,param,theta);

alpha = param(1);

beta = param(2);

delta = param(3);

sigma = param(4);

kmin = param(5);

kmax = param(6);

n = param(7);

k = param(8);

kp = sqrt(kp.^2); % insures positivity of k’

v = value(kp,[kmin kmax n],theta); % computes the value function

c = k.^alpha+(1-delta)*k-kp; % computes consumption

d = find(c<=0); % find negative consumption

c(d) = NaN; % get rid off negative c

util = (c.^(1-sigma)-1)/(1-sigma); % computes utility

util(d) = -1e12; % utility = low number for c<0

res = -(util+beta*v); % compute -TV (we are minimizing)

function v = value(k,param,theta);

kmin = param(1);

kmax = param(2);

n = param(3);

k = 2*(k-kmin)/(kmax-kmin)-1;

v = chebychev(k,n)*theta;

function Tx=chebychev(x,n);

X=x(:);

lx=size(X,1);

if n<0;error(’n should be a positive integer’);end

switch n;

case 0;

Tx=[ones(lx,1)];

case 1;

Tx=[ones(lx,1) X];

otherwise

Tx=[ones(lx,1) X];

for i=3:n+1;

Tx=[Tx 2*X.*Tx(:,i-1)-Tx(:,i-2)];

end

end

A potential problem with the use of Chebychev polynomials is that they

do not put any assumption on the shape of the value function that we know

to be concave and strictly increasing in this case. This is why Judd [1998]

25

recommends to use shape–preserving methods such as Schumaker approxima-

tion in this case. Judd and Solnick [1994] have successfully applied this latter

technic to the optimal growth model and found that the approximation was

very good and dominated a lot of other methods (they actually get the same

precision with 12 nodes as the one achieved with a 1200 data points grid using

a value iteration technic).

7.3 Stochastic dynamic programming

In a large number of problems, we have to deal with stochastic shocks —

just think of a standard RBC model — dynamic programming technics can

obviously be extended to deal with such problems. I will ﬁrst show how

we can obtain the Bellman equation before addressing some important issues

concerning discretization of shocks. Then I will describe the implementation

of the value iteration and policy iteration technics for the stochastic case.

7.3.1 The Bellman equation

The stochastic problem diﬀers from the deterministic problem in that we now

have a . . . stochastic shock! The problem then deﬁnes a value function which

has as argument the state variable x

t

but also the stationary shock s

t

, whose

sequence {s

t

}

+∞

t=0

satisﬁes

s

t+1

= Φ(s

t

, ε

t+1

) (7.6)

where ε is a white noise process. The value function is therefore given by

V (x

t

, s

t

) = max

{y

t+τ

∈D(x

t+τ

,s

t+τ

)}

∞

τ=0

E

t

∞

τ=0

β

τ

u(y

t+τ

, x

t+τ

, s

t+τ

) (7.7)

subject to (7.6) and

x

t+1

= h(x

t

, y

t

, s

t

) (7.8)

Since both y

t

, x

t

and s

t

are either perfectly observed or decided in period t

they are known in t, such that we may rewrite the value function as

V (x

t

, s

t

) = max

{yt∈D(xt,st),{y

t+τ

∈D(x

t+τ

,s

t+τ

)}

∞

τ=1

}

u(y

t

, x

t

, s

t

)+E

t

∞

τ=1

β

τ

u(y

t+τ

, x

t+τ

, s

t+τ

)

26

or

V (x

t

, s

t

) = max

yt∈D(xt,st)

u(y

t

, x

t

, s

t

)+ max

{y

t+τ

∈D(x

t+τ

,s

t+τ

)}

∞

τ=1

E

t

∞

τ=1

β

τ

u(y

t+τ

, x

t+τ

, s

t+τ

)

Using the change of variable k = τ −1, this rewrites

V (x

t

, s

t

) = max

yt∈D(x

t+τ

,s

t+τ

)

u(y

t

, x

t

, s

t

)+ max

{y

t+1+k

∈D(x

t+1+k

,s

t+1+k

)}

∞

k=0

βE

t

∞

k=0

β

k

u(y

t+1+k

, x

t+1+k

, s

t+1+k

)

It is important at this point to recall that

E

t

(X(ω

t+τ

)) =

_

X(ω

t+τ

)f(ω

t+τ

|ω

t

)dω

t+τ

=

_ _

X(ω

t+τ

)f(ω

t+τ

|ω

t+τ−1

)f(ω

t+τ−1

|ω

t

)dω

t+τ

dω

t+τ−1

=

_

. . .

_

X(ω

t+τ

)f(ω

t+τ

|ω

t+τ−1

) . . . f(ω

t+1

|ω

t

)dω

t+τ

. . . dω

t+1

which as corollary the law of iterated projections, such that the value function

rewrites

V (x

t

, s

t

) = max

yt∈D(xt,st)

u(y

t

, x

t

, s

t

)

+ max

{y

t+1+k

∈D(x

t+1+k

,s

t+1+k

)}

∞

k=0

βE

t

E

t+1

∞

k=0

β

k

u(y

t+1+k

, x

t+1+k

, s

t+1+k

)

or

V (x

t

, s

t

) = max

yt∈D(xt,st)

u(y

t

, x

t

, s

t

)

+ max

{y

t+1+k

∈D(x

t+1+k

,s

t+1+k

)}

∞

k=0

β

_

E

t+1

∞

k=0

β

k

u(y

t+1+k

, x

t+1+k

, s

t+1+k

)f(s

t+1

|s

t

)ds

t+1

Note that because each value for the shock deﬁnes a particular mathematical

object the maximization of the integral corresponds to the integral of the max-

imization, therefore the max operator and the summation are interchangeable,

so that we get

V (x

t

, s

t

) = max

yt∈D(xt,st)

u(y

t

, x

t

, s

t

)

+β

_

max

{y

t+1+k

}

∞

k=0

E

t+1

∞

k=0

β

k

u(y

t+1+k

, x

t+1+k

, s

t+1+k

)f(s

t+1

|s

t

)ds

t+1

27

By deﬁnition, the term under the integral corresponds to V (x

t+1

, s

t+1

),

such that the value rewrites

V (x

t

, s

t

) = max

yt∈D(xt,st)

u(y

t

, x

t

, s

t

) + β

_

V (x

t+1

, s

t+1

)f(s

t+1

|s

t

)ds

t+1

or equivalently

V (x

t

, s

t

) = max

yt∈D(xt,st)

u(y

t

, x

t

, s

t

) + βE

t

V (x

t+1

, s

t+1

)

which is precisely the Bellman equation for the stochastic dynamic program-

ming problem.

7.3.2 Discretization of the shocks

A very important problem that arises whenever we deal with value iteration or

policy iteration in a stochastic environment is that of the discretization of the

space spanned by the shocks. Indeed, the use of a continuous support for the

stochastic shocks is infeasible for a computer that can only deals with discrete

supports. We therefore have to transform the continuous problem into a dis-

crete one with the constraint that the asymptotic properties of the continous

and the discrete processes should be the same. The question we therefore face

is: does there exist a discrete representation for s which is equivalent to its

continuous original representation? The answer to this question is yes. In

particular as soon as we deal with (V)AR processes, we can use a very powerful

tool: Markov chains.

Markov Chains: A Markov chain is a sequence of random values whose

probabilities at a time interval depends upon the value of the number at the

previous time. W will restrict ourselves to discrete-time Markov chains, in

which the state changes at certain discrete time instants, indexed by an integer

variable t. At each time step t, the Markov chain is in a state, denoted by

s ∈ S ≡ {s

1

, . . . , s

M

}. S is called the state space.

28

The Markov chain is described in terms of transition probabilities π

ij

. This

transition probability should read as follows

If the state happens to be s

i

in period t, the probability that the next state

is equal to s

j

is π

ij

.

We therefore get the following deﬁnition

Deﬁnition 6 A Markov chain is a stochastic process with a discrete indexing

S, such that the conditional distribution of s

t+1

is independent of all previ-

ously attained state given s

t

:

π

ij

= Prob(s

t

+ 1 = s

j

|s

t

= s

i

), s

i

, s

j

∈ S.

The important assumption we shall make concerning Markov processes is

that the transition probabilities, π

ij

, apply as soon as state s

i

is activated

no matter the history of the shocks nor how the system got to state s

i

. In

other words there is no hysteresis. From a mathematical point of view, this

corresponds to the so–called Markov property

P(s

t+1

= s

j

|s

t

= s

i

, s

t−1

= s

i

n−1

, . . . s

0

= s

i

0

) = P(s

t+1

= s

j

|s

t

= s

i

) = π

ij

for all period t, all states s

i

, s

j

∈ S, and all sequences {s

in

}

t−1

n=0

of earlier

states. Thus, the probability the next state s

t+1

does only depend on the

current realization of s.

The transition probabilities π

ij

must of course satisfy

1. π

ij

0 for all i, j = 1, . . . , M

2.

M

j=1

π

ij

= 1 for all i = 1, . . . , M

All of the elements of a Markov chain model can then be encoded in a

transition probability matrix

Π =

_

_

_

π

11

. . . π

1M

.

.

.

.

.

.

.

.

.

π

M1

. . . π

MM

_

_

_

29

Note that Π

k

ij

then gives the probability that s

t+k

= s

j

given that s

t

= s

i

.

In the long run, we obtain the steady state equivalent for a Markov chain: the

invariant distribution or stationary distribution

Deﬁnition 7 A stationary distribution for a Markov chain is a distribution

π such that

1. π

j

0 for all j = 1, . . . , M

2.

M

j=1

π

j

= 1

3. π = Ππ

Gaussian–quadrature approach to discretization Tauchen and Hussey

[1991] provide a simple way to discretize VAR processes relying on gaussian

quadrature.

3

I will only present the case of an AR(1) process of the form

s

t+1

= ρs

t

+ (1 −ρ)s + ε

t+1

where ε

t+1

;N (0, σ

2

). This implies that

1

σ

√

2π

_

∞

−∞

exp

_

−

1

2

_

s

t+1

−ρs

t

−(1 −ρ)s

σ

_

2

_

ds

t+1

=

_

f(s

t+1

|s

t

)ds

t+1

= 1

which illustrates the fact that s is a continuous random variable. Tauchen and

Hussey [1991] propose to replace the integral by

_

Φ(s

t+1

; s

t

, s)f(s

t+1

|s)ds

t+1

≡

_

f(s

t+1

|s

t

)

f(s

t+1

|s)

f(s

t+1

|s)ds

t+1

= 1

where f(s

t+1

|s) denotes the density of s

t+1

conditional on the fact that s

t

= s

(which amounts to the unconditional density), which in our case implies that

Φ(s

t+1

; s

t

, s) ≡

f(s

t+1

|s

t

)

f(s

t+1

|s)

= exp

_

−

1

2

_

_

s

t+1

−ρs

t

−(1 −ρ)s

σ

_

2

−

_

s

t+1

−s

σ

_

2

__

3

Note that we already discussed this issue in lectures notre # 4.

30

then we can use the standard linear transformation and impose z

t

= (s

t

−

s)/(σ

√

2) to get

1

√

π

_

∞

−∞

exp

_

−

_

(z

t+1

−ρz

t

)

2

−z

2

t+1

j)

__

exp

_

−z

2

t+1

_

dz

t+1

for which we can use a Gauss–Hermite quadrature. Assume then that we have

the quadrature nodes z

i

and weights ω

i

, i = 1, . . . , n, the quadrature leads to

the formula

1

√

π

n

j=1

ω

j

Φ(z

j

; z

i

; x) 1

in other words we might interpret the quantity ω

j

Φ(z

j

; z

i

; x) as an “estimate”

´ π

ij

≡ Prob(s

t+1

= s

j

|s

t

= s

i

) of the transition probability from state i to

state j, but remember that the quadrature is just an approximation such that

it will generally be the case that

n

j=1

´ π

ij

= 1 will not hold exactly. Tauchen

and Hussey therefore propose the following modiﬁcation:

´ π

ij

=

ω

j

Φ(z

j

; z

i

; s)

√

πς

i

where ς

i

=

1

√

π

n

j=1

ω

j

Φ(z

j

; z

i

; x).

Matlab Code: Discretization of AR(1)

function [s,p]=tauch_huss(xbar,rho,sigma,n)

% xbar : mean of the x process

% rho : persistence parameter

% sigma : volatility

% n : number of nodes

%

% returns the states s and the transition probabilities p

%

[xx,wx] = gauss_herm(n); % nodes and weights for s

s = sqrt(2)*s*xx+mx; % discrete states

x = xx(:,ones(n,1));

z = x’;

w = wx(:,ones(n,1))’;

%

% computation

%

p = (exp(z.*z-(z-rx*x).*(z-rx*x)).*w)./sqrt(pi);

sx = sum(p’)’;

p = p./sx(:,ones(n,1));

31

7.3.3 Value iteration

As in the deterministic case, the convergence of the simple value function

iteration procedure is insured by the contraction mapping theorem. This is

however a bit more subtil than that as we have to deal with the convergence

of a probability measure, which goes far beyond this introduction to dynamic

programming.

4

The algorithm is basically the same as in the deterministic

case up to the delicate point that we have to deal with the expectation. The

algorithm then writes as follows

1. Decide on a grid, X, of admissible values for the state variable x

X = {x

1

, . . . , x

N

}

and the shocks, s

S = {s

1

, . . . , s

M

} together with the transition matrix Π = (π

ij

)

formulate an initial guess for the value function V

0

(x) and choose a

stopping criterion ε > 0.

2. For each x

∈ X, = 1, . . . , N, and s

k

∈ S, k = 1, . . . , M compute

V

i+1

(x

, s

k

) = max

{x

∈X}

u(y(x

, s

k

, x

), x

, s

k

) + β

M

j=1

π

kj

V

i

(x

, s

j

)

3. If V

i+1

(x, s) −V

i

(x, s) < ε go to the next step, else go back to 2.

4. Compute the ﬁnal solution as

y

(x, s) = y(x, x

(s, s), s)

As in the deterministic case, I will illustrate the method relying on the optimal

growth model. In order to better understand the algorithm, let us consider a

simple example and go back to the optimal growth model, with

u(c) =

c

1−σ

−1

1 −σ

4

The interested reader should then refer to Lucas et al. [1989] chapter 9.

32

and

k

= exp(a)k

α

−c + (1 −δ)k

where a

= ρa + ε

**. Then the Bellman equation writes
**

V (k, a) = max

c

c

1−σ

−1

1 −σ

+ β

_

V (k

, a

)dµ(a

|a)

From the law of motion of capital we can determine consumption as

c = exp(a)k

α

+ (1 −δ)k −k

**such that plugging this results in the Bellman equation, we have
**

V (k, a) = max

k

(exp(a)k

α

+ (1 −δ)k −k

)

1−σ

−1

1 −σ

+ β

_

V (k

, a

)dµ(a

|a)

A ﬁrst problem that we encounter is that we would like to be able to eval-

uate the integral involved by the rational expectation. We therefore have to

discretize the shock. Here, I will consider that the technology shock can be

accurately approximated by a 2 state markov chain, such that a can take on

2 values a

1

and a

2

(a

1

< a

2

). I will also assume that the transition matrix is

symmetric, such that

Π =

_

π 1 −π

1 −π π

_

a

1

, a

2

and π are selected such that the process reproduces the conditional ﬁrst

and second order moments of the AR(1) process

First order moments

πa

1

+ (1 −π)a

2

= ρa

1

(1 −π)a

1

+ πa

2

= ρa

2

Second order moments

πa

2

1

+ (1 −π)a

2

2

−(ρa

1

)

2

= σ

2

ε

(1 −π)a

2

1

+ πa

2

2

−(ρa

2

)

2

= σ

2

ε

From the ﬁrst two equations we get a

1

= −a

2

and π = (1 + ρ)/2. Plugging

these last two results in the two last equations, we get a

1

=

_

σ

2

ε

/(1 −ρ

2

).

33

Hence, we will actually work with a value function of the form

V (k, a

k

) = max

c

c

1−σ

−1

1 −σ

+ β

2

j=1

π

kj

V (k

, a

j

)

Now, let us deﬁne a grid of N feasible values for k such that we have

K = {k

1

, . . . , k

N

}

and an initial value function V

0

(k) — that is a vector of N numbers that relate

each k

**to a value. Note that this may be anything we want as we know — by
**

the contraction mapping theorem — that the algorithm will converge. But, if

we want it to converge fast enough it may be a good idea to impose a good

initial guess. Finally, we need a stopping criterion.

In ﬁgures 7.9 and 7.10, we report the value function and the decision rules

obtained for the stochastic optimal growth model with α = 0.3, β = 0.95,

δ = 0.1, σ = 1.5, ρ = 0.8 and σ

ε

= 0.12. The grid for the capital stock

is composed of 1000 data points ranging from 0.2 to 6. The algorithm then

converges in 192 iterations when the stopping criterion is ε = 1e

−6

.

34

Figure 7.9: Stochastic OGM (Value function, Value iteration)

0 1 2 3 4 5 6

−6

−4

−2

0

2

4

6

k

t

Figure 7.10: Stochastic OGM (Decision rules, Value iteration)

0 2 4 6

0

1

2

3

4

5

6

k

t

Next period capital stock

0 2 4 6

0

0.5

1

1.5

2

k

t

Consumption

35

Matlab Code: Value Function Iteration

sigma = 1.50; % utility parameter

delta = 0.10; % depreciation rate

beta = 0.95; % discount factor

alpha = 0.30; % capital elasticity of output

rho = 0.80; % persistence of the shock

se = 0.12; % volatility of the shock

nbk = 1000; % number of data points in the grid

nba = 2; % number of values for the shock

crit = 1; % convergence criterion

epsi = 1e-6; % convergence parameter

%

% Discretization of the shock

%

p = (1+rho)/2;

PI = [p 1-p;1-p p];

se = 0.12;

ab = 0;

a1 = exp(-se*se/(1-rho*rho));

a2 = exp(se*se/(1-rho*rho));

A = [a1 a2];

%

% Discretization of the state space

%

kmin = 0.2;

kmax = 6;

kgrid = linspace(kmin,kmax,nbk)’;

c = zeros(nbk,nba);

util = zeros(nbk,nba);

v = zeros(nbk,nba);

Tv = zeros(nbk,nba);

while crit>epsi;

for i=1:nbk

for j=1:nba;

c = A(j)*k(i)^alpha+(1-delta)*k(i)-k;

neg = find(c<=0);

c(neg) = NaN;

util(:,j) = (c.^(1-sigma)-1)/(1-sigma);

util(neg,j) = -1e12;

end

[Tv(i,:),dr(i,:)] = max(util+beta*(v*PI));

end;

crit = max(max(abs(Tv-v)));

v = Tv;

iter = iter+1;

end

36

kp = k(dr);

for j=1:nba;

c(:,j) = A(j)*k.^alpha+(1-delta)*k-kp(:,j);

end

7.3.4 Policy iterations

As in the deterministic case, we may want to accelerate the simple value

iteration using Howard improvement. The stochastic case does not diﬀer that

much from the deterministic case, apart from the fact that we now have to

deal with diﬀerent decision rules, which implies the computation of diﬀerent

Q matrices. The algorithm may be described as follows

1. Set an initial feasible set of decision rule for the control variable y =

f

0

(x, s

k

), k = 1, . . . , M and compute the value associated to this guess,

assuming that this rule is operative forever, taking care of the fact that

x

t+1

= h(x

t

, y

t

, s

t

) = h(x

t

, f

i

(x

t

, s

t

), s

t

) with i = 0. Set a stopping

criterion ε > 0.

2. Find a new policy rule y = f

i+1

(x, s

k

), k = 1, . . . , M, such that

f

i+1

(x, s

k

) ∈ Argmax

y

u(y, x, s

k

) + β

M

j=1

π

kj

V (x

, s

j

)

with x

= h(x, f

i

(x, s

k

), s

k

)

3. check if f

i+1

(x, s) −f

i

(x, s) < ε, if yes then stop, else go back to 2.

When computing the value function we actually have to solve a linear system

of the form

V

i+1

(x

, s

k

) = u(f

i+1

(x

, s

k

), x

, s

k

) + β

M

j=1

π

kj

V

i+1

(h(x

, f

i+1

(x

, s

k

), s

k

), s

j

)

for V

i+1

(x

, s

k

) (for all x

∈ X and s

k

∈ S), which may be rewritten as

V

i+1

(x

, s

k

) = u(f

i+1

(x

, s

k

), x

, s

k

) + βΠ

k.

QV

i+1

(x

, .) ∀x

∈ X

37

where Q is an (N ×N) matrix

Q

j

=

_

1 if x

j

≡ h(f

i+1

(x

), x

) = x

0 otherwise

Note that although it is a big matrix, Q is sparse, which can be exploited in

solving the system, to get

V

i+1

(x, s) = (I −βΠ⊗Q)

−1

u(f

i+1

(x, s), x, s)

We apply this algorithm to the same optimal growth model as in the previous

section and report the value function and the decision rules at convergence in

ﬁgures 7.11 and 7.12. The algorithm converges in only 17 iterations, starting

from the same initial guess and using the same parameterization!

Figure 7.11: Stochastic OGM (Value function, Policy iteration)

0 1 2 3 4 5 6

−6

−4

−2

0

2

4

6

k

t

38

Figure 7.12: Stochastic OGM (Decision rules, Policy iteration)

0 2 4 6

0

1

2

3

4

5

6

k

t

Next period capital stock

0 2 4 6

0

0.5

1

1.5

2

k

t

Consumption

Matlab Code: Policy Iteration

sigma = 1.50; % utility parameter

delta = 0.10; % depreciation rate

beta = 0.95; % discount factor

alpha = 0.30; % capital elasticity of output

rho = 0.80; % persistence of the shock

se = 0.12; % volatility of the shock

nbk = 1000; % number of data points in the grid

nba = 2; % number of values for the shock

crit = 1; % convergence criterion

epsi = 1e-6; % convergence parameter

%

% Discretization of the shock

%

p = (1+rho)/2;

PI = [p 1-p;1-p p];

se = 0.12;

ab = 0;

a1 = exp(-se*se/(1-rho*rho));

a2 = exp(se*se/(1-rho*rho));

A = [a1 a2];

%

% Discretization of the state space

%

kmin = 0.2;

kmax = 6;

kgrid = linspace(kmin,kmax,nbk)’;

c = zeros(nbk,nba);

util = zeros(nbk,nba);

39

v = zeros(nbk,nba);

Tv = zeros(nbk,nba);

Ev = zeros(nbk,nba); % expected value function

dr0 = repmat([1:nbk]’,1,nba); % initial guess

dr = zeros(nbk,nba); % decision rule (will contain indices)

%

% Main loop

%

while crit>epsi;

for i=1:nbk

for j=1:nba;

c = A(j)*kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid;

neg = find(c<=0);

c(neg) = NaN;

util(:,j) = (c.^(1-sigma)-1)/(1-sigma);

util(neg,j) = -inf;

Ev(:,j) = v*PI(j,:)’;

end

[v1,dr(i,:)]= max(u+beta*Ev);

end;

%

% decision rules

%

kp = kgrid(dr);

Q = sparse(nbk*nba,nbk*nba);

for j=1:nba;

c = A(j)*kgrid.^alpha+(1-delta)*kgrid-kp(:,j);

%

% update the value

%

util(:,j)= (c.^(1-sigma)-1)/(1-sigma);

Q0 = sparse(nbk,nbk);

for i=1:nbk;

Q0(i,dr(i,j)) = 1;

end

Q((j-1)*nbk+1:j*nbk,:) = kron(PI(j,:),Q0);

end

Tv = (speye(nbk*nba)-beta*Q)\util(:);

crit= max(max(abs(kp-kp0)));

v = reshape(Tv,nbk,nba);

kp0 = kp;

iter= iter+1;

end

c=zeros(nbk,nba);

for j=1:nba;

c(:,j) = A(j)*kgrid.^alpha+(1-delta)*kgrid-kp(:,j);

end

40

7.4 How to simulate the model?

At this point, no matter whether the model is deterministic or stochastic, we

have a solution which is actually a collection of data points. There are several

ways to simulate the model

• One may simply use the pointwise decision rule and iterate on it

• or one may use an interpolation scheme (either least square methods,

spline interpolations . . . ) in order to obtain a continuous decision rule

Here I will pursue the second route.

7.4.1 The deterministic case

After having solved the model by either value iteration or policy iteration, we

have in hand a collection of points, X, for the state variable and a decision

rule that relates the control variable to this collection of points. In other words

we have Lagrange data which can be used to get a continuous decision rule.

This decision rule may then be used to simulate the model.

In order to illustrate this, let’s focus, once again on the deterministic opti-

mal growth model and let us assume that we have solved the model by either

value iteration or policy iteration. We therefore have a collection of data points

for the capital stock K = {k

1

, . . . , k

N

}, which form our grid, and the associ-

ated decision rule for the next period capital stock k

i

= h(k

i

), i = 1, . . . , N,

and the consumption level which can be deduced from the next period capital

stock using the law of motion of capital together with the resource constraint.

Also note that we have the upper and lower bound of the grid k

1

= k and

k

N

= k. We will use Chebychev polynomials to approximate the rule in or-

der to avoid multicolinearity problems in the least square algorithm. In other

words we will get a decision rule of the form

k

=

n

=0

θ

T

(ϕ(k))

41

where ϕ(k) is a linear transformation that maps [k, k] into [-1;1]. The algo-

rithm works as follows

1. Solve the model by value iteration or policy iteration, such that we have

a collection of data points for the capital stock K = {k

1

, . . . , k

N

} and

associated decision rule for the next period capital stock k

i

= h(k

i

),

i = 1, . . . , N. Choose a maximal order, n, for the Chebychev polynomials

approximation.

2. Apply the transformation

x

i

= 2

k

i

−k

k −k

−1

to the data and compute T

(x

i

), = 1, . . . , n, and i = 1 . . . , N.

3. Perform the least square approximation such that

Θ = (T

(x)

T

(x))

−1

T

(x)h(k)

For the parameterization we have been using so far, setting

ϕ(k) = 2

k −k

k −k

−1

and for an approximation of order 7, we obtain

θ = {2.6008, 2.0814, −0.0295, 0.0125, −0.0055, 0.0027, −0.0012, 0.0007, }

such that the decision rule for the capital stock can be approximated as

k

t+1

= 2.6008 + 2.0814T

1

(ϕ(k

t

)) −0.0295T

2

(ϕ(k

t

)) + 0.0125T

3

(ϕ(k

t

))

−0.0055T

4

(ϕ(k

t

)) + 0.0027T

5

(ϕ(k

t

)) −0.0012T

6

(ϕ(k

t

)) + 0.0007T

7

(ϕ(k

t

))

Then the model can be used in the usual way such that in order to get the

transitional dynamics of the model we just have to iterate on this backward

42

looking ﬁnite–diﬀerences equation. Consumption, output and investment can

be computed using their deﬁnition

i

t

= k

t+1

−(1 −δ)k

t

y

t

= k

α

t

c

t

= y

t

−i −t

For instance, ﬁgure 7.13 reports the transitional dynamics of the model when

the economy is started 50% below its steady state. In the ﬁgure, the plain line

represents the dynamics of each variable, and the dashed line its steady state

level.

Figure 7.13: Transitional dynamics (OGM)

0 10 20 30 40 50

1

1.5

2

2.5

3

Time

Capital stock

0 10 20 30 40 50

0.26

0.27

0.28

0.29

0.3

Time

Investment

0 10 20 30 40 50

1

1.1

1.2

1.3

1.4

Time

Output

0 10 20 30 40 50

0.7

0.8

0.9

1

1.1

Time

Consumption

Matlab Code: Transitional Dynamics

n = 8; % Order of approximation+1

transk = 2*(kgrid-kmin)/(kmax-kmin)-1; % transform the data

%

% Computes the matrix of Chebychev polynomials

%

43

Tk = [ones(nbk,1) transk];

for i=3:n;

Tk=[Tk 2*transk.*Tk(:,i-1)-Tk(:,i-2)];

end

b=Tk\kp; % Performs OLS

k0 = 0.5*ks; % initial capital stock

nrep= 50; % number of periods

k = zeros(nrep+1,1); % initialize dynamics

%

% iteration loop

%

k(1)= k0;

for t=1:nrep;

trkt = 2*(k(t)-kmin)/(kmax-kmin)-1;

Tk = [1 trkt];

for i=3:n;

Tk= [Tk 2*trkt.*Tk(:,i-1)-Tk(:,i-2)];

end

k(t+1)=Tk*b;

end

y = k(1:nrep).^alpha;

i = k(2:nrep+1)-(1-delta)*k(1:nrep);

c = y-i;

Note: At this point, I assume that the model has already been solved and that the capital

grid has been stored in kgrid and the decision rule for the next period capital stock is stored

in kp.

Another way to proceed would have been to use spline approximation, in

which case the matlab code would have been (for the simulation part)

Matlab Code: OGM simulation (cubic spline interpolation

k(1)= k0;

for t=1:nrep;

k(t+1)=interp1(grid,kp,k(t),’cubic’);

end

y = k(1:nrep).^alpha;

i = k(2:nrep+1)-(1-delta)*k(1:nrep);

c = y-i;

7.4.2 The stochastic case

As in the deterministic case, it will often be a good idea to represent the

solution as a continuous decision rule, such that we should apply the same

44

methodology as before. The only departure from the previous experiment

is that since we have approximated the stochastic shock by a Markov chain,

we have as many decision rule as we have states in the Markov chain. For

instance, in the optimal growth model we solved in section 7.3, we have 2

decision rules which are given by

5

k

1,t+1

= 2.8630 + 2.4761T

1

(ϕ(k

t

)) −0.0211T

2

(ϕ(k

t

)) + 0.0114T

3

(ϕ(k

t

))

−0.0057T

4

(ϕ(k

t

)) + 0.0031T

5

(ϕ(k

t

)) −0.0014T

6

(ϕ(k

t

)) + 0.0009T

7

(ϕ(k

t

))

k

2,t+1

= 3.2002 + 2.6302T

1

(ϕ(k

t

)) −0.0543T

2

(ϕ(k

t

)) + 0.0235T

3

(ϕ(k

t

))

−0.0110T

4

(ϕ(k

t

)) + 0.0057T

5

(ϕ(k

t

)) −0.0027T

6

(ϕ(k

t

)) + 0.0014T

7

(ϕ(k

t

))

The only diﬃculty we have to deal with in the stochastic case is to simulate

the Markov chain. Simulating a series of length T for the markov chain can

be achieved easily

1. Compute the cumulative distribution of the markov chain (Π

c

) — i.e.

the cumulative sum of each row of the transition matrix. Hence, Π

c

ij

corresponds to the the probability that the economy is in a state lower

of equal to j given that it was in state i in period t −1.

2. Set the initial state and simulate T random numbers from a uniform

distribution: {p

t

}

T

t=1

. Note that these numbers, drawn from a U

[0;1]

can

be interpreted as probabilities.

3. Assume the economy was in state i in t −1, ﬁnd the index j such that

Π

c

i(j−1)

< p

t

Π

c

ij

then s

j

is the state of the economy in period t

5

The code to generate these two rules is exactly the same as for the deterministic version

of the model since, when computing b=Tk\kp matlab will take care of the fact that kp is a

(N × 2) matrix, such that b will be a ((n + 1) × 2) matrix where n is the approximation

order.

45

Matlab Code: Simulation of Markov Chain

s = [-1;1]; % 2 possible states

PI = [0.9 0.1;

0.1 0.9]; % Transition matrix

s0 = 1; % Initial state

T = 1000; % Length of the simulation

p = rand(T,1); % Simulated distribution

j = zeros(T,1);

j(1) = s0; % Initial state

%

for k=2:n;

j(k)=find(((sim(k)<=cum_PI(state(k-1),2:cpi+1))&...

(sim(k)>cum_PI(state(k-1),1:cpi))));

end;

chain=s(j); % simulated values

We are then in position to simulate our optimal growth model. Figure 7.14

reports a simulation of the optimal growth model we solved before.

Figure 7.14: Simulated OGM

0 50 100 150 200

1.5

2

2.5

3

3.5

4

4.5

Time

Capital stock

0 50 100 150 200

0

0.1

0.2

0.3

0.4

0.5

Time

Investment

0 50 100 150 200

0.8

1

1.2

1.4

1.6

1.8

2

Time

Output

0 50 100 150 200

0.8

1

1.2

1.4

1.6

Time

Consumption

46

Matlab Code: Simulating the OGM

n = 8; % Order of approximation+1

transk = 2*(kgrid-kmin)/(kmax-kmin)-1; % transform the data

%

% Computes approximation

%

Tk = [ones(nbk,1) transk];

for i=3:n;

Tk=[Tk 2*transk.*Tk(:,i-1)-Tk(:,i-2)];

end

b=Tk\kp;

%

% Simulate the markov chain

%

long = 200; % length of simulation

[a,id] = markov(PI,A,long,1); % markov is a function:

% a contains the values for the shock

% id the state index

k0=2.6; % initial value for the capital stock

k=zeros(long+1,1); % initialize

y=zeros(long,1); % some matrices

k(1)=k0; % initial capital stock

for t=1:long;

% Prepare the approximation for the capital stock

trkt = 2*(k(t)-kmin)/(kmax-kmin)-1;

Tk = [1 trkt];

for i = 3:n;

Tk = [Tk 2*trkt.*Tk(:,i-1)-Tk(:,i-2)];

end

k(t+1) = Tk*b(:,id(t)); % use the appropriate decision rule

y(t) = A(id(t))*k(t).^alpha; % computes output

end

i=k(2:long+1)-(1-delta)*k(1:long); % computes investment

c=y-i; % computes consumption

Note: At this point, I assume that the model has already been solved and that the capital

grid has been stored in kgrid and the decision rule for the next period capital stock is stored

in kp.

47

7.5 Handling constraints

In this example I will deal the problem of a household that is constraint on

her borrowing and who faces uncertainty on the labor market. This problem

has been extensively studied by Ayiagari [1994].

We consider the case of a household who determines her consumption/saving

plans in order to maximize her lifetime expected utility, which is characterized

by the function:

6

E

t

∞

τ=0

β

τ

_

c

1−σ

t+τ

−1

1 −σ

_

with σ ∈ (0, 1) ∪ (1, ∞) (7.9)

where 0 < β < 1 is a constant discount factor, c denotes the consumption

bundle. In each and every period, the household enters the period with a level

of asset a

t

carried from the previous period, for which she receives an constant

real interest rate r. She may also be employed and receives the aggregate real

wage w or may be unemployed in which case she just receives unemployment

beneﬁts, b, which are assumed to be a fraction of the real wage: b = γw

where γ is the replacement ratio. These revenus are then used to consume

and purchase assets on the ﬁnancial market. Therefore, the budget constraint

in t is given by

a

t+1

= (1 + r)a

t

+ ω

t

−c

t

(7.10)

where

ω

t

=

_

w if the household is employed

γw otherwise

(7.11)

We assume that the household’s state on the labor market is a random variable.

Further the probability of being (un)employed in period t is greater if the

household was also (un)employed in period t − 1. In other words, the state

on the labor market, and therefore ω

t

, can be modeled as a 2 state Markov

chain with transition matrix Π. In addition, the household is submitted to a

6

Et(.) denotes mathematical conditional expectations. Expectations are conditional on

information available at the beginning of period t.

48

borrowing contraint that states that she cannot borrow more than a threshold

level a

a

t+1

a

(7.12)

In this case, the Bellman equation writes

V (a

t

, ω

t

) = max

ct0

u(c

t

) + βE

t

V (a

t+1

, ω

t+1

)

s.t. (7.10)–(7.12), which may be solved by one of the methods we have seen

before. Anyway, implementing any of these method taking into account the

constraint is straightforward as, plugging the budget constraint into the utility

function, the Bellman equation can be rewritten

V (a

t

, ω

t

) = max

a

t+1

u((1 + r)a

t

+ ω

t

−a

t+1

) + βE

t

V (a

t+1

, ω

t+1

)

s.t. (7.11)–(7.12).

But, since the borrowing constraint should be satisﬁed in each and every

period, we have

V (a

t

, ω

t

) = max

a

t+1

a

u((1 + r)a

t

+ ω

t

−a

t+1

) + βE

t

V (a

t+1

, ω

t+1

)

In other words, when deﬁning the grid for a, one just has to take care of the

fact that the minimum admissible value for a is a

. Positivity of consumption

— which should be taken into account when searching the optimal value for

a

t+1

— ﬁnally imposes a

t+1

< (1 +r)a

t

+ω

t

, such that the Bellman equation

writes

V (a

t

, ω

t

) = max

{a

a

t+1

(1+r)at+ωt}

u((1 + r)a

t

+ ω

t

−a

t+1

) + βE

t

V (a

t+1

, ω

t+1

)

The matlab code borrow.m solves and simulate this problem.

49

50

Bibliography

Ayiagari, R.S., Uninsured Idiosyncratic Risk and Aggregate Saving, Quarterly

Journal of Economics, 1994, 109 (3), 659–684.

Bertsekas, D., Dynamic Programming and Stochastic Control, New York:

Academic Press, 1976.

Judd, K. and A. Solnick, Numerical Dynamic Programming with Shape–

Preserving Splines, Manuscript, Hoover Institution 1994.

Judd, K.L., Numerical methods in economics, Cambridge, Massachussets:

MIT Press, 1998.

Lucas, R., N. Stokey, and E. Prescott, Recursive Methods in Economic Dy-

namics, Cambridge (MA): Harvard University Press, 1989.

Tauchen, G. and R. Hussey, Quadrature Based Methods for Obtaining Ap-

proximate Solutions to Nonlinear Asset Pricing Models, Econometrica,

1991, 59 (2), 371–396.

51

Index

Bellman equation, 1, 2

Blackwell’s suﬃciency conditions, 6

Cauchy sequence, 4

Contraction mapping, 4

Contraction mapping theorem, 4

Discounting, 6

Howard Improvement, 15, 37

Invariant distribution, 30

Lagrange data, 41

Markov chain, 28

Metric space, 3

Monotonicity, 6

Policy functions, 2

Policy iterations, 15, 37

Stationary distribution, 30

Transition probabilities, 29

Transition probability matrix, 30

Value function, 2

Value iteration, 32

52

Contents

7 Dynamic Programming 1

7.1 The Bellman equation and associated theorems . . . . . . . . . 1

7.1.1 A heuristic derivation of the Bellman equation . . . . . 1

7.1.2 Existence and uniqueness of a solution . . . . . . . . . . 3

7.2 Deterministic dynamic programming . . . . . . . . . . . . . . . 8

7.2.1 Value function iteration . . . . . . . . . . . . . . . . . . 8

7.2.2 Taking advantage of interpolation . . . . . . . . . . . . 13

7.2.3 Policy iterations: Howard Improvement . . . . . . . . . 15

7.2.4 Parametric dynamic programming . . . . . . . . . . . . 20

7.3 Stochastic dynamic programming . . . . . . . . . . . . . . . . . 26

7.3.1 The Bellman equation . . . . . . . . . . . . . . . . . . . 26

7.3.2 Discretization of the shocks . . . . . . . . . . . . . . . . 28

7.3.3 Value iteration . . . . . . . . . . . . . . . . . . . . . . . 32

7.3.4 Policy iterations . . . . . . . . . . . . . . . . . . . . . . 37

7.4 How to simulate the model? . . . . . . . . . . . . . . . . . . . . 41

7.4.1 The deterministic case . . . . . . . . . . . . . . . . . . . 41

7.4.2 The stochastic case . . . . . . . . . . . . . . . . . . . . . 44

7.5 Handling constraints . . . . . . . . . . . . . . . . . . . . . . . . 48

53

54

List of Figures

7.1 Deterministic OGM (Value function, Value iteration) . . . . . . 11

7.2 Deterministic OGM (Decision rules, Value iteration) . . . . . . 12

7.3 Deterministic OGM (Value function, Value iteration with inter-

polation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

7.4 Deterministic OGM (Decision rules, Value iteration with inter-

polation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

7.5 Deterministic OGM (Value function, Policy iteration) . . . . . 19

7.6 Deterministic OGM (Decision rules, Policy iteration) . . . . . . 20

7.7 Deterministic OGM (Value function, Parametric DP) . . . . . . 22

7.8 Deterministic OGM (Decision rules, Parametric DP) . . . . . . 22

7.9 Stochastic OGM (Value function, Value iteration) . . . . . . . . 35

7.10 Stochastic OGM (Decision rules, Value iteration) . . . . . . . . 35

7.11 Stochastic OGM (Value function, Policy iteration) . . . . . . . 38

7.12 Stochastic OGM (Decision rules, Policy iteration) . . . . . . . . 39

7.13 Transitional dynamics (OGM) . . . . . . . . . . . . . . . . . . . 43

7.14 Simulated OGM . . . . . . . . . . . . . . . . . . . . . . . . . . 46

55

56

List of Tables

7.1 Value function approximation . . . . . . . . . . . . . . . . . . . 23

57

We ﬁnally make the assumption that the model is markovian. The optimal value our agent can derive from this maximization process is given by the value function V (xt ) =

{yt+s ∈D (xt+s )}∞ s=0 ∞ s=0

max

β s u(yt+s , xt+s )

(7.1)

where D is the set of all feasible decisions for the variables of choice. Note that the value function is a function of the state variable only, as since the model is markovian, only the past is necessary to take decisions, such that all the path can be predicted once the state variable is observed. Therefore, the value in t is only a function of xt . (7.1) may now be rewritten as V (xt ) =

{yt ∈D (xt ),{yt+s ∈D (xt+s )}∞ s=1 ∞ s=1

max

u(yt , xt ) +

β s u(yt+s , xt+s )

(7.2)

**making the change of variable k = s − 1, (7.2) rewrites V (xt ) = or V (xt ) =
**

{yt ∈D (xt ) {yt ∈D (xt ),{yt+1+k ∈D (xt+1+k )}∞ k=0

max

u(yt , xt ) +

∞

β k+1 u(yt+1+k , xt+1+k )

k=0 ∞ k=0

max

u(yt , xt ) + β

{yt+1+k ∈D (xt+1+k )}∞ k=0

max

β k u(yt+1+k , xt+1+k ) (7.3)

**Note that, by deﬁnition, we have V (xt+1 ) =
**

∞ {yt+1+k ∈D (xt+1+k )}k=0

max

∞ k=0

β k u(yt+1+k , xt+1+k )

**such that (7.3) rewrites as V (xt ) = max u(yt , xt ) + βV (xt+1 )
**

yt ∈D (xt )

(7.4)

This is the so–called Bellman equation that lies at the core of the dynamic programming theory. With this equation are associated, in each and every period t, a set of optimal policy functions for y and x, which are deﬁned by {yt , xt+1 } ∈ Argmax u(y, x) + βV (xt+1 )

y∈D (x)

(7.5)

2

Our problem is now to solve (7.4) for the function V (xt ). This problem is particularly complicated as we are not solving for just a point that would satisfy the equation, but we are interested in ﬁnding a function that satisﬁes the equation. A simple procedure to ﬁnd a solution would be the following 1. Make an initial guess on the form of the value function V0 (xt ) 2. Update the guess using the Bellman equation such that Vi+1 (xt ) = max u(yt , xt ) + βVi (h(yt , xt ))

yt ∈D (xt )

3. If Vi+1 (xt ) = Vi (xt ), then a ﬁxed point has been found and the problem is solved, if not we go back to 2, and iterate on the process until convergence. In other words, solving the Bellman equation just amounts to ﬁnd the ﬁxed point of the bellman equation, or introduction an operator notation, ﬁnding the ﬁxed point of the operator T , such that Vi+1 = T Vi where T stands for the list of operations involved in the computation of the Bellman equation. The problem is then that of the existence and the uniqueness of this ﬁxed–point. Luckily, mathematicians have provided conditions for the existence and uniqueness of a solution.

7.1.2

Existence and uniqueness of a solution

Deﬁnition 1 A metric space is a set S, together with a metric ρ : S × S −→ R+ , such that for all x, y, z ∈ S: 1. ρ(x, y) 0, with ρ(x, y) = 0 if and only if x = y,

2. ρ(x, y) = ρ(y, x), 3. ρ(x, z) ρ(x, y) + ρ(y, z). 3

∞ Deﬁnition 2 A sequence {xn }n=0 in S converges to x ∈ S, if for each ε > 0

there exists an integer Nε such that

ρ(xn , x) < ε for all n

Nε

there exists an integer Nε such that

∞ Deﬁnition 3 A sequence {xn }n=0 in S is a Cauchy sequence if for each ε > 0

ρ(xn , xm ) < ε for all n, m

Nε

Deﬁnition 4 A metric space (S, ρ) is complete if every Cauchy sequence in S converges to a point in S. Deﬁnition 5 Let (S, ρ) be a metric space and T : S → S be function mapping ρ(T x, T y) βρ(x, y), for all x, y ∈ S.

S into itself. T is a contraction mapping (with modulus β) if for β ∈ (0, 1),

We then have the following remarkable theorem that establishes the existence and uniqueness of the ﬁxed point of a contraction mapping. Theorem 1 (Contraction Mapping Theorem) If (S, ρ) is a complete metric space and T : S −→ S is a contraction mapping with modulus β ∈ (0, 1), then 1. T has exactly one ﬁxed point V ∈ S such that V = T V , 2. for any V ∈ S, ρ(T n V0 , V ) < β n ρ(V 0 , V ), with n = 0, 1, 2 . . . Since we are endowed with all the tools we need to prove the theorem, we shall do it.

∞ Proof: In order to prove 1., we shall ﬁrst prove that if we select any sequence {Vn }n=0 ,

such that for each n, Vn ∈ S and Vn+1 = T Vn this sequence converges and that it converges to V ∈ S. In order to show convergence of {Vn }∞ , we shall prove that {Vn }∞ is a Cauchy sequence. First of all, note that the n=0 n=0 contraction property of T implies that ρ(V2 , V1 ) = ρ(T V1 , T V0 ) βρ(V1 , V0 )

4

and therefore ρ(Vn+1 , Vn ) = ρ(T Vn , T Vn−1 ) βρ(Vn , Vn−1 ) ... β n ρ(V1 , V0 )

Now consider two terms of the sequence, Vm and Vn , m > n. The triangle inequality implies that ρ(Vm , Vn ) ρ(Vm , Vm−1 ) + ρ(Vm−1 , Vm−2 ) + . . . + ρ(Vn+1 , Vn ) therefore, making use of the previous result, we have ρ(Vm , Vn ) β m−1 + β m−2 + . . . + β n ρ(V1 , V0 ) βn ρ(V1 , V0 ) 1−β

Since β ∈ (0, 1), β n → 0 as n → ∞, we have that for each ε > 0, there exists Nε ∈ N such that ρ(Vm , Vn ) < ε. Hence {Vn }∞ is a Cauchy sequence and it therefore converges. n=0 Further, since we have assume that S is complete, Vn converges to V ∈ S. We now have to show that V = T V in order to complete the proof of the ﬁrst part. Note that, for each ε > 0, and for V0 ∈ S, the triangular inequality implies ρ(V, T V ) ρ(V, Vn ) + ρ(Vn , T V )

But since {Vn }∞ is a Cauchy sequence, we have n=0 ρ(V, T V ) ρ(V, Vn ) + ρ(Vn , T V ) ε ε + 2 2

for large enough n, therefore V = T V . Hence, we have proven that T possesses a ﬁxed point and therefore have established its existence. We now have to prove uniqueness. This can be obtained by contradiction. Suppose, there exists another function, say W ∈ S that satisﬁes W = T W . Then, the deﬁnition of the ﬁxed point implies ρ(V, W ) = ρ(T V, T W ) but the contraction property implies ρ(V, W ) = ρ(T V, T W ) βρ(V, W )

which, as β > 0 implies ρ(V, W ) = 0 and so V = W . The limit is then unique. Proving 2. is straightforward as ρ(T n V0 , V ) = ρ(T n V0 , T V ) βρ(T n−1 V0 , V )

but we have ρ(T n−1 V0 , V ) = ρ(T n−1 V0 , T V ) such that ρ(T n V0 , V ) = ρ(T n V0 , T V ) which completes the proof. βρ(T n−1 V0 , V ) β 2 ρ(T n−2 V0 , V ) ... β n ρ(V0 , V ) 2

This theorem is of great importance as it establishes that any operator that possesses the contraction property will exhibit a unique ﬁxed–point, which therefore provides some rationale to the algorithm we were designing in the previous section. It also insure that whatever the initial condition for this 5

we have T (V + a) then T is a contraction with modulus β. W ) T (W + ρ(V. (Discounting) There exists some constant β ∈ (0. 1) such that for all V ∈ B(X) and a 0. T V + βa Proof: Let us consider two functions V. if the value function satisﬁes a contraction property. W ∈ B(X). It therefore remains to provide conditions for the value function to be a contraction. We therefore get TV − TW Likewise.. W ) βρ(V. and 2. Theorem 2 (Blackwell’s Suﬃciency Conditions) Let X ⊆ R and B(X) be the space of bounded functions V : X −→ R with the uniform metric. This completes the proof. W )) W + ρ(V. W ) βρ(V. These are provided by the following theorem. Let T : B(X) −→ B(X) be an operator satisfying T V (x) T W (x) 1. (Monotonicity) Let V.algorithm. W ) W + ρ(V. if we now consider that V βρ(V. W ) 2 TW − TV Consequently. if V (x) W (x) for all x ∈ X. and such that V Monotonicity ﬁrst implies that TV and discounting TV since ρ(V. then 2. W ). we end up with βρ(V. W ) T W + βρ(V. W )) 0 plays the same role as a. T W ) which deﬁnes a contraction. simple iterations will deliver the solution. 6 . W ∈ B(X) satisfying 1. we have |T V − T W | so that ρ(T V.

for which the Bellman equation writes V (kt ) = max u(ct ) + βV (kt+1 ) ct ∈C with kt+1 = F (kt ) − ct .This theorem is extremely useful as it gives us simple tools to check whether a problem is a contraction and therefore permits to check whether the simple algorithm we were deﬁned is appropriate for the problem we have in hand. V and W . such that V (k) (T V )(k) k ∈K k ∈K subscript and denote the next period capital stock by k . u(F (k) − k ) + βW (k ) k ∈K max u(F (k) − k ) + βW (k ) = (T W (k )) 7 . let us drop the time equation rewrites. let us consider the optimal growth model. What we want to show is that next period capital stock. we just have to check whether T is monotonic and satisﬁes the discounting property. that is (T V )(k) = u(F (k) − k ) + βV (k ) But now. such that the Bellman (T W )(k). since V (k) such that it should be clear that (T V )(k) W (k) for all k ∈ K . plugging the law of motion of capital in the utility function V (k) = max u(F (k) − k ) + βV (k ) Let us now deﬁne the operator T as (T V )(k) = max u(F (k) − k ) + βV (k ) We would like to know if T is a contraction and therefore if there exists a unique function V such that V (k) = (T V )(k) In order to achieve this task. we have V (k ) W (k ). As an example. 1. In order to save on notations. let us denote by k the optimal W (k) for all k ∈ K . Monotonicity: Let us consider two candidate value functions. In order to do that.

.1 Deterministic dynamic programming Value function iteration The contraction mapping theorem gives us a straightforward way to compute the solution to the bellman equation: iterate on the operator T . 2. . Decide on a grid. the Bellman equation satisﬁes discounting in the case of optimal growth model. V . Discounting: Let us consider a candidate value function. 7.2. such that Vi+1 = T Vi up to the point where the distance between two successive value function is small enough. xN } formulate an initial guess for the value function V0 (x) and choose a stopping criterion ε > 0. the optimal growth model satisﬁes the Blackwell’s suﬃcient conditions for a contraction mapping. of admissible values for the state variable x X = {x1 . X . Basically this amounts to apply the following algorithm 1.2 7. . 8 . and therefore the value function exists and is unique. and a positive constant a. (T (V + a))(k) = = max u(F (k) − k ) + β(V (k ) + a) max u(F (k) − k ) + βV (k ) + βa k ∈K k ∈K = (T V )(k) + βa Therefore. We are now in position to design a numerical algorithm to solve the bellman equation.Hence we have shown that V (k) W (k) implies (T V )(k) (T W )(k) and therefore established monotonicity. . Hence.

x ) + βVi (x ) {x ∈X } 3. For each x ∈ X . . . . we have V (k) = max (k α + (1 − δ)k − k )1−σ − 1 + βV (k ) 1−σ kα +(1−δ)k (1−δ)k k Now.2. let us consider a simple example and go back to the optimal growth model. If Vi+1 (x) − Vi (x) < ε go to the next step. kN } 9 . N . x ) and V (x) = u(y (x). 4. let us deﬁne a grid of N feasible values for k such that we have K = {k1 . with u(c) = and k = k α − c + (1 − δ)k Then the Bellman equation writes V (k) = c1−σ − 1 + βV (k ) k α +(1−δ)k 1 − σ max c1−σ − 1 1−σ 0 c From the law of motion of capital we can determine consumption as c = k α + (1 − δ)k − k such that plugging this results in the Bellman equation. x ). . . . . = 1. else go back to 2. compute Vi+1 (x ) = max u(y(x . . x) 1−β In order to better understand the algorithm. Compute the ﬁnal solution as y (x) = y(x.

we want k to satisfy 0 k k α + (1 − δ)k which puts a lower and an upper bound on the index h. the upper bound can be computed as h=E Then we ﬁnd V and set Vi+1 (k ) = V .. if we want it to converge fast enough it may be a good idea to impose a good initial guess. which restricts the number of possible values for k .h . where dk is the increment +1 max V h=1.1 and 7.h .. When the grid of in the grid. . Then. .and an initial value function V0 (k) — that is a vector of N numbers that relate each k to a value.N V k (k ) = kh such that we have In ﬁgures 7. we report the value function and the decision rules obtained from the deterministic optimal growth model with α = 0. we need a stopping criterion. . Finally.h α ki + (1 − δ)ki − k dk values is uniform — that is when kh = k + (h − 1)dk .h and keep in memory the index h = Argmaxh=1. 10 .. . But..h . N . we compute the feasible values that can be taken by the quantity in left hand side of the value function V . Indeed... Namely.3.2. Note that this may be anything we want as we know — by the contraction mapping theorem — that the algorithm will converge.95.. β = 0. we only compute consumption when it is positive and smaller than total output.h ≡ (k α + (1 − δ)k − kh )1−σ − 1 + βV (kh ) for h feasible 1−σ It is important to understand what “h feasible” means. . for each = 1..

30. Much faster implementations can be found in the accompanying matlab codes. = 1. 11 .5 seconds on a 800Mhz computer when the stopping criterion is ε = 1e−6 .1: Deterministic OGM (Value function.5 1 1.5 kt 3 3.δ = 0. 0. = (1-dev)*ks.1.5 4 4. % maximal deviation from steady state % lower bound on the grid % upper bound on the grid This code and those that follow are not eﬃcient from a computational point of view.5. 0.9.5. = 0. = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1)). The grid for the capital stock is composed of 1000 data state and ∆k = 0.95.5 5 sigma delta beta alpha nbk crit epsi ks dev kmin kmax 2 = = = = 1.9. = (1+dev)*ks. Matlab Code: Value Function Iteration % utility parameter % depreciation rate % discount factor % capital elasticity of output % number of data points in the grid % convergence criterion % convergence parameter = 1000. where k denotes the steady 0. Figure 7. The algorithm2 then converges in 211 iterations and 110. 0.5 2 2.1 and σ = 1. as they are intended to have you understand the method without adding any coding complications. = 1e-6. Value iteration) 4 3 2 1 0 −1 −2 −3 −4 0 points ranging from (1 − ∆k )k to (1 + ∆k )k .

% % consumption and utility % c = kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid(1:imax).kmax. for i=1:nbk % % compute indexes for which consumption is positive % tmp = (kgrid(i)^alpha+(1-delta)*kgrid(i)-kmin).Figure 7. end % % Final solution % kp = kgrid(dr). end.5 Consumption kt 1 2 kt 3 4 5 dk kgrid v dr = = = = (kmax-kmin)/(nbk-1). Value iteration) 5 4 3 1 2 1 0 0 1 2 3 4 5 0. % % find value function % [tv(i). linspace(kmin. crit = max(abs(tv-v)).^(1-sigma)-1)/(1-sigma). v = tv. zeros(nbk. imax = min(floor(tmp/dk)+1. util = (c. % % % % implied increment builds the grid value function decision rule (will contain indices) while crit>epsi.1).nbk)’. % Compute convergence criterion % Update the value function 12 .dr(i)] = max(util+beta*v(1:imax)).5 0 0 Next period capital stock 2 1.1). zeros(nbk.2: Deterministic OGM (Decision rules.nbk).

such that the value function is unknown at this particular value. util= (c. .j N formulate an initial guess for the value function V0 (x) and choose a = h(yj . Therefore.2 Taking advantage of interpolation A possible improvement of the method is to have a much looser grid on the capital stock but have a pretty ﬁne grid on the control variable (consumption in the optimal growth model). xN } Decide on a grid.j ) 13 . it may be the case that the computed optimal value for the next period state variable does not lie in the grid. . . 7.2. 2. . . of admissible values for the control variable y Y = {y1 .^(1-sigma)-1)/(1-sigma). compute x . . . x ) + β Vi (x . . of admissible values for the state variable x X = {x1 . . . However. . we use any interpolation scheme to get an approximation of the value function at this value. . because of this precision and the fact the grid is rougher.c = kgrid. Y . Decide on a grid. x )∀j = 1. . One advantage of this approach is that it involves less function evaluations and is usually less costly in terms of CPU time. = 1.j Compute an interpolated value function at each x {y∈Y } = h(yj . Then the next period value of the state variable can be computed much more precisely. x ): Vi (x . yM } with M stopping criterion ε > 0. N . .j ) Vi+1 (x ) = max u(y. M . The algorithm is then as follows: 1. . For each x ∈ X . X . .^alpha+(1-delta)*kgrid-kp.

% depreciation rate = 0.01. % linspace(kmin. The algorithm converges in 182 iterations and 40.^alpha+(1-delta)*k(i)-c.5. % (1+dev)*ks.^(1-sigma)-1)/(1-sigma).1). else go back to 2.v.30. % (c.nbk)’. % utility parameter = 0. % zeros(nbk. 1000.1. maximal deviation from steady state lower bound on the grid upper bound on the grid builds the grid lower bound on the grid upper bound on the grid builds the grid value function decision rule (will contain indices) while crit>epsi. % linspace(cmin.’spline’). = max(util+beta*vi). % kmax^alpha. % discount factor = 0.6 seconds starting from initial condition for the value function v0 (k) = ((c /y ) ∗ k α )1−σ − 1 (1 − σ) sigma delta beta alpha nbk nbk crit epsi ks dev kmin kmax kgrid cmin cmak c v dr util Matlab Code: Value Function Iteration with Interpolation = 1.kmax. for i=1:nbk.1). If Vi+1 (x) − Vi (x) < ε go to the next step. 14 . x) 1−β I report now the matlab code for this approach when we use cubic spline interpolation for the value function. = interp1(k.kp. 1. % zeros(nbk. 20 nodes for the capital stock and 1000 nodes for consumption. % 0.3. kp vi [Tv(i).nbc)’. Compute the ﬁnal solution as V (x) = u(y .9. = = = = = = = = = = 0. 4. 1e-6. % (1-dev)*ks.dr(i)] = A(j)*k(i).cmax. % % % % number of data points in the K grid number of data points in the C grid convergence criterion convergence parameter = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1)). % capital elasticity of output = = = = 20.95.

since it can be shown that this procedure converges at the rate β.3 Policy iterations: Howard Improvement The simple value iteration algorithm has the attractive feature of being particularly simple to implement.3: Deterministic OGM (Value function. com15 .5 2 2. Often. v = util/(1-beta). % Update the value function end % % Final solution % kp = kgrid(dr). which is usually close to 1! Further. c = kgrid.^(1-sigma)-1)/(1-sigma).2. it is a slow procedure.5 5 7. % Compute convergence criterion v = tv. it computes unnecessary quantities during the algorithm which slows down convergence.5 kt 3 3. However. util= (c. Value iteration with interpolation) 4 3 2 1 0 −1 −2 −3 −4 0 0. Figure 7.5 1 1.end crit = max(abs(tv-v)). especially for inﬁnite horizon problems.^alpha+(1-delta)*kgrid-kp.5 4 4.

Set a stopping criterion ε > 0. for instance when one wants to perform a sensitivity analysis of the results obtained in a model using diﬀerent parameterization. fi (x)) 16 . assuming that this rule is operative forever: V (xt ) = ∞ s=0 β s u(fi (xt+s ). yt ) = h(xt .Figure 7. we would like to be able to speed up convergence.4: Deterministic OGM (Decision rules. fi (xt )) with i = 0.5 0 0 Next period capital stock 2 1. Value iteration with interpolation) 5 4 3 1 2 1 0 0 1 2 3 4 5 0. xt+s ) taking care of the fact that xt+1 = h(xt . This method actually iterates on policy functions rather than iterating on the value function. This can be achieved relying on Howard improvement method. The algorithm may be described as follows 1. Set an initial feasible decision rule for the control variable y = f0 (x) and compute the value associated to this guess. 2. Hence. x) + βV (x ) y with x = h(x. Find a new policy rule y = fi+1 (x) such that fi+1 (x) ∈ Argmax u(y.5 Consumption kt 1 2 kt 3 4 5 putation speed is really important.

Note that this method diﬀers fundamentally from the value iteration algorithm in at least two dimensions i one iterates on the policy function rather than on the value function. else go back to 2. x ) + βQVi+1 (x ) ∀x ∈ X where Q is an (N × N ) matrix Qj= 1 if xj ≡ h(fi+1 (x ). if yes then stop. This is precisely this last feature that accelerates convergence. starting from the same initial guess and using the same parameterization! 17 . to get Vi+1 (x) = (I − βQ)−1 u(fi+1 (x). which may be rewritten as Vi+1 (x ) = u(fi+1 (x ).6. Q is sparse. The algorithm converges in only 18 iterations and 9. Note that when computing the value function we actually have to solve a linear system of the form Vi+1 (x ) = u(fi+1 (x ).3. x ) + βVi+1 (h(x . check if fi+1 (x) − fi (x) < ε. ii the decision rule is used forever whereas it is assumed that it is used only two consecutive periods in the value iteration algorithm.5 and 7. which can be exploited in solving the system. x ) = x 0 otherwise ∀x ∈ X Note that although it is a big matrix. x) We apply this algorithm to the same optimal growth model as in the previous section and report the value function and the decision rules at convergence in ﬁgures 7. fi+1 (x ))) for Vi+1 (x ).8 seconds.

^(1-sigma)-1)/(1-sigma).95. % % consumption and utility % c = kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid(1:imax). for i=1:nbk.dr(i)) = 1. dev = 0.1).30.^(1-sigma)-1)/(1-sigma). util = (c.50. % decision rule (will contain indices) % % Main loop % while crit>epsi.9. for i=1:nbk % % compute indexes for which consumption is positive % imax = min(floor((kgrid(i)^alpha+(1-delta)*kgrid(i)-kmin)/devk)+1. Matlab Code: Policy Iteration % utility parameter % depreciation rate % discount factor % capital elasticity of output % number of data points in the grid % convergence criterion % convergence parameter = 1000.nbk). % % update the value % util= (c. Q(i. % % find new policy rule % [v1.dr(i)]= max(util+beta*v(1:imax)).10. % upper bound on the grid kgrid = linspace(kmin. Q = sparse(nbk. % % decision rules % kp = kgrid(dr) c = kgrid.^alpha+(1-delta)*kgrid-kp. % builds the grid v = zeros(nbk.nbk).nbk)’.1). % value function kp0 = kgrid. 18 . = 1e-6. 0. % lower bound on the grid kmax = (1+dev)*ks. % maximal deviation from steady state kmin = (1-dev)*ks.kmax. 0. end.sigma delta beta alpha nbk crit epsi = = = = 1. = 1. % initial guess on k(t+1) dr = zeros(nbk. ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1)). 0.

end Tv = crit= v = kp0 = end (speye(nbk)-beta*Q)\u.5 4 4. a number of researchers has proposed to replace the matrix inversion by an additional iteration step.5 kt 3 3. which replaces the linear problem by the following iteration scheme 1.5 2 2. policy iteration algorithm only requires a few iterations. max(abs(kp-kp0)). Figure 7. Set J0 = Vi 19 . Unfortunately. leading to the so–called modiﬁed policy iteration with k steps. Therefore. Tv.5 5 As experimented in the particular example of the optimal growth model. x) which may be particularly costly when the number of grid points is important. Policy iteration) 4 3 2 1 0 −1 −2 −3 −4 0 0.5: Deterministic OGM (Value function. we have to solve the linear system (I − βQ)Vi+1 = u(y. kp.5 1 1.

5 Consumption kt 1 2 kt 3 4 5 2. terpolating nodes X = {x1 . Θi )) = max u(y. Choose a functional form for the value function V (x. Policy iteration) 5 4 3 1 2 1 0 0 1 2 3 4 5 0. set Vi+1 = Jk . . x) + βQJi . iterate k times on Ji+1 = u(y. Θi ) y 20 .6: Deterministic OGM (Decision rules. . . xN }. When k −→ ∞. . .Figure 7.2. . x ) + β V (x . that is compute w = T (V (x. .4 Parametric dynamic programming This last technics I will describe borrows from approximation theory using either orthogonal polynomials or spline functions. . a stopping criterion ε > 0 and an 2. The idea is actually to make a guess for the functional for of the value function and iterate on the parameters of this functional form. k 3. Jk tends toward the solution of the linear system. i = 0. 7.5 0 0 Next period capital stock 2 1. Θ). Θi )) w = T (V (x . Using the conjecture for the value function. a grid of ininitial vector of parameters Θ0 . perform the maximization step in the Bellman equation. The algorithm then works as follows 1.

we need the payoﬀ function and the value function to be continuous. I will once again focus on the optimal growth problem we have been dealing with so far. Θi ) < ε then stop. compute a new vector of parameters Θi+1 such that V (x.1 reports the parameters of the approximation function. Figures 7.s. I set 21 . If V (x. .8 and 7. . Θi+1 ) − V (x. and I will approximate the value function by V (k. As an example. First note that for this method to be implementable. {ϕi (. x = h(y. N 3. and table 7. In the example. Using the approximation method you have chosen. Θ))2 This algorithm is usually much faster than value iteration as it may not require iterating on a large grid. x ) for = 1. w ). The approximation function may be either a combination of polynomials. 4. . neural networks.t. . and the approximation method may involve numerical minimization in order to solve a non–linear least–square problem of the form: Θi+1 ∈ Argmin Θ N =1 (w − V (x . but took less much time that value iterations.)}p i=0 is a set of Chebychev polynomials. else go back to 2. we may have to rely on a numerical maximization algorithm. Θ) = i=1 p θ i ϕi 2 k−k −1 k−k where p = 10 and used 20 nodes. The algorithm converged in 242 iterations.7 report the decision rule and the value function in this case. splines. Θi+1 ) approximates the data (x . Note that during the optimization problem.

Figure 7.5 2 2.5 Consumption k 1 2 t k 3 t 4 5 22 .5 0 0 Next period capital stock 2 1.5 5 Figure 7.7: Deterministic OGM (Value function.5 kt 3 3.5 1 1. Parametric DP) 4 3 2 1 0 −1 −2 −3 −4 0 0.8: Deterministic OGM (Decision rules. Parametric DP) 5 4 3 1 2 1 0 0 1 2 3 4 5 0.5 4 4.

Table 7.78042 -0.66012 0.23704 -0.00501 -0.82367 2.00281 23 .05148 -0.01126 -0.1: Value function approximation θ0 θ1 θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ10 0.00617 0.02601 0.10281 0.

th0 = X\v Tv = zeros(nbk. options(14)=1e9. 1. % % Main loop % options=foptions..30.th0).1).sigma delta beta alpha nbk p crit iter epsi = = = = = = = = = Matlab Code: Parametric Dynamic Programming 1.th0). k0 = kgrid(1).1). kp = zeros(nbk. % discount factor 0. while crit>epsi. end.95.k0. 1. th0 = theta. iter= iter+1.^(1-sigma)-1)/((1-sigma)*(1-beta))).n). % utility parameter 0. 10. % Interpolating nodes kgrid = kmin+(rk+1)*(kmax-kmin)/2.10.options. crit = max(abs(Tv-v)). % maximal dev. dev = 0.param.9. % depreciation rate 0. % mapping % % Initial guess for the approximation % v = (((kgrid. theta= X\Tv. % capital elasticity of output 20. kp(i) = fminu(’tv’. % % % % % # of data points in the grid order of polynomials convergence criterion iteration convergence parameter ks = ((1-beta*(1-delta))/(alpha*beta))^(1/(alpha-1)).param. end 24 . % upper bound on the grid rk = -cos((2*[1:nbk]’-1)*pi/(2*nnk)). from steady state kmin = (1-dev)*ks.50. 1e-6. X = chebychev(rk. for i=1:nbk param = [alpha beta delta sigma kmin kmax n kgrid(i)]. v = Tv. k0 = kp(i).[]. % lower bound on the grid kmax = (1+dev)*ks.^alpha). Tv(i) = -tv(kp(i).

^(1-sigma)-1)/(1-sigma).i-2)]. This is why Judd [1998] 25 . Tx=[Tx 2*X.^alpha+(1-delta)*k-kp.n)*theta. kmin = param(5).error(’n should be a positive integer’).theta). X=x(:).i-1)-Tx(:. c(d) = NaN. n = param(3). kmax = param(2).1).1) X].theta). beta = param(2). k = 2*(k-kmin)/(kmax-kmin)-1. v = chebychev(k.Matlab Code: Extra function res=tv(kp. lx=size(X. for i=3:n+1. Tx=[ones(lx. end end A potential problem with the use of Chebychev polynomials is that they do not put any assumption on the shape of the value function that we know to be concave and strictly increasing in this case. kmax = param(6). case 1. c = k. alpha = param(1). otherwise Tx=[ones(lx. v = value(kp. if n<0. sigma = param(4). Functions % % % % % % % % insures positivity of k’ computes the value function computes consumption find negative consumption get rid off negative c computes utility utility = low number for c<0 compute -TV (we are minimizing) function Tx=chebychev(x.n). Tx=[ones(lx. k = param(8).theta).param. res = -(util+beta*v). case 0.1)].[kmin kmax n].^2). d = find(c<=0). kp = sqrt(kp. kmin = param(1). function v = value(k.param. util(d) = -1e12. delta = param(3).*Tx(:.end switch n. n = param(7). util = (c.1) X].

Judd and Solnick [1994] have successfully applied this latter technic to the optimal growth model and found that the approximation was very good and dominated a lot of other methods (they actually get the same precision with 12 nodes as the one achieved with a 1200 data points grid using a value iteration technic).1 The Bellman equation The stochastic problem diﬀers from the deterministic problem in that we now have a . 7. st )+Et β τ u(yt+τ . we have to deal with stochastic shocks — just think of a standard RBC model — dynamic programming technics can obviously be extended to deal with such problems.6) and xt+1 = h(xt . I will ﬁrst show how we can obtain the Bellman equation before addressing some important issues concerning discretization of shocks. st+τ ) (7.st+τ )}∞ τ =0 max Et β τ u(yt+τ . xt+τ . whose +∞ sequence {st }t=0 satisﬁes st+1 = Φ(st . yt . stochastic shock! The problem then deﬁnes a value function which has as argument the state variable xt but also the stationary shock st .st+τ )}∞ } τ =1 (7. εt+1 ) ∞ τ =0 (7. st ) = {yt+τ ∈D (xt+τ . The value function is therefore given by V (xt . xt and st are either perfectly observed or decided in period t ∞ τ =1 max u(yt . xt+τ .3.6) where ε is a white noise process. st ) they are known in t. Then I will describe the implementation of the value iteration and policy iteration technics for the stochastic case.st ).3 Stochastic dynamic programming In a large number of problems.recommends to use shape–preserving methods such as Schumaker approximation in this case. . st ) = {yt ∈D (xt .8) Since both yt . . 7. st+τ ) 26 . xt . such that we may rewrite the value function as V (xt .{yt+τ ∈D (xt+τ .7) subject to (7.

. so that we get V (xt . st ) max βEt Et+1 ∞ k=0 + or V (xt . .st ) max u(yt . f (ωt+1 |ωt )dωt+τ . st+1+k ) yt ∈D (xt . this rewrites V (xt . X(ωt+τ )f (ωt+τ |ωt+τ −1 ) . st ) = yt ∈D (xt . xt+1+k . st+1+k ) It is important at this point to recall that Et (X(ωt+τ )) = = = X(ωt+τ )f (ωt+τ |ωt )dωt+τ X(ωt+τ )f (ωt+τ |ωt+τ −1 )f (ωt+τ −1 |ωt )dωt+τ dωt+τ −1 .st+1+k )}∞ k=0 β k u(yt+1+k .st ) max u(yt . xt+τ . st+1+k )f (st+1 |st )dst+1 Note that because each value for the shock deﬁnes a particular mathematical object the maximization of the integral corresponds to the integral of the maximization.or V (xt . xt . . xt+1+k . st ) ∞ k=0 +β {yt+1+k }k=0 max ∞ Et+1 β k u(yt+1+k . such that the value function rewrites V (xt ..st+1+k )}k=0 max βEt ∞ k=0 β k u(yt+1+k . st ) = {yt+1+k ∈D (xt+1+k . xt . st )+ {yt+τ ∈D (xt+τ .st ) max u(yt . xt . therefore the max operator and the summation are interchangeable. dωt+1 which as corollary the law of iterated projections. . xt+1+k . st ) = yt ∈D (xt . st+1+k )f (st+1 |st )dst+1 27 .st+1+k )}∞ k=0 β k u(yt+1+k . st+τ ) Using the change of variable k = τ − 1. st ) = yt ∈D (xt+τ . xt+1+k .st+τ ) max u(yt . st )+ ∞ {yt+1+k ∈D (xt+1+k . xt . xt .st ) max u(yt . st ) = yt ∈D (xt . st ) max β Et+1 ∞ k=0 + {yt+1+k ∈D (xt+1+k ..st+τ )}∞ τ =1 max Et ∞ τ =1 β τ u(yt+τ .

indexed by an integer variable t. The question we therefore face is: does there exist a discrete representation for s which is equivalent to its continuous original representation? The answer to this question is yes. st+1 ) max u(yt . st ) + βEt V (xt+1 . the Markov chain is in a state. xt .st ) which is precisely the Bellman equation for the stochastic dynamic programming problem. . sM }. W will restrict ourselves to discrete-time Markov chains. st ) = or equivalently V (xt . In particular as soon as we deal with (V)AR processes. S is called the state space.By deﬁnition.3. 28 . st+1 )f (st+1 |st )dst+1 yt ∈D (xt . st+1 ). the term under the integral corresponds to V (xt+1 . . xt . . st ) = max u(yt . We therefore have to transform the continuous problem into a discrete one with the constraint that the asymptotic properties of the continous and the discrete processes should be the same. such that the value rewrites V (xt . we can use a very powerful tool: Markov chains. Indeed. .2 Discretization of the shocks A very important problem that arises whenever we deal with value iteration or policy iteration in a stochastic environment is that of the discretization of the space spanned by the shocks. 7. At each time step t. Markov Chains: A Markov chain is a sequence of random values whose probabilities at a time interval depends upon the value of the number at the previous time. denoted by s ∈ S ≡ {s1 . the use of a continuous support for the stochastic shocks is infeasible for a computer that can only deals with discrete supports.st ) yt ∈D (xt . st ) + β V (xt+1 . in which the state changes at certain discrete time instants.

We therefore get the following deﬁnition Deﬁnition 6 A Markov chain is a stochastic process with a discrete indexing S . such that the conditional distribution of st+1 is independent of all previously attained state given st : πij = Prob(st + 1 = sj |st = si ). . apply as soon as state si is activated no matter the history of the shocks nor how the system got to state s i . . πij 2. This transition probability should read as follows If the state happens to be si in period t. Thus. π M M 29 . . . . . . the probability that the next state is equal to sj is πij . . πM 1 . . .. π1M . . The transition probabilities πij must of course satisfy 1. . 0 for all i. M for all period t. the probability the next state st+1 does only depend on the current realization of s. this corresponds to the so–called Markov property P (st+1 = sj |st = si . .The Markov chain is described in terms of transition probabilities πij . The important assumption we shall make concerning Markov processes is that the transition probabilities. . From a mathematical point of view. . . In other words there is no hysteresis. and all sequences {sin }t−1 of earlier n=0 M j=1 πij All of the elements of a Markov chain model can then be encoded in a transition probability matrix π11 . πij . sj ∈ S . . st−1 = sin−1 . . j = 1. . si . . s0 = si0 ) = P (st+1 = sj |st = si ) = πij states. all states si . sj ∈ S . M = 1 for all i = 1. Π= . .

. 0 for all j = 1. σ 2 ). s) ≡ 3 where f (st+1 |s) denotes the density of st+1 conditional on the fact that st = s f (st+1 |st ) 1 = exp − f (st+1 |s) 2 st+1 − ρst − (1 − ρ)s σ 2 − st+1 − s σ 2 Note that we already discussed this issue in lectures notre # 4. M M j=1 πj =1 3. st . π = Ππ Gaussian–quadrature approach to discretization Tauchen and Hussey [1991] provide a simple way to discretize VAR processes relying on gaussian quadrature. In the long run. Tauchen and Hussey [1991] propose to replace the integral by Φ(st+1 . N (0. This implies that 1 √ σ 2π ∞ −∞ exp − 1 2 st+1 − ρst − (1 − ρ)s σ 2 dst+1 = f (st+1 |st )dst+1 = 1 which illustrates the fact that s is a continuous random variable. 30 . . st . . we obtain the steady state equivalent for a Markov chain: the invariant distribution or stationary distribution Deﬁnition 7 A stationary distribution for a Markov chain is a distribution π such that 1. s)f (st+1 |s)dst+1 ≡ f (st+1 |st ) f (st+1 |s)dst+1 = 1 f (st+1 |s) (which amounts to the unconditional density).k Note that Πij then gives the probability that st+k = sj given that st = si . πj 2.3 I will only present the case of an AR(1) process of the form st+1 = ρst + (1 − ρ)s + εt+1 where εt+1 . . which in our case implies that Φ(st+1 .

but remember that the quadrature is just an approximation such that it will generally be the case that n j=1 πij = 1 will not hold exactly./sx(:.1)). zi . i = 1. s) √ πij = πςi where ςi = 1 √ π n j=1 ωj Φ(zj .ones(n.then we can use the standard linear transformation and impose zt = (st − √ s)/(σ 2) to get 1 √ π ∞ −∞ 2 exp − (zt+1 − ρzt )2 − zt+1 j) 2 exp −zt+1 dzt+1 for which we can use a Gauss–Hermite quadrature.ones(n.rho. .wx] = gauss_herm(n).*w). p = p. zi . % nodes and weights for s s = sqrt(2)*s*xx+mx.*(z-rx*x)).1))’.p]=tauch_huss(xbar.*z-(z-rx*x). 31 .sigma. zi . .ones(n. the quadrature leads to the formula 1 √ π n ωj Φ(zj . n. % discrete states x = xx(:. sx = sum(p’)’. x) as an “estimate” πij ≡ Prob(st+1 = sj |st = si ) of the transition probability from state i to state j. % % computation % p = (exp(z. . Assume then that we have the quadrature nodes zi and weights ωi . . x) j=1 1 in other words we might interpret the quantity ωj Φ(zj . zi . Tauchen and Hussey therefore propose the following modiﬁcation: ωj Φ(zj .n) % xbar : mean of the x process % rho : persistence parameter % sigma : volatility % n : number of nodes % % returns the states s and the transition probabilities p % [xx./sqrt(pi).1)). z = x’. w = wx(:. x). Matlab Code: Discretization of AR(1) function [s.

Compute the ﬁnal solution as y (x. . N . [1989] chapter 9. which goes far beyond this introduction to dynamic programming. . This is however a bit more subtil than that as we have to deal with the convergence of a probability measure. . s) − Vi (x. . of admissible values for the state variable x X = {x1 . xN } and the shocks.7. s) = y(x.3 Value iteration As in the deterministic case. X . k = 1. sj ) 3. s). sM } together with the transition matrix Π = (πij ) formulate an initial guess for the value function V0 (x) and choose a stopping criterion ε > 0. = 1. I will illustrate the method relying on the optimal growth model. sk ) + β j=1 πkj Vi (x . . . . . and sk ∈ S . else go back to 2. . . x . The algorithm then writes as follows 1. sk ) = max u(y(x . . . . .3. x (s. s) As in the deterministic case. .4 The algorithm is basically the same as in the deterministic case up to the delicate point that we have to deal with the expectation. let us consider a simple example and go back to the optimal growth model. . sk . with u(c) = 4 c1−σ − 1 1−σ 32 The interested reader should then refer to Lucas et al. M compute M {x ∈X } Vi+1 (x . s) < ε go to the next step. 4. the convergence of the simple value function iteration procedure is insured by the contraction mapping theorem. . In order to better understand the algorithm. x ). s S = {s1 . Decide on a grid. For each x ∈ X . 2. If Vi+1 (x.

We therefore have to discretize the shock. we have V (k. a )dµ(a |a) From the law of motion of capital we can determine consumption as c = exp(a)k α + (1 − δ)k − k such that plugging this results in the Bellman equation. a) = max c c1−σ − 1 +β 1−σ V (k . such that Π= π 1−π 1−π π a1 . I will also assume that the transition matrix is symmetric.and k = exp(a)k α − c + (1 − δ)k where a = ρa + ε . . such that a can take on 2 values a1 and a2 (a1 < a2 ). Then the Bellman equation writes V (k. we get a1 = 33 2 σε /(1 − ρ2 ). a2 and π are selected such that the process reproduces the conditional ﬁrst and second order moments of the AR(1) process First order moments πa1 + (1 − π)a2 = ρa1 (1 − π)a1 + πa2 = ρa2 Second order moments 2 πa2 + (1 − π)a2 − (ρa1 )2 = σε 1 2 2 + πa2 − (ρa )2 = σ 2 (1 − π)a1 2 ε 2 From the ﬁrst two equations we get a1 = −a2 and π = (1 + ρ)/2. a) = max k (exp(a)k α + (1 − δ)k − k )1−σ − 1 +β 1−σ V (k . I will consider that the technology shock can be accurately approximated by a 2 state markov chain. Here. Plugging these last two results in the two last equations. a )dµ(a |a) A ﬁrst problem that we encounter is that we would like to be able to evaluate the integral involved by the rational expectation.

Finally. . if we want it to converge fast enough it may be a good idea to impose a good initial guess. aj ) j=1 Now. we will actually work with a value function of the form c1−σ − 1 V (k. .5.10. σ = 1.12. But.3.Hence. kN } and an initial value function V0 (k) — that is a vector of N numbers that relate each k to a value. we report the value function and the decision rules obtained for the stochastic optimal growth model with α = 0. ak ) = max +β c 1−σ 2 πkj V (k . β = 0. . 34 . Note that this may be anything we want as we know — by the contraction mapping theorem — that the algorithm will converge.1. The grid for the capital stock is composed of 1000 data points ranging from 0. δ = 0. ρ = 0.95. In ﬁgures 7. .2 to 6.8 and σε = 0. The algorithm then converges in 192 iterations when the stopping criterion is ε = 1e−6 . we need a stopping criterion.9 and 7. let us deﬁne a grid of N feasible values for k such that we have K = {k1 .

Figure 7. Value iteration) 6 4 2 0 −2 −4 −6 0 1 2 3 kt 4 5 6 Figure 7.9: Stochastic OGM (Value function. Value iteration) 6 5 4 3 2 1 0 0 2 4 6 Next period capital stock 2 1.5 1 0.10: Stochastic OGM (Decision rules.5 0 0 Consumption k 2 t k 4 t 6 35 .

kmax.12. c = zeros(nbk. 0.sigma delta beta alpha rho se = = = = = = 1. c(neg) = NaN.nba).:)] = max(util+beta*(v*PI)). a1 = exp(-se*se/(1-rho*rho)).j) = -1e12.95. kgrid = linspace(kmin. end 36 . nba = 2.dr(i.30. util(:. A = [a1 a2]. end [Tv(i. util = zeros(nbk.nba). kmax = 6. % % Discretization of the state space % kmin = 0. PI = [p 1-p. while crit>epsi. 0.nbk)’. 0. v = zeros(nbk. neg = find(c<=0). ab = 0.j) = (c.^(1-sigma)-1)/(1-sigma). epsi = 1e-6. se = 0.10. 0.:).12. iter = iter+1. v = Tv. Tv = zeros(nbk. end.2.1-p p].80. util(neg. crit = 1. Matlab Code: Value Function Iteration % utility parameter % depreciation rate % discount factor % capital elasticity of output % persistence of the shock % volatility of the shock % % % % number of data points in the grid number of values for the shock convergence criterion convergence parameter nbk = 1000.50. 0. % % Discretization of the shock % p = (1+rho)/2. for i=1:nbk for j=1:nba. a2 = exp(se*se/(1-rho*rho)). c = A(j)*k(i)^alpha+(1-delta)*k(i)-k. crit = max(max(abs(Tv-v))).nba).nba).

M . else go back to 2. which implies the computation of diﬀerent Q matrices. assuming that this rule is operative forever. sk ).^alpha+(1-delta)*k-kp(:. sk ) = u(fi+1 (x . . fi+1 (x . sk ) = u(fi+1 (x . . which may be rewritten as Vi+1 (x . Find a new policy rule y = fi+1 (x. . we may want to accelerate the simple value iteration using Howard improvement. . sk ) ∈ Argmax u(y. k = 1. . st ) = h(xt . st ) with i = 0. The algorithm may be described as follows 1. sk ) (for all x ∈ X and sk ∈ S ). c(:. x. fi (xt . When computing the value function we actually have to solve a linear system of the form M Vi+1 (x . yt . such that M fi+1 (x. for j=1:nba. st ).) ∀x ∈ X 37 . k = 1.j). sk ). 2. sk ) + β y πkj V (x . Set a stopping criterion ε > 0. The stochastic case does not diﬀer that much from the deterministic case. x . s) < ε. if yes then stop. sk ) + βΠk. x . sj ) for Vi+1 (x . taking care of the fact that xt+1 = h(xt .j) = A(j)*k.3. sk ). Set an initial feasible set of decision rule for the control variable y = f0 (x. sj ) j=1 with x = h(x. end 7. . . . sk ). sk ). fi (x. sk ) + β j=1 πkj Vi+1 (h(x .4 Policy iterations As in the deterministic case. check if fi+1 (x. QVi+1 (x . s) − fi (x. apart from the fact that we now have to deal with diﬀerent decision rules. . sk ). sk ). sk ) 3. M and compute the value associated to this guess.kp = k(dr).

The algorithm converges in only 17 iterations. Policy iteration) 6 4 2 0 −2 −4 −6 0 1 2 3 kt 4 5 6 38 .11: Stochastic OGM (Value function. starting from the same initial guess and using the same parameterization! Figure 7.12. which can be exploited in solving the system. Q is sparse. s) We apply this algorithm to the same optimal growth model as in the previous section and report the value function and the decision rules at convergence in ﬁgures 7. s) = (I − βΠ ⊗ Q)−1 u(fi+1 (x. s). to get Vi+1 (x. x ) = x 0 otherwise Note that although it is a big matrix.where Q is an (N × N ) matrix Qj= 1 if xj ≡ h(fi+1 (x ).11 and 7. x.

a1 = exp(-se*se/(1-rho*rho)).Figure 7. 0. 0.12. nba = 2. se = 0.1-p p].30.2. Matlab Code: Policy Iteration % utility parameter % depreciation rate % discount factor % capital elasticity of output % persistence of the shock % volatility of the shock % % % % number of data points in the grid number of values for the shock convergence criterion convergence parameter nbk = 1000.kmax. % % Discretization of the state space % kmin = 0. A = [a1 a2]. kgrid = linspace(kmin. a2 = exp(se*se/(1-rho*rho)). util = zeros(nbk. epsi = 1e-6.10.5 0 0 Consumption kt 2 kt 4 6 sigma delta beta alpha rho se = = = = = = 1. ab = 0.80.95. 0. 0.50.12. Policy iteration) 6 5 4 3 2 1 0 0 2 4 6 Next period capital stock 2 1.nba). crit = 1.5 1 0.nbk)’. c = zeros(nbk.nba). % % Discretization of the shock % p = (1+rho)/2. 0. 39 .12: Stochastic OGM (Decision rules. kmax = 6. PI = [p 1-p.

1. Ev(:. util(neg.nba).dr(i.j)) = 1. % decision rule (will contain indices) % % Main loop % while crit>epsi. Tv = zeros(nbk.nba).j).j) = (c. iter+1. c(:. Q = sparse(nbk*nba. end c=zeros(nbk.^(1-sigma)-1)/(1-sigma). util(:.nbk). end Q((j-1)*nbk+1:j*nbk.:).nbk. c = A(j)*kgrid(i)^alpha+(1-delta)*kgrid(i)-kgrid.nbk*nba).nba).j) = A(j)*kgrid. Q0 = sparse(nbk. for i=1:nbk for j=1:nba. end [v1. neg = find(c<=0).j)= (c. c(neg) = NaN. Q0(i.:) = kron(PI(j.v = zeros(nbk. Ev = zeros(nbk.j) = v*PI(j.nba). for j=1:nba.^(1-sigma)-1)/(1-sigma).nba). % expected value function dr0 = repmat([1:nbk]’. % % update the value % util(:.nba). end 40 .:)’. c = A(j)*kgrid.^alpha+(1-delta)*kgrid-kp(:. for j=1:nba. kp. end.nba).j) = -inf.^alpha+(1-delta)*kgrid-kp(:.:)]= max(u+beta*Ev).j).Q0). % initial guess dr = zeros(nbk. % % decision rules % kp = kgrid(dr). max(max(abs(kp-kp0))). end Tv = crit= v = kp0 = iter= (speye(nbk*nba)-beta*Q)\util(:). reshape(Tv. for i=1:nbk.dr(i.

We will use Chebychev polynomials to approximate the rule in order to avoid multicolinearity problems in the least square algorithm. In other words we will get a decision rule of the form n ated decision rule for the next period capital stock ki = h(ki ). once again on the deterministic optimal growth model and let us assume that we have solved the model by either value iteration or policy iteration. We therefore have a collection of data points for the capital stock K = {k1 . for the state variable and a decision rule that relates the control variable to this collection of points.1 The deterministic case After having solved the model by either value iteration or policy iteration. kN }. we have a solution which is actually a collection of data points.7. . Also note that we have the upper and lower bound of the grid k1 = k and kN = k. spline interpolations . . k = =0 θ T (ϕ(k)) 41 . i = 1. . . . N . ) in order to obtain a continuous decision rule Here I will pursue the second route. . let’s focus. . we have in hand a collection of points. X . 7. . This decision rule may then be used to simulate the model. and the associand the consumption level which can be deduced from the next period capital stock using the law of motion of capital together with the resource constraint. . There are several ways to simulate the model • One may simply use the pointwise decision rule and iterate on it • or one may use an interpolation scheme (either least square methods. no matter whether the model is deterministic or stochastic.4 How to simulate the model? At this point. In order to illustrate this.4. . which form our grid. In other words we have Lagrange data which can be used to get a continuous decision rule.

.0012. kN } and 3.0295. n. n.0027T5 (ϕ(kt )) − 0. and i = 1 . .6008 + 2. such that we have associated decision rule for the next period capital stock ki = h(ki ).0125.0055T4 (ϕ(kt )) + 0. 0. .0125T3 (ϕ(kt )) −0.0814. . } such that the decision rule for the capital stock can be approximated as kt+1 = 2. . setting ϕ(k) = 2 k−k −1 k−k and for an approximation of order 7. . −0.0007T7 (ϕ(kt )) Then the model can be used in the usual way such that in order to get the transitional dynamics of the model we just have to iterate on this backward 42 . Perform the least square approximation such that Θ = (T (x) T (x))−1 T (x)h(k) For the parameterization we have been using so far. k] into [-1.where ϕ(k) is a linear transformation that maps [k. −0. i = 1. N .0814T1 (ϕ(kt )) − 0. Solve the model by value iteration or policy iteration.0027. . . Choose a maximal order. a collection of data points for the capital stock K = {k1 . ki − k −1 k−k = 1. Apply the transformation xi = 2 to the data and compute T (xi ). The algorithm works as follows 1.1]. .6008. . for the Chebychev polynomials approximation. 0. . 2.0007.0055. −0. 0. 2. . N . .0295T2 (ϕ(kt )) + 0. we obtain θ = {2. . .0012T6 (ϕ(kt )) + 0.

looking ﬁnite–diﬀerences equation.7 0 Consumption 10 20 Time 30 40 50 10 20 Time 30 40 50 Matlab Code: Transitional Dynamics n = 8.1 1 0.1 1 0 Output 1. In the ﬁgure.2 1.5 2 1.8 0.9 0.5 1 0 Capital stock 0. and the dashed line its steady state level. output and investment can be computed using their deﬁnition it = kt+1 − (1 − δ)kt α yt = k t ct = y t − i − t For instance.29 0.3 1. % transform the data % % Computes the matrix of Chebychev polynomials % 43 . the plain line represents the dynamics of each variable.4 1. Figure 7.13: Transitional dynamics (OGM) 3 2.28 0. ﬁgure 7.3 0. Consumption. % Order of approximation+1 transk = 2*(kgrid-kmin)/(kmax-kmin)-1.27 0.13 reports the transitional dynamics of the model when the economy is started 50% below its steady state.26 0 Investment 10 20 Time 30 40 50 10 20 Time 30 40 50 1.

in which case the matlab code would have been (for the simulation part) Matlab Code: OGM simulation (cubic spline interpolation k(1)= k0.i-1)-Tk(:.kp. end y = k(1:nrep). Tk= [Tk 2*trkt. Note: At this point.i-2)].i-2)]. trkt = 2*(k(t)-kmin)/(kmax-kmin)-1.4.1) transk].*Tk(:. it will often be a good idea to represent the solution as a continuous decision rule.*Tk(:.Tk = [ones(nbk. Tk = [1 trkt].^alpha.’cubic’). for i=3:n. i = k(2:nrep+1)-(1-delta)*k(1:nrep). i = k(2:nrep+1)-(1-delta)*k(1:nrep). c = y-i.2 The stochastic case As in the deterministic case. Another way to proceed would have been to use spline approximation.1). for i=3:n.5*ks.k(t). % Performs OLS k0 = 0. % number of periods k = zeros(nrep+1. I assume that the model has already been solved and that the capital grid has been stored in kgrid and the decision rule for the next period capital stock is stored in kp. Tk=[Tk 2*transk. k(t+1)=interp1(grid.^alpha. 7. for t=1:nrep. end k(t+1)=Tk*b. end b=Tk\kp. for t=1:nrep.i-1)-Tk(:. c = y-i. such that we should apply the same 44 . % initialize dynamics % % iteration loop % k(1)= k0. % initial capital stock nrep= 50. end y = k(1:nrep).

e. ﬁnd the index j such that Πc i(j−1) < pt Πc ij then sj is the state of the economy in period t The code to generate these two rules is exactly the same as for the deterministic version of the model since. 2.0057T4 (ϕ(kt )) + 0. we have 2 decision rules which are given by5 k1.0009T7 (ϕ(kt )) k2.3.2002 + 2.t+1 = 3.0110T4 (ϕ(kt )) + 0.t+1 = 2. such that b will be a ((n + 1) × 2) matrix where n is the approximation order.1] can be interpreted as probabilities.methodology as before.0014T6 (ϕ(kt )) + 0.0027T6 (ϕ(kt )) + 0.0014T7 (ϕ(kt )) The only diﬃculty we have to deal with in the stochastic case is to simulate the Markov chain. when computing b=Tk\kp matlab will take care of the fact that kp is a (N × 2) matrix. 3.0543T2 (ϕ(kt )) + 0. Simulating a series of length T for the markov chain can be achieved easily 1.0031T5 (ϕ(kt )) − 0. 5 45 .0114T3 (ϕ(kt )) −0.0057T5 (ϕ(kt )) − 0.0211T2 (ϕ(kt )) + 0. For instance. in the optimal growth model we solved in section 7.6302T1 (ϕ(kt )) − 0. we have as many decision rule as we have states in the Markov chain. Set the initial state and simulate T random numbers from a uniform T distribution: {pt }t=1 .4761T1 (ϕ(kt )) − 0. Hence.8630 + 2. The only departure from the previous experiment is that since we have approximated the stochastic shock by a Markov chain. Assume the economy was in state i in t − 1. Πc ij corresponds to the the probability that the economy is in a state lower of equal to j given that it was in state i in period t − 1.0235T3 (ϕ(kt )) −0. Compute the cumulative distribution of the markov chain (Πc ) — i. the cumulative sum of each row of the transition matrix. Note that these numbers. drawn from a U[0.

4 1. Figure 7.8 0 100 Time 150 200 50 100 Time 150 200 46 . 0. % 2 possible states = [0.9]. end.2:cpi+1))&.2 1 0. % Simulated distribution = zeros(T.14 reports a simulation of the optimal growth model we solved before.1.2 1 0.8 0 50 1.9 0.1 0 0 50 100 Time Consumption 150 200 Capital stock 0.5 2 1.5 0..1]. % Transition matrix = 1. % Initial state We are then in position to simulate our optimal growth model.8 1.1 0.5 0 50 100 Time Output 150 200 0. j(k)=find(((sim(k)<=cum_PI(state(k-1).3 Investment 2 1.s PI s0 T p j j(1) % for k=2:n.4 0.5 3 2.4 1.6 1.1).5 4 3. = s0.1:cpi)))).1). chain=s(j).2 0. (sim(k)>cum_PI(state(k-1).. % simulated values Matlab Code: Simulation of Markov Chain = [-1. Figure 7. % Length of the simulation = rand(T. % Initial state = 1000.14: Simulated OGM 4.6 1.

% length of simulation [a. % Prepare the approximation for the capital stock trkt = 2*(k(t)-kmin)/(kmax-kmin)-1. % use the appropriate decision rule % computes output Note: At this point.1). % some matrices k(1)=k0. % initial value for the capital stock k=zeros(long+1.*Tk(:. % transform the data % % Computes approximation % Tk = [ones(nbk. % initialize y=zeros(long.Matlab Code: Simulating the OGM n = 8.i-2)].i-1)-Tk(:.long. Tk = [Tk 2*trkt.6. for i=3:n.id] = markov(PI. end k(t+1) y(t) end i=k(2:long+1)-(1-delta)*k(1:long). Tk = [1 trkt]. % Order of approximation+1 transk = 2*(kgrid-kmin)/(kmax-kmin)-1. % markov is a function: % a contains the values for the shock % id the state index k0=2. c=y-i.*Tk(:.1) transk]. for i = 3:n. % computes investment % computes consumption = Tk*b(:. 47 .^alpha. I assume that the model has already been solved and that the capital grid has been stored in kgrid and the decision rule for the next period capital stock is stored in kp.id(t)). end b=Tk\kp.1).A. % % Simulate the markov chain % long = 200. = A(id(t))*k(t).i-2)].i-1)-Tk(:.1). % initial capital stock for t=1:long. Tk=[Tk 2*transk.

In each and every period.10) (7. ∞) (7. These revenus are then used to consume and purchase assets on the ﬁnancial market. Further the probability of being (un)employed in period t is greater if the household was also (un)employed in period t − 1.9) where 0 < β < 1 is a constant discount factor.11) We assume that the household’s state on the labor market is a random variable. which is characterized by the function: 6 ∞ τ =0 Et βτ c1−σ − 1 t+τ 1−σ with σ ∈ (0. and therefore ωt . Therefore. This problem has been extensively studied by Ayiagari [1994]. 1) ∪ (1.7. Expectations are conditional on information available at the beginning of period t. b. 48 . In other words.5 Handling constraints In this example I will deal the problem of a household that is constraint on her borrowing and who faces uncertainty on the labor market. the household enters the period with a level of asset at carried from the previous period. the state on the labor market. the budget constraint in t is given by at+1 = (1 + r)at + ωt − ct where ωt = w γw if the household is employed otherwise (7.) denotes mathematical conditional expectations. She may also be employed and receives the aggregate real wage w or may be unemployed in which case she just receives unemployment beneﬁts. In addition. which are assumed to be a fraction of the real wage: b = γw where γ is the replacement ratio. the household is submitted to a 6 Et (. c denotes the consumption bundle. We consider the case of a household who determines her consumption/saving plans in order to maximize her lifetime expected utility. for which she receives an constant real interest rate r. can be modeled as a 2 state Markov chain with transition matrix Π.

we have V (at .t. Positivity of consumption — which should be taken into account when searching the optimal value for at+1 — ﬁnally imposes at+1 < (1 + r)at + ωt . when deﬁning the grid for a. ωt ) = max u((1 + r)at + ωt − at+1 ) + βEt V (at+1 . one just has to take care of the fact that the minimum admissible value for a is a . which may be solved by one of the methods we have seen before. implementing any of these method taking into account the constraint is straightforward as. But. since the borrowing constraint should be satisﬁed in each and every period. ωt+1 ) at+1 s. (7. ωt ) = max u((1 + r)at + ωt − at+1 ) + βEt V (at+1 . such that the Bellman equation writes V (at . Anyway.m solves and simulate this problem. (7.12).t.11)–(7. ωt ) = max u(ct ) + βEt V (at+1 . ωt+1 ) at+1 a In other words.12). plugging the budget constraint into the utility function.10)–(7. the Bellman equation writes V (at .12) s. ωt ) = max u((1 + r)at + ωt − at+1 ) + βEt V (at+1 . 49 .borrowing contraint that states that she cannot borrow more than a threshold level a at+1 In this case. the Bellman equation can be rewritten V (at . ωt+1 ) ct 0 a (7. ωt+1 ) {a at+1 (1+r)at +ωt } The matlab code borrow.

50 .

Tauchen.. Uninsured Idiosyncratic Risk and Aggregate Saving. Manuscript. 1994. 1998. 371–396. R. Numerical methods in economics. Hussey. 1989.S. and E. Cambridge (MA): Harvard University Press.Bibliography Ayiagari.L. Dynamic Programming and Stochastic Control. N.. 109 (3). Hoover Institution 1994. Bertsekas. Stokey. Quarterly Journal of Economics. Numerical Dynamic Programming with Shape– Preserving Splines. Judd. Cambridge.. 51 . Econometrica. G. 1976. 59 (2). K. and A. Recursive Methods in Economic Dynamics. 659–684. Judd. and R. Massachussets: MIT Press.. K. Lucas. 1991. D. Quadrature Based Methods for Obtaining Approximate Solutions to Nonlinear Asset Pricing Models. R. Solnick. Prescott. New York: Academic Press.

28 Metric space. 6 Cauchy sequence. 15. 6 Policy functions. 41 Markov chain. 2 Blackwell’s suﬃciency conditions. 1. 37 Stationary distribution. 32 52 . 37 Invariant distribution. 29 Transition probability matrix. 30 Lagrange data. 30 Value function. 30 Transition probabilities. 3 Monotonicity. 4 Contraction mapping. 6 Howard Improvement. 4 Contraction mapping theorem. 2 Policy iterations.Index Bellman equation. 4 Discounting. 2 Value iteration. 15.

. Value function iteration . . . . . . . Stochastic dynamic programming . . . . . .3 7. .3. .2. . . . . . . . .3 7. . . . . . . . . .1 The Bellman equation and associated theorems .5 A heuristic derivation of the Bellman equation . . .4 7. . . . . The deterministic case . . .1. . .2. . . . . . . . . . Policy iterations: Howard Improvement . . . . . . . . .2 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The stochastic case . . . . . . Taking advantage of interpolation . . .1 7. . . . 53 . . .1 7. 1 1 1 3 8 8 13 15 20 26 26 28 32 37 41 41 44 48 Deterministic dynamic programming . . . . . . . . .4 7. . .3. . . . . . . . . . . . . . . . . . . . . . . .3 7. . . . . . . . .1 7. . . Value iteration . . . . . . . . The Bellman equation .2 7.2 7. . .2. .2 7. . . . . . . . .1 7. . . .4. . . . .Contents 7 Dynamic Programming 7. .2 7. .1. . . . . .4. . 7. . . Parametric dynamic programming . . . . . . . .4 7. .3. . . . . . . . . . . . . . Policy iterations . . . . . . . . . . . . . . . . . .3. . How to simulate the model? . . Existence and uniqueness of a solution . . . . . . . . Discretization of the shocks . . . . . . . . . . . . Handling constraints . . . . . . .2. . . . . . . . . . . . . .

54 .

. .10 Stochastic OGM (Decision rules. . . .4 7. . .6 7. . . .13 Transitional dynamics (OGM) . . . Policy iteration) . . . 7. .9 Deterministic OGM (Value function. . . . . . . . . . 7. . . . . . Deterministic OGM (Decision rules. . . Deterministic OGM (Decision rules. . . .5 7. . . . Value iteration with interpolation) . Parametric DP) . 16 19 20 22 22 35 35 38 39 43 46 15 11 12 7. . . . . . . . . . . . . Policy iteration) . . . .11 Stochastic OGM (Value function. . . Deterministic OGM (Value function. . . . . . . . . .3 7. . .12 Stochastic OGM (Decision rules.8 7. . Deterministic OGM (Value function. . . . . Stochastic OGM (Value function. . . Deterministic OGM (Decision rules. . . . . . . Deterministic OGM (Decision rules. . . . . . . . . . 55 . . . . . . . . . . . . . . . . . . . . . . . . Value iteration) . . . . . . .1 7. . . .List of Figures 7. .14 Simulated OGM . Value iteration with interpolation) . 7. . Parametric DP) . Value iteration) . . Policy iteration) . . . . Value iteration) . . . . . . . .7 7. .2 7. . . . . 7. . . . . . . Deterministic OGM (Value function. . . . . . . . Value iteration) . . . . . Policy iteration) . . . . . .

56 .

. . .1 Value function approximation . . . . .List of Tables 7. . . . . . . . . 23 57 . . . .

- Markov Chain and Classification of Difficulty Levels Enhances the Learning Path in One Digit Multiplication
- Cross Layer Optimization
- Harvard
- Asset Pricing
- pachet-02f
- Mdp
- Queueing System 3
- A review on condition-based maintenance optimization models.pdf
- Vacation Models in Discrete Time
- Solutions Probability
- Document 6
- A Markov Chain Approach to Forecasting Enrollments Levy Et Al June 28 2012
- 02_Hidden Markov Model
- Detecting Cyclical Turning Points 2002 10
- MN3032 ZA d1
- Practice Problem for Operations Researc-1
- STA3045F+Exam+2012
- Alg Factoring Assignments
- get_file.php.pdf
- SMIT Users Guide
- Document
- Marr
- Lecture 31
- Modeling Adequacy for Cascading Failure Analysis
- pragmatic code 1
- 40-Year-old Algorithm That Cannot Be Improved » Loren on the Art of MATLAB
- mcmc
- Factorial Hidden Markov Models
- Cross Entropy
- Laboratory in Automatic Control Lab2

Skip carousel

- Supply Chain and Value Chain Management for Sugar Factory
- tmpCE5.tmp
- tmpD7C3
- tmp51A9.tmp
- UT Dallas Syllabus for cs4375.501.07f taught by Yu Chung Ng (ycn041000)
- Personalized Gesture Recognition with Hidden Markove Model and Dynamic Time Warping
- tmpFEC7.tmp
- Tmp 7832
- tmpDB70.tmp
- tmpF98F
- rev_frbrich2013q1.pdf
- Speech Recognition Using Hidden Markov Model Algorithm
- tmp9649.tmp
- UT Dallas Syllabus for cs4375.501.09f taught by Yu Chung Ng (ycn041000)
- Convergence of EU regions. Measures and evolution (Eng)/ La convergencia de las regiones de la UE (Ing)/ EBko eskualdeen konbergentzia (Ing)
- Speech Recognition using HMM & GMM Models
- UT Dallas Syllabus for stat7345.501.09s taught by Robert Serfling (serfling)
- UT Dallas Syllabus for cs6375.501.10f taught by Yu Chung Ng (ycn041000)
- UT Dallas Syllabus for stat7345.501.10f taught by Robert Serfling (serfling)
- A Study and Comparative analysis of Conditional Random Fields for Intrusion detection
- frbclv_wp1992-01.pdf
- rev_frbrich200604.pdf
- UT Dallas Syllabus for cs4375.501 06f taught by Yu Chung Ng (ycn041000)
- UT Dallas Syllabus for cs6375.002 06f taught by Yu Chung Ng (ycn041000)
- tmp108.tmp
- UT Dallas Syllabus for cs4375.001.11f taught by Yu Chung Ng (ycn041000)
- tmp79F5.tmp
- A Survey of Online Credit Card Fraud Detection using Data Mining Techniques
- As IEC 61165-2008 Application of Markov Techniques
- tmp3746.tmp

Skip carousel

- UT Dallas Syllabus for cs6363.002.08s taught by Balaji Raghavachari (rbk)
- Appraisal of PSO Algorithm over Genetic Algorithm in WSN Using NS2
- UT Dallas Syllabus for opre7313.001.11s taught by Milind Dawande (milind)
- Retiming Of Delayed Least Mean Square Algorithm for Adaptive Filter
- tmpC0CF.tmp
- UT Dallas Syllabus for cs3333.001.11s taught by Jeyakesavan Veerasamy (veerasam)
- UT Dallas Syllabus for cs2305.002 05f taught by Timothy Farage (tfarage)
- Scheduling Resources In a Hetero-Gene Cloud Using Genetic Algorithm
- A review on Development of novel algorithm by combining Wavelet based Enhanced Canny edge Detection and Adaptive Filtering Method for Human Emotion Recognition
- OCR for Gujarati Numeral using Neural Network
- Public Cloud Partition Using Load Status Evaluation and Cloud Division Rules
- UT Dallas Syllabus for opre7313.001.08f taught by Milind Dawande (milind)
- Analysis & Design Algorithm MCQ'S
- UT Dallas Syllabus for cs3345.501 05s taught by Greg Ozbirn (ozbirn)
- tmpAA72
- The Optimizing Multiple Travelling Salesman Problem Using Genetic Algorithm
- Comparison of different Sub-Band Adaptive Noise Canceller with LMS and RLS
- A Survey of Various Intrusion Detection Systems
- Adoption of Parallel Genetic Algorithms for the Solution of System of Equations
- Determining the shortest path for Travelling Salesman Problem using Nearest Neighbor Algorithm
- tmp8BC6
- Simulation of Single and Multilayer of Artificial Neural Network using Verilog
- A Survey of Modern Data Classification Techniques
- tmpDF60.tmp
- Content-Based Image Retrieval Using Features Extracted From Block Truncation Coding
- Cluster Analysis Techniques in Data Mining
- tmpCB3F.tmp
- UT Dallas Syllabus for se3345.502.07f taught by Ivor Page (ivor)
- tmp5056.tmp
- Principles of parallel algorithm models and their objectives

Sign up to vote on this title

UsefulNot usefulRead Free for 30 Days

Cancel anytime.

Close Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Close Dialog## This title now requires a credit

Use one of your book credits to continue reading from where you left off, or restart the preview.

Loading