Professional Documents
Culture Documents
Exercise 8
Exercise 8
Exercise 8.1
The problem is
ẋ = −x + u − 1, x(0) = 1,
∫ 1
J = x2 (1) + u2 dt.
0
Solve the optimal control with the help of the HJB equation. Trial function: V (t, x) = a(t)x2 +
2b(t)x + c(t).
Exercise 8.2
The system is
ẋ1 = x2
ẋ2 = −x1 + x2 + u,
a) Write the system equation in matrix form ẋ = Ax + Bu∫, where A and B are matrices and
x, u vectors. Write also the cost in the matrix form J = [x′ Qx + u′ Ru]dt, where Q, R are
matrices.
when instead of the function J the function V (x, t) = ert J(x, t) is used.
b) What role has the factor e−rt in the cost functional? Characterize its role when t varies
between zero and innity. The parameter r > 0.
by using the trial function V (x, t) = Ax2 for the present value of the optimal cost. Apply the
HJB equation of part a).
The amount of savings in the beginning is S . All income is interest income (interest is constant).
The change in capital x is modeled with the equation ẋ(t) = αx(t) − r(t), where α > 0 is the
interest and r is the consumption rate. The utility achieved from momentary consumption r is
U (r), where U is the utility; let U (r) = r1/2 .
The utility that will be achieved in the future is discounted: the utility achieved today (t = 0)
is more valuable now than in the future. The discount factor is e−βt (β > α/2), thus the present
value of the maximized total utility is
∫ T
J= e−βt U (r)dt (2)
0
b) Solve the HJB equation by using the trial function J(x, t) = f (t)g(x)
Solution
where α > 0 is the interest and r(t) is the consumption rate which acts as a control variable.
The natural constraint is
r(t) ≥ 0, ∀ t ∈ [0, T ].
The utility achieved from consuming is
√
U (r(t)) = r(t),
which is discounted into present value with the discount factor β > α/2, then the present value
of the total utility is ∫ T
J= e−βt U (r(t)) dt.
0
The aim is to maximize J , so that in the end all capital is used.
a) Let us form the HJB equation. The Hamiltonian for this problem is
√ [ ]
H (x(t), r(t), t) = e−βt r(t) + Jx∗T (x(t), t) αx(t) − r(t) .
By writing the partial derivative with regard to r and setting it equal to zero, the following
equation is attained
∂H 1 e−βt
= √ − Jx∗ (x(t), t) = 0,
∂r 2 r∗ (t)
from which we get the maximum
e−2βt
r∗ (t) = [ ]2 ≥ 0,
4 Jx∗ (x(t), t)
because the second partial derivative is negative. Now the HJB equation can be formed:
e−2βt e−2βt
0 = Jt∗ (x(t), t) + + αJ ∗
(x(t), t)x(t) −
2Jx∗ (x(t), t) x
4Jx∗ (x(t), t)
thus
e−2βt
Jt∗ (x(t), t) + + αJx∗ (x(t), t)x(t) = 0.
4Jx∗ (x(t), t)
This is fullled for all t ∈ [0, T ] if only f (t) satises the ordinary dierential equation
e−2βt 1
Af ′ (t) + + αAf (t) = 0.
2Af (t) 2
By multiplying both sides with eαt and organizing the terms the equation can be written
in the following form
1
αeαt f 2 + 2eαt f f ′ = − 2 e(α−2β)t .
| {z } A
d
dt
(eαt f 2 )
From these only the positive sign is possible, otherwise J ∗ ≤ 0. We have achieved a
candidate for the cost-to-go -function
√
√ 1
J ∗ (x(t), t) = A x(t) e−2βt + Ce−αt .
A2 (2β − α)
The integration constant C should be chosen so that the boundary condition of the HJB
equation
J ∗ (x(T ), T )) = 0
is satised. Hence, we get the condition for C :
1
C= e(α−2β)T .
A2 (α − 2β)
The exponential expression in the root expression is always non-negative, when t ∈ [0, T ].
Thus, the function is well-dened and satises the boundary condition of the HJB equa-
tion.
Inserting J ∗ into the optimal control expression previously calculated, we get the optimal
feedback control
2β − α
r∗ (t) = x(t).
1 − e(2β−α)(t−T )
c) Inserting the optimal feedback control into the state equation, we get the dierential
equation for the optimal trajectory
( 2β − α )
ẋ(t) = α − x(t).
1 − e(2β−α)(t−T )
By separating the equation and integrating both sides, we get the form
∫ ∫
x(t)
dx t ( 2β − α )
= α− dt
S x 0 1 − e(2β−α)(t−T )
hence ∫ t
x dt
log = αt − (2β − α) .
S 0 1− e(2β−α)(t−T )
We nd the following formula from the integral cookbook
∫
dt 1( )
kt
= kt − log(a + bekt ) ,
a + be ak
by using it and simplifying, we get the nal result
1 − e(2β−α)(t−T )
x(t) = Se2(α−β)t .
1 − e−(2β−α)T
Figure 1: The optimal consumption trajectories for Exercise 8.4, with dierent values for the
discount factor β when T = 1, S = 1 and α = 0.95.
Note that x(T ) = 0 so the end condition is also satised. This is natural, because if there
would be unused capital in the end, the total utility can't be optimized. In the Figure 1 are
the trajectories for dierent values of β , when T = 1, S = 1 and α = 0.95.
Exercise 8.5 (home assignment)
∫0.04[ ]
1 1 2 1 2
J = x2 (0.04) + x (t) + u (t) dt.
2 4 2
0
The admissible state and control values are not constrained by any boundaries. Find the
optimal control law by using the Hamilton-Jacobi-Bellman equation.