Homework 3: Answer

Victor M.
Chan Ortiz Optimal Control

CINVESTAV - GDL Homework 3 March 17, 2020
1. Find the curve x? (t) that minimizes the functional:

Z 1
1
J(x) = [ x2˙(t) + 5x(t)x(t)
˙ + x2 (t) + 5x(t)]dt
0 2
and passes through points x(0) = 1 and x(1) = 3.

Answer:
First, we define the Euler equation, to obtain x∗ (t):
∂g(x∗ (t), ẋ∗ (t), t) d ∂g(x∗ (t), ẋ∗ (t), t)

− =0
∂x dt ∂ ẋ
Defining g(...) function:

1
g(x∗ (t), ẋ∗ (t), t) = (ẋ∗ (t))2 + 5x∗ (t)ẋ∗ (t) + (x∗ (t))2 + 5x∗ (t)
2
Differentiating with respect to x(t) and ẋ(t):
∂g(x∗ (t), ẋ∗ (t), t)

= 5ẋ∗ (t) + 2x∗ (t) + 5
∂x
∂g(x∗ (t), ẋ∗ (t), t))
= ẋ∗ (t) + 5x∗ (t)
∂ ẋ
Replacing derivatives:
d ∗
5ẋ∗ (t) + 2x∗ (t) + 5 − [ẋ (t) + 5x∗ (t)] = 0
dt
5ẋ∗ (t) + 2x∗ (t) + 5 − [ẍ∗ (t) + 5ẋ∗ (t)] = 0
2x∗ (t) + 5 − ẍ∗ (t) = 0
Applying the Laplace transform:

5
s2 X(s) − sX(0) − X 0 (0) − 2X(s) − =0
s
5
s2 X(s) − s − 2X(s) − =0
s
s2 + 5
X(s) = √ √
s(s − 2)(s + 2)
Expanding by partial fractions:

A B C
X(s) = + √ + √
s s+ 2 s− 2
With the following coefficients:
√ √
5 5+ 2 5− 2
A=− B= C=
2 4 4
1
Victor M. Chan Ortiz Optimal Control
Replacing coefficients:
√ √
51 5+ 2 1 5− 2 1
X(s) = − + √ + √
2s 4 s+ 2 4 s− 2
Applying the inverse Laplace transform:

√ ! √ !
5 5+ 2 √ 5− 2 √
x∗ (t) = − + C1 e−t 2 + C2 et 2 (1)
2 4 4
In order to satisfy the boundary conditions such that x(0) = 1 and x(1) = 3. We must
find the constant values.
√ ! √ !
5 5 + 2 √ 5 − 2 √
x∗ (0) = − + C1 e−(0) 2 + C2 e(0) 2
2 4 4
√ ! √ !
5 5+ 2 √ 5− 2 √
∗ −(1) 2
x (1) = − + C1 e + C2 e(1) 2
2 4 4
Simplifying:
√ √
14 = C1 5 + 2 + C2 5 − 2
√ √ √ √
22 = C1 5 + 2 e− 2 + C2 5 − 2 e 2
Constant values:
C1 = 1.4335
C2 = 1.34
Replacing in equation (1):

√ √
x∗ (t) = −2.5 + 2.2986e−t 2
+ 1.2012et 2
(2)
Equation (2) is the solution to Euler equation. O
2. One important calculus of variations problem that we did not discuss in class has the
same basic form, but with constraints that are given by an integral - called isoperimetric
constraints: Z tf
min J = g[x, ẋ, t] dt
t0
such that: Z tf
e[x, ẋ, t] dt = C
t0
where we will assume that tf is free but x(tf ) is fixed.
2
(a) Use the same approach followed in the notes to find the necessary and bound-
ary conditions for this optimal control problem. In particular, augment the
constraint to the cost using a constant Lagrange multiplier vector ν, and show
that these conditions can be written in the form:

∂ga d ∂ga
− = 0
∂x dt ∂ ẋ
∂ga
ga (tf ) − ẋ(tf ) = 0
∂ ẋ
Z tf
e[x, ẋ, t] dt = C
t0
where ga = g + ν T e.
(b) Use the results of part (a) to clearly state the differential equations and corre-
sponding boundary conditions that must be solved to find the curve y(x) of a
specified length L with endpoints on the x-axis (i.e., at x = 0 and x = xf ) that
encloses the maximum area, so that:
Z xf Z xf p
J= ydx and 1 + ẏ 2 dx = L
0 0
with xf free.
Answer a):
Finding a necessary condition with t0 , x(t0 ) = x0 , and x(tf ) = xf are specified, tf is
free and ga = g + ν T e.
Z tf +δtf Z tf
∆J = T
[g + ν e]dt − [g ∗ + ν T e∗ ]dt
t0 t0
Z tf Z tf +δtf
T ∗ T ∗
= {[g + ν e] − [g + ν e ]}dt [g + ν T e]dt
t0 tf
The first integrand can be expanded about x∗ (t), ẋ∗ (t) in a Taylor series to give:
Z tf
∂ ∗ T ∗ ∂ ∗ T ∗
∆J = (g + ν e ) δx(t) + (g + ν e ) δ ẋ(t) dt + o(δx(t), δ ẋ(t))
t0 ∂x ∂ ẋ
Z tf +δtf
+ [g + ν T e]dt
tf
The second integral can be written:

Z tf +δtf
[g + ν T e]dt = [g + ν T ]δtf + o(δtf )
tf
Substituting the second integral part:

Z tf
∂ ∗ T ∗ ∂ ∗ T ∗
∆J = (g + ν e ) δx(t) + (g + ν e ) δ ẋ(t) dt + [g + ν T ]δtf
t0 ∂x ∂ ẋ
3
Integrating by parts:

∂ ∗ T ∗
∆J = (g + ν e ) δx(tf ) + [g + ν T ]δtf (1)
∂ ẋ
Z tf
∂ ∗ T ∗ d ∂ ∗ T ∗
+ (g + ν e ) − (g + ν e ) δx(t)dt
t0 ∂x dt ∂ ẋ
We have also used the fact that ∂x(t0 ) = 0. Now expressing the function g in terms of
g∗.
∂ ∗ ∂ ∗
[g + ν T ] = [g ∗ + ν T e∗ ] + [g + ν T e∗ ]δx(tf ) + [g + ν T e∗ ]δ ẋ(tf ) + o(·)
∂x ∂ ẋ
Simplifying this equation:
[g + ν T ] = [g ∗ + ν T e∗ ]
Replacing in equation (1):

∂ ∗
∆J = (g + ν e ) δx(tf ) + [g ∗ + ν T e∗ ]δtf
T ∗
(2)
∂ ẋ
Z tf
∂ ∗ T ∗ d ∂ ∗ T ∗
+ (g + ν e ) − (g + ν e ) δx(t)dt
t0 ∂x dt ∂ ẋ
δx(tf ), which is neither zero nor free, depends on δtf . The variation of J, δJ, consists
of the first-order terms in the increment ∆J; therefore, the dependence of δx(tf ) on
δx(tf ) must be linearly approximated.
δx(tf ) + ẋ∗ (tf )δtf = 0 → δx(tf ) = −ẋ∗ (tf )δtf
Replacing this term in equation (2):

∗ ∂ ∗ T ∗ ∗ ∗ T ∗
δJ(x , δx) = 0 = − (g + ν e ) ẋ (tf ) + [g + ν e ] δtf
∂ ẋ
Z tf
∂ ∗ T ∗ d ∂ ∗ T ∗
+ (g + ν e ) − (g + ν e ) δx(t)dt
t0 ∂x dt ∂ ẋ
From the last equation we can get the necessary and boundary conditions:

∂ ∗ T ∗ d ∂ ∗ T ∗
(g + ν e ) − (g + ν e ) = 0
∂x dt ∂ ẋ
∂ ∗
[g ∗ + ν T e∗ ] − (g + ν T e∗ )ẋ∗ (tf ) = 0
∂ ẋ
With:
[g ∗ + ν T e∗ ] = ga
4
Replacing the last terms:

∂ d ∂
ga − ga = 0
∂x dt ∂ ẋ
∂
ga (tf ) − ga (tf )ẋ∗ (tf ) = 0
∂ ẋ
To find the curve having a fixed length which enclosed the maximum area. The con-
straints are of the form: Z tf
e[x, ẋ, t] dt = C
t0
Answer b):
First, we are going to identify g function and e function:
p
g=y e = 1 + ẏ 2
∂g ∂e
=1 =0
∂y ∂y
∂g ∂e ẏ
=0 =p
∂ ẏ ∂ ẏ 1 + ẏ 2
With the information in the last subsection:
p ∂ga ∂ga ẏ
ga = y + ν 1 + ẏ 2 =1 = νp
∂y ∂ ẏ 1 + ẏ 2
Using the first condition:

∂ga d ∂ga
− =0
∂y dt ∂ ẏ
! ! !
d ẏ d ẏ 1 d ẏ
1− νp =0→1=ν p → = p
dt 1 + ẏ 2 dt 1 + ẏ 2 ν dt 1 + ẏ 2
p 1
1 1 + ẏ 2 (ÿ) − ẏ(1 + ẏ 2 )− 2 (ẏ ÿ)
=
ν 1 + ẏ 2
1 ÿ
= 3
ν (1 + ẏ) 2
So that:
1 3
ÿ = (1 + ẏ) 2
ν
Using the second condition:
∂ga (tf ) ∗
ga (tf ) − ẏ (tf ) = 0
∂ ẏ
ẏ(tf )
q
y(tf ) + ν 1 + ẏ 2 (tf ) − ν p ẏ(tf ) = 0
1 + ẏ 2 (tf )
−ν
y(tf ) = p
1 + ẏ 2 (tf )
5
If ν 6= 0 and y(tf ) = 0, replacing this values in the last equation:
ẏ(tf ) = ∞
Now, the last boundary condition must be satisfied:

Z tf
e[x, ẋ, t]dt = C
t0
Replacing e in equation: Z tf q
1 + ẏ 2 (tf )dt = L
t0
O
3. Consider the unstable second order system
ẋ1 = x2
ẋ2 = x2 + u
and performance index Z ∞

J= (Rxx x21 + Ruu u2 )dt
0
(a) For Rxx /Ruu = 1 show analytically (i.e. not using Matlab) that the steady-state
LQR gains are: √
K = [1 3 + 1]
√
and that the closed-loop poles are at s = −( 3 ± j)/2
(b) Using the steady-state regulator gains from Matlab in each case, plot (on one
graph) the closed-loop locations for a range of possible values of Rxx /Ruu . Show
the pole locations for the expensive control problem. Do you see any trends here?
Answer a):
We have the following system:

0 1 0
ẋ = x+ u
0 1 1
The general form of the cost function is:
1 tf T
Z
1 T
J = x(tf ) Hx(tf ) + [x (t)Qx(t) + u(t)T R(t)u(t)]dt
2 2 t0
The performance measure to be minimized is:

Z ∞
J= (Rxx x21 + Ruu u2 )dt
0
6
In order to do more familiar the performance measure:
Rxx = 2Rx Ruu = 2Ru
From this equation Q and R are deduced, knowing that:
g(x) = Rx xT Qx + Ru Ru2

T
1 0
x Qx = x1 x2 x 1 x2
0 0

1 0
Q= Ru = 1
0 0
Now, applying the equation to get the gains K:
K̇(t) = −K(t)A(t) − AT (t)K(t) − Q(t) + K(t)B(t)R−1 (t)B T (t)K(t)

k̇11 k̇12 k11 k12 0 1 0 0 k11 k12 Rx 0
=− − −
k̇21 k̇22 k21 k22 0 1 1 1 k21 k22 0 0

k k 0 1 k11 k12
+ 11 12 0 1
k21 k22 1 Ru k21 k22

Rx k11 + k12 1 k12 k21 k12 k22
=− + 2
k11 + k21 k12 + k21 + 2k22 Ru k21 k22 k22
 k12 k21 k12 k22 
Rx − k11 + k12 −
= −
 Ru Ru 
2 
k21 k22 k22
k11 + k21 − k12 + k21 + 2k22 −
Ru Ru
The matrix K is a symmetric matrix (k12 = k21 ) and with the condition that Rxx /Ruu =
1 and for the last substitution Rxx = 2Rx and Ruu = 2Ru . In order to simplify the
calculations, we propose:
Rx = Ru = 1
If we’re looking the K value in steady state, K̇ = 0:

0 0 1 − k12 k21 k11 + k12 − k12 k22
=− 2
0 0 k11 + k21 − k21 k22 k12 + k21 + 2k22 − k22
Solving the equations system:
2
0 = 1 − k12
0 = k11 + k12 − k12 k22
2
0 = 2k12 + 2k22 − k22
K values: √ √
k11 = 3 k12 = 1 k22 = 1 ± 3
So: √
3 1√
K=
1 1+ 3
7
Now the gains in steady state are:

√
√
−1
T
3 1√
R B K= 0 1 = 1 1+ 3
1 1+ 3
Calculating closed-loop poles. First:

√
u∗ (t) = −R−1 (t)B T (t)K(t)x(t) → u∗ (t) = − 1 1 + 3 x(t)

Substituting in original system:

√

0 1 0 0 1
√ x(t)
ẋ(t) = x(t) − 1 1 + 3 x(t) =
0 1 1 −1 − 3
The closed-loop poles are determined by:
det(sI − A + BK) = 0
√

s
−1√ = s2 + s 3 + 1 = 0

1 s + 3
√
3±j
s1,2 = − ∇
2
Answer b):
The larger Rxx /Ruu is, the K gains in the system become just as large. In the same
way the poles of the system move away from the imaginary axis. The opposite happens
when Rxx /Ruu tends to be very small.
Figure 1: LQR Gains.
8
Figure 2: Evolution of the poles.
Figure 3: Real part of the poles versus Rxx /Ruu .
4. Find the Hamiltonian and then solve the necessary conditions to compute the optimal
control and state trajectory that minimize:
Z 1
J= u2 dt
0
for the system ẋ = −2x + u with initial state x(0) = 2 and terminal state x(1) = 0.
Plot the optimal control and state response using Matlab.
Answer:
9
The Hamiltonian is given by:
H(x(t), u(t), P (t), t) = g(x(t), u(t), t) + P T (t)a(x(t), u(t), t)

H = u2 (t) + P (t)[−2x(t) + u(t)]
With:
∂H
ẋ∗ (t) = = −2x∗ (t) + u∗ (t)
∂P
∂H
Ṗ ∗ (t) = − = −2P ∗ (t)
∂x
∂H
= 2u∗ (t) + P ∗ (t) = 0
∂u
Now, solving the differential equation:

Z t Z t
∗ ∗ dP dP
Ṗ (t) = −2P (t) → = 2P → =2 dt → ln P (t) − ln P (0) = 2t
dt 0 P 0
P (t) P (t)
ln = 2t → = e2t
P (0) P (0)
Replacing P (t) in u∗ (t):
1
u∗ (t) = P (0)e2t
2
Replacing u∗ (t) in ẋ∗ (t):
1
ẋ∗ (t) = −2x∗ (t) + P (0)e2t
2
Solving the last differential equation:
P (0) 1
sX(s) − X(0) = −2X(s) −
2 s−2
P (0)
X(s)(2)(s + 2) = 4 −
s−2
4s − 8 − P (0)
X(s) =
2(s − 2)(s + 2)

1 P (0) − 4s + 8
X(s) = −
2 (s − 2)(s + 2)

1 A B
X(s) = − +
2 s+2 s−2
P (0) + 16 P (0)
A=− B=
4 4
1 P (0) + 16 P (0)
X(s) = − − +
8 s+2 s−2
1
(P (0) + 16)e−2t − P (0)e2t

x(t) =
8
10
To obtain P(0) value, we must use the conditions that the problem gave to us:
1
0 = [(P (0) + 16)e−2 − P (0)e2 ]
8
0 = P (0) + 16 − P (0)e4
P (0)(e4 − 1) = 16
16
P (0) = 4 = 0.2985
e −1
So, the solution is:
x∗ (t) = 2.03772e−2t − 0.2985e2t
u∗ (t) = −14925e2t
Figure 4: States x∗ (t) evolution.
Figure 5: Input u∗ (t) evolution.
11
5. Consider a disturbance rejection problem that minimizes:

1 tf T
Z
1 T
J = x(tf ) Hx(tf ) + [x (t)Rxx x(t) + u(t)T Ruu (t)u(t)]dt (1)
2 2 t0
subject to
ẋ(t) = A(t)x(t) + B(t)u(t) + w(t) (2)
To handle the disturbance term, the optimal control should consist of both a feedback
term and a feed-forward term (assume w(t) is known).
u? (t) = −K(t)x(t) + uf w (t) (3)
Using the HAmilton-Jacobi-Bellman equation, show that a possible optimal value func-
tion is of the form:
1 1
J ? (x(t), t) = xT (t)P (t)x(t) + bT (t)x(t) + c(t) (4)
2 2
where:
−1 T −1
K(t) = Ruu B (t)P (t) uf w = −Ruu (t)B T (t)b(t) (5)
In the process demonstrate that the conditions that must be satisfied are:
−1
−Ṗ (t) = AT (t)P (t) + P (t)A(t) + Rxx (t) − P (t)B(t)Ruu (t)B T (t)P (t)
−1
ḃ(t) = −[A(t) − B(t)Ruu (t)B T (t)P (t)]T b(t) − P (t)w(t)
−1
ċ(t) = bT (t)B(t)Ruu (t)B T (t)b(t) − 2bT (t)w(t)
with boundary conditions: P (tf ) = H, b(tf ) = 0, c(tf ) = 0.

Answer:
First, we identify the information:
1
g(x, u, t) = [xT Rxx x + uT Ruu u]
2
a(x, u, w, t) = Ax(t) + Bu(t) + w(t)
The Hamiltonian is given by:

1
H = [xT Rxx x + uT Ruu u] + Jx∗ [Ax(t) + Bu(t) + w(t)]
2
The optimal control must satisfy:
∂H
= Ru u + B T Jx∗ = 0
∂u
∂ 2H
= Ru > 0
∂u2
Thus:
u∗ = −Ru−1 B T Jx∗
12
Replacing u∗ in H-J-B equation:

1
0 = Jt∗ + [xT Rxx x + uT Ruu u] + Jx∗ T [Ax + Bu + w]
2
1
0 = Jt∗ + [xT Rxx x + (−Ru−1 B T Jx∗ )T Ruu (−Ru−1 B T Jx∗ )] + Jx∗ T [Ax + Bu + w]
2
1
0 = Jt∗ + [xT Rxx x + Jx∗ T BRuu B Jx ] + Jx∗ T [Ax + Bu + w]
−1 T ∗
2
We know that Rxx and Ruu are symmetric. The boundary value is:
1
J ∗ (x, t) = xT (tf )Hx(tf )
2
Let us assume a solution that satisfies the differential equation and the boundary con-
ditions.
1 1
J ∗ (x, t) = xT (t)P (t)x(t) + bT (t)x(t) + c(t) (6)
2 2
If t = tf :
P (tf ) = H b(tf ) = 0 c(tf ) = 0
From equation (6) we can get:
Jx∗ = P (t)x(t) + bT (t)

1 1
Jt∗ = xT (t)Ṗ (t)x(t) + ḃT (t)x(t) + ċ(t)
2 2
Replacing in control input:
u∗ = −Ru−1 B T [P (t)x(t) + b(t)] = −Ru−1 B T P (t)x(t) − Ru−1 B T b(t)
Doing a change of variable by equations in (5):
K(t) = −Ru−1 (t)B T (t)P (t) uwf = −Ru−1 (t)B T (t)b(t)
Control input:
u∗ (t) = K(t)x(t) + uwf (t)
Substituting the input in the Hamiltonian:
1
H = Jt∗ + [xT Rx x + Jx∗ T BRu−1 B T Jx∗ ] + Jx∗ T [Ax + B(−Ru−1 B T Jx∗ ) + w]
2
1 1
H = xT Rx x − Jx∗ T BRu−1 B T Jx∗ + Jx∗ T [Ax + w]
2 2
13
Now substituting the input in the HJB equation we get:

1 1
0 = Jt∗ + xT Rx x − Jx∗ T BRu−1 B T Jx∗ + Jx∗ T [Ax + w]
2 2
1 T 1 1
0 = x (t)Ṗ (t)x(t) + ḃT (t)x(t) + ċ(t) + xT (t)Rx (t)x(t)
2 2 2
1
− (P (t)x(t) + bT (t))T BRu−1 B T (P (t)x(t) + bT (t)) + (P (t)x(t) + bT (t))T [Ax + w]
2
1 1 1 1
0 = xT Ṗ x + ḃT x + ċ + xT Rx x − (xT P + b)BRu−1 B T (P x + bT )
2 2 2 2
T T
+(P x + b ) [Ax + w]
1 1 1 1
0 = xT Ṗ x + ḃT x + ċ + xT Rx x − xT P BRu−1 B T (P x + bT )
2 2 2 2
1 −1 T
− bBRu B (P x + b ) + x P Ax + xT P w + bAx + bw
T T
2
1 T 1 1 1 1
0 = x Ṗ x + ḃT x + ċ + xT Rx x − xT P BRu−1 B T P x − xT P BRu−1 B T bT
2 2 2 2 2
1 −1 T 1 −1 T T T T
− bBRu B P x − bBRu B b + x P Ax + x P w + bAx + bw
2 2
We can se that:
xT P BRu−1 B T b = bT BRu−1 B T P x
Thus:
1 1 1 1
0 = xT Ṗ x + ḃT x + ċ + xT Rx x − xT P BRu−1 B T P x − xT P BRu−1 B T bT
2 2 2 2
1
− bBRu−1 B T bT + xT P Ax + xT P w + bAx + bw
2
1 T
0 = x [Ṗ + Rxx + P BRu−1 B T P + 2P A]x + xT (P BRu−1 B T b + 2P w + ḃ + AT b)
2
1
+ (ċ − bT BRu−1 B T b + 2bT w)
2
The conditions to satisfy the last equation are:
−1
−Ṗ (t) = AT (t)P (t) + P (t)A(t) + Rxx (t) − P (t)B(t)Ruu (t)B T (t)P (t)
−1
ḃ(t) = −[A(t) − B(t)Ruu (t)B T (t)P (t)]T b(t) − P (t)w(t)
−1
ċ(t) = bT (t)B(t)Ruu (t)B T (t)b(t) − 2bT (t)w(t)
14

Homework 3: Answer

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Homework 3: Answer

Uploaded by

Copyright:

Available Formats

Victor M.

Chan Ortiz Optimal Control

1. Find the curve x? (t) that minimizes the functional:

and passes through points x(0) = 1 and x(1) = 3.

∂g(x∗ (t), ẋ∗ (t), t) d ∂g(x∗ (t), ẋ∗ (t), t)

Defining g(...) function:

∂g(x∗ (t), ẋ∗ (t), t)

Applying the Laplace transform:

Expanding by partial fractions:

Applying the inverse Laplace transform:

Replacing in equation (1):

Equation (2) is the solution to Euler equation. O

where we will assume that tf is free but x(tf ) is fixed.

The second integral can be written:

Substituting the second integral part:

Replacing in equation (1):

δx(tf ) + ẋ∗ (tf )δtf = 0 → δx(tf ) = −ẋ∗ (tf )δtf

Replacing this term in equation (2):

Replacing the last terms:

If ν 6= 0 and y(tf ) = 0, replacing this values in the last equation:

Now, the last boundary condition must be satisfied:

3. Consider the unstable second order system

and performance index Z ∞

The general form of the cost function is:

The performance measure to be minimized is:

In order to do more familiar the performance measure:

Rxx = 2Rx Ruu = 2Ru

From this equation Q and R are deduced, knowing that:

K̇(t) = −K(t)A(t) − AT (t)K(t) − Q(t) + K(t)B(t)R−1 (t)B T (t)K(t)

Now the gains in steady state are:

Calculating closed-loop poles. First:

Substituting in original system:

The closed-loop poles are determined by:

Figure 1: LQR Gains.

Figure 2: Evolution of the poles.

Figure 3: Real part of the poles versus Rxx /Ruu .

The Hamiltonian is given by:

H(x(t), u(t), P (t), t) = g(x(t), u(t), t) + P T (t)a(x(t), u(t), t)

Now, solving the differential equation:

Figure 4: States x∗ (t) evolution.

Figure 5: Input u∗ (t) evolution.

5. Consider a disturbance rejection problem that minimizes:

u? (t) = −K(t)x(t) + uf w (t) (3)

with boundary conditions: P (tf ) = H, b(tf ) = 0, c(tf ) = 0.

The Hamiltonian is given by:

Replacing u∗ in H-J-B equation:

P (tf ) = H b(tf ) = 0 c(tf ) = 0

From equation (6) we can get:

Jx∗ = P (t)x(t) + bT (t)

u∗ = −Ru−1 B T [P (t)x(t) + b(t)] = −Ru−1 B T P (t)x(t) − Ru−1 B T b(t)

Doing a change of variable by equations in (5):

K(t) = −Ru−1 (t)B T (t)P (t) uwf = −Ru−1 (t)B T (t)b(t)

Now substituting the input in the HJB equation we get:

You might also like