You are on page 1of 24

AA203

Optimal and Learning-based Control


Proof of NOC, Pontryagin’s minimum principle
Roadmap
Adaptive
Model-based RL
optimal control

Linear Non-linear
methods methods
Optimal control

Open-loop MPC Closed-loop

Indirect Direct
DP HJB / HJI
methods methods
Reachability
CoV NOC PMP LQR iLQR DDP LQG analysis
4/28/19 AA 203 | Lecture 9 2
Necessary conditions for optimal control
(with unbounded controls)
We want to prove that, with unbounded controls, the necessary
optimality conditions are (𝐻 is the Hamiltonian)
23
𝐱̇ ∗ 𝑡 = 2𝐩
𝐱 ∗
𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 , 𝑡
23
𝐩̇ ∗ 𝑡 = − 2𝐱 𝐱 ∗ 𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 , 𝑡 for all 𝑡 ∈ [𝑡7 , 𝑡' ]
23
𝟎 = 𝐱∗ 𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 , 𝑡
2𝐮

along with the boundary conditions:


+
𝜕ℎ ∗ 𝜕ℎ ∗
𝐱 𝑡' , 𝑡' − 𝐩∗ 𝑡' ∗ ∗
𝛿𝐱' + 𝐻 𝐱 𝑡' , 𝐮 𝑡' , 𝐩 𝑡' , 𝑡' ∗
+ 𝐱 𝑡' , 𝑡' 𝛿𝑡' = 0
𝜕𝐱 𝜕𝑡

4/28/19 AA 203 | Lecture 9 3


Proof of NOC
• For simplicity, assume that the terminal penalty is equal to zero,
and that 𝑡' and 𝐱(𝑡' ) are fixed and given
• Consider the augmented cost function
𝑔< 𝐱 𝑡 , 𝐱̇ 𝑡 , 𝐮 𝑡 , 𝐩 𝑡 , 𝑡 : = 𝑔 𝐱 𝑡 , 𝐮 𝑡 , 𝑡 + 𝐩 𝑡 + [𝐟 𝐱 𝑡 , 𝐮 𝑡 , 𝑡 − 𝐱(𝑡)]
̇
where the {𝑝A (𝑡)}’s are Lagrange multipliers
• Note that we have simply added zero to the cost function!
• The augmented cost function is then
EG
𝐽< (𝐮) = D 𝑔< 𝐱 𝑡 , 𝐱̇ 𝑡 , 𝐮 𝑡 , 𝐩 𝑡 , 𝑡 𝑑𝑡
EF

4/28/19 AA 203 | Lecture 9 4


Proof of NOC
On an extremal, by applying the fundamental theorem of the CoV
By the CoV
theorem
𝜕𝑔 ∗ ∗
𝜕𝐟 ∗ 𝑑
= 𝐱 𝑡 ,𝐮 𝑡 ,𝑡 + 𝐱 𝑡 , 𝐮∗ 𝑡 , 𝑡 + 𝐩∗ (𝑡) = − (−𝐩∗ (𝑡))
𝜕𝐱 𝜕𝐱 𝑑𝑡

EG +
𝜕𝑔< ∗ 𝑑 𝜕𝑔 <
0 = 𝛿𝐽< 𝐮 = D I 𝐱 𝑡 , 𝐱̇ ∗ 𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 , 𝑡 − 𝐱 ∗ 𝑡 , 𝐱̇ ∗ 𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 , 𝑡 𝛿𝐱 𝑡
EF 𝜕𝐱 𝑑𝑡 𝜕𝐱̇
+ +
𝜕𝑔< ∗ 𝜕𝑔<
+ 𝐱 𝑡 , 𝐱̇ ∗ 𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 , 𝑡 𝛿𝐮 𝑡 + 𝐱 ∗ 𝑡 , 𝐱̇ ∗ 𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 , 𝑡 𝛿𝐩(𝑡)J 𝑑𝑡
𝜕𝐮 𝜕𝐩

= 𝐟 𝐱 ∗ 𝑡 , 𝐮∗ 𝑡 , 𝑡 − 𝐱̇ ∗ 𝑡

4/28/19 AA 203 | Lecture 9 5


Proof of NOC
Considering each term in sequence,
• 𝐟 𝐱 ∗ 𝑡 , 𝐮∗ 𝑡 , 𝑡 − 𝐱̇ ∗ 𝑡 = 𝟎, on an extremal
• The Lagrange multipliers are arbitrary, so we can select them to
make the coefficient of 𝛿𝐱(𝑡) equal to zero, that is

𝜕𝑔 ∗ ∗
𝜕𝐟 ∗
𝐩̇ 𝑡 = − 𝐱 𝑡 ,𝐮 𝑡 ,𝑡 − 𝐱 𝑡 , 𝐮∗ 𝑡 , 𝑡 + 𝐩∗ (𝑡)
𝜕𝐱 𝜕𝐱
• The remaining variation 𝛿𝐮 𝑡 , is independent, so its coefficient
must be zero; thus
𝜕𝑔 ∗ ∗
𝜕𝐟 ∗
𝐱 𝑡 ,𝐮 𝑡 ,𝑡 + 𝐱 𝑡 , 𝐮∗ 𝑡 , 𝑡 + 𝐩∗ 𝑡 = 𝟎
𝜕𝐮 𝜕𝐮
By using the Hamiltonian formalism, one obtains the claim
4/28/19 AA 203 | Lecture 9 6
Necessary conditions for optimal control
(with bounded controls)
• So far, we have assumed that the admissible controls and states are
not constrained by any boundaries
• However, in realistic systems, such constraints do commonly occur
• control constraints often occur due to actuation limits
• state constraints often occur due to safety considerations
• We will now consider the case with control constraints, which will
lead to the statement of the Pontryagin’s minimum principle

4/28/19 AA 203 | Lecture 9 7


Why control constraints complicate the analysis?
• By definition, the control 𝐮∗ causes the functional 𝐽 to have a
relative minimum if
𝐽 𝐮 − 𝐽 𝐮∗ = Δ𝐽 ≥ 0
for all admissible controls “close” to 𝐮∗
• If we let 𝐮 = 𝐮∗ + 𝛿𝐮, the increment in 𝐽 can be expressed as
Δ𝐽 𝐮∗ , 𝛿𝐮 = 𝛿𝐽 𝐮∗ , 𝛿𝐮 + higher order terms
• The variation 𝛿𝐮 is arbitrary only if the extremal control is strictly
within the boundary for all time in the interval [𝑡7 , 𝑡' ]
• In general, however, an extremal control lies on a boundary during
at least one subinterval of the interval [𝑡7 , 𝑡' ]
4/28/19 AA 203 | Lecture 9 8
Why control constraints complicate the analysis?
• As a consequence, admissible control variations 𝛿𝐮 exist whose
negatives (−𝛿𝐮) are not admissible
• This implies that a necessary condition for 𝐮∗ to minimize 𝐽 is
𝛿𝐽 𝐮∗ , 𝛿𝐮 ≥ 0
for all admissible variations with 𝛿𝐮 small enough
• The reason why the equality (= 0) in the fundamental theorem of
CoV (where we assumed no constraints) is replaced with an
inequality (≥ 0) is the presence of the control constraints
• This result has an analog in calculus, where the necessary condition
for a scalar function 𝑓 to have a relative minimum at an end point is
that the differential 𝑑𝑓 is ≥ 0
4/28/19 AA 203 | Lecture 9 9
Pontryagin’s minimum principle
• Assuming bounded controls 𝐮 ∈ 𝑈, the necessary optimality conditions are
(𝐻 is the Hamiltonian)
23
̇∗
𝐱 𝑡 = 𝐱 ∗ 𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 , 𝑡
2𝐩
23 for all
𝐩̇ ∗ 𝑡 =− 𝐱 ∗ 𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 , 𝑡 𝑡 ∈ [𝑡7 , 𝑡' ]
2𝐱
𝐻 𝐱 ∗ 𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 , 𝑡 ≤ 𝐻 𝐱 ∗ 𝑡 , 𝐮 𝑡 , 𝐩∗ 𝑡 , 𝑡 , for all 𝐮(𝑡) ∈ 𝑈

along with the boundary conditions:


+
𝜕ℎ ∗ 𝜕ℎ ∗
𝐱 𝑡' , 𝑡' − 𝐩∗ 𝑡' 𝛿𝐱' + 𝐻 𝐱∗ 𝑡' , 𝐮∗ 𝑡' , 𝐩∗ 𝑡' , 𝑡' + 𝐱 𝑡' , 𝑡' 𝛿𝑡' = 0
𝜕𝐱 𝜕𝑡

4/28/19 AA 203 | Lecture 9 10


Pontryagin’s minimum principle
• 𝐮∗ 𝑡 is a control that causes 𝐻 𝐱 ∗ 𝑡 , 𝐮 𝑡 , 𝐩∗ 𝑡 , 𝑡 to assume its
global minimum
• Harder condition in general to analyze
• Example: consider the system having state equations:
𝑥̇ Q 𝑡 = 𝑥̇ R 𝑡 , 𝑥̇ R 𝑡 = −𝑥R 𝑡 + 𝑢(𝑡);
it is desired to minimize the functional
EG
1 R
𝐽=D 𝑥Q 𝑡 + 𝑢R 𝑡 𝑑𝑡
EF 2
with 𝑡' fixed and the final state free

4/28/19 AA 203 | Lecture 9 11


Pontryagin’s minimum principle
Solution:
• If the control is unconstrained,
𝑢∗ 𝑡 = −𝑝R∗ 𝑡
• If the control is constrained as 𝑢 𝑡 ≤ 1, then

−1 for 1 < 𝑝R 𝑡
𝑢∗ 𝑡 = V−𝑝R∗ 𝑡 , −1 ≤ 𝑝R∗ 𝑡 ≤ 1
+1 for 𝑝R∗ 𝑡 < −1
• To determine 𝑢∗ 𝑡 explicitly, the state and co-state equations must
be solved (more on this in the next lecture)

4/28/19 AA 203 | Lecture 9 12


Additional necessary conditions
1. If the final time is fixed and the Hamiltonian does not depend
explicitly on time, then
𝐻 𝐱 ∗ 𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 =𝑐 for all 𝑡 ∈ 𝑡7 , 𝑡'

2. If the final time is free and the Hamiltonian does not depend
explicitly on time, then
𝐻 𝐱 ∗ 𝑡 , 𝐮∗ 𝑡 , 𝐩∗ 𝑡 =0 for all 𝑡 ∈ [𝑡7 , 𝑡' ]

4/28/19 AA 203 | Lecture 9 13


Minimum time problems
• Find the control input sequence
𝑀A] ≤ 𝑢A 𝑡 ≤ 𝑀A^ for 𝑖 = 1, … , 𝑚
that drives the control affine system
𝐱̇ = 𝐴 𝐱, 𝑡 + 𝐵 𝐱, 𝑡 𝐮 𝑡
from an arbitrary state 𝐱 7 to the origin, and minimizes time
EG
𝐽 = D 𝑑𝑡
EF

4/28/19 AA 203 | Lecture 9 14


Minimum time problems
• Form the Hamiltonian
𝐻 = 1 + 𝐩 𝑡 + {𝐴 𝐱, 𝑡 + 𝐵 𝐱, 𝑡 𝐮 𝑡 }
= 1 + 𝐩 𝑡 + {𝐴 𝐱, 𝑡 + [𝐛Q 𝐱, 𝑡 𝐛R 𝐱, 𝑡 ⋯ 𝐛i 𝐱, 𝑡 ]𝐮 𝑡 }
i

= 1 + 𝐩 𝑡 + 𝐴 𝐱, 𝑡 + j 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 𝑢A (𝑡)
AkQ

• By the PMP, select 𝑢A (𝑡) to minimize 𝐻, which gives


^ +
∗ 𝑀 A if 𝐩 𝑡 𝐛A 𝐱, 𝑡 < 0
𝑢A 𝑡 = d ] Bang bang control
𝑀A if 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 > 0
• Then solve for the co-state equations (more on this in the next lecture)
4/28/19 AA 203 | Lecture 9 15
Minimum time problems
• Note: we showed what to do when 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 ≠ 0
• Not obvious what to do if 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 = 0
• If 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 = 0 for some finite time interval, then the coefficient
of 𝑢A (𝑡) in the Hamiltonian is zero, so the PMP provides no
information on how to select 𝑢A (𝑡)
• The treatment of such a singular condition requires a more
sophisticated analysis
• The analysis in the linear case is significantly easier, see Kirk Sec. 5.4

4/28/19 AA 203 | Lecture 9 16


Minimum fuel problems
• Find the control input sequence
𝑀A] ≤ 𝑢A 𝑡 ≤ 𝑀A^ for 𝑖 = 1, … , 𝑚
that drives the control affine system
𝐱̇ = 𝐴 𝐱, 𝑡 + 𝐵 𝐱, 𝑡 𝐮 𝑡
from an arbitrary state 𝐱 7 to the origin in a fixed time, and minimizes
EG i
𝐽 = D j 𝑐A |𝑢A (𝑡)| 𝑑𝑡
EF AkQ

4/28/19 AA 203 | Lecture 9 17


Minimum fuel problems
• Form the Hamiltonian
𝐻 = ∑i
AkQ 𝑐A |𝑢A (𝑡)| + 𝐩 𝑡 + {𝐴 𝐱, 𝑡 + 𝐵 𝐱, 𝑡 𝐮 𝑡 }
i i
= j 𝑐A |𝑢A (𝑡)| + 𝐩 𝑡 + 𝐴 𝐱, 𝑡 + j 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 𝑢A (𝑡)
AkQ AkQ
i
= j[𝑐A |𝑢A (𝑡)| + 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 𝑢A (𝑡)] + 𝐩 𝑡 + 𝐴 𝐱, 𝑡
AkQ
• By the PMP, select 𝑢A (𝑡) to minimize 𝐻, that is
∑i
AkQ[𝑐A |𝑢A

(𝑡)| + 𝐩 𝑡 + 𝐛 𝐱, 𝑡 𝑢 ∗ (𝑡)] ≤ ∑i [𝑐 |𝑢 (𝑡)| + 𝐩 𝑡 + 𝐛 𝐱, 𝑡 𝑢 (𝑡)]
A A AkQ A A A A

4/28/19 AA 203 | Lecture 9 18


Minimum fuel problems
• Since the components of 𝐮 𝑡 are independent, then one can just
look at
𝑐A |𝑢A∗ (𝑡)| + 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 𝑢A∗ 𝑡 ≤ 𝑐A |𝑢A (𝑡)| + 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 𝑢A (𝑡)
• The resulting control law is
𝑀A] if 𝑐A < 𝐩 𝑡 + 𝐛A 𝐱, 𝑡
𝑢A∗ 𝑡 = o 0 if − 𝑐A < 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 < 𝑐A
𝑀A^ if 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 < −𝑐A

4/28/19 AA 203 | Lecture 9 19


Minimum energy problem
• Find the control input sequence
𝑀A] ≤ 𝑢A 𝑡 ≤ 𝑀A^ for 𝑖 = 1, … , 𝑚
that drives the control affine system
𝐱̇ = 𝐴 𝐱, 𝑡 + 𝐵 𝐱, 𝑡 𝐮 𝑡
from an arbitrary state 𝐱 7 to the origin in a fixed time, and minimizes
Q EG
𝐽= ∫ 𝐮 𝑡 + 𝑅𝐮 𝑡 𝑑𝑡, where 𝑅 > 0 and diagonal
R EF

4/28/19 AA 203 | Lecture 9 20


Minimum energy problem
• Form the Hamiltonian
Q
𝐻 = 𝐮 𝑡 + 𝑅𝐮(𝑡) + 𝐩 𝑡 + {𝐴 𝐱, 𝑡 + 𝐵 𝐱, 𝑡 𝐮 𝑡 }
R
Q
= 𝐮 𝑡 + 𝑅𝐮 𝑡 + 𝐩 𝑡 + 𝐵 𝐱, 𝑡 𝐮 𝑡 + 𝐩 𝑡 + 𝐴 𝐱, 𝑡
R

• By the PMP, we need to solve


i
𝐮∗ 𝑡 = arg min j 𝑅AA 𝑢A 𝑡 R + 𝐩 𝑡 + 𝐛A 𝐱, 𝑡 𝑢A (𝑡)
𝐮 E ∈v
AkQ

4/28/19 AA 203 | Lecture 9 21


Minimum energy problem
• In the unconstrained case, the optimal solution for each component
of 𝐮(𝑡) would be
𝑢w A 𝑡 = −𝑅AA]Q 𝐩 𝑡 + 𝐛A 𝐱, 𝑡
• Considering the input constraints, the resulting control law is
𝑀A] if 𝑢w A 𝑡 < 𝑀A]
𝑢∗ 𝑡 = V𝑢w A 𝑡 if 𝑀A] < 𝑢w A 𝑡 < 𝑀A^
𝑀A^ if 𝑀A^ < 𝑢w A 𝑡

4/28/19 AA 203 | Lecture 9 22


Uniqueness and existence
• Note: uniqueness and existence are not in general guaranteed!

• Example 1 (non uniqueness): find a control sequence 𝑢(𝑡) to transfer the


system 𝑥̇ 𝑡 = 𝑢(𝑡) from an arbitrary initial state 𝑥7 to the origin, and
EG
such that the functional 𝐽 = ∫7 𝑢 𝑡 𝑑𝑡 is minimized. The final time is
free, and the admissible controls are 𝑢 𝑡 ≤ 1

• Example 2 (non existence): find a control sequence 𝑢(𝑡) to transfer the


system 𝑥̇ 𝑡 = −𝑥 𝑡 + 𝑢(𝑡) from an arbitrary initial state 𝑥7 to the
EG
origin, and such that the functional 𝐽 = ∫E 𝑢 𝑡 𝑑𝑡 is minimized. The
F
final time is free, and the admissible controls are 𝑢 𝑡 ≤ 1

4/28/19 AA 203 | Lecture 9 23


Next time
• Numerical methods for indirect optimal control

4/28/19 AA 203 | Lecture 9 24

You might also like