AstroNet II Becerra

Optimal Control and Applications
V. M. Becerra - Session 1 - 2nd AstroNet-II Summer School

Session 1
2nd AstroNet-II Summer School
Victor M. Becerra
School of Systems Engineering
University of Reading
5-6 June 2013
1 / 85
Contents of Session 1
Introduction to optimal control

A brief on constrained nonlinear optimisation
Introduction to to calculus of variations
The optimal control problem
Optimality conditions
Pontryagin’s minimum principle
Equality path constraints
Singular arcs
Inequality path constraints
2 / 85
Direct collocation methods for optimal control

Local discretisations
Global (pseudospectral) discretisations
Practical aspects
Multi-phase problems
Efficient sparse differentiation
Scaling
Estimating the discretisation accuracy
Mesh refinement
3 / 85
For the purposes of these lectures, a dynamical system can be

described by a system of ordinary differential equations, as follows:
ẋ = f[x(t), u(t), t]
where x : [t0 , tf ] :7→ Rnx is the state, u(t) : [t0 , tf ] 7→ Rnu is the
control, t is time, and f : Rnx × Rnu × R 7→ Rnx . Suppose that we
are interested in an interval of time such that t ∈ [t0 , tf ].
4 / 85
We are also interested in a measure of the performance of the

system along a trajectory.
We are going to quantify the performance of the system by
means of a functional J, this is a rule of correspondence that
assigns a value to a function or set of functions.
The functional of interest depends on the state history x(t)
and the control history u(t) over the period of interest
t ∈ [t0 , tf ], the initial time t0 and the final time tf .
J = J[x(·), u(·), t0 , tf ]
5 / 85
The performance index J might include, for example, measures of:

control effort
fuel consumption
energy expenditure
time taken to reach a target
6 / 85

What is optimal control?
Optimal control is the process of finding control and state histories
for a dynamic system over a period of time to minimise or
maximise a performance index.
7 / 85
Optimal control theory has been developed over a number of

centuries.
It is closely related to the theory of calculus of variations
which started in the late 1600’s.
Also related with developments in dynamic programming and
nonlinear programming after the second world war.
The digital computer has enabled many complex optimal
control problems to be solved.
Optimal control is a fundamental tool in space mission
design, as well as in many other fields.
8 / 85
As a form of motivation, consider the

design of a zero propellant maneouvre
for the international space station by
means of control moment gyroscopes
(CMGs).
The example work was repored by Bhatt
(2007) and Bedrossian and co-workers
(2009).
The original 90 and 180 degree
maneouvres were computed using a
pseudospectral method.
Image courtesy of nasaimages.org.
9 / 85
Implemented on the International Space Station on 5

November 2006 and 2 January 2007.
Savings for NASA of around US$1.5m in propellant costs.
The case described below corresponds with a 90◦ maneovure
lasting 7200 seconds and using 3 CMG’s.
10 / 85

The problem is formulated as follows. Find
qc (t) = [qc,1 (t) qc,2 (t) qc,3 (t) qc,4 ]T , t ∈ [t0 , tf ] and the scalar
parameter γ to minimise,
Z tf
J = 0.1γ + ||u(t)||2 dt
t0
subject to the dynamical equations:
1
q̇(t) = T(q)(ω(t) − ωo (q))
2
ω̇(t) = J−1 (τd (q) − ω(t) × (Jω(t)) − u(t))
ḣ(t) = u(t) − ω(t) × h(t)
where J is a 3 × 3 inertia matrix, q = [q1 , q2 , q3 , q4 ]T is the
quaternion vector, ω is the spacecraft angular rate relative to an
inertial reference frame and expressed in the body frame, h is the
momentum, t0 = 0 s, tf = 7200 s
11 / 85

The path constraints:
||q(t)||22 = 1
||qc (t)||22 = 1
||h(t)||22 ≤ γ
||ḣ(t)||22 = ḣmax
2
The parameter bounds

2
0 ≤ γ ≤ hmax
And the boundary conditions:
q(t0 ) = q̄0 ω(t0 ) = ωo (q̄0 ) h(t0 ) = h̄0

q(tf ) = q¯f ω(tf ) = ωo (q̄f ) h(tf ) = h̄f
12 / 85
T(q) is given by:

 
−q2 −q3 −q4
 q1 −q4 q3 
T(q) =  
 q4 q1 −q2 
−q3 q2 q1
13 / 85
u is the control force, which is given by:
u(t) = J (KP ε̃(q, qc ) + KD ω̃(ω, qc ))
where
ε̃(q, qc ) = 2T(qc )T q
ω̃(ω, ωc ) = ω − ωc
14 / 85
ωo is given by:
ωo (q) = nC2 (q)
where n is the orbital rotation rate, Cj is the j column of the
rotation matrix:
1 − 2(q32 + q42 ) 2(q2 q3 + q1 q4 )
 
2(q2 q4 − q1 q3 )
C(q) = 2(q2 q3 − q1 q4 ) 1 − 2(q22 + q42 ) 2(q3 q4 + q1 q2 )
2(q2 q4 + q1 q3 ) 2(q3 q4 − q1 q2 ) 1 − 2(q22 + q32 )
τd is the disturbance torque, which in this case only incorporates

the gravity gradient torque τgg , and the aerodynamic torque τa :
τd = τgg + τa = 3n2 C3 (q) × (JC3 (q)) + τa
15 / 85
Image courtesy of nasaimages.org
Video animation of the zero propellant maneouvre using real

telemetry.
http://www.youtube.com/watch?v=MIp27Ea9_2I
16 / 85
Numerical results obtained by the author with PSOPT using a

pseudospectral discretisation method with 60 grid points,
neglecting the aerodynamic torque.
Zero Propellant Maneouvre of the ISS: control torques Zero Propellant Maneouvre of the ISS: momentum norm
u1 h
20 u2 hmax
u3 10000
10 9000
8000
0
h (ft-lbf-sec)
7000
u (ft-lbf)
-10
6000
-20 5000
4000
-30
3000
0 1000 2000 3000 4000 5000 6000 7000 8000 0 1000 2000 3000 4000 5000 6000 7000 8000
time (s) time (s)
17 / 85
Ingredients for solving realistic optimal control problems
18 / 85
Indirect methods
Optimality conditions of continuous problems are derived
Attempt to numerically satisfy the optimality conditions
Requires numerical root finding and ODE methods.
Direct methods
Continuous optimal control problem is discretised
Need to solve a nonlinear programming problem
Later in these lectures, we will concentrate on some direct
methods.
19 / 85
Consider a twice continuously differentiable scalar function

f : Rny 7→ R. Then y∗ is called a strong minimiser of f if there
exists δ ∈ R such that
f (y∗ ) < f (y∗ + ∆y)
for all 0 < ||∆y|| ≤ δ.
Similarly y∗ is called a weak minimiser if it is not strong and there

is δ ∈ R such that f (y∗ ) ≤ f (y∗ + ∆y) for all 0 < ||∆y|| ≤ δ. The
minimum is said to be local if δ < ∞ and global if δ = ∞.
20 / 85
The gradient of f (y) with respect to y is a column vector

constructed as follows:
 
∂f
∂f T  ∂y. 1 
∇y f (y) = = fyT =  . 
 . 
∂y ∂f
∂yny
21 / 85
The second derivative of f (y) with respect to y is an ny × ny

symmetric matrix known as the Hessian and is expressed as
follows:
∂f 2 /∂y12 · · · ∂f 2 /∂y1 ∂yny
 
∇2y f (y) = fyy = 

 .. .. .. 
. . . 
∂f 2 /∂yny ∂y1 · · · ∂f 2 /∂yny 2
22 / 85
Consider the second order Taylor expansion of f about y∗ :

1
f (y∗ + ∆y) ≈ f (y∗ ) + fy (y∗ )∆y + ∆yT fyy (y∗ )∆y
2
A basic result in unconstrained nonlinear programming is the
following set of conditions for a local minimum.
Necessary conditions:
fy (y∗ ) = 0, i.e. y∗ must be a stationary point of f .
fyy (y∗ ) ≥ 0, i.e. the Hessian matrix is positive semi-definite.
Sufficient conditions:
fy (y∗ ) = 0, i.e. y∗ must be a stationary point of f .
fyy (y∗ ) > 0, i.e. the Hessian matrix is positive definite.
23 / 85

For example, the parabola f (x) = x12 + x22 has a global minimum at
x∗ = [0, 0]T , where the gradient fx (x∗ ) = [0, 0] and the Hessian is
positive definite:
∗ 2 0
fxx (x ) =
0 2
24 / 85
Constrained nonlinear optimisation involves finding a decision

vector y ∈ Rny to minimise:
f (y)
subject to
hl ≤ h(y) ≤ hu ,
yl ≤ y ≤ yu ,
where f : Rny 7→ R is a twice continuously differentiable scalar
function, and h : Rny 7→ Rnh is a vector function with twice
continuously differentiable elements hi . Problems such as the
above are often called nonlinear programming (NLP) problems.
25 / 85
A particular case that is simple to analyse occurs when the

nonlinear programming problem involves only equality constraints,
such that the problem can be expressed as follows. Choose y to
minimise
f (y)
subject to
c(y) = 0,
where c : Rny 7→ Rnc is a vector function with twice continuously
differentiable elements ci , and nc < ny .
26 / 85
Introduce the Lagrangian function:
L[y, λ] = f (y) + λT c(y) (1)
where L : Rny × Rnc 7→ R is a scalar function and λ ∈ <nc is a

Lagrange multiplier.
27 / 85
The first derivative of c(y) with respect to y is an nc × ny matrix

known as the Jacobian and is expressed as follows:
 
∂c1 /∂y1 · · · ∂c1 /∂yny
∂c(y) .. .. ..
= cy (y) = 
 
. . .
∂y

∂cnc /∂y1 · · · ∂cnc /∂yny
28 / 85
The first order necessary conditions for a point (y∗ , λ) to be a local

minimiser are as follows:
Ly [y∗ , λ]T = fy (y∗ )T + [cy (y∗ )]T λ = 0

(2)
Lλ [y∗ , λ]T = c(y∗ ) = 0
Note that equation (2) is a system of ny + nc equations with
ny + nc unknowns.
29 / 85
For a scalar function f (x) and a single equality constraint h(x), the
figure illustrates the collinearity between the gradient of f and the
gradient of h at a stationary point, i.e. ∇x f (x∗ ) = −λ∇x c(x∗ ).
30 / 85
For example, consider

the minimisation of
the function f (x) = x12 + x22
subject to the
linear equality constraint
c(x) = 2x1 + x2 + 4 = 0.
The first order

necessary conditions lead
to a system of three linear
equations whose solution
is the minimiser x∗ .
31 / 85
In fairly simple cases, the necessary conditions may help find the
solution. In general, however,
There is no analytic solution
It is necessary to solve the problem numerically
Numerical methods approach the solution iteratively.
32 / 85
If the nonlinear programming problem also includes a set of

inequality constraints, then the resulting problem is as follows.
Choose y to minimise
f (y)
subject to
c(y) = 0,
q(y) ≤ 0
where q : Rny 7→ Rnq is a twice continuously differentiable vector
function.
33 / 85
The Lagrangian function is redefined to include the inequality

constraints function:
L[y, λ, µ] = f (y) + λT c(y) + µT q(y) (3)
where µ ∈ Rnq is a vector of Karush-Kuhn-Tucker multipliers.
34 / 85
Definition
Active and inactive constraints An inequality constraint qi (y) is
said to be active at y if qi (y) = 0. It is inactive at y if qi (y) < 0
Definition
Regular point Let y∗ satisfy c(y∗ ) = 0 and q(y∗ ) ≤ 0, and let
Ic (y∗ ) be the index set of active inequality constraints. Then we
say that y∗ is a regular point if the gradient vectors:
∇y ci (y∗ ), ∇y qj (y∗ ), 1 ≤ i ≤ nc , j ∈ Ic (y∗ )
are linearly independent.
35 / 85
In this case, the first order necessary conditions for a local

minimiser are the well known Karush-Kuhn-Tucker (KKT)
conditions, which are stated as follows.
If y∗ is a regular point and a local minimiser, then:
µ≥0
Ly [y∗ , λ, µ] = fy (y∗ ) + [cy (y∗ )]T λ + [qy (y∗ )]T µ = 0
(4)
Lλ [y∗ , λ] = c(y∗ ) = 0
µT q(y∗ ) = 0
36 / 85
Methods for the solution of nonlinear programming problems

are well established.
Well known methods include Sequential Quadratic
Programming (SQP) and Interior Point methods.
A quadratic programming problem is a special type of
nonlinear programming problem where the constraints are
linear and the objective is a quadratic function.
37 / 85
SQP methods involve the solution of a sequence of quadratic

programming sub-problems, which give, at each iteration, a
search direction which is used to find the next iterate through
a line search on the basis of a merit (penalty) function.
Interior point methods involve the iterative enforcement of
the KKT conditions doing a line search at each major iteration
on the basis of a barrier function to find the next iterate.
38 / 85
Current NLP implementations incorporate developments in

numerical linear algebra that exploit sparsity in matrices.
It is common to numerically solve NLP problems involving
variables and constraints in the order of hundreds of
thousands.
Well known large scale NLP solvers methods include SNOPT
(SQP) and IPOPT (Interior Point).
39 / 85
A quick introduction to calculus of variations
Definition (Functional)
A functional J is a rule of correspondence that assigns to each
function x a unique real number.
For example, Suppose that x is a continuous function of t defined

in the interval [t0 , tf ] and
Z tf
J(x) = x(t) dt
t0
is the real number assigned by the functional J (in this case, J is

the area under the x(t) curve).
40 / 85

The increment of a funcional J is defined as follows:
∆J = J(x + δx) − J(x) (5)
where δx is called the variation of x
41 / 85
The variation of a functional J is defined as follows:
δJ = lim ∆J = lim J(x + δx) − J(x)

||δx||→0 ||δx||→0
Example: Find the variation δJ of the following functional:

Z 1
J(x) = x 2 (t) dt
0
Solution:
∆J = J(x + δx) − J(x) = . . .
Z 1
δJ = (2xδx) dt
0
42 / 85

In general, the variation of a differentiable functional of the form:
Z tf
J= L(x)dt
t0
can be calculated as follows for fixed t0 and tf :

Z tf
∂L(x)
δJ = δx dt
t0 ∂x
If t0 and tf are also subject to small variations δt0 and δtf , then
the variation of J can be calculated as follows:
Z tf
∂L(x)
δJ = L(x(tf ))δtf − L(x(t0 ))δt0 + δx dt
t0 ∂x
43 / 85
If x is a vector function of dimension n, then we have, for fixed t0

and tf :
Z tf n
Z tf (X )
∂L(x) ∂L(x)
δJ = δx dt = δxi dt
t0 ∂x t0 ∂xi
i=1
44 / 85
A functional J has a local minimum at x ∗ if for all functions x in

the vecinity of x ∗ we have:
∆J = J(x) − J(x ∗ ) ≥ 0
For a local maximum, the condition is:
∆J = J(x) − J(x ∗ ) ≤ 0
45 / 85
Important result
If x ∗ is a local minimiser or maximiser, then the variation of J is
zero:
δJ[x ∗ , δx] = 0
46 / 85
A basic problem in calculus of variations is the minimisation or

maximisation of the following integral with fixed t0 , tf , x(t0 ),
x(tf ):
Z tf
J= L[x(t), ẋ(t), t]dt
t0
for a given function L. To solve this problem, we need to compute

the variation of J, for a variation in x. This results in:
Z tf
∂L d ∂L
δJ = − δxdt
t0 ∂x dt ∂ ẋ
47 / 85
Setting δJ = 0 results in the Euler-Lagrange equation:

∂L d ∂L
− =0
∂x dt ∂ ẋ
This equation needs to be satisfied by the function x ∗ which
minimises or maximises J. This type of solution is called an
extremal solution.
48 / 85
Example
Find an extremal for the following functional with x(0) = 0,
x(π/2) = 1.
Z π/2
2
ẋ (t) − x 2 (t) dt

J(x) =
0
Solution: the Euler-Lagrange equation gives −2x − 2ẍ = 0 such
that ẍ + x = 0
The solution of this second order ODE is x(t) = c1 cos t + c2 sin t.

Using the boundary conditions gives c0 = 0 and c1 = 1, such that
the extremal is:
x ∗ (t) = sin t
49 / 85
A general optimal control problem
Optimal control problem

Determine the state x(t), the control u(t), as well as t0 and tf , to
minimise
Z tf
J = Φ[x(t0 ), t0 , x(tf ), tf ] + L[x(t), u(t), t]dt
t0
subject to
Differential constraints:
ẋ = f[x, u, t]
Algebraic path constraints
hL ≤ h[x(t), u(t), t] ≤ hU
Boundary conditions
ψ[x(t0 ), t0 , x(tf ), tf ] = 0
50 / 85
Important features include:

It is a functional minimisation problem, as J is a functional.
Hence we need to use calculus of variations to derive its
optimality conditions.
All optimal control problems have constraints.
51 / 85
There are various classes of optimal control problems:

Free and fixed time problems
Free and fixed terminal/initial states
Path constrained and unconstrained problems
...
52 / 85
For the sake of simplicity, the first order optimality conditions for
the general optimal control problem will be given ignoring the path
constraints.
The derivation involves constructing an augmented performance

index: Z tf h i
T
Ja = Φ + ν ψ + L + λT (f − ẋ) dt
t0
where ν and λ(t) are Lagrange multipliers for the boundary

conditions and the differential constraints, respectively. λ(t) is also
known as the co-state vector. The first variation of Ja , δJa , needs
to be calculated. We assume that u(t), x(t), x(t0 ), x(tf ), t0 , tf ,
λ(t), λ(t0 ), λ(tf ), and ν are free to vary.
53 / 85

Define the Hamiltonian function as follows:
H[x(t), u(t), λ(t), t] = L[x(t), u(t), t] + λ(t)T f[x(t), u(t), t]
After some work, the variation δJa can be expressed as follows:

Z tf h i
T T
δJa = δΦ + δν ψ + ν δψ + δ H − λT ẋ dt
t0
where:
∂Φ ∂Φ ∂Φ ∂Φ
δΦ = δx0 + δt0 + δxf + δtf
∂x(t0 ) ∂t0 ∂x(tf ) ∂tf
∂ψ ∂ψ ∂ψ ∂ψ
δψ = δx0 + δt0 + δxf + δtf
∂x(t0 ) ∂t0 ∂x(tf ) ∂tf
δx0 = δx(t0 ) + ẋ(t0 )δt0 , δxf = δx(tf ) + ẋ(tf )δtf
54 / 85
and
Z tf
δ [H − λẋ]dt = λT (t0 )δx0 − λT (tf )δxf
t0
+ H(tf )δtf − H(t0 )δt0
Z tf
∂H T ∂H T ∂H
+ + λ̇ δx + + ẋ δλ + δu dt
t0 ∂x ∂λ ∂u
55 / 85
Assuming that initial and terminal times are free, and that the
control is unconstrained, setting δJa = 0 results in the following
first order necessary optimality conditions.
∂H T
ẋ = ∂λ = f[x(t), u(t), t] (state dynamics)
∂H T
λ̇ = − ∂x (costate dynamics)
∂H
∂u =0 (stationarity condition)
56 / 85

In addition we have the following conditions at the boundaries:
ψ[x(t0 ), t0 , x(tf ), tf ] = 0
h iT h iT T h i
∂Φ ∂ψ ∂Φ ∂ψ
λ(t0 ) + ∂x(t 0 ) + ∂x(t 0 ) ν δx0 + −H(t0 ) + ∂t0 + ν T ∂t 0
δt0 = 0
h iT h iT T h i
∂Φ ∂ψ ∂Φ ∂ψ
−λ(tf ) + ∂x(tf ) + ∂x(tf ) ν δxf + H(tf ) + ∂tf + ν T ∂t f
δtf = 0
Note that δx0 depends on δt0 so in the second equation above the
terms that multiply the variations of δx0 and δt0 cannot be made
zero independently. Something similar to this happens in the third
equation.
57 / 85
The conditions in boxes are necessary conditions, so solutions

(x∗ (t), λ∗ (t), u∗ (t), ν ∗ ) that satisfy them are extremal
solutions.
They are only candidate solutions.
Each extremal solution can be a minimum, a maximum or a
saddle.
58 / 85
Note the dynamics of the state - co-state system:
ẋ = ∇λ H
λ̇ = −∇x H
This is a Hamiltonian system.

Boundary conditions are at the beginning and at the end of
the interval, so this is a two point boundary value problem.
If H is not an explicit function of time, H = constant.
If the Hamiltonian is not an explicit function of time and tf is
free, then H(t) = 0.
59 / 85
Example: The brachistochrone problem
This problem was solved by Bernoulli in 1696. A mass m moves in

a constant force field of magnitude g starting at rest at the origin
(x, y ) = (x0 , y0 ) at time 0. It is desired to find a path of minimum
time to a specified final point (x1 , y1 ).
60 / 85
The equations of motion are:
ẋ = V cos θ
ẏ = V sin θ
√
where V (t) = 2gy . The performance index (we are after
minimum time), can be expressed as follows:
Z tf
J= 1dt = tf
0
and the boundary conditions on the states are

(x(0), y (0)) = (x0 , y0 ), (x(tf ), y (tf )) = (x1 , y1 ).
61 / 85
In this case, the Hamiltonian function is:
H = 1 + λx (t)V (t) cos θ(t) + λy (t)V (t) sin θ(t)
The costate equations are:
λ̇x = 0
g
λ̇y = − (λx cos θ + λy sin θ)
V
and the stationarity condition is:
λy
0 = −λx V sin θ + λy V cos θ =⇒ tan θ =
λx
62 / 85

We need to satify the boundary conditions:
   
x(0) − x0 0
 y (0) − y0  0
ψ= x(tf ) − x1  = 0
  
y (tf ) − y1 0
Also since the Hamiltonian is not an explicit function of time and

tf is free, we have that H(t) = 0. Using this, together with the
stationarity condition we can write:
cos θ
λx = −
V
sin θ
λy = −
V
63 / 85
After some work with these equations, we can write:

y1
y= 2
cos2 θ
cos θ(t f)
y1
x = x1 + [2(θ(tf ) − θ) + sin 2θ(tf ) − sin 2θ]
2 cos2 θ(tf )
s
2y1 θ − θ(tf )
tf − t =
g cos θ(tf )
Given the current position (x, y ) these three equations can be used
to solve for θ(t) and θ(tf ) and the time to go tf − t.
64 / 85
Using the above results, it is possible to find the states:
y (t) = a(1 − cos φ(t))

x(t) = a(φ − sin φ(t))
where
φ = π − 2θ
y1
a=
1 − cos φ(tf )
The solution describes a cycloid in the x − y plane that passes
through (x1 , y1 ).
65 / 85
The figures illustrates numerical results with (x(0), y (0)) = (0, 0),
(x(tf ), y (tf )) = (2, 2) and g = 9.8.
Brachistochrone Problem: x-y plot Brachistochrone Problem: control

2
y u
1.4
1.5
1.2
control
1
1
y
0.8
0.5
0.6
0.4
0
0 0.5 1 1.5 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
x time (s)
66 / 85
Assume that the optimal control is u∗ , and that the states and
co-states are fixed at their optimal trajectories x∗ (t), λ∗ (t). we
have that for all u close to u∗ :
J(u∗ ) ≤ J(u) =⇒ J(u) − J(u∗ ) ≥ 0
Then we have that J(u) = J(u∗ ) + δJ[u∗ , δu].
67 / 85
From the variation in the augmented functional δJa , we have in

this case:
Z tf
∂H
δJ[u∗ , δu] = δudt
t0 ∂u
68 / 85
But, using a Taylor series we know that:

∗ ∗ ∗ ∂H∗ ∗ ∗
H[x , u + δu, λ ] = H[x , u , λ ] + δu + . . .
∂u u∗
So that, to first order:
Z tf
∗
δJ[u , δu] = (H[x∗ , u∗ + δu, λ∗ ] − H[x∗ , u∗ , λ∗ ]) dt ≥ 0
t0
69 / 85
This leads to:
H[x∗ , u∗ + δu, λ∗ ] − H[x∗ , u∗ , λ∗ ] ≥ 0
for all admissible variations in u. This in turn leads to the

minimum principle:
u∗ = arg minu∈U H[x∗ , u, λ∗ ]
where U is the admissible control set.
70 / 85
Assuming that the admissible control set is closed then, if the

control lies in the interior of the set, we have that the minimum
principle implies that:
∂H
=0
∂u
Otherwise, if the optimal control is at the boundary of the
admissible control set, the minimum principle requires that the
optimal control must be chosen to minimise the Hamiltonian:
u∗ = arg min H[x∗ , u, λ∗ ]

u∈U
71 / 85
Example: Minimum time control with constrained input
Consider the following optimal control problem. Chose tf and

u(t) ∈ U = [−1, 1], t ∈ [t0 , tf ] to minimise:
Z tf
J = tf = 1dt
0
subject to
ẋ1 = x2
ẋ2 = u
with the boundary conditions:
x1 (0) = x10 , x1 (tf ) = 0

x2 (0) = x20 , x2 (tf ) = 0
72 / 85
In this case, the Hamiltonian is:
H = 1 + λ1 x2 + λ2 u
Calculating ∂H/∂u gives no information on the control, so we need
to apply the minimum principle:

 +1 if λ2 < 0
∗ ∗ ∗
u = arg min H[x , u, λ ] = undefined if λ2 = 0
u∈[−1,1]
−1 if λ2 > 0

This form of control is called bang-bang. The control switches

value when the co-state λ2 changes sign. The actual form of the
solution depends on the values of the initial states.
73 / 85
The costate equations are:
λ̇1 = 0 =⇒ λ1 (t) = λ10 = constant

λ̇2 = −λ1 =⇒ λ2 (t) = λ10 t + λ20
There are four cases:

1 λ2 is always positive.
2 λ2 is always negative.
3 λ2 switches from positive to negative.
4 λ2 switches from negative to positive.
So there will be at most one switch in the solution. The minimum
principle helps us to understand the nature of the solution.
74 / 85
Let’s assume that we impose equality path constraints of the

form:
c[x(t), u(t), t] = 0
This type of constraint may be part of the model of the
system.
The treatment of path constraints depends on the Jacobian
matrix of the constraint function.
75 / 85
If the matrix ∂c/∂u is full rank, then the system of differential

and algebraic equations formed by the state dynamics and the
path constraints is a DAE of index 1. In this case, the
Hamiltonian can be replaced by:
H = L + λT f + µT c
which results in new optimality conditions.
76 / 85
If the matrix ∂c/∂u is rank defficient, the index of the DAE is

higher than one.
This may require repeated differentiations of c with respect to
u until one of the derivatives of c is of full rank, a procedure
called rank reduction.
This may be difficult to perform and may be prone to
numerical errors.
77 / 85
Singular arcs
In some optimal control problems, extremal arcs satisfying the

stationarity condition ∂H/∂u = 0 occur where the Hessian
matrix Huu is singular.
These are called singular arcs.
Additional tests are required to verify if a singular arc is
optimizing.
78 / 85
Singular arcs
A particular case of practical relevance occurs when the

Hamiltonian function is linear in at least one of the control
variables.
In such cases, the control is not determined in terms of the
state and co-state by the stationarity condition ∂H/∂u = 0
Instead, the control is determined by repeatedly differentiating
∂H/∂u with respect to time until it is possible to obtain the
control by setting the k-th derivative to zero.
79 / 85
Singular arcs
In the case of a single control u, once the control is obtained

by setting the k-th time derivative of ∂H/∂u to zero, then
additional conditions known as the generalized
Legendre-Clebsch conditions must be checked:
( )
k ∂ d (2k) ∂H
(−1) ≥ 0, k = 0, 1, 2, . . .
∂u dt (2k) ∂u
If the above conditions are satisfied, then the singular arc will
be optimal.
The presence of singular arcs may cause difficulties to
computational optimal control methods to find accurate
solutions if the appropriate conditions are not enforced a priori.
80 / 85
These are constraints of the form:
hL ≤ h[x(t), u(t), t] ≤ hU
with hL,i 6= hU,i .

Inequality constraints may be active (h at one of the limits) or
inactive (h within the limits) at any one time.
As a result the time domain t ∈ [t0 , tf ] may be partitioned
into constrained and unconstrained arcs.
Different optimality conditions apply during constrained and
unconstrained arcs.
81 / 85
Three major complications:

The number of constrained arcs is not necessarily known a
priori.
The location of the junction points between constrained and
unconstrained arcs is not known a priori.
At the junction points it is possible that both the controls u
and the costates λ are discontinuous.
Additional boundary conditions need to be satisfied at the
junction points ( =⇒ multipoint boundary value problem).
82 / 85
Summary
Optimal control theory for continuous time systems is based

on calculus of variations.
Using this approach, we formulate the first order optimality
conditions of the problem.
This leads to a Hamiltonian boundary value problem.
83 / 85
Summary
Even if we use numerical methods, a sound understanding of

control theory is important.
It provides a systematic way of formulating the problem.
It gives us ways of analysing the numerical solution.
It gives us conditions that we can incorporate into the
formulation of the numerical problem.
84 / 85
Further reading
D.E. Kirk (1970). Optimal Control Theory. Dover.

E.K. Chong and S.H. Zak (2008). An Introduction to Optimization.
Wiley.
F.L. Lewis and V.L. Syrmos (1995). Optimal Control.
Wiley-Interscience.
J.T. Betts (2010). Practical Methods for Optimal Control and
Estimation Using Nonlinear Programming. SIAM.
A.E. Bryson and Y.C. Ho (1975). Applied Optimal Control. Halsted
Press.
85 / 85

Session 2
2nd AstroNet-II Summer School
Victor M. Becerra
School of Systems Engineering
University of Reading
5-6 June 2013
1 / 89
Direct collocation methods for optimal control

Global (pseudospectral) discretisations
Practical aspects
Scaling
Estimating the discretisation accuracy
Mesh refinement
2 / 89
Direct Collocation methods for optimal control
This session introduces a family of methods for the numerical

solution of optimal control problems known as direct
collocation methods.
These methods involve the discretisation of the ODSs of the
problem.
The discretised ODEs becomes a set of algebraic equality
constraints.
The problem time interval is divided into segments and the
decision variables include the values of controls and state
variables at the grid points.
3 / 89
ODE discretisation can be done, for example, using

trapezoidal, Hermite-Simpson, or pseudospectral
approximations.
This involves defining a grid of points covering the time
interval [t0 , tf ], t0 = t1 < t2 · · · < tN = tf .
4 / 89
The nonlinear optimization problems that arise from direct

collocation methods may be very large.
For instance, Betts (2003) describes a spacecraft low thrust
trajectory design with 211,031 variables and 146,285
constraints.
5 / 89
It is interesting that such large NLP problems may be easier

to solve than boundary-value problems.
The reason for the relative ease of computation of direct
collocation methods is that the associated NLP problem is
often quite sparse, and efficient software exist for its solution.
6 / 89
With direct collocation methods there is no need to explicitly

derive the necessary conditions of the continuous problem.
For a given problem, the region of attraction of the solution
is usually larger with direct methods than with indirect
methods.
Moreover, direct collocation methods do not require an a
priori specification of the sequence of constrained arcs in
problems with inequality path constraints.
7 / 89
Then the range of problems that can be solved with direct

methods is larger than the range of problems that can be
solved via indirect methods.
Direct methods can be used to find a starting guess for an
indirect method to improve the accuracy of the solution.
8 / 89
Consider the following optimal control problem:
Problem 1. Find all the free functions and variables (the controls
u(t), t ∈ [t0 , tf ], the states x(t), t ∈ [t0 , tf ], the parameters vector
p, and the times t0 , and tf ), to minimise the following
performance index,
Z tf
J = ϕ[x(t0 ), t0 , x(tf ), tf ] + L[x(t), u(t), t]dt
t0
subject to:
ẋ(t) = f[x(t), u(t), p, t], t ∈ [t0 , tf ],
9 / 89
Boundary conditions (often called event constraints):
ψL ≤ ψ[x(t0 ), u(t0 ), x(tf ), u(tf ), p, t0 , tf ] ≤ ψU .
Inequality path constraints:
hL ≤ h[x(t), u(t), p, t] ≤ hU , t ∈ [t0 , tf ].
10 / 89

Bound constraints on controls, states and static parameters:
uL ≤ u(t) ≤ uU , t ∈ [t0 , tf ],
xL ≤ x(t) ≤ xU , t ∈ [t0 , tf ],
pL ≤ p ≤ pU .
The following constraints allow for cases where the initial and/or
final times are not fixed but are to be chosen as part of the
solution to the problem:
t 0 ≤ t0 ≤ t̄0 ,
t f ≤ tf ≤ t̄f ,
tf − t0 ≥ 0.

11 / 89

Consider the continuous domain [t0 , tf ] ⊂ R, and break this
domain using intermediate nodes as follows:
t0 < t1 ≤ · · · < tN = tf
such that the domain has been divided into the N segments
[tk , tk+1 ], j = 0, . . . , N − 1.
x(tN −1 ) x(tN )
x(t1 ) x(t2 )
x(t0 )
u(t0 )
u(t1 )
u(t2 )
u(tN −1 )
u(tN )
t0 t1 t2 tN −1 tN = tf
The figure illustrates the idea that the control u and state variables
x are discretized over the resulting grid.
12 / 89
By using direct collocation methods, it is possible to discretise

Problem 1 to obtain a nonlinear programming problem.
Problem 2. Choose y ∈ Rny to minimise:
F (y)
subject to:      
0 Z(y) 0
 HL  ≤  H(y)  ≤  HU 
ΨL Ψ(y) ΨU
yL ≤ y ≤ yU

13 / 89
where
F is simply a mapping resulting from the evaluation of the
performance index,
Z is the mapping that arises from the discretisation of the
differential constraints over the gridpoints,
H arises from the evaluation of the path constraints at each of
the grid points, with associated bounds HL , HU ,
Ψ is simply a mapping resulting from the evaluation of the
event constraints.
14 / 89
The form of Problem 2 is the same regardless of the method

used to discretize the dynamics.
What may change depending on the discretisation method
used is the decision vector y and the differential defect
constraints.
15 / 89
Suppose that the solution of the ODE is approximated by a

polynomial x̃(t) of degree M over each interval
t ∈ [tk , tk+1 ], k = 0, . . . , N − 1:
(k) (k) (k)
x̃(t) = a0 + a1 (t − tk ) + · · · + aM (t − tk )M
n o
(k) (k) (k)
where the polynomial coefficients a0 , a1 , . . . , aM are chosen
such that the approximation matches the function at the beginning
and at the end of the interval:
x̃(tk ) = x(tk )
x̃(tk+1 ) = x(tk+1 )
16 / 89
and the time derivative of the approximation matches the time

derivative of the function at tk and tk+1 :
dx̃(tk )
= f[x(tk ), u(tk ), p, tk ]
dt
dx̃(tk+1 )
= f[x(tk+1 ), u(tk+1 ), p, tk+1 ]
dt
The last two equations are called collocation conditions.
17 / 89
true solution
x(tk+1 )
polynomial approximation
x(tk )
u(tk )
u(tk+1 )
tk tk+1
The figure illustrates the collocation concept. The solution of the

ODE is approximated by a polynomial of degree M over each
interval, such that the approximation matches at the beginning
and the end of the interval.
18 / 89
Many of the local discretisations used in practice for solving

optimal control problems belong to the class of so-called classical
Runge-Kutta methods. A general K -stage Runge-Kutta method
discretises the differential equation as follows:
K
X
xk+1 = xk + hk bi fki
i=1
where
K
X
xki = xk + hk ail fkl
l=1
fki = f[xki , uki , p, tki ]

where uki = u(tki ), tki = tk + hk ρi , and ρi = (0, 1].
19 / 89
Trapezoidal method
The trapezoidal method is based on a quadratic interpolation

polynomial:
(k) (k) (k)
x̃(t) = a0 + a1 (t − tk ) + a2 (t − tk )2
Let: fk ≡ f[x(tk ), u(tk ), p, tk ] and fk+1 ≡ f[x(tk+1 ), u(tk+1 ), p, tk+1 ]

The compressed form of the trapezoidal method is as follows:
hk
ζ(tk ) = x(tk+1 ) − x(tk ) − (fk + fk+1 ) = 0
2
where hk = tk+1 − tk , and ζ(tk ) = 0 is the differential defect
constraint at node tk associated with the trapezoidal method.
Allowing k = 0, . . . , N − 1, this discretisation generates Nnx
differential defect constraints.
20 / 89
Trapezoidal method
Note that the trapezoidal method is a 2-stage Runge-Kutta

method with a11 = a12 = 0, a21 = a22 = 1/2, b1 = b2 = 1/2,
ρ1 = 0, ρ2 = 1.
21 / 89
Trapezoidal method
Using direct collocation with the compressed trapezoidal

discretisation, the decision vector for single phase problems is given
by
 
vec(U)
vec(X)
 
y=  p 

 t0 
tf
where vec(U) represents the vectorisation of matrix U, that is an
operator that stacks all columns of the matrix, one below the
other, in a single column vector,
22 / 89
Trapezoidal method
The expressions for matrices U and X are as follows:
U = [u(t0 ) u(t1 ) . . . u(tN−1 )] ∈ Rnu ×N

X = [x(t0 ) x(t1 ) . . . x(tN )] ∈ Rnx ×N+1
23 / 89
Hermite-Simpson method
The Hermite-Simpson method is based on a cubic interpolating

polynomial:
(k) (k) (k) (k)
x̃(t) = a0 + a1 (t − tk ) + a2 (t − tk )2 + a3 (t − tk )3
Let t̄k = (tk + tk+1 )/2, and ū(t̄k ) = (u(tk ) + u(tk+1 ))/2.
24 / 89
The compressed Hermite-Simpson discretisation is as follows:
1 hk
x̃(t̄k ) = [x(tk ) + x(tk+1 )] + [fk − fk+1 ]
2 8
f̄k = f[x̃(t̄k ), ū(t̄k ), p, t̄k ]
hk
ζ(tk ) = x(tk+1 ) − x(tk ) − fk + 4f̄k + fk+1 = 0
6
where ζ(tk ) = 0 is the differential defect constraint at node tk
associated with the Hermite-Simpson method.
25 / 89
Allowing k = 0, . . . , N − 1, the compressed Hermite-Simpson

discretisation generates Nnx differential defect constraints.
Using direct collocation with the compressed Hermite-Simpson
method, the decision vector for single phase problems is given by
vec(U)
 
vec(Ū)
 
vec(X)
y=  
 p 

 t0 
tf
where
Ū = [u(t̄0 ) u(t̄1 ) . . . u(t̄N−1 )] ∈ Rnu ×N−1
26 / 89
Note that the Hermite-Simpson method is a 3-stage Runge-Kutta

method with
a11 = a12 = a13 = 0, a21 = 5/24, a22 = 1/3, a23 = −1/24,
a31 = 1/6, a32 = 2/3, a33 = 1/6, b1 = 1/6, b2 = 2/3, b3 = 1/6,
ρ1 = 0, ρ2 = 1/2, ρ3 = 1.
27 / 89
Local convergence
The error in variable zk ≡ z(tk ) with respect to a solution

z ∗ (t) is of order n if there exist h̄, c > 0 such that
max ||z̃k − z ∗ (tk )|| ≤ chn , ∀h < h̄

k
It is well known that the convergence of the trapezoidal

method for integrating differential equations is of order 2,
while the convergence of the Hermite-Simpson method for
integrating differential equations is of order 4.
Under certain conditions, these properties more or less
translate to the optimal control case, but detailed analysis is
beyond the scope of these lectures.
28 / 89
Global discretisations: Pseudospectral methods
Pseudospectral methods (PS) were originally developed for

the solution of partial differential equations that arise in fluid
mechanics.
However, PS techniques have also been used as computational
methods for solving optimal control problems.
While Runge-Kutta methods approximate the derivatives of a
function using local information, PS methods are, in contrast,
global.
PS methods are the combination of using global polynomials
with orthogonally collocated points.
29 / 89
A function is approximated as a weighted sum of smooth basis

functions, which are often chosen to be Legendre or
Chebyshev polynomials.
One of the main appeals of PS methods is their exponential
(or spectral) rate of convergence, which is faster than any
polynomial rate.
30 / 89
Another advantage is that with relatively coarse grids it is

possible to achieve good accuracy.
Approximation theory shows that PS methods are well suited
for approximating smooth functions, integrations, and
differentiations, all of which are relevant to optimal control
problems
31 / 89
PS methods have some disadvantages:

The resulting NLP is not as sparse as NLP problems derived
from Runge-Kutta discretizations.
This results in less efficient (slower) NLP solutions and
possibly difficulties in finding a solution.
They may not work very well if the solutions of the optimal
control problem is not sufficiently smooth, or if their
representation require many grid points.
32 / 89
Approximating a continuous function using Legendre

polynomials
Let LN (τ ) denote the Legendre polynomial of order N, which may
be generated from:
1 dN 2
LN (τ ) = (τ − 1)N
2N N! dτ N
Examples of Legendre polynomials are:
L0 (τ ) = 1
L1 (τ ) = τ
1
L2 (τ ) = (3τ 2 − 1)
2
1
L3 (τ ) = (5τ 3 − 3τ )
2
33 / 89

polynomials
The figure illustrates the Legendre polynomials LN (τ ) for
N = 0, . . . , 3.
1
0.8
0.6
0.4
0.2
LN(τ)
−0.2
−0.4
−0.6
−0.8
−1
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

τ
34 / 89

polynomials
Let τk , k = 0, . . . , N be the Lagrange-Gauss-Lobatto (LGL) nodes,

which are defined as τ0 = −1, τN = 1, and τk , being the roots of
L̇N (τ ) in the interval [−1, 1] for k = 1, 2, . . . , N − 1.
There are no explicit formulas to compute the roots of L̇N (τ ), but

they can be computed using known numerical algorithms.
35 / 89

polynomials
For example, for N = 20, the LGL nodes τk , k = 0, . . . , 12 shown
in the figure.
−1
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
36 / 89

polynomials
Given any real-valued function f (τ ) : [−1, 1] → <, it can be
approximated by the Legendre pseudospectral method:
N
X
f (τ ) ≈ f˜(τ ) = f (τk )φk (τ ) (1)
k=0
where the Lagrange interpolating polynomial φk (τ ) is defined by:
1 (τ 2 − 1)L̇N (τ )
φk (τ ) = (2)
N(N + 1)LN (τk ) τ − τk
It should be noted that φk (τj ) = 1 if k = j and φk (τj )=0, if k 6= j,

so that:
f˜(τk ) = f (τk ) (3)
37 / 89

polynomials
Regarding the accuracy and error estimates of the Legendre

pseudospectral approximation, it is well known that for
smooth functions f (τ ), the rate of convergence of f˜(τ ) to
f (τ ) is faster than any power of 1/N.
The convergence of the pseudospectral approximations has
been analysed by Canuto et al (2006).
38 / 89

polynomials
The figure shows the degree N interpolation of the function
f (τ ) = 1/(1 + τ + 15τ 2 ) in (N + 1) equispaced and LGL points for
N = 20. With increasing N, the errors increase exponentially in the
equispaced case (this is known as the Runge phenomenon) whereas
in the LGL case they decrease exponentially.
equispaced points LGL points
1.5 1.5
1 1
0.5 0.5
0 0
−0.5 max error = 18.8264 −0.5 max error = 0.0089648
−1 −1
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
39 / 89
Differentiation with Legendre polynomials
The derivatives of f˜(τ ) in terms of f (τ ) at the LGL points τk can

be obtained by differentiating Eqn. (1). The result can be
expressed as follows:
N
f˙ (τk ) ≈ f˜˙ (τk ) =
X
Dki f (τi )
i=0
where 
LN (τk ) 1
 if k 6= i
 LN (τi ) τk −τi


Dki = −N(N + 1)/4 if k = i = 0 (4)
 N(N + 1)/4 if k = i = N


 0 otherwise
which is known as the differentiation matrix.
40 / 89
For example, this is the Legendre differentiation matrix for N = 5.

 
7.5000 −10.1414 4.0362 −2.2447 1.3499 −0.5000
 1.7864
 0 −2.5234 1.1528 −0.6535 0.2378 

−0.4850 1.7213 0 −1.7530 0.7864 −0.2697
D= 
 0.2697
 −0.7864 1.7530 0 −1.7213 0.4850 

−0.2378 0.6535 −1.1528 2.5234 0 −1.7864
0.5000 −1.3499 2.2447 −4.0362 10.1414 −7.5000
41 / 89

Legendre differentiation of f (t) = e t sin(5t) for N = 10 and
N = 20. Note the vertical scales.
f(t), N=10 error in f’(t), N=10
2 0.02
0 0
−2 −0.02
−4 −0.04
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
f(t), N=20 −9
x 10 error in f’(t), N=20
2 2
0 1
−2 0
−4 −1
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
42 / 89
Numerical integration with Legendre polynomials

Note that if h(τ ) is a polynomial of degree ≤ 2N − 1, its integral
over τ ∈ [−1, 1] can be exactly computed as follows:
Z 1 N
X
h(τ )dτ = h(τk )wk (5)
−1 k=0
where τk , k = 0, . . . , N are the LGL nodes and the weights wk are
given by:
2 1
wk = , k = 0, . . . , N. (6)
N(N + 1) [LN (τk )]2
If L(τ ) is a general smooth function, then for a suitable N, its
integral over τ ∈ [−1, 1] can be approximated as follows:
Z 1 N
X
L(τ )dτ ≈ L(τk )wk (7)
−1 k=0
43 / 89
Discretising an optimal control problem
Note that PS methods require introducing the transformation:

2 tf + t0
τ← t− ,
tf − t0 tf − t0
so that it is possible to write Problem 1 using a new independent
variable τ in the interval [−1, 1]
44 / 89

After applying the above transformation, Problem 1 can be
expressed as follows.
Problem 3. Find all the free functions and variables (the controls
u(τ ), τ ∈ [−1, 1], the states x(τ ), t ∈ [−1, 1], the parameters
vector p, and the times t0 , and tf ), to minimise the following
performance index,
Z 1
J = ϕ[x(−1), t0 , x(1), tf ] + L[x(τ ), u(τ ), t(τ )]dτ
−1
subject to:
tf − t0
ẋ(τ ) = f[x(τ ), u(τ ), p, t(τ )], τ ∈ [−1, 1],
2
45 / 89
Boundary conditions:
ψL ≤ ψ[x(−1), u(−1), x(1), u(1), p, t0 , tf ] ≤ ψU .
Inequality path constraints:
hL ≤ h[x(τ ), u(τ ), p, t(τ )] ≤ hU , τ ∈ [−1, 1].
46 / 89
Bound constraints on controls, states and static parameters:
uL ≤ u(τ ) ≤ uU , τ ∈ [−1, 1],

xL ≤ x(τ ) ≤ xU , τ ∈ [−1, 1],
pL ≤ p ≤ pU .
And the following constraints:
t 0 ≤ t0 ≤ t̄0 ,
t f ≤ tf ≤ t̄f ,
tf − t0 ≥ 0.

47 / 89
In the Legendre pseudospectral approximation of Problem 3, the

state function x(τ ), τ ∈ [−1, 1] is approximated by the N-order
Lagrange polynomial x̃(τ ) based on interpolation at the
Legendre-Gauss-Lobatto (LGL) quadrature nodes, so that:
N
X
x(τ ) ≈ x̃(τ ) = x(τk )φk (τ ) (8)
k=0
48 / 89
Moreover, the control u(τ ), τ ∈ [−1, 1] is similarly approximated

using an interpolating polynomial:
N
X
u(τ ) ≈ ũ(τ ) = u(τk )φk (τ ) (9)
k=0
Note that, from (3), x̃(τk ) = x(τk ) and ũ(τk ) = u(τk ).
49 / 89
The derivative of the state vector is approximated as follows:

N
X
˙ k) =
ẋ(τk ) ≈ x̃(τ Dki x̃(τi ), i = 0, . . . N (10)
i=0
where Dki is an element of D, which is the (N + 1) × (N + 1) the

differentiation matrix given by (4).
50 / 89
Define the following nu × (N + 1) matrix to store the trajectories

of the controls at the LGL nodes:

U = u(τ0 ) u(τ1 ) . . . u(τN ) (11)
51 / 89
Define the following nx × (N + 1) matrices to store, respectively,

the trajectories of the states and their approximate derivatives at
the LGL nodes:

X = x(τ0 ) x(τ1 ) . . . x(τN ) (12)
and
˙ 0 ) x̃(τ
˙ 1 ) . . . x̃(τ
˙ N)

Ẋ = x̃(τ (13)
52 / 89
Now, form the following nx × (N + 1) matrix with the right hand

side of the differential constraints evaluated at the LGL nodes:
tf − t0
F= f[x̃(τ0 ), ũ(τ0 ), p, τ0 ] ... f[x̃(τN ), ũ(τN ), p, τN ] (14)
2
Define the differential defects at the collocation points as the
nx (N + 1) vector:

ζ = vec Ẋ − F = vec XDT − F
53 / 89
The decision vector y , which has dimension

ny = (nu (N + 1) + nx (N + 1) + np + 2), is constructed as follows:
 
vec(U)
vec(X)
 
 p 
y= (15)

 t0 
tf
54 / 89
Some optimal control problems can be conveniently

formulated as having a number of phases.
Phases may be inherent to the problem (for example, a
spacecraft drops a section and enters a new phase).
Phases may also be introduced by the analyst to allow for
peculiarities in the solution of the problem.
Many real world problems need a multi-phase approach for
their solution.
55 / 89
The figure illustrates the notion of multi-phase problems.
Phase 3
Phase 2
Phase 4
Phase 1
t
(1) (1) (2) (3) (4)
t0 tf tf tf tf
(2)
t0 (3)
t0
(4)
t0
56 / 89
Typically, the formulation for multi-phase problems involve the

following elements:
1 a performance index which is the sum of individual
performance indices for each phase;
2 differential equations and path constraints for each phase.
3 phase linkage constraints which relate the variables at the
boundaries between the phases;
4 event constraints which the initial and final states of each
phase need to satisfy.
57 / 89
Scaling
The numerical methods employed by NLP solvers are sensitive

to the scaling of variables and constraints.
Scaling may change the convergence rate of an algorithm, the
termination tests, and the numerical conditioning of systems
of equations that need to be solved.
Modern optimal control codes often perform automatic
scaling of variables and constraints, while still allowing the
user to manually specify some or all of the scaling factors.
58 / 89
Scaling
A scaled variable or function is equal to the original value

times an scaling factor.
Scaling factors for controls, states, static parameters, and
time, are typically computed on the basis of known bounds for
these variables, e.g. for state xj (t)
1
sx,j =
max(|xL,j |, |xU,j |)
For finite bounds, the variables are scaled so that the scaled
values are within a given range, for example [−1, 1].
The scaling factor of each differential defect constraint is
often taken to be equal to the scaling factor of the
corresponding state:
sζ,j,k = sx,j
59 / 89
Scaling
Scaling factors for all other constraints are often computed as

follows. The scaling factor for the i-th constraint of the NLP
problem is the reciprocal of the 2-norm of the i-th row of the
Jacobian of the constraints evaluated at the initial guess.
sH,i = 1/||row(Hy , i)||2

This is done so that the rows of the Jacobian of the
constraints have a 2-norm equal to one.
The scaling factor for the objective function can be computed
as the reciprocal of the 2-norm of the gradient of the objective
function evaluated at the initial guess.
sF = 1/||Fy ||2
60 / 89
Modern optimal control software uses sparse finite

differences or sparse automatic differentiation libraries to
compute the derivatives needed by the NLP solver.
Sparse finite differences exploit the structure of the Jacobian
and Hessian matrices associated with an optimal control
problem
The individual elements of these matrices can be numerically
computed very efficiently by perturbing groups of variables,
as opposed to perturbing single variables.
61 / 89
Automatic differentiation is based on the application of the

chain rule to obtain derivatives of a function given as a
numerical computer program.
Automatic differentiation exploits the fact that computer
programs execute a sequence of elementary arithmetic
operations or elementary functions.
By applying the chain rule of differentiation repeatedly to
these operations, derivatives of arbitrary order can be
computed automatically and very accurately.
Sparsity can also be exploited with automatic differentiation
for increased efficiency.
62 / 89
It is advisable to use, whenever possible, sparse automatic

differentiation to compute the required derivatives, as they are
more accurate and faster to compute than numerical
derivatives.
When computing numerical derivatives, the structure may be
exploited further by separating constant and variable parts of
the derivative matrices.
The achievable sparsity is dependent on the discretisation
method employed and on the problem itself.
63 / 89
Measures of accuracy of the discretisation
It is important for optimal control codes implementing direct

collocation methods to estimate the discretisation error.
This information can be reported back to the user and
employed as the basis of a mesh refinement scheme.
64 / 89
Define the error in the differential equation as a function of time:
˙
(t) = x̃(t) − f[x̃(t), ũ(t), p, t]
where x̃ is an interpolated value of the state vector given the grid

point values of the state vector, x̃˙ is an estimate of the derivative
of the state vector given the state vector interpolant, and ũ is an
interpolated value of the control vector given the grid points values
of the control vector.
65 / 89
The type of interpolation used depends on the collocation method

employed. The absolute local error corresponding to state i on a
particular interval t ∈ [tk , tk+1 ], is defined as follows:
Z tk+1
ηi,k = |i (t)|dt
tk
where the integral is computed using an accurate quadrature

method.
66 / 89
The local relative ODE error is defined as:

ηi,k
k = max
i wi + 1
where
N
wi = max |x̃i,k |, |x̃˙ i,k |

k=1
The error sequence k , or a global measure of the size of the error

(such as the maximum of the sequence k ), can be analysed by the
user to assess the quality of the discretisation. This information
may also be useful to aid the mesh refinement process.
67 / 89
Mesh refinement
The solution of optimal control problems may involve periods

of time where the state and control variables are changing
rapidly.
It makes sense to use shorter discretisation intervals in such
periods, while longer intervals are often sufficient when the
variables are not changing much.
This is the main idea behind automatic mesh refinement
methods.
68 / 89
Mesh refinement
Mesh refinement methods detect when a solution needs

shorter intervals in regions of the domain by monitoring the
accuracy of the discretisation.
The solution grid is then refined aiming to improve the
discretisation accuracy in regions of the domain that require it.
The NLP problem is then solved again over the new mesh,
using the previous solution as an initial guess (after
interpolation), and the result is re-evaluated.
69 / 89
Mesh refinement
Typically several mesh refinement iterations are performed

until a given overall accuracy requirement is satisfied.
With local discretisations, the mesh is typically refined by
adding a limited number of nodes to intervals that require
more accuracy, so effectively sub-dividing existing intervals. A
well known local method was proposed by Betts.
For pseudospectral methods, techniques have been proposed
for segmenting the problem (adding phases) and automatically
calculating the number of nodes in each segment.
70 / 89
Mesh refinement
The figure illustrates the mesh refinement process.
71 / 89
Example: vehicle launch problem

This problem consists of the
launch of a space vehicle
(Benson, 2004).
The flight of the vehicle can
be divided into four phases,
with dry masses ejected
from the vehicle at the end
of phases 1, 2 and 3.
The final times of phases 1,
2 and 3 are fixed, while the
final time of phase 4 is free.
It is desired to find the
control u(t) and the final
time to maximise the mass
at the end of phase 4.
72 / 89

The dynamics are given by:
ṙ = v
µ T D
v̇ = − 3
r+ u+
krk m m
T
ṁ = −
g0 Isp
T
where r(t) = x(t) y (t) z(t) is the position,
T
v = vx (t) vy (t) vz (t) is the Cartesian ECI velocity, T is
the vacuum thrust, m is the mass, g0 is the acceleration of gravity
at sea level, Isp is the specific impulse of the engine,
T
u = ux uy uz is the thrust direction, and
T
D = Dx Dy Dz is the drag force.
73 / 89
The vehicle starts on the ground at rest relative to the Earth at

time t0 , so that the initial conditions are
T
r(t0 ) = r0 = 5605.2 0 3043.4 km
T
v(t0 ) = v0 = 0 0.4076 0 km/s
m(t0 ) = m0 = 301454 kg
74 / 89
The terminal constraints define the target transfer orbit, which is

defined in orbital elements as
af = 24361.14 km,
ef = 0.7308,
if = 28.5 deg,
Ωf = 269.8 deg,
ωf = 130.5 deg
75 / 89
The following linkage constraints force the position and velocity to

be continuous and also model the discontinuity in the mass due to
the ejections at the end of phases 1, 2 and 3:
r(p) (tf ) − r(p+1) (t0 ) = 0,

v(p) (tf ) − v(p+1) (t0 ) = 0, (p = 1, . . . , 3)
(p)
m(p) (tf ) − mdry − m(p+1) (t0 ) = 0
where the superscript (p) represents the phase number.

There is also a path constraint associated with this problem:
||u||2 = ux2 + uy2 + uz2 = 1
76 / 89
This problem has been solved using the author’s open source
optimal control software PSOPT (see http://www.psopt.org)
Parameter values can be found in the code.
The problem was solved in eight mesh refinement iterations
until the maximum relative ODE error max was below
1 × 10−5 , starting from a coarse uniform grid and resulting in
a finer non-uniform grid.
77 / 89
Multiphase vehicle launch
alt
6550
6500
altitude (km)
6450
6400
0 100 200 300 400 500 600 700 800 900 1000
time (s)
Figure: Altitude for the vehicle launch problem
78 / 89
10000 speed
9000
8000
7000
speed (m/s)
6000
5000
4000
3000
2000
1000
0 100 200 300 400 500 600 700 800 900 1000
time (s)
Figure: Speed for the vehicle launch problem
79 / 89
1 u1
u2
u3
0.8
0.6
0.4
u (dimensionless)
0.2
-0.2
-0.4
-0.6
0 100 200 300 400 500 600 700 800 900 1000
time (s)
Figure: Controls for the vehicle launch problem
80 / 89
Example: Two burn orbit transfer
The goal of this problem is to compute a trajectory for an

spacecraft to go from a standard space shuttle park orbit to a
geosynchronous final orbit (Betts, 2010).
The engines operate over two short periods during the mission.
It is desired to compute the timing and duration of the burn
periods, as well as the instantaneous direction of the thrust
during these two periods
The objective is to maximise the final weight of the spacecraft.
The mission then involves four phases: coast, burn, coast and
burn.
81 / 89

The problem is formulated as follows. Find
(1) (2) (3) (4)
u(t) = [θ(t), φ(t)]T , t ∈ [tf , tf ] and t ∈ [tf , tf ], and the
(1) (2) (3) (4)
instants tf , tf , tf , tf such that the following objective
function is minimised:
J = −w (tf )
subject to the dynamic constraints for phases 1 and 3:
ẏ = A(y)∆g + b
the following dynamic constraints for phases 2 and 4:
ẏ = A(y)∆ + b
ẇ = −T /Isp
82 / 89

and the following linkages between phases
(1) (2)
y(tf ) = y(t0 )
(2) (3)
y(tf ) = y(t0 )
(3) (4)
y(tf ) = y(t0 )
(1) (2)
tf = t0
(2) (3)
tf = t0
(3) (4)
tf = t0
(2) (4)
w (tf ) = w (t0 )
where y = [p, f , g , h, k, L, w ]T is the vector of modified equinoctial

elements, w is the spacecraft weight, Isp is the specific impulse of
the engine, T is the maximum thrust.
83 / 89
Expressions for A(y) and b are given in (Betts, 2010). the

disturbing acceleration is ∆ = ∆g + ∆T , where ∆g is the
gravitational disturbing acceleration due to the oblatness of Earth,
and ∆T is the thurst acceleration.
 
Ta cos θ cos φ
∆T = Qr Qv  Ta cos θ sin φ 
Ta sin θ
where Ta (t) = g0 T /w (t), g0 is a constant, θ is the pitch angle
and φ is the yaw angle of the thurst.
84 / 89
Matrix Qv is given by:

v v×r v v×r
Qv = , , ×
||v|| ||v × r|| ||v|| ||v × r||
Matrix Qr is given by:

h r (r×v)×r (r×v)
i
Qr = ir iθ ih = ||r|| ||r×v||||r|| ||r×v||
The results shown were obtained with PSOPT using local

discretisations and 5 mesh refinement iterations until the maximum
relative ODE error max was below 1 × 10−5 .
85 / 89
86 / 89
Two burn transfer trajectory
-45000
2000
0 -40000
-2000 -35000
-4000 -30000
-6000 -25000
z (km) -20000
-15000 x (km)
-10000 -10000
-5000 -5000
0
5000 0
10000 5000
15000
y (km) 2000010000
87 / 89
Mesh refinement statistics
88 / 89
Further reading
V.M. Becerra (2010) PSOPT Optimal Control Solver User Manual.

http://www.psopt.org.
Benson, D.A.: A Gauss Pseudospectral Transcription for Optimal
Control. Ph.D. thesis, MIT, Department of Aeronautics and
Astronautics, Cambridge, Mass. (2004)
J.T. Betts (2010). Practical Methods for Optimal Control and
Estimation Using Nonlinear Programming. SIAM.
C. Canuto, M.Y. Hussaini, A. Quarteroni, and T.A. Zang. Spectral
Methods: Fundamentals in Single Domains. Springer-Verlag, Berlin,
2006.
Elnagar, G., Kazemi, M.A., Razzaghi, M.: “The Pseudospectral
Legendre Method for Discretizing Optimal Control Problems”. IEEE
Transactions on Automatic Control 40, 17931796 (1995)
89 / 89

AstroNet II Becerra

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AstroNet II Becerra

Uploaded by

Copyright:

Available Formats

Optimal Control and Applications

V. M. Becerra - Session 1 - 2nd AstroNet-II Summer School

Optimal Control and Applications

5-6 June 2013

Introduction to optimal control

Direct collocation methods for optimal control

Introduction to optimal control

For the purposes of these lectures, a dynamical system can be

Introduction to optimal control

We are also interested in a measure of the performance of the

Introduction to optimal control

The performance index J might include, for example, measures of:

Introduction to optimal control

Introduction to optimal control

Optimal control theory has been developed over a number of

Introduction to optimal control

As a form of motivation, consider the

Image courtesy of nasaimages.org.

Introduction to optimal control

Implemented on the International Space Station on 5

Introduction to optimal control

Introduction to optimal control

The parameter bounds

And the boundary conditions:

q(t0 ) = q̄0 ω(t0 ) = ωo (q̄0 ) h(t0 ) = h̄0

Introduction to optimal control

T(q) is given by:

Introduction to optimal control

u is the control force, which is given by:

u(t) = J (KP ε̃(q, qc ) + KD ω̃(ω, qc ))

Introduction to optimal control

τd is the disturbance torque, which in this case only incorporates

τd = τgg + τa = 3n2 C3 (q) × (JC3 (q)) + τa

Introduction to optimal control

Image courtesy of nasaimages.org

Video animation of the zero propellant maneouvre using real

Introduction to optimal control

Numerical results obtained by the author with PSOPT using a

Introduction to optimal control

Ingredients for solving realistic optimal control problems

Introduction to optimal control

A brief on constrained nonlinear optimisation

Consider a twice continuously differentiable scalar function

f (y∗ ) < f (y∗ + ∆y)

for all 0 < ||∆y|| ≤ δ.

Similarly y∗ is called a weak minimiser if it is not strong and there

A brief on constrained nonlinear optimisation

The gradient of f (y) with respect to y is a column vector

A brief on constrained nonlinear optimisation

The second derivative of f (y) with respect to y is an ny × ny

∇2y f (y) = fyy = 

A brief on constrained nonlinear optimisation

Consider the second order Taylor expansion of f about y∗ :

A brief on constrained nonlinear optimisation

A brief on constrained nonlinear optimisation

Constrained nonlinear optimisation involves finding a decision

A brief on constrained nonlinear optimisation

A particular case that is simple to analyse occurs when the

A brief on constrained nonlinear optimisation

Introduce the Lagrangian function:

L[y, λ] = f (y) + λT c(y) (1)

where L : Rny × Rnc 7→ R is a scalar function and λ ∈ <nc is a

A brief on constrained nonlinear optimisation

The first derivative of c(y) with respect to y is an nc × ny matrix