State Feedback (Continued), Optimal Control: Morten O. Alver (Based On Slides by Morten D. Pedersen

TTK4115
Lecture 4
State feedback (continued), optimal control
Morten O. Alver (based on slides by Morten D. Pedersen
TTK4115 (MOA) Lecture 4 1 / 43

This lecture
1. Reference feed-forward
2. Integral effect
3. Optimal Control
4. Lunar lander

Topic
2. Integral effect
3. Optimal Control
4. Lunar lander

Control with a reference
State feedback
Consider the system:
ẋ = Ax + Bu
y = Cx
With the control law:

u = −Kx
the state equation can be rewritten as:
ẋ = (A − BK)x
if the system is controllable, we can choose K to make the closed-loop system stable. Then, x will
converge towards x∞ = 0, and y towards y∞ = 0.
Reference
Sometimes we want the system output y to converge to a non-zero reference r. But we still want
the benefits of the state feedback. How?

Reference feed-forward
Aim: asymptotic convergence to reference
We want1 y(t) → r as t → ∞.
Implementation
u= Kr r − Kx
|{z} |{z}
Reference feedforward State Feedback
Equilibrium conditions
Assuming that feedback results in a stable equilibrium yields the steady-state condition:
ẋ = Ax + B(Kr r − Kx) = 0 ⇒ (A − BK)x∞ = −BKr r0 ⇒ y∞ = [C(BK − A)−1 B]Kr r0
Finding Kr
Inversion gives the correct feedforward gain:
Kr = [C(BK − A)−1 B]−1
Note that the number of references must be the same as the number of outputs.
1
The elements in y are the variables we wish to control, not the measurement per sé. In state feedback all states are assumed
known.
A common mistake
If C = I, so y = x, one might be tempted to use pure error feedback:
u = K(r − x)
This yields the system:

ẋ = (A − BK)x + BKr
x̂(s)
⇒ = (sI − [A − BK])−1 BK
r̂(s)
Assume that the reference is constant r̂(s) = r0 /s. The final-value theorem yields (in general):
x̂(s) r0
x(∞) = lim s = −(A − BK)−1 BKr0 6= r0
s→0 r̂(s) s

Correct approach
Suppose now that state feedback + reference feedforward is used instead (now for any C):
u = [C(BK − A)−1 B]−1 r − Kx
This yields the system:

ẋ = (A − BK)x + B[C(BK − A)−1 B]−1 r
ŷ(s) x̂(s)
⇒ =C = C(sI − [A − BK])−1 B[C(BK − A)−1 B]−1
r̂(s) r̂(s)
Assume that the reference is constant r̂(s) = r0 /s. The final-value theorem yields:
ŷ(s) r0
y(∞) = lim s = C(A − BK)−1 B[C(A − BK)−1 B]−1 r0 = r0
s→0 r̂(s) s

The effect of the feed-forward

For controlling toward a reference, we have found the feedback law u = Kr r − Kx with the
feedforward defined by:
Kr = [C(BK − A)−1 B]−1
The reference feed-forward moves the equilibrium point of the closed-loop system so y = r in
the equilibrium point.
This lets us separate the issue of following the reference (feed-forward), from the issue of the
system’s convergence properties (state feedback).
Note that the value of Kr depends on the value of K.

Topic
2. Integral effect
3. Optimal Control
4. Lunar lander

Integral effect
Problem
What if we don’t know our system perfectly? (We never do)
What about disturbances: w?
Solution
Use integral effect!

Integral effect
Plant2
Disturbance
z}|{
ẋ = Ax + Bu + Bw
y = Cx
Integrator state augmentation

Z t
xa = r(τ ) − Cx(τ ) dτ
0
Augmented system

ẋ A 0 x B B 0
= + u+ w+ r
ẋa −C 0 xa 0 0 I
2
The disturbance w is assumed to act in a way that can be cancelled by the input.
Integral effect
Augmented system

ẋ A 0 x B B 0
= + u+ w+ r
ẋa −C 0 xa 0 0 I
State feedback
As before state feedback + reference feedforward is used:

x
u = − K Ka + Kr r
xa
Closed loop system

ẋ A − BK −BKa x BKr B
= + r+ w
ẋa −C 0 xa I 0

Integral effect
Closed loop system

= + r+ w
ẋa −C 0 xa I 0
Steady state behavior - with feedforward

Provided that the feedback results in an asymptotically stable augmented system the output will
tend towards a constant reference r0 :
ẋa = 0 ⇒ Cx∞ = r0
The state vector exhibits the following limit:
ẋ = 0 ⇒ Cx∞ = C(A − BK)−1 B[Ka xa,∞ − Kr r0 − w0 ]
If the feedforward is implemented as shown earlier (Kr = [C(BK − A)−1 B]−1 ) we have:
Cx∞ − r0 = C(A − BK)−1 B[w0 − Ka xa,∞ ] = 0
The integral action thus acts as a disturbance estimator/compensator:
Ka xa,∞ = w0

Integral effect
Closed loop system

= + r+ w
ẋa −C 0 xa I 0
Steady state behavior - without feedforward

Provided that the feedback results in an asymptotically stable augmented system the output will
tend towards a constant reference r0 :
ẋa = 0 ⇒ Cx∞ = r0
The state vector exhibits the following limit:
ẋ = 0 ⇒ Cx∞ = C(A − BK)−1 B[Ka xa,∞ − w0 ]
If the feedforward is not implemented we have:
Cx∞ − r0 = C(A − BK)−1 B[Ka xa,∞ − w0 ] − r0 = 0
The integral action now performs double duty and compensates for the disturbance as well as
setting the correct bias3 :
Ka xa,∞ = −Kr r0 + w0
3
But only asymptotically..
Example
Cruise control
Car model:
mv̇ = −dv + u + w, y =v
Matrices:
d 1
A = −m , B= m
, C=1
State feedback
Control:
u = −kv + kr r
Feedforward gain:
kr = [C(BK − A)−1 B]−1 = d + k
Closed loop dynamics:
mv̇ = (d + k )(r − v ) + w, y =v
Stable equilibrium at y = r if w = 0.

Example
Cruise control
Car model:
mv̇ = −dv + u + w, y =v
Matrices:
d 1
A = −m , B= m
, C=1
Augmented state space

d 1 1

v̇ −m 0 v m
0
= + u+ r+ m w
ẋa −1 0 xa 0 1 0
State feedback
Control:
u = −kv − ka xa + kr r
− d+k − kma kr 1

v̇ v
= m + m r+ m w
ẋa −1 0 xa 1 0

Example
Cruise control
Car model:
mv̇ = −dv + u + w, y =v
Matrices:
d 1
A = −m , B= m
, C=1
State feedback
Control:
u = −kv − ka xa + kr r
− d+k − kma kr 1

v̇ v
= m + m r+ m w
ẋa −1 0 xa 1 0
Transfer functions
r.ff.
z }| {
(d + k)s −ka s
ŷ (s) = r̂ (s) + ŵ(s)
ms2 + (d + k)s − ka ms2 + (d + k)s − ka
Retaining reference feedforward may yield faster reference tracking.

Topic
2. Integral effect
3. Optimal Control
4. Lunar lander

Optimal Control
Optimal control
Optimal control is concerned with finding an input u(t) that minimizes a cost function over some
interval of time, subject to system dynamics.
LQR
The Linear Quadratic Regulator is a special case where the plant is Linear and the cost function is
Quadratic. The time interval is infinite.

Optimal Control
The LQR problem
Consider the system:
ẋ = Ax + Bu
y = Cx
The system is to be controlled using the control law: u = −Kx
K is chosen to minimize the following object function:

Z ∞
J= yT (t)Qy(t) + uT (t)Ru(t)dt
0
This form weights the outputs y(t). We can choose to weight the state values x(t) instead.
Note:
It is an ordinary state feedback - just with a new way of choosing K.
The J function is separate from the state equation - but its value depends on it.
For J to have a finite value, the system must converge towards its equilibrium in origo.
The system must therefore be controllable...
... and the value of K that optimizes J must stabilize the system.

Optimal Control
Scalar case
In a scalar system, the object function can be written as follows:
Z ∞
J= qx(t)2 + ru(t)2 dt, q > 0 and r > 0
0
Here, the value of J depends on:

The integral of x(t)2 as it converges towards 0 (from some initial value).
The integral of u(t)2 as the input works to bring x(t) to 0.
The weighting parameters q and r .
The feedback gain k that optimizes J depends on the relative sizes of q and r , and these serve to
prioritize:
either fast convergence of x(t), using a large q/r
or low use of the input u(t), using a small q/r .

Optimal Control
Multivariable case
Z ∞
0
In the multivariable case:
The Q matrix lets us weight fast convergence in each individual output (in the diagonal
elements) and in products of pairs of outputs (in the off-diagonal elements). We often use
nonzero values on the diagonal only.
The R matrix lets us weight the inputs (in the diagonal elements). We typically use nonzero
values on the diagonal only.

Optimal Control
Trajectories of a feedback controlled system
Finding a minimum
Finding an analytical solution that minimizes the objective function is in general not trivial.
Discretizing the dynamics and finding the optimum numerically over a finite horizon is a viable
method (MPC,NMPC).
Linear model with a quadratic cost function is a ”lucky” case: we can find an analytical
solution that covers all all initial values.

LQR problem definition
Cost functional to be minimized w.r.t. u(t)

Z ∞
JLQR = yT (t)Qy(t) + uT (t)Ru(t)dt
0
with:
Q > 0, Q = QT , R > 0, R = RT
Output energy
yT (t)Qy(t)
Making this function smaller requires more input energy.
Control energy
uT (t)Ru(t)
Making this function smaller requires less input energy which leads to higher output energy.

Optimal Control
Finding a minimum
We want to find the minimum of:
Z ∞
0
The ”trick” is to rewrite the cost functional on the special form:

Z ∞
J = H(x(·), u(·)) + Λ(x(t), u(t))dt
0
where:
H(x(·), u(·)) is not affected by the input, directly or indirectly.
Λ(x(t), u(t)) has an obvious minimum in terms of u(t).

Optimal control
Desired form of cost function
Z ∞
J = H(x(·), u(·)) + Λ(x(t), u(t))dt
0
The functionala H(x(·), u(·)) above can be described as a feedback invariant. The functional takes
the system inputs and states as arguments, but its value depends only on the initial condition x(0).
It is not affected by the input, directly or indirectly.
a
A functional takes functions as arguments and returns a scalar.
Feedback invariance
The functional: Z ∞ d h T i
H(x(·), u(·)) , − x (t)Sx(t) dt
0 dt
happens to be feedback invariant as long as x(t) → 0 as t → ∞:
Z ∞
d h T i h i∞
x (t)Sx(t) dt = − xT (t)Sx(t) = xT (0)Sx(0) − (
xT (∞)Sx(∞)
(
− (((
0 dt 0
The integrand can be rewritten as follows:
d h T i
x Sx(t) = ẋT Sx + xT Sẋ = xT S[Ax + Bu] + [Ax + Bu]T Sx
dt

Optimal control

Z ∞ Z ∞
J = H(x(·), u(·)) + Λ(x(t), u(t))dt = yT (t)Qy(t) + uT (t)Ru(t)dt
0 0
Z ∞
H(x(·), u(·)) = − [Ax(t) + Bu(t)]T Sx(t) + xT (t)S[Ax(t) + Bu(t)]dt
0
Develop by adding and subtracting
"Z #
∞
T T
J = H(x, u) + y (t)Qy(t) + u (t)Ru(t)dt − H(x(·), u(·))
0
Z ∞
= H(x, u) + xT CT QCx + uT Ru + (Ax + Bu)T Sx + xT S(Ax + Bu)dt
0
 
Z ∞
xT (AT S + SA + CT QC)x + uT Ru + 2uT BT Sx dt 
 
= H(x, u) + 

0 | {z } 
Λ(x,u)

Optimal control
Z ∞
J = H(x(·), u(·)) + Λ(x(t), u(t))dt
0
Z ∞
0
Λ(x, u) = xT (AT S + SA + CT QC)x + uT Ru + 2uT BT Sx
Refine Λ(x, u) by ”completing the squares” (matrix version):

zT Mz − 2bT z = (z − M−1 b)T M(z − M−1 b) − bT M−1 b
Thus we can rewrite uT Ru + 2uT BT Sx by choosing u as z and −BT Sx as b (remember that

uT BT Sx = xT SBu since the term is is scalar and S is symmetric):
uT Ru + 2uT BT Sx = (u + R−1 BT Sx)T R(u + R−1 BT Sx) − xT SBR−1 BT Sx

Z ∞
J = H(x(·), u(·)) + Λ(x(t), u(t))dt
0
Z ∞
0
Λ(x, u) = xT (AT S + SA + CT QC − SBR−1 BT S)x + (u + R−1 BT Sx)T R(u + R−1 BT Sx)
How to minimize J
We already established that H(x(·), u(·)) is feedback invariant for any choice of S, so we can
ignore it in the optimization.
Instead we focus on Λ(x, u), which contains two quadratic expressions:

| {z }
≥0
We can choose S so the first quadratic expression becomes 0. The equation has the form of
the continuous-time Ricatti equation (CARE)
After choosing S, we just need to minimize the second term (to 0) by the right choice of u

Z ∞
J = H(x(·), u(·)) + Λ(x(t), u(t))dt
0
Z ∞
0
Minimization step 1:
Solve CARE:
AT S + SA + CT QC − SBR−1 BT S = 0
Minimization step 2:
Choose:
u = −R−1 BT Sx
The minimum is thus obtained:

Λ(x(t), u(t)) ≡ 0 ⇒ J = xT (0)Sx(0)

LQR summarized
Minimal solution
The solution that minimizes the cost function for the system:
ẋ(t) = Ax(t) + Bu(t)
is a linear feedback:
u(t) = −Kx(t)
where:
K = R−1 BT S
leading to the closed loop system:
ẋ(t) = [A − BR−1 BT S]x(t)
Algebraic Riccati Equation

The matrix S is found by solving:
[AT S + SA] + CT QC − SBR−1 BT S = 0
where S > 0 must be positive definite.

LQR example
Example ẋ(t) = −λx(t) + u(t), y (t) = x(t)

Ricatti equation:
[AT S + SA] + CT QC − SBR−1 BT S = 0
In scalar form, with S replaced by p, A = −λ and B = C = 1, the equation becomes:
p2
−2λp + q − =0
r
The solutions are: q
p = −λr ± qr + λ2 r 2 : Pick the positive solution
The feedback matrix becomes:
r !
p q
u(t) = −kx(t) = − x(t) = λ− + λ2 x(t)
r r

LQR example
Closed loop system ẋ(t) = −λx(t) − kx(t)

State dynamics:
p
ẋ(t) = − λ− x(t)
r
r !
q 2
= − λ − [λ − + λ ] x(t)
r
r !
q
= − + λ2 x(t)
r
Input: !
r
p q
u(t) = −kx(t) = − x(t) = λ− + λ2 x(t)
r r
Note:
There is a direct tradeoff between q and r .

Tuning
Tuning is done by selecting the weights Q and R. We typically choose these as diagonal matrices.
Bryson’s Rule (Rule of thumb)
1
Qii =
maximum acceptable value of xi2
1
Rjj =
maximum acceptable value of uj2
Matlab commands
”SYS = ss(A,B,C,D)”
”[K,S,E] = lqr(SYS,Q,R,N)”
”[K,S,E] = lqry(SYS,Q,R,N)”
Remarks
{A, B} must be controllable.
If weighing the outputs, all unstable modes must show up in the output.

Topic
2. Integral effect
3. Optimal Control
4. Lunar lander

Lunar lander
θ
fs
y
f

Lunar lander
fs
y
f
Nonlinear model
      
m 0 0 ẍ sin(θ) cos(θ) 0
f
 0 m 0   ÿ  =  cos(θ) − sin(θ)  −  mg 
fs
0 0 j θ̈ 0 r 0

Lunar lander
Nonlinear model
      
m 0 0 ẍ sin(θ) cos(θ) 0
f
 0 m 0   ÿ  =  cos(θ) − sin(θ)  −  mg 
fs
0 0 j θ̈ 0 r 0
Inputs
u1 = f − mg u2 = fs
Perturbed nonlinear model

      
m 0 0 ẍ sin(θ) cos(θ) sin(θ)
u1
 0 m 0   ÿ  =  cos(θ) − sin(θ)  + mg  cos(θ) − 1 
u2
0 0 j θ̈ 0 r 0

Lunar lander
Perturbed nonlinear model

      
m 0 0 ẍ sin(θ) cos(θ) sin(θ)
u1
 0 m 0   ÿ  =  cos(θ) − sin(θ)  + mg  cos(θ) − 1 
u2
0 0 j θ̈ 0 r 0
Linearized model
      
m 0 0 ẍ 0 1 θ
u 1
 0 m 0   ÿ  =  1 0  + mg  0 
u2
0 0 j θ̈ 0 r 0

Lunar lander
Linearized model
      
m 0 0 ẍ 0 1 θ
u1
 0 m 0   ÿ  =  1 0  + mg  0 
u2
0 0 j θ̈ 0 r 0
State equation
0 0
     
ẋ 
0 0 0 1 0 0 x
 ẏ   0 0 0 0 1 0  y   0 0 
  0 0  u1
      
 θ̇  
= 0 0 0 0 0 1  θ +
    0 1 

 ẍ  
  0 0 g 0 0 0  ẋ   1
 u2
m 
 ÿ  0 0 0 0 0 0 
 ẏ   0 
m
0 0 0 0 0 0 r
θ̈ θ̇ 0 j

Lunar lander
State equation
0 0
     
ẋ 
0 0 0 1 0 0 x
 ẏ   0 0 0 0 1 0  y   0 0 
0 0  u1
      
 θ̇  
= 0 0 0 0 0 1  θ  
+
  1 
 ẍ   0 0 g 0 0 0  ẋ   0  u2
m 
1
     
 ÿ  0 0 0 0 0 0  ẏ  
m
0 
0 0 0 0 0 0 r
θ̈ θ̇ 0 j
Controllability Matrix
1 gr
0 0 0 0 0 0 0 0 0 0
 
m j
1

 0 0 m
0 0 0 0 0 0 0 0 0 

r
 0 0 0 j
0 0 0 0 0 0 0 0 
C=
 
1 gr
0 0 0 0 0 0 0 0 0 0

m j
 
 
1

m
0 0 0 0 0 0 0 0 0 0 0 
r
0 j
0 0 0 0 0 0 0 0 0 0

Lunar lander
State equation
0 0
     
ẋ 
0 0 0 1 0 0 x
 ẏ   0 0 0 0 1 0  y   0 0 
0 0  u1
      
 θ̇  
= 0 0 0 0 0 1  θ  
+
  1 
 ẍ   0 0 g 0 0 0  ẋ   0  u2
m 
1
     
 ÿ  0 0 0 0 0 0  ẏ  
m
0 
0 0 0 0 0 0 r
θ̈ θ̇ 0 j
Output
 
x
  y 
1 0 0 0 0 0  
 θ 
y = Cx =  0 1 0 0 0 0  
 ẋ 
0 0 1 0 0 0  
 ẏ 
θ̇

Lunar lander
Simulink model
A Simulink model for the lunar lander with accompanying Matlab script is available on Blackboard.
Nonlinearities
The true system is not linear!
Actuator saturation
Thrusters are limited. In this example to:
0 ≤ f ≤ 2mg − 0.1mg ≤ fs ≤ 0.1mg

State Feedback (Continued), Optimal Control: Morten O. Alver (Based On Slides by Morten D. Pedersen

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

State Feedback (Continued), Optimal Control: Morten O. Alver (Based On Slides by Morten D. Pedersen

Uploaded by

Copyright:

Available Formats

TTK4115

Morten O. Alver (based on slides by Morten D. Pedersen

TTK4115 (MOA) Lecture 4 1 / 43

TTK4115 (MOA) Lecture 4 2 / 43

TTK4115 (MOA) Lecture 4 3 / 43

With the control law:

TTK4115 (MOA) Lecture 4 4 / 43

ẋ = Ax + B(Kr r − Kx) = 0 ⇒ (A − BK)x∞ = −BKr r0 ⇒ y∞ = [C(BK − A)−1 B]Kr r0

Kr = [C(BK − A)−1 B]−1

This yields the system:

TTK4115 (MOA) Lecture 4 6 / 43

u = [C(BK − A)−1 B]−1 r − Kx

This yields the system:

TTK4115 (MOA) Lecture 4 7 / 43

The effect of the feed-forward

TTK4115 (MOA) Lecture 4 8 / 43

TTK4115 (MOA) Lecture 4 9 / 43

TTK4115 (MOA) Lecture 4 10 / 43

Integrator state augmentation

Closed loop system

TTK4115 (MOA) Lecture 4 12 / 43

Steady state behavior - with feedforward

The state vector exhibits the following limit:

ẋ = 0 ⇒ Cx∞ = C(A − BK)−1 B[Ka xa,∞ − Kr r0 − w0 ]

Cx∞ − r0 = C(A − BK)−1 B[w0 − Ka xa,∞ ] = 0

The integral action thus acts as a disturbance estimator/compensator:

TTK4115 (MOA) Lecture 4 13 / 43

Steady state behavior - without feedforward

The state vector exhibits the following limit:

ẋ = 0 ⇒ Cx∞ = C(A − BK)−1 B[Ka xa,∞ − w0 ]

If the feedforward is not implemented we have:

Cx∞ − r0 = C(A − BK)−1 B[Ka xa,∞ − w0 ] − r0 = 0

TTK4115 (MOA) Lecture 4 15 / 43

Augmented state space

TTK4115 (MOA) Lecture 4 16 / 43

TTK4115 (MOA) Lecture 4 17 / 43

TTK4115 (MOA) Lecture 4 18 / 43

TTK4115 (MOA) Lecture 4 19 / 43

The system is to be controlled using the control law: u = −Kx

K is chosen to minimize the following object function:

TTK4115 (MOA) Lecture 4 20 / 43

Here, the value of J depends on:

TTK4115 (MOA) Lecture 4 21 / 43

TTK4115 (MOA) Lecture 4 22 / 43

Trajectories of a feedback controlled system

TTK4115 (MOA) Lecture 4 23 / 43

Cost functional to be minimized w.r.t. u(t)

TTK4115 (MOA) Lecture 4 24 / 43

The ”trick” is to rewrite the cost functional on the special form:

TTK4115 (MOA) Lecture 4 25 / 43

The integrand can be rewritten as follows:

TTK4115 (MOA) Lecture 4 26 / 43

Desired form of cost function

Develop by adding and subtracting

TTK4115 (MOA) Lecture 4 27 / 43

Desired form of cost function

Refine Λ(x, u) by ”completing the squares” (matrix version):

Thus we can rewrite uT Ru + 2uT BT Sx by choosing u as z and −BT Sx as b (remember that

uT Ru + 2uT BT Sx = (u + R−1 BT Sx)T R(u + R−1 BT Sx) − xT SBR−1 BT Sx

TTK4115 (MOA) Lecture 4 28 / 43

Instead we focus on Λ(x, u), which contains two quadratic expressions:

Λ(x, u) = xT (AT S + SA + CT QC − SBR−1 BT S)x + (u + R−1 BT Sx)T R(u + R−1 BT Sx)

TTK4115 (MOA) Lecture 4 29 / 43