You are on page 1of 43

TTK4115

Lecture 4
State feedback (continued), optimal control

Morten O. Alver (based on slides by Morten D. Pedersen

TTK4115 (MOA) Lecture 4 1 / 43


This lecture

1. Reference feed-forward

2. Integral effect

3. Optimal Control

4. Lunar lander

TTK4115 (MOA) Lecture 4 2 / 43


Topic

1. Reference feed-forward

2. Integral effect

3. Optimal Control

4. Lunar lander

TTK4115 (MOA) Lecture 4 3 / 43


Control with a reference

State feedback
Consider the system:

ẋ = Ax + Bu
y = Cx

With the control law:


u = −Kx
the state equation can be rewritten as:

ẋ = (A − BK)x

if the system is controllable, we can choose K to make the closed-loop system stable. Then, x will
converge towards x∞ = 0, and y towards y∞ = 0.

Reference
Sometimes we want the system output y to converge to a non-zero reference r. But we still want
the benefits of the state feedback. How?

TTK4115 (MOA) Lecture 4 4 / 43


Reference feed-forward
Aim: asymptotic convergence to reference
We want1 y(t) → r as t → ∞.

Implementation
u= Kr r − Kx
|{z} |{z}
Reference feedforward State Feedback

Equilibrium conditions
Assuming that feedback results in a stable equilibrium yields the steady-state condition:

ẋ = Ax + B(Kr r − Kx) = 0 ⇒ (A − BK)x∞ = −BKr r0 ⇒ y∞ = [C(BK − A)−1 B]Kr r0

Finding Kr
Inversion gives the correct feedforward gain:

Kr = [C(BK − A)−1 B]−1

Note that the number of references must be the same as the number of outputs.

1
The elements in y are the variables we wish to control, not the measurement per sé. In state feedback all states are assumed
known.
TTK4115 (MOA) Lecture 4 5 / 43
Reference feed-forward

A common mistake
If C = I, so y = x, one might be tempted to use pure error feedback:

u = K(r − x)

This yields the system:


ẋ = (A − BK)x + BKr

x̂(s)
⇒ = (sI − [A − BK])−1 BK
r̂(s)
Assume that the reference is constant r̂(s) = r0 /s. The final-value theorem yields (in general):

x̂(s) r0
x(∞) = lim s = −(A − BK)−1 BKr0 6= r0
s→0 r̂(s) s

TTK4115 (MOA) Lecture 4 6 / 43


Reference feed-forward

Correct approach
Suppose now that state feedback + reference feedforward is used instead (now for any C):

u = [C(BK − A)−1 B]−1 r − Kx

This yields the system:


ẋ = (A − BK)x + B[C(BK − A)−1 B]−1 r

ŷ(s) x̂(s)
⇒ =C = C(sI − [A − BK])−1 B[C(BK − A)−1 B]−1
r̂(s) r̂(s)
Assume that the reference is constant r̂(s) = r0 /s. The final-value theorem yields:

ŷ(s) r0
y(∞) = lim s = C(A − BK)−1 B[C(A − BK)−1 B]−1 r0 = r0
s→0 r̂(s) s

TTK4115 (MOA) Lecture 4 7 / 43


Reference feed-forward

The effect of the feed-forward


For controlling toward a reference, we have found the feedback law u = Kr r − Kx with the
feedforward defined by:
Kr = [C(BK − A)−1 B]−1

The reference feed-forward moves the equilibrium point of the closed-loop system so y = r in
the equilibrium point.
This lets us separate the issue of following the reference (feed-forward), from the issue of the
system’s convergence properties (state feedback).
Note that the value of Kr depends on the value of K.

TTK4115 (MOA) Lecture 4 8 / 43


Topic

1. Reference feed-forward

2. Integral effect

3. Optimal Control

4. Lunar lander

TTK4115 (MOA) Lecture 4 9 / 43


Integral effect

Problem
What if we don’t know our system perfectly? (We never do)
What about disturbances: w?

Solution
Use integral effect!

TTK4115 (MOA) Lecture 4 10 / 43


Integral effect

Plant2

Disturbance
z}|{
ẋ = Ax + Bu + Bw
y = Cx

Integrator state augmentation


Z t
xa = r(τ ) − Cx(τ ) dτ
0

Augmented system
          
ẋ A 0 x B B 0
= + u+ w+ r
ẋa −C 0 xa 0 0 I

2
The disturbance w is assumed to act in a way that can be cancelled by the input.
TTK4115 (MOA) Lecture 4 11 / 43
Integral effect

Augmented system
          
ẋ A 0 x B B 0
= + u+ w+ r
ẋa −C 0 xa 0 0 I

State feedback
As before state feedback + reference feedforward is used:
 
  x
u = − K Ka + Kr r
xa

Closed loop system


        
ẋ A − BK −BKa x BKr B
= + r+ w
ẋa −C 0 xa I 0

TTK4115 (MOA) Lecture 4 12 / 43


Integral effect
Closed loop system
        
ẋ A − BK −BKa x BKr B
= + r+ w
ẋa −C 0 xa I 0

Steady state behavior - with feedforward


Provided that the feedback results in an asymptotically stable augmented system the output will
tend towards a constant reference r0 :

ẋa = 0 ⇒ Cx∞ = r0

The state vector exhibits the following limit:

ẋ = 0 ⇒ Cx∞ = C(A − BK)−1 B[Ka xa,∞ − Kr r0 − w0 ]

If the feedforward is implemented as shown earlier (Kr = [C(BK − A)−1 B]−1 ) we have:

Cx∞ − r0 = C(A − BK)−1 B[w0 − Ka xa,∞ ] = 0

The integral action thus acts as a disturbance estimator/compensator:

Ka xa,∞ = w0

TTK4115 (MOA) Lecture 4 13 / 43


Integral effect
Closed loop system
        
ẋ A − BK −BKa x BKr B
= + r+ w
ẋa −C 0 xa I 0

Steady state behavior - without feedforward


Provided that the feedback results in an asymptotically stable augmented system the output will
tend towards a constant reference r0 :

ẋa = 0 ⇒ Cx∞ = r0

The state vector exhibits the following limit:

ẋ = 0 ⇒ Cx∞ = C(A − BK)−1 B[Ka xa,∞ − w0 ]

If the feedforward is not implemented we have:

Cx∞ − r0 = C(A − BK)−1 B[Ka xa,∞ − w0 ] − r0 = 0

The integral action now performs double duty and compensates for the disturbance as well as
setting the correct bias3 :
Ka xa,∞ = −Kr r0 + w0
3
But only asymptotically..
TTK4115 (MOA) Lecture 4 14 / 43
Example

Cruise control
Car model:
mv̇ = −dv + u + w, y =v
Matrices:
d 1
A = −m , B= m
, C=1

State feedback
Control:
u = −kv + kr r
Feedforward gain:
kr = [C(BK − A)−1 B]−1 = d + k
Closed loop dynamics:
mv̇ = (d + k )(r − v ) + w, y =v
Stable equilibrium at y = r if w = 0.

TTK4115 (MOA) Lecture 4 15 / 43


Example
Cruise control
Car model:
mv̇ = −dv + u + w, y =v
Matrices:
d 1
A = −m , B= m
, C=1

Augmented state space


d 1 1
          
v̇ −m 0 v m
0
= + u+ r+ m w
ẋa −1 0 xa 0 1 0

State feedback
Control:
u = −kv − ka xa + kr r
Closed loop dynamics:

− d+k − kma kr 1
        
v̇ v
= m + m r+ m w
ẋa −1 0 xa 1 0

TTK4115 (MOA) Lecture 4 16 / 43


Example
Cruise control
Car model:
mv̇ = −dv + u + w, y =v
Matrices:
d 1
A = −m , B= m
, C=1

State feedback
Control:
u = −kv − ka xa + kr r
Closed loop dynamics:

− d+k − kma kr 1
        
v̇ v
= m + m r+ m w
ẋa −1 0 xa 1 0

Transfer functions
r.ff.
z }| {
(d + k)s −ka s
ŷ (s) = r̂ (s) + ŵ(s)
ms2 + (d + k)s − ka ms2 + (d + k)s − ka
Retaining reference feedforward may yield faster reference tracking.

TTK4115 (MOA) Lecture 4 17 / 43


Topic

1. Reference feed-forward

2. Integral effect

3. Optimal Control

4. Lunar lander

TTK4115 (MOA) Lecture 4 18 / 43


Optimal Control

Optimal control
Optimal control is concerned with finding an input u(t) that minimizes a cost function over some
interval of time, subject to system dynamics.

LQR
The Linear Quadratic Regulator is a special case where the plant is Linear and the cost function is
Quadratic. The time interval is infinite.

TTK4115 (MOA) Lecture 4 19 / 43


Optimal Control
The LQR problem
Consider the system:

ẋ = Ax + Bu
y = Cx

The system is to be controlled using the control law: u = −Kx

K is chosen to minimize the following object function:


Z ∞
J= yT (t)Qy(t) + uT (t)Ru(t)dt
0

This form weights the outputs y(t). We can choose to weight the state values x(t) instead.

Note:
It is an ordinary state feedback - just with a new way of choosing K.
The J function is separate from the state equation - but its value depends on it.
For J to have a finite value, the system must converge towards its equilibrium in origo.
The system must therefore be controllable...
... and the value of K that optimizes J must stabilize the system.

TTK4115 (MOA) Lecture 4 20 / 43


Optimal Control

Scalar case
In a scalar system, the object function can be written as follows:
Z ∞
J= qx(t)2 + ru(t)2 dt, q > 0 and r > 0
0

Here, the value of J depends on:


The integral of x(t)2 as it converges towards 0 (from some initial value).
The integral of u(t)2 as the input works to bring x(t) to 0.
The weighting parameters q and r .

The feedback gain k that optimizes J depends on the relative sizes of q and r , and these serve to
prioritize:
either fast convergence of x(t), using a large q/r
or low use of the input u(t), using a small q/r .

TTK4115 (MOA) Lecture 4 21 / 43


Optimal Control

Multivariable case
Z ∞
J= yT (t)Qy(t) + uT (t)Ru(t)dt
0
In the multivariable case:
The Q matrix lets us weight fast convergence in each individual output (in the diagonal
elements) and in products of pairs of outputs (in the off-diagonal elements). We often use
nonzero values on the diagonal only.
The R matrix lets us weight the inputs (in the diagonal elements). We typically use nonzero
values on the diagonal only.

TTK4115 (MOA) Lecture 4 22 / 43


Optimal Control

Trajectories of a feedback controlled system

Finding a minimum
Finding an analytical solution that minimizes the objective function is in general not trivial.
Discretizing the dynamics and finding the optimum numerically over a finite horizon is a viable
method (MPC,NMPC).
Linear model with a quadratic cost function is a ”lucky” case: we can find an analytical
solution that covers all all initial values.

TTK4115 (MOA) Lecture 4 23 / 43


LQR problem definition

Cost functional to be minimized w.r.t. u(t)


Z ∞
JLQR = yT (t)Qy(t) + uT (t)Ru(t)dt
0
with:
Q > 0, Q = QT , R > 0, R = RT

Output energy
yT (t)Qy(t)
Making this function smaller requires more input energy.

Control energy
uT (t)Ru(t)
Making this function smaller requires less input energy which leads to higher output energy.

TTK4115 (MOA) Lecture 4 24 / 43


Optimal Control

Finding a minimum
We want to find the minimum of:
Z ∞
J= yT (t)Qy(t) + uT (t)Ru(t)dt
0

The ”trick” is to rewrite the cost functional on the special form:


Z ∞
J = H(x(·), u(·)) + Λ(x(t), u(t))dt
0

where:
H(x(·), u(·)) is not affected by the input, directly or indirectly.
Λ(x(t), u(t)) has an obvious minimum in terms of u(t).

TTK4115 (MOA) Lecture 4 25 / 43


Optimal control
Desired form of cost function
Z ∞
J = H(x(·), u(·)) + Λ(x(t), u(t))dt
0

The functionala H(x(·), u(·)) above can be described as a feedback invariant. The functional takes
the system inputs and states as arguments, but its value depends only on the initial condition x(0).
It is not affected by the input, directly or indirectly.
a
A functional takes functions as arguments and returns a scalar.

Feedback invariance
The functional: Z ∞ d h T i
H(x(·), u(·)) , − x (t)Sx(t) dt
0 dt
happens to be feedback invariant as long as x(t) → 0 as t → ∞:
Z ∞
d h T i h i∞
x (t)Sx(t) dt = − xT (t)Sx(t) = xT (0)Sx(0) − (
xT (∞)Sx(∞)
(
− (((
0 dt 0

The integrand can be rewritten as follows:

d h T i
x Sx(t) = ẋT Sx + xT Sẋ = xT S[Ax + Bu] + [Ax + Bu]T Sx
dt

TTK4115 (MOA) Lecture 4 26 / 43


Optimal control

Desired form of cost function


Z ∞  Z ∞ 
J = H(x(·), u(·)) + Λ(x(t), u(t))dt = yT (t)Qy(t) + uT (t)Ru(t)dt
0 0
Z ∞
H(x(·), u(·)) = − [Ax(t) + Bu(t)]T Sx(t) + xT (t)S[Ax(t) + Bu(t)]dt
0

Develop by adding and subtracting

"Z #

T T
J = H(x, u) + y (t)Qy(t) + u (t)Ru(t)dt − H(x(·), u(·))
0
Z ∞ 
= H(x, u) + xT CT QCx + uT Ru + (Ax + Bu)T Sx + xT S(Ax + Bu)dt
0
 
Z ∞
xT (AT S + SA + CT QC)x + uT Ru + 2uT BT Sx dt 
 
= H(x, u) + 

0 | {z } 
Λ(x,u)

TTK4115 (MOA) Lecture 4 27 / 43


Optimal control

Desired form of cost function

Z ∞
J = H(x(·), u(·)) + Λ(x(t), u(t))dt
0
Z ∞
H(x(·), u(·)) = − [Ax(t) + Bu(t)]T Sx(t) + xT (t)S[Ax(t) + Bu(t)]dt
0
Λ(x, u) = xT (AT S + SA + CT QC)x + uT Ru + 2uT BT Sx

Refine Λ(x, u) by ”completing the squares” (matrix version):


zT Mz − 2bT z = (z − M−1 b)T M(z − M−1 b) − bT M−1 b

Thus we can rewrite uT Ru + 2uT BT Sx by choosing u as z and −BT Sx as b (remember that


uT BT Sx = xT SBu since the term is is scalar and S is symmetric):

uT Ru + 2uT BT Sx = (u + R−1 BT Sx)T R(u + R−1 BT Sx) − xT SBR−1 BT Sx

TTK4115 (MOA) Lecture 4 28 / 43


Desired form of cost function

Z ∞
J = H(x(·), u(·)) + Λ(x(t), u(t))dt
0
Z ∞
H(x(·), u(·)) = − [Ax(t) + Bu(t)]T Sx(t) + xT (t)S[Ax(t) + Bu(t)]dt
0
Λ(x, u) = xT (AT S + SA + CT QC − SBR−1 BT S)x + (u + R−1 BT Sx)T R(u + R−1 BT Sx)

How to minimize J
We already established that H(x(·), u(·)) is feedback invariant for any choice of S, so we can
ignore it in the optimization.

Instead we focus on Λ(x, u), which contains two quadratic expressions:

Λ(x, u) = xT (AT S + SA + CT QC − SBR−1 BT S)x + (u + R−1 BT Sx)T R(u + R−1 BT Sx)


| {z }
≥0

We can choose S so the first quadratic expression becomes 0. The equation has the form of
the continuous-time Ricatti equation (CARE)
After choosing S, we just need to minimize the second term (to 0) by the right choice of u

TTK4115 (MOA) Lecture 4 29 / 43


Desired form of cost function

Z ∞
J = H(x(·), u(·)) + Λ(x(t), u(t))dt
0
Z ∞
H(x(·), u(·)) = − [Ax(t) + Bu(t)]T Sx(t) + xT (t)S[Ax(t) + Bu(t)]dt
0
Λ(x, u) = xT (AT S + SA + CT QC − SBR−1 BT S)x + (u + R−1 BT Sx)T R(u + R−1 BT Sx)

Minimization step 1:
Solve CARE:
AT S + SA + CT QC − SBR−1 BT S = 0

Minimization step 2:
Choose:
u = −R−1 BT Sx

The minimum is thus obtained:


Λ(x(t), u(t)) ≡ 0 ⇒ J = xT (0)Sx(0)

TTK4115 (MOA) Lecture 4 30 / 43


LQR summarized
Minimal solution
The solution that minimizes the cost function for the system:

ẋ(t) = Ax(t) + Bu(t)

is a linear feedback:
u(t) = −Kx(t)
where:
K = R−1 BT S
leading to the closed loop system:

ẋ(t) = [A − BR−1 BT S]x(t)

Algebraic Riccati Equation


The matrix S is found by solving:

[AT S + SA] + CT QC − SBR−1 BT S = 0

where S > 0 must be positive definite.

TTK4115 (MOA) Lecture 4 31 / 43


LQR example

Example ẋ(t) = −λx(t) + u(t), y (t) = x(t)


Ricatti equation:
[AT S + SA] + CT QC − SBR−1 BT S = 0
In scalar form, with S replaced by p, A = −λ and B = C = 1, the equation becomes:

p2
−2λp + q − =0
r
The solutions are: q
p = −λr ± qr + λ2 r 2 : Pick the positive solution
The feedback matrix becomes:
r !
p q
u(t) = −kx(t) = − x(t) = λ− + λ2 x(t)
r r

TTK4115 (MOA) Lecture 4 32 / 43


LQR example

Closed loop system ẋ(t) = −λx(t) − kx(t)


State dynamics:
 p
ẋ(t) = − λ− x(t)
r
r !
q 2
= − λ − [λ − + λ ] x(t)
r
r !
q
= − + λ2 x(t)
r

Input: !
r
p q
u(t) = −kx(t) = − x(t) = λ− + λ2 x(t)
r r

Note:
There is a direct tradeoff between q and r .

TTK4115 (MOA) Lecture 4 33 / 43


Tuning
Tuning is done by selecting the weights Q and R. We typically choose these as diagonal matrices.

Bryson’s Rule (Rule of thumb)

1
Qii =
maximum acceptable value of xi2
1
Rjj =
maximum acceptable value of uj2

Matlab commands
”SYS = ss(A,B,C,D)”
”[K,S,E] = lqr(SYS,Q,R,N)”
”[K,S,E] = lqry(SYS,Q,R,N)”

Remarks
{A, B} must be controllable.
If weighing the outputs, all unstable modes must show up in the output.

TTK4115 (MOA) Lecture 4 34 / 43


Topic

1. Reference feed-forward

2. Integral effect

3. Optimal Control

4. Lunar lander

TTK4115 (MOA) Lecture 4 35 / 43


Lunar lander
θ

fs

y
f

TTK4115 (MOA) Lecture 4 36 / 43


Lunar lander

fs

y
f

Nonlinear model
      
m 0 0 ẍ sin(θ) cos(θ)   0
f
 0 m 0   ÿ  =  cos(θ) − sin(θ)  −  mg 
fs
0 0 j θ̈ 0 r 0

TTK4115 (MOA) Lecture 4 37 / 43


Lunar lander

Nonlinear model
      
m 0 0 ẍ sin(θ) cos(θ)   0
f
 0 m 0   ÿ  =  cos(θ) − sin(θ)  −  mg 
fs
0 0 j θ̈ 0 r 0

Inputs
u1 = f − mg u2 = fs

Perturbed nonlinear model


      
m 0 0 ẍ sin(θ) cos(θ)   sin(θ)
u1
 0 m 0   ÿ  =  cos(θ) − sin(θ)  + mg  cos(θ) − 1 
u2
0 0 j θ̈ 0 r 0

TTK4115 (MOA) Lecture 4 38 / 43


Lunar lander

Perturbed nonlinear model


      
m 0 0 ẍ sin(θ) cos(θ)   sin(θ)
u1
 0 m 0   ÿ  =  cos(θ) − sin(θ)  + mg  cos(θ) − 1 
u2
0 0 j θ̈ 0 r 0

Linearized model
      
m 0 0 ẍ 0 1   θ
u 1
 0 m 0   ÿ  =  1 0  + mg  0 
u2
0 0 j θ̈ 0 r 0

TTK4115 (MOA) Lecture 4 39 / 43


Lunar lander

Linearized model
      
m 0 0 ẍ 0 1   θ
u1
 0 m 0   ÿ  =  1 0  + mg  0 
u2
0 0 j θ̈ 0 r 0

State equation
0 0
     
ẋ 
0 0 0 1 0 0 x
 ẏ   0 0 0 0 1 0  y   0 0 
  0 0  u1
       
 θ̇  
= 0 0 0 0 0 1  θ +
    0 1 

 ẍ  
  0 0 g 0 0 0  ẋ   1
 u2
m 
 ÿ  0 0 0 0 0 0 
 ẏ   0 
m
0 0 0 0 0 0 r
θ̈ θ̇ 0 j

TTK4115 (MOA) Lecture 4 40 / 43


Lunar lander

State equation
0 0
     
ẋ 
0 0 0 1 0 0 x
 ẏ   0 0 0 0 1 0  y   0 0 
0 0  u1
       
 θ̇  
= 0 0 0 0 0 1  θ  
+
  1 
 ẍ   0 0 g 0 0 0  ẋ   0  u2
m 
1
     
 ÿ  0 0 0 0 0 0  ẏ  
m
0 
0 0 0 0 0 0 r
θ̈ θ̇ 0 j

Controllability Matrix
1 gr
0 0 0 0 0 0 0 0 0 0
 
m j
1

 0 0 m
0 0 0 0 0 0 0 0 0 

r
 0 0 0 j
0 0 0 0 0 0 0 0 
C=
 
1 gr
0 0 0 0 0 0 0 0 0 0

m j
 
 
1

m
0 0 0 0 0 0 0 0 0 0 0 
r
0 j
0 0 0 0 0 0 0 0 0 0

TTK4115 (MOA) Lecture 4 41 / 43


Lunar lander

State equation
0 0
     
ẋ 
0 0 0 1 0 0 x
 ẏ   0 0 0 0 1 0  y   0 0 
0 0  u1
       
 θ̇  
= 0 0 0 0 0 1  θ  
+
  1 
 ẍ   0 0 g 0 0 0  ẋ   0  u2
m 
1
     
 ÿ  0 0 0 0 0 0  ẏ  
m
0 
0 0 0 0 0 0 r
θ̈ θ̇ 0 j

Output
 
x
  y 
1 0 0 0 0 0  
 θ 
y = Cx =  0 1 0 0 0 0  
 ẋ 
0 0 1 0 0 0  
 ẏ 
θ̇

TTK4115 (MOA) Lecture 4 42 / 43


Lunar lander

Simulink model
A Simulink model for the lunar lander with accompanying Matlab script is available on Blackboard.

Nonlinearities
The true system is not linear!

Actuator saturation
Thrusters are limited. In this example to:

0 ≤ f ≤ 2mg − 0.1mg ≤ fs ≤ 0.1mg

TTK4115 (MOA) Lecture 4 43 / 43

You might also like