You are on page 1of 19

Lecture 9: Discrete-Time Linear Quadratic

Regulator – Finite-Horizon Case

Dr. Burak Demirel

Faculty of Electrical Engineering and Information Technology, University of Paderborn

December 15, 2015

1
Previous lectures

In the previous lectures, we considered


the synthesis problem, solved by pole-placement techniques (i.e., the
main design parameter is the location of poles)

These techniques are limited to SISO systems.

2
LQR problem: Background

Given a discrete-time LTI system

xk+1 = Axk + Buk


(1)
yk = Cxk

LQR problem: Find the control inputs π = {u0 , · · · , uN−1 } that makes
the following criterion as small as possible:
N−1
X
J(π) = xN| PxN + xk| Qxk + uk| Ruk (2)
k=0

where

Q ∈ S≥0 , P ∈ S≥0 , R ∈ S>0 ,

are given state cost, terminal cost, and input cost matrices, respectively.

3
Multi-objective interpretation

Common form for Q and R:

Q = P = C |C R = ρI

where ρ > 0. Then, the cost is


N
X N−1
X
J(π) = k yk k2 +ρ k uk k2
k=0 k=0

where yk = Cxk .


ρ gives relative weighting of output and input norm.

4
Multi-objective interpretation

LQR quadratic cost can be written as: J = Jout + ρJin . The term
N
X
Jout = k yk k2
k=0

provides a measure of the output energy, and the term


N−1
X
Jin = k uk k 2
k=0

provides a measurement of the control energy.

There are competing objectives; we want both small.

5
Trade-off between conflicting goals

Jout
Jout + ρJin = J (Constant)

π3 is worse than π2 on both


Jout and Jin π1
π3
π1 is better than π2 in Jin but
worse in Jout π2

Jin
When ρ is much larger than 1, the most effective way to decrease J
is to employ a small control input, at the expense of a large output.
When ρ is much smaller than 1, the most effective way to decrease
J is to obtain a very small output, even if this is achieved at the
expense of employing a large control input.

6
LQR via least-squares

LQR can be formulated (and solved) as a least-squares problem

X = [x0 , · · · , xN ] is a linear function of x0 and U = [u0 , · · · , uN−1 ]:


 
0  
     I
x0  B 0  u0

A
 ..   AB 
B
 .   
 . =  ..  +  ..  x0
 .. ..  . 


xN  . . 0  uN−1
AN
AN−1 B AN−2 B · · · B

express as X = GU + Hx0 , where G ∈ RNn×Nm , H ∈ RNn×n

7
LQR via least-squares

Express linear-quadratic cost as

J(U) = k diag(Q 1/2 , · · · , Q 1/2 , P 1/2 )(GU + Hx0 ) k2


+ k diag(R 1/2 , · · · , R 1/2 )U k2

This is only a least-squares problem.


This solution method requires solving a least-squares problem with
size N(n + m) × Nm (see Boyd)
Using a naive method (e.g., QR factorization), cost is O(N 3 nm2 )
(see Boyd)

8
Finite-horizon LQR control

Theorem
Consider the finite horizon LQR problem. The optimal control

uk = (R + B | Sk+1 B)−1 B | Sk+1 Axk

is a linear function of the state xk . The matrix Sk evolves according to


the backward Riccati recursion

Sk = A| Sk+1 A| + Q − A| Sk+1 B(B | Sk+1 B + R)−1 B | Sk+1 A

with the initial condition SN = P. Furthermore, the optimal control loss


is given by

JN = x0| Qx0 .
Dynamic programming solution

gives an efficient, recursive method to solve LQR least-squares


problem; cost is O(Nn3 )
but a less naive approach to solve the LQR least-squares problem
will have the same complexity
same idea can be applied to many other problems

10
Completing the squares
Consider the loss function of the form
 |   
x Qx Qxu x
J(x , u) = | , (3)
u Qxu Qu u

and minimize with respect to u. Then, there exists a matrix L satisfying


|
Qu L = Qxu

such that the loss function (3) becomes

J(x , u) = x | (Qx − L| Qu L)x + (u + Lx )| Qu (u + Lx ) . (4)

Since (4) is quadratic in u and both terms are greater or equal zero, (3)
is minimized for u = −Lx . The minimum is

Jmin = x | (Qx − L| Qu L)x .

11
Proof: Via dynamic programming

Define the value function Vk : Rn → R by

n N−1
X o
Vk = min xi| Qxi + ui| Rui + xN| PxN
uk ,··· ,uN−1
i=k

subject to xi+1 = Axi + Bui for i ∈ {k, · · · , N − 1}.

Vk can be interpreted as the loss-to-go from k to N and is a function of


xk at time k.

We will find that


Vk is quadratic, i.e., Vk (x ) = x | Sk x , where Sk ∈ S≥0
Sk can be found recursively, working from k = N
the LQR optimal u is easily expressed in terms of Sk

12
Proof: Via dynamic programming

Loss-to-go with no time left is only final state cost:

VN = xN| SN xN

where SN = P. Now, we will show that Vk is quadratic in xk for all k.


For k = N − 1, we have
n o
| |
VN−1 = min xN−1 QxN−1 + uN−1 RuN−1 + VN
uN−1

Using xN = AxN−1 + BuN−1 gives


n
| |
VN−1 = min xN−1 QxN−1 + uN−1 RuN−1
uN−1
o
+ (AxN−1 + BuN−1 )| SN (AxN−1 + BuN−1 )

13
Proof: Via dynamic programming
Then, we get:
| 
Q + A| SN A B | SN A
  
xN−1 xN−1
VN−1 = min
uN−1 uN−1 A| SN B R + B | SN B uN−1
This function is quadratic in uN−1 . Completing the squares, we can
write:
n
|
A| SN A + Q − L|N−1 (R + B | SN B)LN−1 xN−1

VN−1 = min xN−1
uN−1
o
+ (uN−1 + LN−1 xN−1 )| (R + B | SN B)(uN−1 + LN−1 xN−1 ) (5)

where LN−1 = (R + B | SN B)−1 B | SN A. To minimize (5), we need to


select
uN−1 = −LN−1 xN−1
which gives the minimum loss
|
VN−1 = xN−1 SN−1 xN−1
DP algorithm for LQR

1. set SN := P
2. for k = N, · · · , 1, compute

Sk−1 := A| Sk A + Q − A| Sk B(R + B | Sk B)−1 B | Sk A

3. for k = 0, · · · , N − 1, compute

Lk := (R + B | Sk+1 B)−1 B | Sk+1 A

4. for k = 0, · · · , N − 1, compute

uk := −Lk xk

15
LQ-control of the double integrator

Consider a double integrator system (with unity sampling interval):


  1
1 1
xk+1 = xk + 2 uk
0 1 1
 
yk = 1 0 xk

with initial state x0 = [1 0]| , horizon N = 20, and weighting matrices:

Q = P = C |C , R=ρ

where ρ > 0.

16
LQ-control of the double integrator

Optimal trade-off curve of Jout and Jin :

2.5
Jout

1.5

1
0 0.2 0.4 0.6 0.8 1
Jin

Blue circle denotes ρ = 0.3 while red circle denotes ρ = 10.

17
LQ-control of the double integrator

1
0.8 ρ = 0.3
0.6 ρ = 10
0.4
yk

0.2
0
−0.2
0 2 4 6 8 10 12 14 16 18 20

0.4
0.0
uk

-0.4
-0.8
0 2 4 6 8 10 12 14 16 18 20
Number of samples [k]

18
Reference

1. S. Boyd, “Linear quadratic regulator: Discrete-time finite horizon,”


Lecture Notes.
2. K. J. Åström and B. Wittenmark, “Computer-Controlled Systems,”
Prentice Hall, 2002.
3. D. P. Bertsekas, “Dynamic Programming and Optimal Control,”
Athena Scientific, 2000.

19

You might also like