Lecture 9 PDF

Lecture 9: Discrete-Time Linear Quadratic
Regulator – Finite-Horizon Case
Dr. Burak Demirel
Faculty of Electrical Engineering and Information Technology, University of Paderborn
December 15, 2015
1
Previous lectures
In the previous lectures, we considered

the synthesis problem, solved by pole-placement techniques (i.e., the
main design parameter is the location of poles)
These techniques are limited to SISO systems.
2
LQR problem: Background
Given a discrete-time LTI system
xk+1 = Axk + Buk

(1)
yk = Cxk
LQR problem: Find the control inputs π = {u0 , · · · , uN−1 } that makes
the following criterion as small as possible:
N−1
X
J(π) = xN| PxN + xk| Qxk + uk| Ruk (2)
k=0
where
Q ∈ S≥0 , P ∈ S≥0 , R ∈ S>0 ,
are given state cost, terminal cost, and input cost matrices, respectively.
3
Multi-objective interpretation
Common form for Q and R:
Q = P = C |C R = ρI
where ρ > 0. Then, the cost is

N
X N−1
X
J(π) = k yk k2 +ρ k uk k2
k=0 k=0
where yk = Cxk .
√
ρ gives relative weighting of output and input norm.
4
Multi-objective interpretation
LQR quadratic cost can be written as: J = Jout + ρJin . The term
N
X
Jout = k yk k2
k=0
provides a measure of the output energy, and the term

N−1
X
Jin = k uk k 2
k=0
provides a measurement of the control energy.
There are competing objectives; we want both small.
5
Trade-off between conflicting goals
Jout
Jout + ρJin = J (Constant)
π3 is worse than π2 on both

Jout and Jin π1
π3
π1 is better than π2 in Jin but
worse in Jout π2
Jin
When ρ is much larger than 1, the most effective way to decrease J
is to employ a small control input, at the expense of a large output.
When ρ is much smaller than 1, the most effective way to decrease
J is to obtain a very small output, even if this is achieved at the
expense of employing a large control input.
6
LQR via least-squares
LQR can be formulated (and solved) as a least-squares problem
X = [x0 , · · · , xN ] is a linear function of x0 and U = [u0 , · · · , uN−1 ]:

 
0  
     I
x0  B 0  u0

A
 ..   AB 
B
 .   
 . =  ..  +  ..  x0
 .. ..  . 


xN  . . 0  uN−1
AN
AN−1 B AN−2 B · · · B
express as X = GU + Hx0 , where G ∈ RNn×Nm , H ∈ RNn×n
7
LQR via least-squares
Express linear-quadratic cost as
J(U) = k diag(Q 1/2 , · · · , Q 1/2 , P 1/2 )(GU + Hx0 ) k2

+ k diag(R 1/2 , · · · , R 1/2 )U k2
This is only a least-squares problem.

This solution method requires solving a least-squares problem with
size N(n + m) × Nm (see Boyd)
Using a naive method (e.g., QR factorization), cost is O(N 3 nm2 )
(see Boyd)
8
Finite-horizon LQR control
Theorem
Consider the finite horizon LQR problem. The optimal control
uk = (R + B | Sk+1 B)−1 B | Sk+1 Axk
is a linear function of the state xk . The matrix Sk evolves according to

the backward Riccati recursion
Sk = A| Sk+1 A| + Q − A| Sk+1 B(B | Sk+1 B + R)−1 B | Sk+1 A
with the initial condition SN = P. Furthermore, the optimal control loss

is given by
JN = x0| Qx0 .
Dynamic programming solution
gives an efficient, recursive method to solve LQR least-squares

problem; cost is O(Nn3 )
but a less naive approach to solve the LQR least-squares problem
will have the same complexity
same idea can be applied to many other problems
10
Completing the squares
Consider the loss function of the form
|
x Qx Qxu x
J(x , u) = | , (3)
u Qxu Qu u
and minimize with respect to u. Then, there exists a matrix L satisfying

|
Qu L = Qxu
such that the loss function (3) becomes
J(x , u) = x | (Qx − L| Qu L)x + (u + Lx )| Qu (u + Lx ) . (4)
Since (4) is quadratic in u and both terms are greater or equal zero, (3)
is minimized for u = −Lx . The minimum is
Jmin = x | (Qx − L| Qu L)x .
11
Proof: Via dynamic programming
Define the value function Vk : Rn → R by
n N−1
X o
Vk = min xi| Qxi + ui| Rui + xN| PxN
uk ,··· ,uN−1
i=k
subject to xi+1 = Axi + Bui for i ∈ {k, · · · , N − 1}.
Vk can be interpreted as the loss-to-go from k to N and is a function of

xk at time k.
We will find that

Vk is quadratic, i.e., Vk (x ) = x | Sk x , where Sk ∈ S≥0
Sk can be found recursively, working from k = N
the LQR optimal u is easily expressed in terms of Sk
12
Loss-to-go with no time left is only final state cost:
VN = xN| SN xN
where SN = P. Now, we will show that Vk is quadratic in xk for all k.

For k = N − 1, we have
n o
| |
VN−1 = min xN−1 QxN−1 + uN−1 RuN−1 + VN
uN−1
Using xN = AxN−1 + BuN−1 gives

n
| |
VN−1 = min xN−1 QxN−1 + uN−1 RuN−1
uN−1
o
+ (AxN−1 + BuN−1 )| SN (AxN−1 + BuN−1 )
13
Then, we get:
|
Q + A| SN A B | SN A

xN−1 xN−1
VN−1 = min
uN−1 uN−1 A| SN B R + B | SN B uN−1
This function is quadratic in uN−1 . Completing the squares, we can
write:
n
|
A| SN A + Q − L|N−1 (R + B | SN B)LN−1 xN−1

VN−1 = min xN−1
uN−1
o
+ (uN−1 + LN−1 xN−1 )| (R + B | SN B)(uN−1 + LN−1 xN−1 ) (5)
where LN−1 = (R + B | SN B)−1 B | SN A. To minimize (5), we need to

select
uN−1 = −LN−1 xN−1
which gives the minimum loss
|
VN−1 = xN−1 SN−1 xN−1
DP algorithm for LQR
1. set SN := P
2. for k = N, · · · , 1, compute
Sk−1 := A| Sk A + Q − A| Sk B(R + B | Sk B)−1 B | Sk A
3. for k = 0, · · · , N − 1, compute
Lk := (R + B | Sk+1 B)−1 B | Sk+1 A
4. for k = 0, · · · , N − 1, compute
uk := −Lk xk
15
LQ-control of the double integrator
Consider a double integrator system (with unity sampling interval):

1
1 1
xk+1 = xk + 2 uk
0 1 1

yk = 1 0 xk
with initial state x0 = [1 0]| , horizon N = 20, and weighting matrices:
Q = P = C |C , R=ρ
where ρ > 0.
16
Optimal trade-off curve of Jout and Jin :
2.5
Jout
1.5
1
0 0.2 0.4 0.6 0.8 1
Jin
Blue circle denotes ρ = 0.3 while red circle denotes ρ = 10.
17
1
0.8 ρ = 0.3
0.6 ρ = 10
0.4
yk
0.2
0
−0.2
0 2 4 6 8 10 12 14 16 18 20
0.4
0.0
uk
-0.4
-0.8
0 2 4 6 8 10 12 14 16 18 20
Number of samples [k]
18
Reference
1. S. Boyd, “Linear quadratic regulator: Discrete-time finite horizon,”

Lecture Notes.
2. K. J. Åström and B. Wittenmark, “Computer-Controlled Systems,”
Prentice Hall, 2002.
3. D. P. Bertsekas, “Dynamic Programming and Optimal Control,”
Athena Scientific, 2000.
19

Lecture 9 PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 9 PDF

Uploaded by

Copyright:

Available Formats

Lecture 9: Discrete-Time Linear Quadratic

Regulator – Finite-Horizon Case

Dr. Burak Demirel

Faculty of Electrical Engineering and Information Technology, University of Paderborn

December 15, 2015

In the previous lectures, we considered

These techniques are limited to SISO systems.

Given a discrete-time LTI system

xk+1 = Axk + Buk

Q ∈ S≥0 , P ∈ S≥0 , R ∈ S>0 ,

Common form for Q and R:

where ρ > 0. Then, the cost is

provides a measure of the output energy, and the term

provides a measurement of the control energy.

There are competing objectives; we want both small.

π3 is worse than π2 on both

LQR can be formulated (and solved) as a least-squares problem

X = [x0 , · · · , xN ] is a linear function of x0 and U = [u0 , · · · , uN−1 ]:

express as X = GU + Hx0 , where G ∈ RNn×Nm , H ∈ RNn×n

Express linear-quadratic cost as

J(U) = k diag(Q 1/2 , · · · , Q 1/2 , P 1/2 )(GU + Hx0 ) k2

This is only a least-squares problem.

uk = (R + B | Sk+1 B)−1 B | Sk+1 Axk

is a linear function of the state xk . The matrix Sk evolves according to

Sk = A| Sk+1 A| + Q − A| Sk+1 B(B | Sk+1 B + R)−1 B | Sk+1 A

with the initial condition SN = P. Furthermore, the optimal control loss

gives an efficient, recursive method to solve LQR least-squares

and minimize with respect to u. Then, there exists a matrix L satisfying

such that the loss function (3) becomes

J(x , u) = x | (Qx − L| Qu L)x + (u + Lx )| Qu (u + Lx ) . (4)

Jmin = x | (Qx − L| Qu L)x .

Define the value function Vk : Rn → R by

subject to xi+1 = Axi + Bui for i ∈ {k, · · · , N − 1}.

Vk can be interpreted as the loss-to-go from k to N and is a function of

We will find that

Loss-to-go with no time left is only final state cost:

where SN = P. Now, we will show that Vk is quadratic in xk for all k.

Using xN = AxN−1 + BuN−1 gives

where LN−1 = (R + B | SN B)−1 B | SN A. To minimize (5), we need to

Sk−1 := A| Sk A + Q − A| Sk B(R + B | Sk B)−1 B | Sk A

Lk := (R + B | Sk+1 B)−1 B | Sk+1 A

Consider a double integrator system (with unity sampling interval):

with initial state x0 = [1 0]| , horizon N = 20, and weighting matrices:

Optimal trade-off curve of Jout and Jin :

Blue circle denotes ρ = 0.3 while red circle denotes ρ = 10.

1. S. Boyd, “Linear quadratic regulator: Discrete-time finite horizon,”

You might also like