This action might not be possible to undo. Are you sure you want to continue?
Robust and Optimal Control of Linear Systems
2
Chapter 2
Optimal Control and the Linear Quadratic Regulator
ABSTRACT In this chapter we introduce optimal control theory and the linear
quadratic regulator. In the introduction we briefly discuss and compare classical
control, modern control, and optimal control, and why optimal control designs
have emerged as a popular design method of control in aerospace problems. We
then begin by introducing optimal control problems and the resulting Hamilton
Jacobi partial differential equation. Then, for linear systems with a quadratic per
formance index, we develop the linear quadratic regulator. We will cover both fi
nite time and infinite time problems, and will explore some very important stabili
ty and robustness properties of these systems.
2.1 Introduction
Control systems must provide stability and performance in the presence of model
uncertainty and neglected dynamics. This has proven to be a significant challenge,
and as our understanding of dynamics and control has improved, aerospace has
been able to develop new aircraft designs that are faster, have greater perfor
mance, and perform robustly in very large flight envelopes. These advancements
built upon the foundation created by classical methods, but were powered by
computer aided design tools which greatly expanded the engineers ability to solve
larger, more complex problems, using advanced techniques.
In general, designing flight control systems using conventional (classical) ana
lytical methods involves iterative singleloop design analyses that are costly in
time and manpower. These systems were often designed by discretizing the flight
envelope at specific points, designing the control system at these points, and gua
ranteeing robustness to parameter variations by designing large singleloop stabili
ty margins and evaluating the design through simulation. These methods worked
well on aircraft that were openloop stable, but as new designs emerged that were
openloop unstable in multiple axes, multiinput multioutput (MIMO) design me
thods were needed.
In the 1970s and 80s, the question of robust stability and performance was
raised and new control system design and analysis methods emerged, called mod
ern control. These advancements provided the theoretical mathematics required
for optimizing the controller design MIMO systems and evaluating stability and
3
robustness to parameter uncertainties. Using methods for characterizing model un
certainties, controller robustness properties were evaluated, and iterative design
tools emerged to achieve robust stability and performance. These modern methods al
lowed the control system designer to understand and directly address stability and
robustness concerns for openloop unstable MIMO systems. With computer aided
design tools, engineers could readily pose and solve “optimal control” problems
for complex systems, and implement the control across a large flight envelope us
ing gain scheduling.
Optimal control problems arise in designing a control in order to minimize a
performance index. There are many classes of problems for nonlinear or linear
systems, dealing with time variant or time invariant dynamics, over a fixed time
interval or infinite time, and with different types of performance indexes. Optimal
control problems are in general very difficult to solve, except for linear systems
with a quadratic performance index. These problems are well understood and pro
duce control laws that have very interesting properties.
One of the key challenges in using optimal control theory is transforming fre
quency domain performance and stability requirements from classical control into
timedomain requirements. A multivariable optimal controller design using a qua
dratic performance index optimizes the design in the time domain. Satisfying fre
quencydomain requirements such as bandwidth, noise sensitivity, etc., using the
optimization performance index, is a challenge. Similarly, quantifying the degree
of robustness required to overcome parameter uncertainties is not well posed in
the problem setup.
The key to using optimal control theory is to develop a method to tune the de
sign parameters to achieve the desired performance and stability in the control sys
tem (and robustness). This is the goal for this chapter and the next. This chapter
introduces optimal control theory, the linear quadratic regulator, and the all impor
tant matrix Riccati equation. We will discuss in detail some of the excellent prop
erties that optimal controllers produce, which makes them a favorite in many aero
space control problems. Chapter 3 takes the optimal control principles and the
regulator framework and extends them to command following design problems. It
is this command following challenge that is most common in aerospace flight con
trol systems.
2.2 Optimal Control and the HamiltonJacobi Equation
The derivation of the HamiltonJacobi partial differential equation for optimal
control problems will allow us to understand how optimal control regulator prob
lems are posed, and how we can form an optimal control from a performance in
dex minimization problem. Optimal control problems are in general very difficult
to solve. There are many books available on the subject. Athans and Falb [1],
Kwakernak and Sivan [2], and Anderson and Moore [3] are three excellent text
books that deal with necessary and sufficiency conditions, differentiability and
continuity assumptions, problem setup, derivations, and solutions for most prob
4
lems that can be solved analytically. We will begin by deriving the Hamilton
Jacobi partial differential equation in a general setting, and will then focus on li
near systems with quadratic performance indices.
Hamilton Jacobi Approach
Consider the following first order system model and performance index
( ) ( )
( ) ( ) ( )
0
0 0
, ,
, ,
T
t
x f x u t x t x
J L x u d S x T t t
= =
= +
}
(2.1)
The challenge is to find a control * u to minimize J over the time interval
 
0
, t T . We call * u the optimal control. When used in (2.1) it produces the op
timal state * x . Assume the minimum when using * u is * J .
( ) ( ) ( )
 
0
,
0
* *, *, * min over
T
t T
J L x u d S x T J u t t = + =
}
(2.2)
We see that the performance index * J is a function of the control
 
0
, t T
u , the
initial state and time.
 
( )
0
0 0 ,
, ,
t T
J J u x t = (2.3)
( )
 
( ) ( ) ( )
,
0
0
0 0
* , min , ,
t T
T
u
t
J x t L x u d S x T t t
(
= + (
(
¸ ¸
}
(2.4)
Next ,consider an arbitrary initial condition at time t :
( )
 
( ) ( ) ( )
,
* , min , ,
t T
T
u
t
J x t L x u d S x T t t
(
= +
(
¸ ¸
}
(2.5)
Now, break (2.5) into two integrals, from
 
1
, t t to
 
1
, t T .
( )
 
( ) ( ) ( ) ( )
1
,
1
* , min , , , ,
t T
t T
u
t t
J x t L x u d L x u d S x T t t t t
(
= + + (
(
¸ ¸
} }
(2.6)
5
We can explicitly write the minimization over the two intervals as
( )
   
( ) ( ) ( ) ( )
1
, ,
1 1
1
* , minmin , , , ,
t t t T
t T
u u
t t
J x t L x u d L x u d S x T t t t t
(
= + + (
(
¸ ¸
} }
(2.7)
The idea is to break the integral into time slices, and at each slice choose the
optimal control that minimizes J . This idea leads us to what is referred to as the
Principle of Optimality. The Principle of Optimality tells us that if we use the op
timal control at each slice in time interval, that the system will be optimal over the
interval. We can move min operation inside the bracket as follows:
( )
 
( )
 
( ) ( ) ( )
1
, ,
1 1
1
* , min , , min , ,
t t t T
t T
u u
t t
J x t L x u d L x u d S x T t t t t
(
= + + (
(
¸ ¸
} }
(2.8)
We see that inside the bracket, the second integral is itself a * J that starts at
time
1
t . We can denote this as
( )
1 1
* , J x t
( )
 
( )
 
( ) ( ) ( )
( )
1
, ,
1 1
1
1 1
* ,
* , min , , min , ,
t t t T
t T
u u
t t
J x t
J x t L x u d L x u d S x T t t t t
(
(
(
= + +
(
(
¸ ¸
} }
(2.9)
( )
 
( ) ( )
1
,
1
1 1
* , min , , * ,
t t
t
u
t
J x t L x u d J x t t t
(
= +
(
(
¸ ¸
}
(2.10)
Now, let
1
t t t = + A , and substitute into (2.10).
( )
 
( ) ( ) ( )
,
* , min , , * ,
t t t
t t
u
t
J x t L x u d J x t t t t t t
+A
+A
(
= + + A + A
(
¸ ¸
}
(2.11)
Next, expand (2.11) in a Taylor series expansion to obtain
( )
 
( ) ( )
,
* *
* , min , , * , HOT
t t t
u
J J
J x t L x u t J x t x t
x t
t
+A
c c
(
= A + + A + A +
(
c c
¸ ¸
(2.12)
6
We can cancel ( )
*
, J x t on each side since it does not depend on
 
, u t t t + A , and
divide by t A to obtain
 
( )
,
* *
0 min , , HOT
u
t t t
J x J
L x u
x t t
t
+A
c A c
(
= + + +
(
c A c
¸ ¸
(2.13)
Now let 0 t A ÷
( )
* *
min , ,
J J
L x u x
t x
u
t
c c
(
÷ = +
(
c c
¸ ¸
(2.14)
Subject to
( ) ( )
0 0
, , x f x u t x t x = =
Define the Hamiltonian
*
, , ,
J
H x u t
x
c
 

c
\ .
=
( )
*
, ,
J
L x u x
x
t
c
+
c
. (2.15)
To minimize H with respect to the control u we take the derivative and equate it
to zero.
0
u
H V = (2.16)
This formulation allows the functional minimization problem to be transformed
into a function minimization, which can be solved using ordinary calculus. If we
solve for the optimal control, * u u = , and substitute back into (2.14), we get a
PDE for
*
J
Let
* *
* , , min , , ,
J J
H x t H x u t
x x
u
c c (
   
÷
  (
c c
\ . \ .
¸ ¸
(2.17)
then
* *
* , ,
J J
H x t
t x
c c
 
÷ =

c c
\ .
(2.18)
and is the HamiltonJacobi differential equation.
7
In most optimal control problems one does not really care about
*
J , but is inter
ested in the optimal control * u applied to the system dynamics. Solving (2.18) is
still quite difficult even for low order problems in that we still must solve a PDE
for
*
J . As derived here, * u is an open loop optimal control. We really want a
feedback control for robustness and sensitivity minimization. We will see that if
the dynamics are linear, and the performance index penalty function
( ) , , L x u t is
quadratic, then the problem is easily solved, and the resulting feedback control and
closed loop system have very interesting properties. We exploit these properties
in our use of optimal control in aerospace applications to maximize performance
and robustness while minimizing the control effort.
Summary
( ) ( )
( ) ( ) ( )
( )
 
  ( ) ( )
( ) ( ) ( ) ( )
,
0 0
0
Dynamics: ,
Performance index: , ,
*
* , min , , ,
Optimal control: 0 * Plug into *
* *
: * , , : * ,
t T
T
u
u
x f x u x t x
J L x u d S x T
J
J x t J H L x u t f x u
x
H u H H
J J
HJ Eqn H x t BC J x T T S x T
t x
t t
= =
= +
c
= = +
c
V = ÷ ÷
c c
 
÷ ÷ = =

c c
\ .
}
Example 2.1
In this example we will set up, but not solve, the HamiltonJacobi Equation.
Consider the system
1 2
2 1 2
2 3
x x
x x x u
=
= ÷ ÷ +
(2.19)
where ( )  
0 1 2
T
x = , with performance index
( ) ( ) ( )
1
4 2 2 2
1 1 1
0
1 1 J x u dt x x = + + +
}
(2.20)
For this problem ( )
4 2
1
, , L x u t x u = + and
( ) ( ) ( ) ( ) ( ) ( )
2 2
1 1
1 1 1 1
T
S x T x x x x = + = .
The Hamiltonian is
8
( ) ( ) ( )
( ) ( )
4 2
1 1 2
1 2
4 2
1 2 1 2
1 2
*
, , *, , , ,
* *
* *
2 3
J
H x u J t L x u t f x u
x
J J
x u x x
x x
J J
x u x x x
x x
c
V = +
c
c c
= + + +
c c
c c
= + + + ÷ ÷
c c
(2.21)
Now, minimize (2.21) with respect to the control by differentiating and equat
ing to zero. Thus,
2
2
*
0 2 *
1 *
*
2
u
J
H u
x
J
u
x
c
V = = +
c
c
= ÷
c
(2.22)
Substituting this back into (2.21) yields
( )
2
4
1 2 1 2
2 1 2 2 2
1 * * * * 1 *
* , *, 2 3
4 2
J J J J J
H x J t x x x x
x x x x x
  c c c c c
V = + + ÷ ÷ ÷

c c c c c
\ .
(2.23)
The HamiltonJacobi Equation is then
2
4
1 2 1 2
2 1 2
* * 1 * * * 1
* , , 2 3
4 2
J J J J J
H x t x x x x
t x x x x
  c c c c c
   
÷ = = + + ÷ ÷ ÷

 
c c c c c
\ . \ .
\ .
(2.24)
with boundary condition ( ) ( ) ( )
2 2
1 2
* , J x T x T x T = + .
2.3 Linear Quadratic Regulator
The Linear Quadratic Regulator (LQR) is one of the most widely used control
design methods in aerospace. Trade studies have been performed comparing prop
erties of controllers (performance, robustness, control usage) in many different
applications. We have found that flight control systems designed using LQRs
have excellent performance, robustness, and minimize the control usage. This me
thod is easily extended (in the next chapter) to produce command tracking control
lers typical of flight control systems.
Consider the linear system
9
( ) ( ) ( )
0 0
,
x u
n n
x A t x B t u x t x x R u R = + = e e (2.25)
with performance index
( ) ( ) ( )
0
T
T T T
T
J x Qx u Ru d x T Q x T t = + +
}
(2.26)
where
0, 0, 0,
T T T
T T
Q Q R R Q Q = > = > = > (2.27)
The matrices Q and R can be time varying if needed. To have a well posed prob
lem we want the pair ( ) , A B to be controllable and the pair
( )
1/ 2
, A Q observable.
Weaker conditions of ( ) , A B stabilizable and
( )
1/ 2
, A Q detectable are also accept
able. The need for controllability of the system’s dynamics should be obvious.
Clearly the control cannot stabilize the system and perform as desired if the dy
namics are not controllable. Detectability of modes through the performance index
guarantees that the modes are penalized, producing a control that will minimize
their contribution to the index. We will see that the numerical choices in the ma
trices Q and R is very important in achieving performance and robustness in the
closed loop system.
Following (2.15), the LQR Hamiltonian is
( ) ( ) ( )
*
T T
J
H x Qx u Ru A t x B t u
x
c
= + + +
c
(2.28)
Taking the derivative with respect to u and equating to zero produces
*
2 0
T
T
H J
Ru B
u x
c c
 
= + =

c c
\ .
(2.29)
where the optimal control is
1
1
2
*
*
T
T
J
u R B
x
÷
c
 
= ÷

c
\ .
(2.30)
Substituting * u back into (2.18) yields the HamiltonJacobi equation
10
1 1
1 1
4 2
* * * * * *
T T
T T T
J J J J J J
x Qx BR B Ax BR B
t x x x x x
÷ ÷
c c c c c c
   
÷ = + + ÷
 
c c c c c c
\ . \ .
(2.31)
which, in this form, is still quite difficult to solve. Now, if we try a quadratic func
tion for * J , i.e.
*
T
J x Px = (2.32)
where 0
T
P P = > , we get
* *
2
T
J J
x Px Px
t x
c c
= =
c c
(2.33)
Substituting (2.33) back into (2.31), and factoring out x on both sides, we have
1
0
T T T
x P PA A P Q PBR B P x
÷
( ÷ ÷ ÷ ÷ + =
¸ ¸
(2.34)
with boundary condition ( )
T
P T Q = . Since this must be satisfied for any x. we
have
1 T T
P PA A P Q PBR B P
÷
÷ = + + ÷
(2.35)
which is called the Riccati equation. Substituting (2.33) into (2.30) yields
1 1
1
2
*
*
T T
J
u R B R B P x
x
K
÷ ÷
c
= ÷ = ÷
c
(2.36)
which is a state feedback optimal control. In this formulation, the Riccati equation
is integrated backward in time, the gains are formed using (2.36), stored, and the
feedback control law is implemented by looking up the gains. This is called gain
scheduling. We substitute the optimal control to obtain the closed loop system,
written as
( ) ( ) ( )
( ) ( )
( ) ( ) ( ) ( )
0 0
x A t x B t u x t x
u t K t x
x A t B t K t x
= + =
= ÷
= ÷
(2.37)
11
Summary
( ) ( ) ( )
( ) ( ) ( )
( )
( ) ( )
( ) ( ) ( ) ( ) ( )
0
0
1
1
0
Dynamics: 0
Performance index:
Riccati Equation: ,
Optimal Control: *
Closed Loop System: , 0
T
T T T
T
T T
T
T
x A t x B t u x x
J x Qx u Ru d x T Q x T
P PA A P Q PBR B P P T Q
u R B P t x K t x
x A t B t K t x x x
t
÷
÷
= + =
= + +
÷ = + + ÷ =
= ÷ = ÷
= ÷ =
}
Infinite Time LQR
Now consider the performance index
( )
0
0, 0
T T T T
J x Qx u Ru d Q Q R R t
·
= + = > = >
}
(2.38)
where the final time T = ·. The state dynamics for this problem are linear time
invariant, written as
, constant ,
x u
n n
x Ax Bu A B x R u R = + ÷ e e (2.39)
with ( ) , A B controllable and
( )
1
2
, A Q observable. The Riccati equation becomes
algebraic, written as
1
0
T T
PA A P Q PBR B P
÷
+ + ÷ = (2.40)
called the Algebraic Riccati Equation (ARE), with the optimal control
1
*
T
u R B Px Kx
÷
= ÷ = ÷ (2.41)
where K is a constant gain matrix. In applications where the state dynamics are
linearized at operating conditions, and the control is designed at each design point,
the constant gain matrix K can be stored in a table and looked up for implementa
tion. Substituting the control into the open lop dynamics yields
( )
cl
x Ax Bu
u Kx
x A BK x A x
= +
= ÷
= ÷ =
(2.42)
12
This formulation guarantees the closed loop system whose dynamics are described
by
cl
A to be stable. This means the eigenvalues of
cl
A lie in the lefthalf plane,
( ) ( )
Re 0
cl
A ì < . The state is regulated to zero, 0 x ÷ as t ÷ ·, which yields
0 u ÷ as t ÷ ·.
It is often desirable when simulating the dynamics to compute the control and
its rate, u , and examine the peak values. If we differentiate the control, we have
( )
cl
u Kx
u Kx K A BK x KA x
= ÷
= ÷ = ÷ ÷ = ÷
(2.43)
We can form a closed loop simulation model, with outputs x , u , and u for
this closed loop LQR system as
( )
cl
cl
x Ax Bu u Kx
x A BK x A x
x I
y u K x
u KA
= + = ÷
= ÷ =
( (
( (
= = ÷
( (
( ( ÷
¸ ¸ ¸ ¸
(2.44)
In flight control systems it is critical not to saturate the control surfaces in posi
tion or rate. When this happens nonlinear effects dominate the response, stability
is not guaranteed, and the aircraft could depart. We can see from (2.44) that large
gains K will cause large control u and control rate u . Thus, high gains are un
desirable in flight control systems. From (2.41) we see that K gets large as P
get large. From (2.40) we see that it is the choice of Q and R in the ARE that
determines how large the gains will be.
Summary
( )
( )
( )
( ) ( )
0
0
1
1
0
Dynamics: 0
Performance index:
Algebraic Riccati Equation: , 0
Optimal Control: *
Closed Loop System: , 0
Simulation output:
T T
T T
T
x Ax Bu x x
J x Qx u Ru d
PA A P Q PBR B P P T
u R B Px Kx
x A BK x x x
x
y
t
·
÷
÷
= + =
= +
+ + ÷ =
= ÷ = ÷
= ÷ =
=
}
cl
I
u K x
u KA
( (
( (
= ÷
( (
( ( ÷
¸ ¸ ¸ ¸
13
Example 2.2
In this example we wish to solve for the optimal control and examine the proper
ties of the closed loop system. Consider the following linear time invariant model
0 1 0
0 1 1
x Ax Bu A B
( (
= + = =
( (
÷
¸ ¸ ¸ ¸
(2.45)
where the performance index is
( )
2 2
1
0
1 0
0 0
J x ru d Q R r t
·
(
= + = =
(
¸ ¸
}
(2.46)
The eigenvalues of the open loop system are 0 ì = and 1 ì = ÷ . In the perfor
mance index the state penalty matrix Q penalizes the first state of the system.
The control penalty r is left as a parameter so we can see how small and large
values of r change the closed loop dynamics.
It is always important to check and see if the design problem is well posed or not.
Conditions on the plant and the performance index for a well posed problem re
quire to check if the unstable modes of the system are controllable, and if the unst
able modes are observable through the state penalty matrix. This is equivalent to
checking
( ) , stabilizable A B and
( )
1
2
, detectable A Q . First, compute the control
lability matrix
c
P
 
2
0 1
1 1
c
RK
P B AB
=
(
= =
(
÷
¸ ¸
(2.47)
Since this matrix has full rank the system is controllable, so any unstable modes
are controllable. Next, factor the state penalty matrix into square roots:
( )
 
1
2
1
2
1 1 0
= 1 0
0 0 0
T
Q Q Q
( (
= =
( (
¸ ¸ ¸ ¸
(2.48)
Now check the observability using the square root of Q
1
2
1
2
2
1 0
0 1
RK
Q
AQ
=
(
(
=
(
(
¸ ¸ (
¸ ¸
(2.49)
14
Since this matrix also has full rank, all modes of the system are observable
through the penalty matrix. Now solve the ARE parametrically for P using A ,
B , Q , and r . The ARE is
1
0
T T
PA A P Q PBR B P
÷
+ + ÷ = (2.50)
Let
l m
P
m n
(
=
(
¸ ¸
. Then the ARE is
 
1
0 1 0 0 1 0 0
0 1 0
0 1 1 1 0 0 1
r
l m l m l m l m
m n m n m n m n
( ( ( ( ( ( ( (
+ + ÷ =
( ( ( ( ( ( ( (
÷ ÷
¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸ ¸
(2.51)
Since the Riccati matrix is positive definite symmetric and real, (2.51) gives 3 eq
uations for l , m, and n . These are:
( )
2
1 0
0
2
2 0
m
r
mn
l m
r
n
m n
r
÷ + =
÷ ÷ =
÷ ÷ =
(2.52)
The first equation gives m r = (both positive and negative values of m must be
checked to see which is the solution). Using m r = , l and n are
( )
2
2
1 1
1
r
r
n r
l r
= + ÷
= +
(2.53)
The constant state feedback gain matrix is given as
1
1 2
1 1
T
r r
K R B P
÷
(
= = + ÷
¸ ¸
(2.54)
The closed loop state dynamics are always stable with characteristic equations
( )
2
2 1
1
cl r r
s s s  = + + + (2.55)
By varying the control penalty r , (2.55) gives a root locus which indicates how
the numerical choice of Q and R impact the closed loop system dynamics. Fig
ure 2.1 illustrates this root locus varying r from 0.001 to 100. For large values of
15
r (small gains), the closed loop poles are close to the open loop poles ( 100. r = ,
 
0.1 0.0954 K = ), producing a slow system response. For small values of r
(large gains), the roots follow asymptotes into the left half plane and the response
gets fast ( 0.01 r = ,
 
31.6228 7.0153 K = ). In general, the size of the optimal
feedback gains are proportional to the relative magnitude of Q and R . For fixed
R , large values of Q heavily penalize the state (relative to the control), and the
gains get large, and the response gets fast. Small values of Q penalizes the con
trol more than the state so less control is used. This keeps the gains small, produc
ing a slower response
5 4 3 2 1 0 1
5
4
3
2
1
0
1
2
3
4
5
Real
I
m
a
g
i
n
a
r
y
r = 0.001
r = 100.
Open Loop
Eigenvalues
ì=0, ì=1
Fig. 2.1. Exampe 2.2 root locus varying the LQR control penalty parameter.
Guaranteed Stability Margins From State Feedback LQRs
The LQR has excellent stability properties at the input to the plant. This can be
shown by examining the return difference matrix in the frequency domain. For
those not familiar with frequency domain analysis of multiinput multioutput li
near time invariant systems, please see Chapter 5, and then return to this section.
Consider the following LTI system
16
x u
n n
x Ax Bu x R u R = + e e (2.56)
along with the infinitetime LQR problem
( )
0
T T
J x Qx u Ru dt
·
= +
}
(2.57)
0, 0
T T
Q Q R R = > = > (2.58)
With ( ) , A B stabilizable,
( )
1 2
, A Q detectable. The Algebraic Riccati Equation
(ARE) is
1
0
T T
PA A P PBR B P Q
÷
+ ÷ + = (2.59)
with optimal state feedback control given by
1 T
u R B Px Kx
÷
= ÷ = ÷ (2.60)
Substituting (2.60) into (2.56) yields the closed loop system
( )
cl
x A BK x A x = ÷ = (2.61)
Of interest is what frequency domain properties we can derive for this system con
trolled by optimal state feedback.
First, introduce the loop transfer function ( ) L s . For this state feedback system
with the loop break point at the plant input, shown in Figure 2.2, ( ) L s is
( ) ( )
1
L s K sI A B
÷
= ÷ (2.62)
Also, denote ( ) ( )
1
s sI A
÷
u = ÷ and ( ) *
T
s u u ÷ = .
Begin with the Algebraic Riccati Equation (ARE). Add/Subtract sP from both
sides of the ARE, and rearrange.
( ) ( )
1 T T
P sI A sI A P PBR B P Q
÷
÷ + ÷ ÷ + = (2.63)
Substitute ( )
1
sI A
÷
u = ÷ , and multiply by * Bu on left, and by B u on the right.
17
( )
1
sI A
÷
÷

x
B
K
u
Loop Break Point
X
Fig. 2.2 Block diagram of the state feedback architecture.
1
* * *
T T T T T
B P B B PB B PBR B P B B Q B
÷
u + u + u u = u u (2.64)
Add 0 R > from the performance index (2.57) to both sides. Note that
1 T
K R B P
÷
= and ( ) ( )
1
1 T
L s K sI A B K B R B P B
÷
÷
= ÷ = u = u . Substituting ( ) L s
into (2.64) yields
( ) ( ) ( ) ( )
1 1 1 1
* * *
T T T T T
T T
RR B P B B PBR R B PBR RR B P B R B Q B R
L s L s L s L s
÷ ÷ ÷ ÷
u + u + u u + = u u +
÷ ÷
(2.65)
Simplifying yields
( ) ( ) ( ) ( ) ( )
T T T T
R RL s L s R L s RL s B s Q B R + + ÷ + ÷ = u ÷ u + (2.66)
which can be further simplified to
( ) ( ) ( ) ( ) ( )
*
T T
I L s R I L s R B s Q B + + = + u ÷ u (2.67)
where ( ) I L s + is the return difference matrix. The term ( )
T T
B s Q B u ÷ u is a
Hermitian positive semidefinite matrix. By removing this term on the right side,
we form the inequality
( ) ( ) ( ) ( )
* I L s R I L s R + + > (2.68)
If we assume an equal penalty on each control. i.e. R I µ = , then
( ) ( ) ( ) ( )
* I L s I L s I + + > (2.69)
18
Re
Im
‐1
( ) L je
0 db
( ) I L o +
e
Nyquist Loci Never
Enters Unit Disk
Centered at (–1,j0)
Min Singular Value of
Return Difference Is
Greater Than One
Single Input Systems
Multi‐Input Systems
Fig. 2.3 Frequency domain analysis of optimal state feedback loop transfer functions.
which says that the return difference has magnitude greater than one. For single
input systems this is equivalent to the Nyquist locus not entering a unit circle cen
tered about
( ) 1, 0 j ÷ . For multiinput system, this says that the minimum singular
value of the return difference matrix versus frequency always has magnitude
greater than one. These properties are shown in Figure 2.3. This implies a mini
mum gain margin of
 
6, ÷ +· and a phase margin of at least 60°. This property is
what makes the LQR so attractive in aerospace applications. Often, on open loop
unstable design problems it is very difficult to achieve the desired gain and phase
margins. This property is guaranteed (under the assumptions shown) for any
choice of Q matrix. However, experience has shown that large feedback gains
seldom work in practice, and due to modeling errors, unmodeled dynamics, noise,
actuator rate saturation, and other disturbances these excellent margins are not al
ways realized in the physical system. Care must be taken not to get large gains (a
high bandwidth design) in most physical systems, and especially in aerospace ap
plications.
There are many rules of thumb for selecting the LQR matrices. In the next
chapter a design method for tuning the LQR to achieve the desired performance
and robustness without large gains will be given.
aerospace has been able to develop new aircraft designs that are faster. for linear systems with a quadratic performance index. These advancements built upon the foundation created by classical methods. In the 1970s and 80s. multiinput multioutput (MIMO) design methods were needed. and guaranteeing robustness to parameter variations by designing large singleloop stability margins and evaluating the design through simulation. and why optimal control designs have emerged as a popular design method of control in aerospace problems. designing flight control systems using conventional (classical) analytical methods involves iterative singleloop design analyses that are costly in time and manpower.1 Introduction Control systems must provide stability and performance in the presence of model uncertainty and neglected dynamics. In the introduction we briefly discuss and compare classical control. We will cover both finite time and infinite time problems. and optimal control.2 Chapter 2 Optimal Control and the Linear Quadratic Regulator ABSTRACT In this chapter we introduce optimal control theory and the linear quadratic regulator. called modern control. designing the control system at these points. we develop the linear quadratic regulator. We then begin by introducing optimal control problems and the resulting HamiltonJacobi partial differential equation. but as new designs emerged that were openloop unstable in multiple axes. more complex problems. and will explore some very important stability and robustness properties of these systems. These systems were often designed by discretizing the flight envelope at specific points. These methods worked well on aircraft that were openloop stable. and perform robustly in very large flight envelopes. and as our understanding of dynamics and control has improved. using advanced techniques. Then. These advancements provided the theoretical mathematics required for optimizing the controller design MIMO systems and evaluating stability and . the question of robust stability and performance was raised and new control system design and analysis methods emerged. 2. This has proven to be a significant challenge. have greater performance. modern control. but were powered by computer aided design tools which greatly expanded the engineers ability to solve larger. In general.
and implement the control across a large flight envelope using gain scheduling. controller robustness properties were evaluated. except for linear systems with a quadratic performance index. Optimal control problems are in general very difficult to solve. These problems are well understood and produce control laws that have very interesting properties. which makes them a favorite in many aerospace control problems. This chapter introduces optimal control theory. We will discuss in detail some of the excellent properties that optimal controllers produce. quantifying the degree of robustness required to overcome parameter uncertainties is not well posed in the problem setup. using the optimization performance index. over a fixed time interval or infinite time. 2. noise sensitivity.3 robustness to parameter uncertainties. With computer aided design tools. is a challenge. Optimal control problems are in general very difficult to solve. It is this command following challenge that is most common in aerospace flight control systems. etc. A multivariable optimal controller design using a quadratic performance index optimizes the design in the time domain. problem setup. the linear quadratic regulator. and the all important matrix Riccati equation. and solutions for most prob . Using methods for characterizing model uncertainties. Satisfying frequencydomain requirements such as bandwidth. and how we can form an optimal control from a performance index minimization problem. There are many books available on the subject.2 Optimal Control and the HamiltonJacobi Equation The derivation of the HamiltonJacobi partial differential equation for optimal control problems will allow us to understand how optimal control regulator problems are posed. Kwakernak and Sivan [2]. One of the key challenges in using optimal control theory is transforming frequency domain performance and stability requirements from classical control into timedomain requirements. Chapter 3 takes the optimal control principles and the regulator framework and extends them to command following design problems. There are many classes of problems for nonlinear or linear systems. and Anderson and Moore [3] are three excellent textbooks that deal with necessary and sufficiency conditions. and iterative design tools emerged to achieve robust stability and performance. engineers could readily pose and solve “optimal control” problems for complex systems. derivations. differentiability and continuity assumptions. The key to using optimal control theory is to develop a method to tune the design parameters to achieve the desired performance and stability in the control system (and robustness). dealing with time variant or time invariant dynamics. Optimal control problems arise in designing a control in order to minimize a performance index. This is the goal for this chapter and the next. These modern methods allowed the control system designer to understand and directly address stability and robustness concerns for openloop unstable MIMO systems. and with different types of performance indexes. Similarly. Athans and Falb [1]..
T . t0 min L x. d S x T u t . t T x t0 x0 (2.T . t min L x.3) T J * x0 . u*. u. t1 to t1 . break (2.4) Next . x0 .5) into two integrals. T . u. Assume the minimum when using u * is J * . When used in (2. u. u . d S x T t0 t0 . T t1 J * x.1) J L x. t0 (2. u.T 0 T (2. The challenge is to find a control u * to minimize J over the time interval We call u * the optimal control.T t (2. d S x T ut .2) We see that the performance index J * is a function of the control ut0 .T 0 t0 (2.4 lems that can be solved analytically. the initial state and time.1) it produces the op timal state x * . Hamilton Jacobi Approach Consider the following first order system model and performance index x f x. u .T t t1 (2.T . J * L x*.6) . J J ut0 . from t . We will begin by deriving the HamiltonJacobi partial differential equation in a general setting.consider an arbitrary initial condition at time t : T J * x. d S x * T min J over ut0 . d S x T ut .5) Now. d L x. t min L x. and will then focus on linear systems with quadratic performance indices.
t 1 t (2. d S x T ut . t min L x. t t ut . d min L x. d L x. d S x T ut . d J * x1 . d S x T ut .t t t (2. t t J * x. t min min L x. t min L x. u. We can denote this as J * x1 . d J * x t t . u . and at each slice choose the optimal control that minimizes J . u.5 We can explicitly write the minimization over the two intervals as T t1 J * x. u. u. u.t1 (2.9) t1 J * x. t x t HOT ut . and substitute into (2. We can move min operation inside the bracket as follows: T t1 J * x. the second integral is itself a J * that starts at time t1 .t ut .T t t1 J * x1 . expand (2. u.t1 ut1 .11) Next. t1 t1 T J * x. The Principle of Optimality tells us that if we use the optimal control at each slice in time interval. that the system will be optimal over the interval.t t x t (2.7) The idea is to break the integral into time slices. d min L x. This idea leads us to what is referred to as the Principle of Optimality.12) . u.8) We see that inside the bracket. t1 ut . t min L x. t J * x.11) in a Taylor series expansion to obtain J * J * J * x. t min L x.T 1 1 t t1 (2.10).10) Now. let t1 t t . t min L x.t1 ut1 . u .T t t1 (2.
and divide by t to obtain J * x J * 0 min L x.18) and is the HamiltonJacobi differential equation. u . t = L x.14) Subject to x f x. u. t x x u (2. u .17) then J * J * H * x. t min H x.16) This formulation allows the functional minimization problem to be transformed into a function minimization. x. t t . . and substitute back into (2. u . .14). u u * . we get a PDE for J * Let J * J * H * x. t t (2. u. . If we solve for the optimal control. .15) To minimize H with respect to the control u we take the derivative and equate it to zero. x t x u (2. x x (2.13) Now let t 0 J * J * min L x. H u 0 (2. which can be solved using ordinary calculus. t on each side since it does not depend on u t . .6 We can cancel J * x. t x t0 x0 Define the Hamiltonian J * J * H x. u.t t x (2. HOT u x t t t .
t t x BC : J * x T .7 In most optimal control problems one does not really care about J * . Solving (2.19) where x 0 1 2 . T S x T Example 2. and the resulting feedback control and closed loop system have very interesting properties. with performance index T J x14 u 2 dt x12 1 x12 1 0 1 (2. . Consider the system x1 x2 x2 2 x1 3x2 u (2. We really want a feedback control for robustness and sensitivity minimization. is quadratic.T H L x. but not solve. u T x t0 x0 Performance index: J L x. We will see that if the dynamics are linear. and the performance index penalty function L x. We exploit these properties in our use of optimal control in aerospace applications to maximize performance and robustness while minimizing the control effort. but is interested in the optimal control u * applied to the system dynamics. t x14 u 2 and S x T x12 1 x12 1 xT 1 x 1 . the HamiltonJacobi Equation. t min J ut . Summary Dynamics: x f x. u * is an open loop optimal control. The Hamiltonian is . u.1 In this example we will set up.20) For this problem L x. u. As derived here. u x Optimal control: H u 0 u * Plug into H H * J * x. d S x T 0 J * f x. u .18) is still quite difficult even for low order problems in that we still must solve a PDE for J * . u . then the problem is easily solved. t HJ Eqn : J * J * H * x.
21) x14 u 2 Now.22) Substituting this back into (2. 2. t x14 x2 2 x1 3 x2 (2. T x12 T x2 T . t L x. t x14 x2 2 x1 3 x2 (2. We have found that flight control systems designed using LQRs have excellent performance. Trade studies have been performed comparing properties of controllers (performance. .3 Linear Quadratic Regulator The Linear Quadratic Regulator (LQR) is one of the most widely used control design methods in aerospace. and minimize the control usage. Consider the linear system . u.8 H x. Thus. control usage) in many different applications. J *.21) yields 1 J * J * J * J * 1 J * H * x.23) 4 x2 2 x2 x1 x2 x2 2 The HamiltonJacobi Equation is then J * 1 J * J * J * 1 J * H * x.21) with respect to the control by differentiating and equating to zero. J *. H u 0 2u * u* 1 J * 2 x2 J * x2 (2. This method is easily extended (in the next chapter) to produce command tracking controllers typical of flight control systems. u. u x J * J * x1 x2 x14 u 2 x1 x2 J * J * x2 2 x1 3x2 x1 x2 (2. robustness. robustness.24) t x1 x2 4 x2 2 x 2 2 with boundary condition J * x. t J * f x. minimize (2.
28) Taking the derivative with respect to u and equating to zero produces H J * 2 Ru BT 0 u x T (2. We will see that the numerical choices in the matrices Q and R is very important in achieving performance and robustness in the closed loop system. u R nu (2. R RT 0. Following (2. Detectability of modes through the performance index guarantees that the modes are penalized.18) yields the HamiltonJacobi equation .30) Substituting u * back into (2. QT QT T 0. Clearly the control cannot stabilize the system and perform as desired if the dynamics are not controllable.9 x At x B t u x t0 x0 x R nx .27) The matrices Q and R can be time varying if needed. Q1/2 detectable are also acceptable. Q1/2 observable. B stabilizable and A. (2. The need for controllability of the system’s dynamics should be obvious.26) where Q QT 0. (2. B to be controllable and the pair A. the LQR Hamiltonian is H xT Qx u T Ru J * At x B t u x lem we want the pair A.25) with performance index J xT Qx u T Ru d xT T QT x T 0 T (2.29) where the optimal control is J * u* 1 R 1 BT 2 x T (2.15). To have a well posed probWeaker conditions of A. producing a control that will minimize their contribution to the index.
Now. we have xT P PA AT P Q PBR 1 BT P x 0 (2.32) where P PT 0 . J * xT Px (2. and the feedback control law is implemented by looking up the gains. stored. in this form. if we try a quadratic function for J * . and factoring out x on both sides.33) back into (2.31). This is called gain scheduling. We substitute the optimal control to obtain the closed loop system.30) yields u* 1 R 1 BT 2 J * R 1 BT P x x K (2.33) Substituting (2. i. we get J * xT Px t J * 2 Px x (2.35) (2. is still quite difficult to solve. written as x At x B t u u t K t x x At B t K t x x t0 x0 (2.36).31) 4 2 t x x x x x T T which. the gains are formed using (2. Since this must be satisfied for any x.e.34) with boundary condition P T QT .33) into (2.36) which is a state feedback optimal control.10 J * J * 1 T J * J * J * 1 T J * xT Qx 1 BR B Ax 1 BR B (2. Substituting (2. we have P PA AT P Q PBR 1 BT P which is called the Riccati equation.37) . the Riccati equation is integrated backward in time. In this formulation.
written as 1 2 observable. P T QT Optimal Control: u* R 1 BT P t x K t x Closed Loop System: x A t B t K t x. The Riccati equation becomes (2. with the optimal control u* R 1 BT Px Kx (2.42) . In applications where the state dynamics are linearized at operating conditions.41) where K is a constant gain matrix. R RT 0 (2.38) where the final time T . The state dynamics for this problem are linear timeinvariant.11 Summary Dynamics: x A t x B t u T x 0 x0 Performance index: J xT Qx u T Ru d xT T QT x T 0 Riccati Equation: P PA AT P Q PBR 1 BT P. B controllable and A.39) with A. and the control is designed at each design point. B constant x R nx . written as x Ax Bu A.40) PA AT P Q PBR 1 BT P 0 called the Algebraic Riccati Equation (ARE). u R nu (2. Substituting the control into the open lop dynamics yields x Ax Bu u Kx x A BK x Acl x (2. the constant gain matrix K can be stored in a table and looked up for implementation. x 0 x0 Infinite Time LQR Now consider the performance index J xT Qx u T Ru d 0 Q QT 0. Q algebraic.
and the aircraft could depart. x 0 as t . stability is not guaranteed. high gains are undesirable in flight control systems. with outputs x . we have u Kx u Kx K A BK x KAcl x (2. If we differentiate the control. Re Acl 0 . u . Summary Dynamics: x Ax Bu x 0 x0 Performance index: J xT Qx u T Ru d 0 Algebraic Riccati Equation: PA AT P Q PBR 1 BT P.44) that large gains K will cause large control u and control rate u . When this happens nonlinear effects dominate the response. From (2. We can see from (2. which yields u 0 as t . u .41) we see that K gets large as P get large. From (2. It is often desirable when simulating the dynamics to compute the control and its rate.40) we see that it is the choice of Q and R in the ARE that determines how large the gains will be.44) In flight control systems it is critical not to saturate the control surfaces in position or rate. and u for this closed loop LQR system as x Ax Bu u Kx x A BK x Acl x x I y u K x u KAcl (2. and examine the peak values. Thus. P T 0 Optimal Control: u* R 1 BT Px Kx Closed Loop System: x A BK x. x 0 x0 x I Simulation output: y u K x u KAcl . The state is regulated to zero. This means the eigenvalues of Acl lie in the lefthalf plane.12 This formulation guarantees the closed loop system whose dynamics are described by Acl to be stable.43) We can form a closed loop simulation model.
Conditions on the plant and the performance index for a well posed problem require to check if the unstable modes of the system are controllable.45) where the performance index is J x12 ru 2 d 0 1 0 Q 0 0 Rr (2.46) The eigenvalues of the open loop system are 0 and 1 . Consider the following linear time invariant model x Ax Bu 0 1 A 0 1 0 B 1 (2. It is always important to check and see if the design problem is well posed or not. compute the control (2.47) Since this matrix has full rank the system is controllable. B stabilizable and A. In the performance index the state penalty matrix Q penalizes the first state of the system. Next.49) .48) Now check the observability using the square root of Q Q 2 1 0 1 AQ 2 0 1 RK 2 (2. so any unstable modes are controllable. First. The control penalty r is left as a parameter so we can see how small and large values of r change the closed loop dynamics. This is equivalent to checking A. Q lability matrix Pc Pc B 0 1 AB 1 1 RK 2 1 2 detectable . and if the unstable modes are observable through the state penalty matrix.2 In this example we wish to solve for the optimal control and examine the properties of the closed loop system.13 Example 2. factor the state penalty matrix into square roots: Q Q Q 1 T 2 1 1 2 1 1 0 = 1 0 0 0 0 (2.
m . all modes of the system are observable through the penalty matrix.51) Since the Riccati matrix is positive definite symmetric and real. and r .51) gives 3 equations for l .55) gives a root locus which indicates how the numerical choice of Q and R impact the closed loop system dynamics.50) l m 0 1 0 0 l m 1 0 l m 0 1 l m m n 0 1 1 1 m n 0 0 m n 1 r 0 1 m n 0 (2.14 Since this matrix also has full rank. Using m r . Then the ARE is m n (2.54) The closed loop state dynamics are always stable with characteristic equations cl s s 2 1 2 r s 1 r (2. Now solve the ARE parametrically for P using A . (2. (2. The ARE is PA AT P Q PBR 1 BT P 0 l m Let P . These are: 1 0 r l m mn 0 r 2 2m n n 0 r m 2 (2. For large values of . and n .52) The first equation gives m r (both positive and negative values of m must be checked to see which is the solution).55) By varying the control penalty r . Figure 2.53) The constant state feedback gain matrix is given as K R 1 B T P 1 r 1 2 r 1 (2. l and n are nr 1 2 r 2 r 1 l r 1 (2.1 illustrates this root locus varying r from 0.001 to 100. B . Q .
1.0954 ).01 . producing a slow system response. Exampe 2.1 0. For small values of r (large gains). and the response gets fast. please see Chapter 5. and the gains get large. the closed loop poles are close to the open loop poles ( r 100.001 Open Loop Eigenvalues =0. K 31. the roots follow asymptotes into the left half plane and the response gets fast ( r 0. =1 1 0 1 Fig. Consider the following LTI system . producing a slower response 5 4 3 2 1 Imaginary 0 1 2 3 4 5 5 4 3 2 Real r = 100. Small values of Q penalizes the control more than the state so less control is used. K 0. 2. and then return to this section.6228 7.2 root locus varying the LQR control penalty parameter.15 r (small gains). This can be shown by examining the return difference matrix in the frequency domain. In general. r = 0. large values of Q heavily penalize the state (relative to the control). . This keeps the gains small.0153 ). For fixed R . the size of the optimal feedback gains are proportional to the relative magnitude of Q and R . Guaranteed Stability Margins From State Feedback LQRs The LQR has excellent stability properties at the input to the plant. For those not familiar with frequency domain analysis of multiinput multioutput linear time invariant systems.
R RT 0 With (2. and rearrange.56) yields the closed loop system x A BK x Acl x (2. For this state feedback system with the loop break point at the plant input. introduce the loop transfer function L s .62) Also. shown in Figure 2. First. L s is L s K sI A B 1 1 (2.63) Substitute sI A .60) into (2. denote s sI A and * T s . Begin with the Algebraic Riccati Equation (ARE). P sI A sI AT P PBR 1 BT P Q 1 (2. and by B on the right. Add/Subtract sP from both sides of the ARE. .2. B stabilizable.57) Q QT 0. The Algebraic Riccati Equation (ARE) is PA AT P PBR 1 BT P Q 0 with optimal state feedback control given by u R 1 BT Px Kx (2.56) along with the infinitetime LQR problem J xT Qx u T Ru d 0 (2.58) A. Q 12 detectable.61) Of interest is what frequency domain properties we can derive for this system controlled by optimal state feedback. A.60) Substituting (2.16 x Ax Bu x R nx u R nu (2. and multiply by B * on left.59) (2.
R I .57) to both sides.69) .e.65) Simplifying yields R RL s LT s R LT s RL s BT T s QB R (2.2 Block diagram of the state feedback architecture.68) I L s * I L s I (2. i.64) yields R T PB * PBR 1 R * PBR 1 R T PB R BT * QB R R 1 B BT BT R 1 B T Ls Ls L s LT s (2. 2.64) Add R 0 from the performance index (2. Note that 1 K R 1BT P and L s K sI A B K B R 1 BT PB . then (2. we form the inequality I L s * R I L s R If we assume an equal penalty on each control. BT PB BT * PB BT * PBR 1 BT PB BT * QB (2. The term BT T s QB is a Hermitian positive semidefinite matrix.66) which can be further simplified to I L s * R I L s R B T T s QB (2. By removing this term on the right side.17 Loop Break Point u X  B sI A1 K x Fig.67) where I L s is the return difference matrix. Substituting L s into (2.
3. and especially in aerospace applications. This property is guaranteed (under the assumptions shown) for any choice of Q matrix. There are many rules of thumb for selecting the LQR matrices. Often. This property is what makes the LQR so attractive in aerospace applications. This implies a minimum gain margin of 6. 2.3 Frequency domain analysis of optimal state feedback loop transfer functions. unmodeled dynamics. actuator rate saturation. For multiinput system. experience has shown that large feedback gains seldom work in practice. In the next chapter a design method for tuning the LQR to achieve the desired performance and robustness without large gains will be given. noise. However. . this says that the minimum singular value of the return difference matrix versus frequency always has magnitude greater than one. on open loop unstable design problems it is very difficult to achieve the desired gain and phase margins. j 0 . and other disturbances these excellent margins are not always realized in the physical system.j0) Multi‐Input Systems I L 0 db Min Singular Value of Return Difference Is Greater Than One Fig. and due to modeling errors.18 Single Input Systems Im Re ‐1 L j Nyquist Loci Never Enters Unit Disk Centered at (–1. For single input systems this is equivalent to the Nyquist locus not entering a unit circle centered about 1. and a phase margin of at least 60°. which says that the return difference has magnitude greater than one. These properties are shown in Figure 2. Care must be taken not to get large gains (a high bandwidth design) in most physical systems.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.