Multivariable Control Systems I

ECE4520/ECE5520
MULTIVARIABLE CONTROL SYSTEMS I

Fall 2001
Fundamental aspects of modern control theory are covered, including solutions to systems modeled in statevariable format, controllability, observability, pole placement, and linear transformation. Computer based tools
for control system design are used. Prer., ECE4510 and Math 313 (or equiv).
Instructor: Dr. Gregory Plett
Office: EN290
Phone: 2623468
Course web-page: http://mocha-java.uccs.edu/ECE4520/
email: glp@eas.uccs.edu
Office Hours: TBD

Text: C-T. Chen, Linear System Theory and Design, third edition, Oxford University Press, 1999.
Reference: J.S. Bay, Fundamentals of Linear State Space Systems, WCB/McGraw-Hill, Boston: 1999.
Optional Software: The Matlab Student Version, with Control-Systems toolbox (full windows
version is running in the ECE Multimedia lab). The book that comes with the student software
applies to all versions of Matlab .
Evaluation:
Grading:
1) Graded homework assignments, 30% total.

2) Two hour exams, worth 20% each for a total of 40%.
3) Final lab-based project, worth 30%
90-100
80-89
70-79
60-69
0-59
A
B
C
D
F
Topics
1. Fundamentals of feedback control.
2. Linear algebra (matrix) review.
3. Continuous-time state-space systems.
4. Discrete-time state-space systems.
5. Observability and controllability.
6. Controllers, observers and compensators.
7. Linear quadratic regulation.
8. Review of multivariable control.
Text
Ch. 2
Ch. 3
Ch. 4
Ch. 4
Ch. 6
Chs. 89
Est. Weeks
0.5
1.5
2.5
1.0. . . (Exam I)
2.0
4.0. . . (Exam II)
2.0
0.5
Work Load: This is an aggressive course requiring weekly homework assignments. Expect to spend six to nine hours
per week outside of class reading the textbook and completing homework assignments. This is in accord with UCCS
policy relating credit hours for a lecture course to student workload. Some students will find that more time is required,
while others will find that less time is required.
Homework Policy #1:
Homework will be collected at the beginning of class on the assigned date. Homework
turned in after the class period will be penalized 10%. Homework turned in after the due date will be penalized an additional 25% per day unless previous arrangements have been made with the instructor. Examinations will be based on the
homework problems and the material covered in class. It is to your advantage to understand the fundamental concepts
that are demonstrated in the homework problems. It will be difficult to earn higher than a C without performing well
on the homework assignments.
Homework Policy #2:
Your homework is expected to be a bona-fide individual effort. Copying homework from

another student is CHEATING and will not be tolerated. You may (and are encouraged to) discuss homework problems
with other students, but only to the extent that you would discuss them with the instructor. Dont ask another student a
question that you would not expect the instructor to answer. Most of us know when we are compromising our integrity.
If you are in doubt, ask first.
Homework Policy #3:
Part of your engineering education involves learning how to communicate technical information to others. Basic standards of neatness and clarity are essential to this process of communication. Your process of
solving a problem must be presented in a logical sequence. Consider your assignments to represent your performance
as an engineer. Do not submit scrap paper, and do not submit paper containing scratched out notes. Graphs are to be
titled and axes are to be labeled (with correct units). The above standards of clarity and neatness also apply to your work
on exams.
Attendance:
Attendance is your responsibility. Class lectures will cover a significant amount of material. Some
will not be in the text or may be explained differently. It is to your advantage to take notes, ask questions, and to fully
participate in the classroom experience.
Missed Exams: Missed exams will count as ZERO without a physicians documentation of an illness, or other appropriate documenta tion of an emergency beyond your control and requiring your absence.
Homework Format Rules:
Points will be deducted for failure to comply with the following rules:
1. Use 8 1/2 by 11 paper (engineering paper is good).

2. Write on one side of the paper only.
3. Enclose your final answer to each problem in a box so that it may be clearly identified.
4. Write name and date and homework set number in the right corner.
5. Staple in the upper left corner. Use only one staple!
6. Be sure to write in pencil. Do not use ink to complete your homework assignments.
The Course Reader: These notes have been entered using LYX, and typeset with LATEX2 on a Pentium-II class
computer running the Linux operating system. All diagrams have been created using either xfig or Matlab.
Some sections of these notes have been adapted from lectures given by Drs. Theresa Meng, Jonathan How and Stephen
Boyd at Stanford University.
(mostly blank)
(mostly blank)
ECE4520/5520: Multivariable Control Systems I.
11
FUNDAMENTALS OF FEEDBACK CONTROL

Goals of Feedback Control
Change dynamic response of a system to have desired properties.
Output of system tracks reference input.
Reject disturbances.
Classical feedback control techniques (cf., ECE4510) use frequency
domain (Laplace) tools to analyze and design control systems.
Multivariable, State-Space Control
Use primarily time-domain matrix representations of systems.
Very powerful. Can often place poles of closed-loop system
anywhere we want!
Same methods work for single-input, single-output (SISO) or
multi-input, multi-output (MIMO or multivariable) systems.
Advanced techniques (cf., ECE5530) allow design of optimal linear
controllers with a single Matlab command!
This course is a bridge between classical control and topics in
advanced linear systems and control.
We now review some of the concepts of classical linear systems and
control which we will use. . .
c 2001, 2000, Gregory L. Plett
Lecture notes prepared by Dr. Gregory L. Plett. Copyright
ECE4520/5520, FUNDAMENTALS OF FEEDBACK CONTROL
12
Dynamic Response
We wish to control linear-time invariant (LTI) systems.
Their dynamics may be specified via linear, constant-coefficient
ordinary differential equations (LCCODE).
Examples include:
Mechanical systems: Use Newtons laws.
Electrical systems: Use Kirchoffs laws.
Electro-mechanical systems (generator/motor).
Thermodynamic systems.
Fluid-dynamic systems.
EXAMPLE:
Second-order system in standard form:

y (t) + 2 n y (t) + n2 y(t) = n2 u(t)
u(t) is the input.
y(t) is the output.

4 dy(t)
.
y (t) =
dt
2
4 d y(t)
y (t) =
.
dt 2
The Laplace Transform is a tool to help analyze dynamic systems.
Y (s) = H (s)U (s), where
Y (s) is Laplace transform of output, y(t);
U (s) is Laplace transform of input, u(t);
H (s) is transfer functionthe Laplace tx of impulse response, h(t).
{ y (t)} = sY (s) for no initial conditions.
13
EXAMPLE:
Second-order system:
s 2Y (s) + 2 n sY (s) + n2 Y (s) = n2U (s)
n2
Y (s) = 2
U (s).
s + 2 n s + n2
Transforms for systems with LCCODE representations can be written

as Y (s) = H (s)U (s), where
b0s m + b1s m1 + + bm1s + bm
H (s) =
,
a0s n + a1s n1 + + an1s + an
where n m for physical systems.
These can be represented in Matlab using vectors of numerator and

denominator polynomials:
num=[b0 b1 . . . bm];
den=[a0 a1 . . . an];
sys=tf(num,den);
Can also represent these systems by factoring the polynomials into
zero-pole-gain form:
Qm
(s zi )
H (s) = K Qni =1
.
(s
p
)
i
i =1
sys=zpk(z,p,k);
% in Matlab
Input signals of interest include the following:

u(t) = k (t)
u(t) = k 1(t)
u(t) = kt 1(t)
. . . U (s) = k
. . . U (s) = k/s
. . . U (s) = k/s 2
k
u(t) = k exp(t) 1(t) . . . U (s) =
s+
k
u(t) = k sin(t) 1(t)
. . . U (s) = 2
s + 2

impulse
step
ramp
exponential
sinusoid
14
Matlabs impulse, step, and lsim commands can be used to

find output time histories.
The Final Value Theorem states that if a system is stable and has a
final, constant value,
lim x(t) = lim s X (s).
t
Sfrag replacements
s0
This is useful when investigating steady-state errors in a control

H1 (s)
system.
replacements H2 (s)
H (s)
Block
Diagrams
UsefulR(s)
when analyzing systems comprised of a number of sub-units.
U1 (s)
Y2 (s)
U (s)
replacements
R(s) U2 (s)
UH1 (s)
Y2 (s)
U (s)
U2 (s)
H1 (s)
replacements
R(s)
H
(s)
(s)
U1U
(s)
Y2 (s)
U2 (s)
U (s)
R(s)
Y (s)
H (s)
H2 (s)
Y (s) = H (s)U (s)
Y (s)
Y (s) = [H1(s)H2(s)] U (s)
Y (s)
Y (s) = [H1(s) + H2 (s)] U (s)
H1 (s)
H2 (s)
U1 (s)
H1 (s)
Y (s)
Y (s) =
Y2 (s)
H2 (s)
H1 (s)
R(s)
1 + H2(s)H1(s)
U2 (s)

15
acements
Block-diagram algebra (or Masons rule) may be used to reduce block

diagrams to a single transfer function.
U (s)
H (s)
Y2 (s)
1
H1 (s)
U1 (s)
H (s)
Y1 (s)
H (s)
Y (s)
1
H (s)
U1 (s)
H (s)
U2 (s)
H (s)
Y2 (s)
Y (s)
U2 (s)
R(s)
U (s)
Y1 (s)
H1 (s)
Y (s)
R(s)
H2 (s)
1
H2 (s)
H2 (s)
H1 (s)
Y (s)
Unity Feedback
Dynamic Response versus Pole Locations

The poles of H (s) determine (qualitatively) the dynamic response of
the system. The zeros of H (s) quantify the relationship.
If the system has only real poles, each one is of the form:
1
H (s) =
.
s+
If > 0, the system is stable, and h(t) = e t 1(t). The time constant
is = 1/, and the response of the system to an impulse or step
decays to steady-state in about 4 or 5 time constants.

16
step([0 1],[1 1]);
impulse([0 1],[1 1]);

1
ements
1 1]);
PSfrag replacements
0.8
dc gain
ondition
0.6
impulse([0 1],[1 1]);

e
0.4
e
e t
h(t)
1
e
0.2
0
0
0.8
K (1 et/ )
System response. K = dc gain
y(t) K
h(t)
(t) K
0.6
0.4
Response to initial condition

0.
0.2
t =
Time (sec )
0
0
t =
Time (sec )
If a system has complex-conjugate poles, each may be written as:

n2
H (s) = 2PSfrag replacements
.
s + 2 n s + n2
We can extract two more parameters from this equation:

p
= n and d = n 1 2.
plays the same role as aboveit specifies
= sin1 ( )
decay rate of the response.
d is the oscillation frequency of the output.
Note: d 6= n unless = 0.
is the damping ratio and it also plays a
role in decay rate and overshoot.
Impulse response h(t) = n e t sin(d t) 1(t).

sin(d t) .
Step response y(t) = 1 e t cos(d t) +
d

(s)
(s)
17
Impulse Responses of 2nd-Order Systems

1
PSfrag replacements
ments
=0
0.1
0.3
0.5
0.7
0.9
0.1
0.3
0.5
0.7
0.8
0.9
=1
0.7
0.9
0.2
0.5
0.4
0.7
0.8
0.9
1.0
y(t)
y(t)
0.6
0
=1
0.8
0.5
Step Responses of 2nd-Order Systems

2
=0
0.2
1.5
0.4
0.6
0.8
0.5
1.0
1
Impulse
Responses
of
2nd-Order
Systems
0
2
4
6
8
10
12
n t
stems
0
0
n t
10
12
A summary chart of impulse responses and step responses versus

pole locations is:
(s)
ements
(s)
PSfrag replacements
PSfrag replacements
(s)
Impulse responses vs. pole locations

cationsImpulse responses vs. pole locations
Step responses vs. pole locations
Time-domain specifications determine where poles SHOULD be

placed in the s-plane. (step-response).
Mp
tp
1
0.9
0.1
tr
ts

(s)
18

100
90
Rise time tr = time to go from

10% to 90% of final value.
80
M p, %
70
Settling time ts = time until

permanently within 1% of final
PSfrag replacements
value.
50
40
30
20
10
Overshoot M p = maximum
PERCENT overshoot.
0
0
0.2
0.4
0.6
0.8
1.0
tr 1.8/n
ts 4.6/
/ 1 2
Mp e
(s)
60
n 1.8/tr
4.6/ts
...
...
fn(M p )
...
(s)
(s)
(s)
n
PSfrag
replacements
sin1
(s)
(s)
(s)
= 0.707
(s)PSfrag replacementsPSfrag replacements
(s)
ements
(s)
ements
Basic Feedback Properties
r (t)
D(s)
G(s)
y(t)
Y (s)
D(s)G(s)
=
= T (s).
R(s) 1 + D(s)G(s)
Stability depends on roots of denominator of T (s): 1 + D(s)G(s) = 0.

Routh test used to determine stability.
19
Steady-state error found from (for unity feedback case)

E(s)
1
=
.
R(s) 1 + D(s)G(s)
ess = lim e(t) = lim s E(s) if the limit exists.

t
s0
System type = 0 iff ess is finite for unit-step reference-input 1(t).
System type = 1 iff ess is finite for unit-ramp reference-input r(t).
System type = 2 iff ess is finite for unit-parabola ref.-input p(t). . .

For unity-feedback systems,
K p = lim D(s)G(s),
s0
K v = lim s D(s)G(s).
s0
K a = lim s 2 D(s)G(s).
s0
position error constant

velocity error constant
acceleration error constant
Steady-state errors versus system type for unity feedback:

Step input Ramp input Parabola input
1
Type 0
1 + Kp
1
0
Type 1
Kv
1
Type 2
0
0
Ka
Some Types of Controllers
Proportional ctrlr: u(t) = K e(t).
Z
K t
e(t) dt.
Integral ctrlr
u(t) =
TI
D(s) = K .
K
D(s) =
TI s
110
Derivative ctrlr.
Combinations:
Lead:
Lag:
Lead/Lag:
u(t) = K TD e(t)

D(s) = K TD s
1
;
TI s
PD: D(s) = K (1

+ TD s) ;
1
+ TD s .
PID:D(s) = K 1 +
TI s
Ts + 1
D(s) = K
,
< 1 (approx PD)
T s + 1
Ts + 1
D(s) = K
,
> 1 (approx PI;
T s + 1
often, K = )
(T1s + 1)(T2s + 1)
D(s) = K
, 1 < 1, 2 > 1.
(1 T1 s + 1)(2 T2 s + 1)
PI: D(s) = K 1 +
Root Locus
A root locus plot shows (parametrically) the possible locations of the
roots of the equation
b(s)
1+ K
= 0.
a(s)
For a unity-gain feedback system,
T (s) =
D(s)G(s)
.
1 + D(s)G(s)
The poles of the closed-loop system T (s) depend on the open-loop

transfer functions D(s)G(s). Suppose D(s) = K D0(s).
closed-loop poles at 1 + K (D0(s)G(s)) = 0
which is the root-locus form.
111
Drawing the root locus allows us to select K for good pole locations.
Intuition into the root-locus helps us design D0(s) with lead/ lag/ PI/
PID. . . controllers.

(mostly blank)
21
Linear Algebra (Matrix) Review

We will begin looking at state-space methods to analyze and design
control systems.
These methods rely HEAVILY on matrix/vector operations.
This set of notes reviews the mechanics of matrix manipulation. An

attempt is also made to aid intuition.
A Matrix Primer: Terminology and Notation

A matrix is a rectangular array of numbers (a.k.a., scalars) written
between brackets.
EXAMPLE:
0.1 1.2 2.4 0.4
A = 0.5 0.2 1.3 2.5 .

0.2 1.1 9.5 1.8
An important attribute of a matrix is its size or dimension:
Always measured in number of rows number of columns. Above:

3 4.
The entries or coefficients are the values in the array.
The i, j entry is the value in the ith row and the jth column.
The i, jth entry in matrix A is Ai j which is a number.
The positive integers i and j are called the (row and column,
respectively) indices.
22
ECE4520/5520, Linear Algebra (Matrix) Review
A13 = 2.4, A31 = 0.2. The row index of the bottom row is 3,
the column index of the first column is 1.
EXAMPLE :
A matrix with only one column (i.e., size of n 1) is called a column

vector, or just a vector.
Sometimes, size is specified by calling it an n-vector.
Entries are denoted with just one subscript (the other is 1) as in v 3.
The entries are sometimes called the components of the vector.
EXAMPLE :
v = 0.5
1
is a 3-vector (or 3 1 matrix); its third component is v3 = 1.
Similarly, a matrix with a single row (size 1 n) is a row vector.

EXAMPLE :
h
i
w=
8 1 0.1
is a row vector (or 1 3 matrix); its second component is w 2 = 1.
Sometimes a 1 1 matrix is considered to be the same as a scalar,

i.e., a number.
Two matrices are equal if they are the same size and all the
corresponding entries (which are numbers) are equal.
Notational Conventions
Some authors try to use notation that helps the reader distinguish
between matrices, vectors and scalars.
23
For example,
Greek letters (, , . . .) might be used for numbers;
Lower-case letters (a, x, y, . . .) might be used for vectors;
Upper-case letters (A, B, . . .) for matrices.
Other notational conventions include matrices given in bold font (H),
or vectors written with arrows above them (E
a ).
But, there are about as many notational conventions as authors!
Be prepared to figure out what things are (i.e., scalars, vectors,
matrices) despite the authors notational scheme (if any exists!).
Zero Matrices
The zero matrix (of size m n) has all entries equal to zero.
Sometimes written as 0mn where subscript denotes size.
Often just written as 0, the same symbol used to denote the number 0.
You need to figure out the size of the zero matrix from the context.
Zero matrices of different sizes are different matrices, even though we
use the same symbol (i.e., 0).
In programming, this is called overloading; we say that the symbol 0 is
overloaded because it can mean different things depending on its
context (i.e., the equation it appears in).
When a zero matrix is a (row or column) vector, we call it a zero (row
or column) vector.

24
Identity Matrices
An identity matrix is another common matrix.
It is always square, i.e., has the same number of rows as columns.
Its diagonal entries, i.e., those with equal row and column indices, are
all equal to 1.
Its off-diagonal entries, i.e., those with unequal row and column
indices, are equal to 0.
Identity matrices are denoted by the letter I . Sometimes a subscript
denotes the size, as in I3 or maybe I22.
More often, size must be determined from context (just like for zero
matrices).
Formally, the identity matrix is defined by
(
1, i = j;
Ii j =
0, i 6= j.
EXAMPLES:
"
1 0
0 1
0
1
0
0
0
0
1
0
which are 2 2 and 4 4 identity matrices. Remember that both are

denoted by the same symbolI .
The importance of the identity matrix will become clear later.

25
Matrix Operations
Matrices may be combined in various operations to form other
matrices.
Matrix Transpose
If A is an m n matrix, its transpose, denoted A T (or sometimes A0),
is the n m matrix given by
T
A i j = A ji.
In other words, the rows and columns of A are transposed in A T .

EXAMPLE :
T
1 2 3
1 4 7 1
4 5 6
2
5
8
2
.
7 8 9
3 6 9 3
1 2 3
Note that transposition converts row vectors into column vectors, and
vice versa.
If we transpose a matrix twice, we get back the original matrix:

T
A T = A.
Matrix Addition and Subtraction
Two matrices of the same size can be added together to form another
matrix (of the same size) by adding the corresponding entries.
Matrix addition is denoted by the symbol +. (Thus the symbol + is
overloaded to mean scalar addition when scalars appear on its leftand right-hand side, and matrix addition when matrices appear on its
left- and right-hand sides.)
26
EXAMPLE :
"
1 2
3 4
"
5 6
7 8
"
6 8
10 12
Note that (row or column) vectors of the same size can be added, but
you cannot add together a row vector and a column vector (except if
they are both scalars!).
Matrix subtraction is similar:
EXAMPLE :
"
#
"
#
1 2
0 2
I =
.
3 4
3 3
Note that this gives an example where we have to figure out what size
the identity matrix is. Since you can only add (or subtract) matrices of
the same size, we conclude that I must refer to a 2 2 identity matrix.
Matrix addition is commutative; i.e., if A and B are matrices of the
same size, then A + B = B + A.
It is also associative; i.e., (A + B) + C = A + (B + C), so we write
both as A + B + C.
We always have A + 0 = 0 + A = A; i.e., adding the zero matrix has
no effect.
Scalar Multiplication
If we multiply a matrix by a scalar, the resulting matrix has every entry
multiplied by the scalar.

27
Usually denoted by juxtaposition, with the scalar on the left, as in
1 4
2 8
(2) 2 5 = 4 10 .
3 6
6 12
Sometimes you see scalar multiplication with the scalar on the right,
or even scalar division with the scalar shown in the denominator
(which just means scalar multiplication by one over the scalar), as in
"
#
1 3 5
"
#
1 4
2 8
2
4
6
0.5 1.5 2.5
=
,
2 5 2 = 4 10 ,
2
1 2 3
3 6
6 12
but these look ugly.
Scalar multiplication obeys several laws you can determine for

yourself. e.g., if A is any matrix and , are any scalars, then
( + )A = A + A.
It is useful to identify the symbols above. The + sign on the left is the
addition of scalars. The + sign on the right denotes matrix addition.
Another simple property is ()A = ()( A), where and are
scalars and A is a matrix. On the left side we have scalar-scalar
multiplication () and scalar-matrix multiplication; on the right side
we see two cases of scalar-matrix multiplication.
Note that 0 A = 0 (where the left-hand zero is the scalar zero, and
the right-hand zero is a matrix zero of the same size as A).
Matrix Multiplication
It is also possible to multiply two matrices using matrix multiplication.
28
You can multiply two matrices A and B provided that their dimensions
are compatible, which means that the number of columns of A equals
the number of rows of B.
EXAMPLE :
Am p B pn = Cmn .
The product is defined by
p
X
Ai k Bk j = Ai 1 B1 j + + Ai p B pj ,
Ci j =
k=1
i = 1, . . . , m,
j = 1, . . . , n.
This looks complicated, but is not too difficult.
A11 A1 p
C11 C 1n
..
..
...
...
.
.
B11 B1 j B1n
...
...
... =
Ai 1 Ai p
Ci j

..
..
... B B B
...
.
.
pj
pn
p1
Am1 Amp
Cm1 C mn
To find the i, jth entry of the product C = AB, you need to know the
ith row of A and the jth column of B.
The summation above can be interpreted as moving left-to-right
along the row i of A while moving top-to-bottom down column j of B.
As you go, keep a running sum of the product of entries: one from A
and one from B.
Now we can explain why I is called the identity matrix: If A is any
m n matrix, then AI = A and I A = A, i.e., when you multiply a
matrix by an identity matrix, it has no effect. (The identity matrices in
AI = A and I A = A have different sizeswhat are they?)
One VERY important fact is that matrix multiplication is not (in

general) commutative. We DONT have AB = B A. In fact B A may
29
not even make sense (due to dimensions) and even if it does make
sense, it may have different dimension than AB so that equality in
AB = B A is meaningless.
If A is 2 3 and B is 3 4 then AB makes sense, and is 2 4.
B A does not make sense.
EXAMPLE :
Even if both make sense (as in when both are square, for
example) AB 6= B A in general
"
#"
# "
#
"
#"
# "
#
1 2
5 6
19 22
5 6
1 2
23 34
=
,
=
.
3 4
7 8
43 50
7 8
3 4
31 46
EXAMPLE :
Matrix multiplication is associative; i.e., (AB)C = A(BC). Therefore,

we write a product as ABC.
Matrix multiplication is also associative with scalar multiplication; i.e.;
(AB) = ( A)B.
Matrix multiplication distributes across matrix addition:
A(B + C) = AB + AC, and (A + B)C = AC + BC.
Matrix-Vector Product
A very important type of matrix multiplication: matrix-vector product.
EXAMPLE :
y = Ax,
where A is an m n matrix, x is an n-vector and y is an m-vector.
We can think of matrix-vector multiplication (with an m n matrix) as
a function that transforms n-vectors into m-vectors. The formula is:
yi = Ai 1 x1 + + Ai n xn , i = 1, . . . , m

210
Inner Product
Another special case is the product of a row vector with a column
vector of same size.
Then vw makes sense, and has size 1 1 (i.e., a scalar).
vw = v1w1 + + vn wn . This often occurs in the form x T y where x

and y are both column n-vectors. In this case the product is called the
inner product or dot product of the vectors x and y. Other notation is:
hx, yi or x y.
Matrix Powers
When a matrix A is square, then it makes sense to multiply A by
itself; i.e., to form A A. We call this A 2. Similarly, k copies multiplied
together are Ak .
1
Non-integer powers, such as A 2 (the matrix square-root) are pretty

trickythey might not make sense, or be ambiguous, unless certain
conditions on A hold. This is an advanced topic in linear algebra.
By convention, we set A 0 = I (usually only when A is invertiblesee
below).
Matrix Inverse
If A is square, and there is a matrix F such that F A = I , then we say
that A is invertible or nonsingular. We call F the inverse of A, and
denote it A1. Then, Ak = (A1)k .
It is important to note that not all square matrices are invertible. For
example, the zero matrix never has an inverse. A less-obvious
211
example is to show that
does not have an inverse.
"
1 1
2 2
As an example of a matrix inverse, we have

#
"
#1 "
2 1
1 1
=
1 1
1 2
(you should check this!).
It is very useful to know the general formula for a 2 2 matrix inverse.

"
#1
"
#
1
a b
d b
=
ad bc c a
c d
provided ad bc 6= 0. (If ad bc = 0, the matrix is not invertible.)
1
= A.
When a matrix is invertible, A1
The importance of the matrix inverse will become VERY clear.

Useful Identities
Here are a few useful identities. This list is not complete!
1. Transpose of product: (AB) T = B T A T .
2. Transpose of sum: (A + B)T = A T + B T .
3. Inverse of product: (AB)1 = B 1 A1 provided A and B are square

and invertible.
4. Products of powers: A k Al = Ak+l (for k, l 1 in general, and for all
k, l if A is invertible).

212
Block Matrices and Submatrices

Sometimes it is convenient to form matrices whose entries are
themselves matrices.
i
h
A B C
"
0 G
where A, B, C, F and G are matrices (as are 0 and I ). Such

matrices are called block matrices.
Block matrices need to have the right dimensions to fit together.
EXAMPLE :
A=
"
C=
"
1 2
0 2
1 0
A B
C D
#
i
B=
"
D=
1 2
= 0 2
1 0
3
1
0
#
i
1 .
0
Block matrices may be added and multiplied as if the entries were

numbers, provided the corresponding entries have the right size and
you are careful about the order of multiplication.
"
#" # "
#
AX + BY
A B
X
=
,
C X + DY
C D
Y
provided the products AX, BY, C X and DY make sense.

213
Linear Equations and Matrices.

Linear Functions
Suppose that f is a function that takes as input n-vectors and returns
m-vectors.
We say that f is linear iff
scaling: for any n-vector x and scalar , f (x) = f (x).
superposition: for any n-vectors x and y, f (x + y) = f (x) + f (y).

Such a function may always be represented as a matrix-vector
multiplication f (x) = Ax.
Conversely, all functions represented by f (x) = Ax are linear.
We can also write the function in explicit form, where f (x) = y as
n
X
yi =
Ai j x j = Ai 1 x 1 + + Ai n x n ,
i = 1, . . . , m
j =1
This gives a simple interpretation of A i j : it gives the coefficient by

which yi depends on x j .
Linear Equations
Any set of m linear equations in (scalar) variables x 1, . . . x n can be
represented by the compact matrix equation Ax = b, where x is a
vector made from the variables, A is a m n matrix and b is a
m-vector.
EXAMPLE :
1 + x2 x3 = 2x 1,
x3 = x2 2.

214
Rewrite the equations with the variables lined up in columns, and the
constants on the right-hand side.
2x1 +x2 x3 = 1
0x1 x2 +x3 = 2.
Now it is easy to rewrite the equations as a single matrix equation

"
# x1
"
#
2 1 1
1
,
x2 =
0 1 1
2
x3
so we have two equations in three variables as Ax = b where

"
#
"
#
x1
1
2 1 1

b=
.
A=
,
x = x2 ,
2
0 1 1
x3
Solving Linear Equations
Suppose we have n linear equations in n variables x 1, . . . , x n , written

in the compact matrix notation Ax = b.
A is a n n matrix; b is an n-vector. Suppose that A 1 exists. Multiply

both sides of Ax = b by A 1.
A1(Ax) = A1b
I x = A1b
x = A1b.
We have solved the simultaneous equations.

We see the importance of the matrix inverse in solving simultaneous
equations.
We cant always solve n simultaneous equations for n variables. One
or more of the equations may be redundant (i.e., may be obtained
215
from the others), or the equations may be inconsistent (i.e., x 1 = 1,

and x1 = 2).
When these pathologies occur, A is singular (non-invertible).

Conversely, when A is non-invertible, the equations are either
redundant or inconsistent.
From a practical point of view, either you dont have enough equations
or you have the wrong ones. Otherwise, A 1 exists, and you can
solve x = A1b.
Solving Linear Equations in Practice
When we solve linear equations by computer, we dont use x = A 1b,
although that would work. Practical methods compute x = A 1b
directly.
A may be large, sparse, or poorly conditioned. There exist efficient
methods to handle each case.
In Matlab, x=A\b;
The Determinant Function
Consider the set of equations
a11 x1 + a12 x2 = b1
a21 x1 + a22 x2 = b2
or
"
a11 a12
a21 a22
#"
x1
x2
"
b1
b2
Multiply the first equation by a22 and the second equation by a12.
Add the resulting two equations:
216
a11a22 x1 + a12a22 x2 = a22b1

a12a21 x1 a12a22 x2 = a12b2
(a
a12a21})x1 = a22b1 a12b2
| 11a22 {z
det( A)
so we can solve for x 1.
Multiply first equation by a21 and the second by a11. Add the
resulting two equations:
a11a21 x1 a12a21 x2 = a21b1
a11a21 x1 + a11a22 x2 = a11b2
(a
a12a21})x2 = a11b2 a21b1
| 11a22 {z
det( A)
so we can solve for x 2.
Determinants come up naturally when solving systems of equations.

a a
11 12
22:

= a11a22 a12a21
a21 a22

a11 a12 a13
a11a22a33 + a12a23a31

33:
a21 a22 a23 = +a13a21a32 a13a22a31

a31 a32 a33
a11a23a32 a12a21a33,
and so forth.
adj(A)
, det(AB) = det(A) det(B) when both are square, and
A1 =
det(A)
det(A T ) = det(A).

217
The Adjoint of a Square Matrix A

Define i, j = (1)i + j det(Mi, j ) where Mi, j is the matrix A with the ith
row and jth column removed. Then,
adj(A) = [i, j ]T .
The adjoint comes up naturally when solving matrix inverses. See
above.
Range-Space, Null-Space and Rank
Null-Space
The nullspace of A
mn
is defined as

(A) = x n Ax = 0 .
(A) is the set of vectors mapped to zero by y = Ax.
(A) is the set of vectors orthogonal to all rows of A.

(A) gives ambiguity in x given y = Ax.
If y = Ax and z
(A) then y = A(x + z).
Range-Space
The rangespace of A
mn
is defined as

(A) = Ax x n
(A) is the set of vectors that can be generated by y = Ax.

(A) is the span of the columns of A.

218
Rank
We define the rank of A mn as

rank(A) = dim (A)
(nontrivial) facts:
rank(A) = rank(A T ).
rank(A) is maximum number of independent columns (or rows) of A.
Hence, rank(A) min(m, n).

rank(A) + dim(
(A)) = n.
Interpreting y = Ax
Consider the system of linear equations
y1 = A11 x1 + A12 x2 + + A1n xn
y2 = A21 x1 + A22 x2 + + A2n xn
...
ym = Am1 x1 + Am2 x2 + + Amn xn
which can be written as y = Ax, where
A11 A12 A1n

y1
A21 A22 A2n

y2
A=
y = . ,
...
. . . ...
..
Am1 Am2 Amn
ym
Some interpretations of y = Ax :
x=
x1
x2
...
xn
y is measurement or observation; x is unknown to be determined

x is input or action; y is output or result.
y = Ax defines a function that maps x
into y

219
Interpreting y = Ax via the Coefficients of A: A i j

yi =
n
X
Ai j x j
j =1
Ai j is the gain factor from the jth input (x j ) to the ith output (yi ).
Thus,
ith row of A concerns ith output.
jth column of A concerns jth input.
A27 = 0 means 2nd output (y2) doesnt depend on 7th input (x 7).
|A31| |A3 j | for j 6= 1 means y3 mostly depends on x 1.
|A52| |Ai 2| for i 6= 5 means x 2 mostly affects y5.
A is lower triangular, i.e., Ai j = 0 for i < j, means yi only depends

on x1, . . . , xi .
A is diagonal, i.e., Ai j = 0 for i 6= j, means ith output depends
only on ith input.
More generally, sparsity pattern of A, i.e., list of zero/nonzero entries
of A, shows which x j affect which yi .
Interpreting y = Ax via the Columns of A: ai
Write A in terms of its columns
h
where a j
A=
a1 a2 a n ,
Then, y = Ax can be written as

y = x 1 a1 + x 2 a2 + + x n an
(note: x j s are scalars, a j s are m-vectors)
220
y is a linear combination or mixture of the columns of A.
Coefficients of x give coefficients of mixture.

EXAMPLE :
"
#
"
#
1 1
1.0
A=
,
x=
,
2 1
0.5
y=
"
1.5
1.5
a1
Ax = (1)a1 + (0.5)a2
= (1.5, 1.5)
a2
PSfrag replacements
x = (1, 0.5)
Interpreting y = Ax via the Rows of A: a iT

Write A in terms of its rows
where a i
a 1T
a 2T
a 1T x
a 2T x
A= .
..
a nT
y=
...
T
a n x
and yi = ha i , xi, i.e., yi is inner product of ith row of A with x.

Geometric interpretation: a iT x = const is a (hyper-)plane in

to a i )
ha,
xi = 0
221
n
(normal
ha,
xi = 2
PSfrag replacements
a
ha,
xi = 1
Thus, x is on intersection of hyperplanes a iT x = yi .

EXAMPLE :
"
#
" #
"
#
2 1
4
10
A=
,
x=
,
y=
1 1
2
2
a 2T x = 2
PSfrag replacements
a 2
a 1
x
x
a 1T x = 10
Interpreting y = Ax via the Eigenvectors/Eigenvalues of A

Eigen is a German word meaning (roughly) characteristic.
The eigenvectors and eigenvalues of a matrix A characterize its

behavior.
An eigenvector is a vector satisfying
Av = v,

222
where is a (possibly complex) constant, and v 6= 0.

That is, multiplying by A does nothing to the vector except change its
length!
This is a very unusual vector. There are usually only n of them if A
has size n n.
Note that if v is an eigenvector, kv is also an eigenvectorso
eigenvectors are often normalized to have unit length: kvk2 = 1.
The constant is an eigenvalue. Specifically, it is the eigenvalue
associated with eigenvector v.
Since there are (usually) n eigenvectors with n corresponding
eigenvalues, we label the eigenvectors and eigenvalues vi and i
where 1 i n.
Why is this important?
Suppose we have a vector x = v1 + 2v2. Then, Ax = 1 v1 + 22v2.
If we decompose the input into eigenvector coordinates, then
multiplication by A is simply adding together scaled eigenvectors.
We can write y = Ax as
y = V 3V 1 x
(will show this later on). V is a collection of all the eigenvectors put
into a matrix.
V 1 decomposes x into the eigenvector coordinates. 3 is a
diagonal matrix multiplying each component of the resulting vector by
the eigenvalue associated with that component, and V puts
everything back together.
223
Thus, eigenvectors are the directions of matrix A, and the

eigenvalues are the magnifications along those directions.
To find eigenvalues, consider that (I A)v = 0.
Since v 6= 0 I A must drop rank for some value of associated
with v. A matrix which is not full rank has zero determinant. So, we
can solve for the eigenvalues by solving
det(I A) = 0,
This is a VERY IMPORTANT equation when studying state-space
systems.
Note that there are very efficient and numerically robust methods of
finding eigenvectors and eigenvalues. These methods do not use the
determinant rule, above. The determinant rule is useful for
mathematical analysis.
In Matlab,
[V,Lambda]=eig(A);
The Diagonal Form
is an eigenvalue of A if det(I A) = 0 which is true iff there exists
a nonzero vector v so that
(I A)v = 0 Av = v
v = eigenvector.
Repeat to find all eigenvectors. Assume that v1, v2, . . . vn are linearly
independent.
Avi = i vi
i = 1, 2, . . . , n
224
A v1 v2 . . . v n =
|
{z
}
AT = T 3
v1 v2 . . . v n
Not all matrices diagonalizable

"
#
0 1
A=
0 0
1
...
0
T 1 AT = 3.
n
{z
3
det(I A) = 2
One eigenvalue = 0. Solve for the eigenvectors

" #
"
#" #
va
0 1
va
6= 0.
=0
all vectors of the form
0
vb
0 0
The Jordan Form
What if A cannot be diagonalized?
Any matrix A nn can be put in Jordan form by a similarity

transformation; i.e.,
J1
0
...
T 1 AT = J =
where
Ji =
Jq
i 1
0

i . . .
... 1
0
i
n i n i

225
is called a Jordan block of size n i with eigenvalue i (so n =

J is block-diagonal and upper bidiagonal.
q
X
n i ).
i =1
J is diagonal is the special case of n Jordan blocks of size n i = 1.

Jordan form is unique (up to permutations of the blocks).
Can have multiple blocks with the same eigenvalue.
The Jordan form is a conceptual tool, never used in numerical
computations!
NOTE :
(s) = det(s I A) = det(s I J ) = (s 1)n1 (s q )nq hence

distinct eigenvalues n i = 1 A diagonalizable.
dim (i I A) = dim
eigenvalue i .
(i I J ) is the number of Jordan blocks with
The sizes of each Jordan block may also be computed, but this is
complicated. i.e., leave it to Matlab!
EXAMPLE :
Consider
A=
0
1
1
0
0
0
0
1
1
2
0
0
0
1
1
0
2
0
0
0
0
1
1
1
1
.
1
From Matlab, we find that A has eigenvalue 2 with multiplicity 5 and

eigenvalue 0 with multiplicity 1:
det(I A) = ( 2)5.
226
rank(2I A) = 2 so there are two Jordan blocks with eigenvalue 2.

We can check this in Matlab: jordan(A)
0 0 0 0 0
0 2 1 0 0
0 0 2 1 0
J =
0 0 0 2 0
0 0 0 0 2
0 0 0 0 0
0
0
0
0
1
2
Note that without further information (computation) the following form

might also be the Jordan form for A (but it isnt)
0 0 0 0 0 0
0 2 0 0 0 0
0 0 2 1 0 0
J =
0 0 0 2 1 0
0 0 0 0 2 1
0 0 0 0 0 2
Cayley-Hamilton Theorem
The square matrix A satisfies its own characteristic equation. That is,
if
() = det(I A) = 0
then
(A) = 0.
We can easily show this if A is diagonalizable. Let

A = V 13V.

227
Then
A2 = V 13V V 13V
= V 132 V
Ak = V 13k V.
The characteristic polynomial is:

() = n + an1n1 + + a1
so if we replace with A we get
(A) = An + an1 An1 + + a1 I
n

1
n1
=V
3 + an13
+ + a1 I V.
To prove the Cayley-Hamilton theorem, we just need to show that

the quantity inside the brackets is zero.
It is a diagonal matrix, and each element on the diagonal is of the
form
in + an1in1 + + a1 = 0
because i is an eigenvalue of A.
So each element on the diagonal is zero, and we have shown the
proof.
If A is not diagonalizable, the same proof may be repeated using the
Jordan form and Jordan blocks:
A = T 1 J T.
Consider a sketch of the proof for a Jordan block of size 2 and
(i ) = i3 + a2i2 + a1i + a0 = 0

228
Then
Ji =
"
i 1
0 i
(Ji ) =
"
i3
0
3i2
i3
+ a2
"
i2
2i
0 i2
+ a1
"
i 1
0 i
+ a0
"
1 0
0 1
We can easily see that the diagonal and lower-diagonal components

are zero so
"
#
0
(Ji ) =
0 0
where
= 3i2 + 2a2i + a1
d
() = 0 which completes the sketch.
d
n
SIGNIFICANCE : The Cayley-Hamilton theorem shows us that A is a
function of matrix powers A n1 down to A0. Therefore, to compute any
polynomial of A it suffices to compute only powers of A up to A n1
and appropriately weight their sum. A lot of proofs use the
Cayley-Hamilton theorem.
"
#
1 2
EXAMPLE : With A =
we have (s) = s 2 5s 2 so
3 4
but =
(A) = A2 5A 2I
"
#
"
#
"
#
7 10
1 2
1 0
=
5
2
15 22
3 4
0 1
#
"
0 0
=
.
0 0
229
Solving Lyapunov Equations

A number of times in this course we will encounter an equation of the
form
X A + B X = C,
which has the name Lyapunov Equation.
An example we have already seen is the diagonalization problem (if
the eigenvalues are known)
V 3 + AV = 0
or the Jordan-form problem
T J + AT = 0.
The equations can be solved by writing them out in terms of X i j .
Another way to do it uses the Kronecker product and vectorized
matrices.
K RONECKER PRODUCT:
a11 B a12 B a1n B

..
... .
AB = .
am1 B am2 B amn B
That is, the Kronecker product is a large matrix containing all possible
permutations of A and B.
We can convert a matrix into a column vector

which stacks up each column of the matrix.

T
(A) = A(:, 1)T A(:, 2)T A(:, n)T .
VECTORIZED MATRICES :
A general Kronecker-product rule is:

(AX B) = [B T A](X).

230
Three specific special-case rules result:

(P M P T ) = [P P](M)
(to be used later in course)
(X A) = [A T I ](X)
(B X) = [I B](X)
so
[A I ] + [I B] (X) = (C)
which can be inverted to find (X). X is then found by un-vectorizing

(X).

(mostly blank)
(mostly blank)
31
STATE-SPACE DYNAMIC SYSTEMS

(CONTINUOUS-TIME)
1. What are they?
2. Why use them?
3. How are they related to the
transfer functions we have
used already?
4. How do we formulate them?
What are They?
"Nice artwork kiddo. I have a feeling that a great many

people will make a living off that third line someday!"
(Out of Control, IEEE Control Systems Magazine)
Representation of the dynamics of an nth-order system as a

first-order differential equation in an n-vector called the STATE. n
first-order equations.
acements
Classic example: 2nd-order E.O.M.
k
m
b
f (t)
y(t)
m y (t) = f (t) b y (t) ky(t)

f (t) b y (t) ky(t)
.
y (t) =
m

32
ECE4520/5520, STATE-SPACE DYNAMIC SYSTEMSCONTINUOUS-TIME
Define a state vector"

xE(t) =
then, xE (t) =
"
y(t)
y (t)
y (t)
y (t)
y (t)
.
k
b
1
y(t) y (t) + f (t)
m
m
m
We can write this in the form xE (t) = A xE(t) + B f (t), where A and B are
constant matrices.
A=
B=
Complete the picture by setting y(t) as a function of xE(t). The general

form is:
y(t) = C xE(t) + D f (t)
where C and D are constant
matrices.
h
i
C=
,
D=
Fundamental form for linear state-space model:

xE (t) = A xE(t) + Bu(t)
y(t) = C xE(t) + Du(t).
where u(t) is the input, xE(t) is the state, A, B, C, D are constant

matrices. We usually assume that xE(t) is a vector, so simplify notation
by simply using x(t).
The state of a system at time t0 is the minimum amount of
information at t0 that, together with the input u(t), t t0, uniquely
determines the behavior of the system for all t t0.
DEFINITION :

33
Contrast with impulse-response (convolution) representation which

requires all past history of u(t)
Z t
h( )u(t ) d.
y(t) =
0
Why Use Them?

Transfer functions provide input-output mapping: u G(s) y.
State variables provide access to what is going on inside the system.
Convenient way to express E.O.M. Matrix format great for computers.
Allows new analysis and synthesis tools.
GREAT for multi-input, multi-output systems. These are very hard to

work with transfer functions.
Converting State-Space to Transfer Function

Start with the state equations
x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t).
Laplace transform
or
s X (s) x(0) = AX (s) + BU (s)

Y (s) = C X (s) + DU (s),
(s I A)X (s) = BU (s) + x(0)
X (s) = (s I A)1 BU (s) + (s I A)1 x(0)
and
1
Y (s) = [C(s
I A)
A)1 x(0)} .
{z B + D]} U (s) + C(s
| I {z
|
transfer function of system
response to initial conditions

34
So,
Y (s)
= C(s I A)1 B + D,
U (s)
but
(s I A)1 =
adj(s I A)
.
det(s I A)
Slightly easier to compute (for SISO systems)
Y (s)
= C(s I A)1 B + D
U (s)
#
"
sI A B
det
C D
.
=
det(s I A)
We will develop this result at the end of this section of notes.

EXAMPLE :
Our mass-spring-damper example:

0
0
1
B=
A=
1
k
b

m
m
m
C=
G(s) = C(s I A)1 B + 0
1
h
i s
1
0
= 1 0
k
b
1
s+
m
m
m

b
h
i s+
1 0
m
1 0
1
k
s
m
m
=
s 2 + (b/m)s + (k/m)

1 0
35
1/m
s 2 + (b/m)s + (k/m)
1
.
=
ms 2 + bs + k
This is exactly what we expect from the first example in this section.
EXAMPLE :
Using the special SISO formula,
Same result.
s
1
0
b 1
k
det
s+
m
m m
1
0
0
G(s) =
s
1
det k
b
s+
m
m
1/m
= 2
s + (b/m)s + (k/m)
1
=
.
ms 2 + bs + k
Example shows that the characteristic equation for the system is

(s) = det(s I A) = 0.
Poles of system are roots of det(s I A) = 0 (eigenvalues).
In transfer function matrix form, G(s) = C(s I A)1 B + D, a pole of
any entry in G(s) is a pole of the system.
To investigate how state-space
systems work, we can simulate them in Simulink. We could use the
SIMULATING SYSTEMS IN SIMULINK :

36
State Space block from the Continuous library, or we can make our
own. The following method has advantages because it gives us
explicit access to the state and other internal signals. It is a direct
implementation of the transfer function above, and the initial state
may be set by setting the initial integrator values.
K
u
1
xdot
D
1
s
C
K
A
y
1
Note: All (square) gain blocks

are MATRIX GAIN blocks
from the Math Library.
Transfer Function to State-Space: Canonical Forms

We can make state-space forms from E.O.M., for example.
Also from transfer functions.

Controller Canonical Form
Three cases:
1] Transfer function is only made up of poles.

Y (s)
1
=
G(s) = 3
s + a1 s 2 + a2 s + a3
U (s)
...
y (t) + a1 y (t) + a2 y (t) + a3 y(t) = u(t).
Choose output and derivatives as the state.

h
iT
x(t) = y (t) y (t) y(t) . Then
...

y (t)
y (t)
1
a1 a2 a3

x(t)
= y (t) = 1
0
0 y (t) + 0 u(t)
0
1
0
y(t)
0
y (t)

37
y(t) =
0 0 1 x(t) + 0 u(t).
Note the special form of A (top companion matrix).

2] Transfer function has poles and zeros, but is strictly proper.
Y (s)
b1 s 2 + b 2 s + b 3
=
G(s) = 3
.
s + a1s 2 + a2s + a3 U (s)
V (s)
contains all of the
Break up transfer function into two parts.
U (s)
Y (s)
poles of
. Then,
U (s)
Y (s) = [b1s 2 + b2s + b3]V (s).
Or,
y(t) = b1v(t)
+ b2v(t)
+ b3v(t).
But,
V (s)[s 3 + a1s 2 + a2s + a3] = U (s),
or,
...
v(t) + a1v(t)
+ a2v(t)
+ a3v(t) = u(t).
The representation for this is the same as in Case [1]. Let
h
iT
x(t) = v(t)
v(t)
v(t) .
Then
...

v(t)
v(t)
a1 a2 a3
1

x(t)
= v(t)
= 1
0
0 v(t)
+ 0 u(t)
v(t)
0
1
0
v(t)
0
represents the dynamics of v(t). All that remains is to couple in the

zeros of the system.
Y (s) = [b1s 2 + b2s + b3]V (s)
38
y(t) =
b1 b2 b3 x(t) + 0 u(t).
3] Non-proper transfer function.

b0 s 3 + b 1 s 2 + b 2 s + b 1
G(s) = 3
s + a1 s 2 + a2 s + a3
1 s 2 + 2 s + 3
+ D,
= 3
s + a1 s 2 + a2 s + a3
where the i terms are computed via long division. The remainder
D is the feedthrough term.
This particular method of implementing a system in state-space form

is called Controller canonical form.
Matlab command tf2ss(num,den) converts a transfer-function

frag replacements
form to state-space form.
Analog computer implementation:

b1
y(t)
b2
u(t)
x 1c
a1
x 2c
x 3c
b3
a2
a3
Observer Canonical Form

Now, using the same transfer function,
(s 3 + a1s 2 + a2s + a3)Y (s) = (b1s 2 + b2s + b3)U (s),

39
divide both sides by s 3

a
b1 b2 b3
a2 a3
1
Y (s) = 2 3 Y (s) +
+ 2 + 3 U (s)
s
s
s
s
s
s
PSfrag replacements

1
1
1
b1U (s) a1Y (s) +
b2U (s) a2Y (s) +
b3U (s) a3Y (s)
.
=
s
s
s
This has block-diagram:
u(t)
b3
a3
Or,
b2
R
x 3o
a2
b1
R
x 2o
a1
x 1o
y(t)
b1
a1 1 0
x(t)
= a2 0 1 x(t) + b2 u(t)
a3 0 0
b3
h
i
y(t) = 1 0 0 x(t).
This is called observer canonical form.

A is a left companion matrix.
Controllability Canonical Form
Thirdly, consider the block diagram:

310
u(t)
a3
x 1co
R
1
(x2 a1 x3)
s
1
x2 = (x1 a2 x3)
s
1
x1 = (u a3 x3)
s
a2
x3 =
Thus,
x 2co
R
y(t)
a1
x 3co
1
U (s)
s 3 + a1 s 2 + a2 s + a3
s + a1
U (s)
X 2(s) = 3
s + a1 s 2 + a2 s + a3
s 2 + a1 s + a2
X 1(s) = 3
U (s).
s + a1 s 2 + a2 s + a3
X 3(s) =
3 + 2(s + a1) + 1(s 2 + a1s + a2)

U (s).
Y (s) =
s 3 + a1 s 2 + a2 s + a3
In order to get the correct transfer function, we must compute the { i }

values to get the desired numerator:

1 0 0
1
b1

a 1 1 0 2 = b 2
a2 a1 1
3
b3

1
1
b1
1 0 0

2 = a 1 1 0 b 2
3
a2 a1 1
b3
b1
=
b2 a 1 b1
.
b3 a1b2 a2b1 + a12b1

311
Or,
x(t)
=1
0
h
y(t) = 1
0 a3
1

0 a2 x(t) + 0 u(t)
0
1 a1
i
2 3 x(t).
A is a right companion matrix.

Observability Canonical Form
Note that H (s) is a scalar. So, H (s) T = H (s).

H (s) = C(s I A)1 B + D
= B T (s I A)T C T + D T
= B T (s I A T )1C T + D T .
So, C B T , A A T , B C T and D D T are dual forms.
We have already seen this (!). Controller and observer are dual
forms. Likewise, we can come up with
1
0
1
0
x(t)
= 0
0
1 x(t) + 2 u(t)
a3 a2 a1
h
i
y(t) = 1 0 0 x(t)
as a dual form with the controllability form.

312
u(t)
3
2
R
x 3ob
x 2ob
a1
x 1ob
y(t)
a2
a3
A is a bottom companion matrix.
We will see that we have a lot of freedom when making our

state-space models (i.e., in choosing the components of x(t)).
Modal (Diagonal) Form
Yet another canonical form. Very useful. . .
N (s)
Assume G(s) =
, D(s) has distinct roots pi (real).
D(s)
N (s)
G(s) =
(s p1)(s p2) (s pn )
r1
r2
rn
=
+
+ +
.
s p1 s p2
s pn
Now, let
X 1(s)
r1
=
U (s)
s p1
...
Or,
X n (s)
rn
=
U (s)
s pn
x1(t) = p1 x1(t) + r1u(t)
xn (t) = pn xn (t) + rn u(t).
x(t)
= Ax(t) + Bu(t)
313
y(t) = C x(t) + Du(t)
p1
0
p2
A=
...
C=
pn
1 1 1 ,
r1
r2
B=
...
rn
h i
D= 0 .
Easily extends to handle complex poles i = i + ji .

If A and B can have complex elements, then no change is
necessary.
Otherwise, use real modal form which is made via partial-fraction
expansion where complex pole-pairs are represented as
i s + i
.
G i (s) =
(s i )2 + i2
The real-modal form has an A matrix which is block diagonal, and
of the form
#
"
#!
"
r +1 r +1
n n
A = diag 3r ,
,...,
r +1 r +1
n n
where 3r is a diagonal matrix containing the real poles, and
i = i + ji ,
i = r + 1, . . . , n
are the complex poles.
The B matrix has corresponding entries:
#1 " #
# "
"
i
1
1
bi,1
.
=
i
i i i i
bi,2
Modal form is convenient for keeping track of system poles. . . they

are right on the diagonal!
314
Good representation to use. . . numerical robustness.
All canonical forms related by linear algebrachange of basis.

Diagonal is very useful, but we cannot always put a system in
diagonal form. (What was our assumption above?) We will see one
more canonical form in a little while (Jordan form) which is very
similar to diagonal. All systems can be put in Jordan form.
Transformations
We have seen that state-space representations are not unique.
Selection of state x are quite arbitrary.
Can we convert from one representation to another and get
equivalent systems?
Analyze the transformation of
x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t).
Let x(t) = T z(t), where T is an invertible (similarity) transformation
matrix.
z (t) = T 1 x(t)
= T 1[Ax(t) + Bu(t)]
= T 1[AT z(t) + Bu(t)]
1
= |T 1
{zAT} z(t) + |T {z B} u(t)
A
y(t) = |{z}
C T z(t) + |{z}
D u(t)
C
so z (t) = Az(t)
+ Bu(t)
y(t) = C z(t) + Du(t).

Argue that we should be able to use either model.

Are they going to give the same transfer function?
H1 (s) = C(s I A)1 B + D
I A)
1 B + D.
H2 (s) = C(s
Need H1 (s) = H2(s).

H1 (s) = C(s I A)1 B + D
= C T T 1(s I A)1 T T 1 B + D
= (C T )[T 1(s I A)T ]1(T 1 B) + D

I A)
1 B + D = H2 (s).
= C(s
Transfer function not changed by similarity transform.
OBSERVATION :
Consider
b1 s 2 + b 2 s + b 3
H (s) = 3
.
s + a1 s 2 + a2 s + a3
Only six parameters in transfer function. But, A has 3 3, B has

3 1, C has 1 3: a total of 15 parameters.
Appears that we have 9 degrees of freedom in state-space model.
Contradiction?
h
i
9 = size
.
We will see (Chapter 5) how to design T to put a system into the

various canonical forms.
EXAMPLE :
Controller canonical form for

2s + 3
(s + 1)(s + 2)

315
316
is
3 2
xc (t) +
1 0
h
i
y(t) = 2 3 xc (t).
xc (t) =
Let
"
T =
"
2 1
1 1
"
1
0
u(t)
Notice that det(T ) = 1 so T is invertible. Let x c = T x where x is a new

state.
Then,
= (T 1 AT )x(t)
x(t)
+ (T 1 B)u(t)
y(t) = (C T )x(t).
Plugging in A, B, C and T :
#
" #
"
1
= 2 0 x(t)
u(t)
+
x(t)
1
0 1
h
i
y(t) = 1 1 x(t),
which gives the diagonal realization of the transfer function!

Well often change coordinates in a system, for example to solve a
particular problem more easily.
Consider the system in the above example, implemented in
the four main canonical forms. Let the initial state for each form be
x(0) = [1 1]T . Simulate response of each system.
EXAMPLE :

317
Initial responses for same initial state

5
Amplitude
PSfrag replacements
3
2
1
0
1
0
Time
The systems have the same transfer function, but different responses
to initial states since the states have different interpretations.
Time (Dynamic) Response
Develop more insight into the system response by looking at
time-domain solution for x(t).
Scalar case first, then many states and MIMO.
Homogeneous Part (scalar)
x(t)
= ax(t),
x(0).
Take Laplace. X (s) = (s a)1 x(0).
Inverse Laplace. x(t) = e at x(0).

Homogeneous Part (full solution)
x(t)
= Ax(t),
x(0).
Take Laplace. X (s) = (s I A)1 x(0).

x(t) =
[(s I A)1]x(0).

318
But,
(s I A)
A
I
A2
= + 2 + 3 +
s s
s
so,
1
A2 t 2 A3t 3
[(s I A) ] = I + At +
+
+
2!
3!
1
= e At
matrix exponential
x(t) = e At x(0).
e At : Transition matrix or state-transition matrix.
Matrix exponential
e( A+B)t = e At e Bt
expm.m
iff
AB = B A. (i.e., not in general).
Will say more about e At when we discuss the structure of A.
Computation of e At =
[(s I A)1] straightforward for 2 2.
EXAMPLE :
x = Ax,
(s I A)1 =
"
"
s
2
s+3
2
A=
1
s +3
1
s
"
0 1
2 3
#1
#
1
(s + 2)(s + 1)
2
1
s+1 s+2
=
2
2
[7mm]
+
s+1 s+2
1
1
s+1 s+2
1
2
+
s+1 s+2

319
At
"
2et e2t
2et + 2e2t
et e2t
et + 2e2t
1(t)
This is the best way to find e At if A 2 2.

Forced Solution (scalar)
x(t)
= ax(t) + bu(t), x(0)
Z t
x(t) = eat x(0) +
ea(t ) bu( ) d .
{z
}
|0
convolution
Where did this come from?

1. x(t)
ax(t) = bu(t)
d
2. eat [x(t)
ax(t)] = [eat x(t)] = eat bu(t).
dt
Z t
Z t
d a
ea bu( ) d.
[e x( )] d = eat x(t) x(0) =
3.
0
0 dt
Forced Solution (full solution)
Now, let x(t)
= Ax(t) + Bu(t),
Follow three steps above to get

At
x(t) = e x(0) +
x
Z
t
0
n1
m1
e A(t ) Bu( ) d
Clearly, if y(t) = C x(t) + Du(t),

Z t
At
A(t )
y(t) = Ce
x(0)
+
Ce
Bu( ) d + |Du(t)
.
{z
}
| {z }
|0
{z
} feedthrough
initial resp.
convolution
More on the Matrix Exponential
Have seen the key role of e At in the solution for x(t). Impacts the
system response, but need more insight.
320
Consider what happens if the matrix A is diagonalizable, that is, there

exists a matrix T such that T 1 AT = 3 =diagonal.
Then, e At = T e3t T 1, and

e3t
1 t
0
e 2 t
...
e n t
Much simpler form for the exponential, but how to find T, 3?

Eigenvalues/eigenvectors.
Dynamic Interpretation
1
Write T 1 AT = 3 as T
A =3T 1 with
T 1
w1T
T
w2
=
...
wnT
i.e., rows of T 1.
wiT A = i wiT , so wi is a left eigenvector of A and note that wiT v j = i, j .

How does this help?
e At = T e3t T 1
h
i
= v1 v2 . . . v n
n
X
1 t
0
e
2 t
...
0
e n t
ei t vi wiT .
i =1

w1T
w2T
...
wn T
321
Very simple form.
Can be used to develop intuition about dynamic response e i t .

Recall,
x(t)
= Ax(t)
x(t) = e At x(0)
= T e3t T 1 x(0)
n
X
ei t vi (wiT x(0)).
=
i =1
Solution (trajectory) can be expressed as a linear combination of

system modes: vi ei t .
Left eigenvectors decompose initial state x(0) into modal coordinates
wiT x(0).
ei t propagates mode forward in time. Stability?
vi corresponds to relative phasing of state contribution to the modal
response.
EXAMPLE :
with x(t)
Lets consider a specific system

x(t)
= Ax(t)
y(t) = C x(t)
161
, y(t) . (16-state, single output).
A lightly damped system.

Typical output to initial conditions.

322
Impulse Response
2
1.5
Amplitude
PSfrag replacements
0.5
0
0.5
1
1.5
2
50
100
150
Time (sec.)
200
250
300
Output waveform is very complicated. Looks almost random.

However, such a solution can be decomposed into much simpler
modal components.

323

1
0
1
50
100
150
200
250
50
100
150
200
250
50
100
150
200
250
50
100
150
200
250
50
100
150
200
250
50
100
150
200
250
50
100
150
200
250
50
100
150
200
250
0.5
0
0.5
1
0
1
0.5
0
0.5
0.5
0
0.5
1
0
1
1
0
1
acements
Response
Amplitude
me (sec.)
0.5
0
0.5

324
How? Diagonalize A to form an equivalent system.

Assume A is diagonalizable by T .
Define new coordinates by x(t) = T x(t)
so
= T 1 Ax(t) = T 1 AT x(t)
x(t)
= 3x(t).
In new coordinate system, system is diagonal (decoupled).
1/s
x1
Trajectories consist of n independent

modes; that is,
xi (t) = ei t xi (0)
hence the name, modal form.
Can write
ements
1/s
xn
x(t) = e At x(0)
= T e3t T 1 x(0)
n
X
=
ei t vi (wiT x(0)).
i =1
Thus, trajectory can be expressed as linear

combination of modes.
Interpretation.
Left eigenvectors decompose initial state x(0) into modal components
wiT x(0).
ei t term propagates ith mode forward t seconds.
Reconstruct state as linear combination of right eigenvectors.
The Jordan Form
325
Any matrix A nn can be put in Jordan canonical form by a

similarity transformation; i.e.,
J1
0
...
T 1 AT = J =
where
Ji =
Jq
i 1
0
.
.

.
i
... 1
0
i
n i n i
q
X
n i ).
i =1
System is decomposed into independent Jordan block systems

g replacements
xi (t) = Ji xi (t)
u(t)
1
s
1
s
1
s
cn
y(t)
c2
c1
Jordan blocks are sometimes called Jordan chains (diagram shows

why).
What does this mean in the time domain?
1
s 1
0
.
.
.
s
1
(s I J ) =
. . . 1
0
s

326
(s )1 (s )2 (s )k
1
k+1
(s
(s
)
=
...
...
0
(s )1
= (s )1 I + (s )2 F1 + + (s )k Fk
where Fk is the matrix with ones on the kth upper diagonal.

Hence, the matrix exponential is
k1
1 t t /(k 1)!
k2
t
/(k
2)!
J t
t
e =e
...
...
= et (I + t F1 + + t k1/(k 1)!Fk ).
Thus, Jordan blocks yield repeated poles and terms of the form t p et
in e At .
Canonical Forms for MIMO Systems
Consider
Now,
and
G(s) = C(s I A)1 B + D

C[adj(s I A)]B
=
+ D.
det(s I A)
det(s I A) = s n + 1s n1 + 2s n2 + + n
C adj(s I A)B = N (s) = [N1s n1 + N2s n2 + + Nn ].

327
We can then write
1 I 2 I n1 I n I
I
0
0
0
x(t)
= 0
I
0
0 x(t) +
..
.
.
.
..
..
..
.
0
0
I
0
i
h
y(t) = N1 N2 Nn1 Nn x(t) + G()u(t),
I
0
0
...
0
u(t)
which is multivariable controller canonical form.
We can also write
1 I I
2 I 0
...
...
x(t)
=
n1 I 0
n I 0
h
y(t) = I 0 0
N1
0
N2
0
... x(t) +
...
I
Nn1
Nn
0
i
0 x(t) + G()u(t),
0
I
...
0
0
u(t)
which is multivariable observer canonical form.
We generally find that SISO canonical forms are more useful than
MIMO canonical forms.
Zeros of a State-Space System
Seen eigenvalues of A, or poles of the entries of G(s) are the poles.
Zeros of transfer function?
What is a zero? Two types of zero in a MIMO system: Blocking zeros
and transmission zeros.
328
K
, K 6= 0 a constant vector.
s
If is a blocking zero, e t will not appear at the output for any K , x(0).
(Not considered a very useful definition for MIMO zero).
Consider a system with input U (s) =
If is a transmission zero, e t will not appear at the output for some

specific K , x(0).
{blocking zeros} {transmission zeros}.
PSfrag replacements
IDEA : (but not the entire story) Consider a two-input two-output system
u 1 (t)
y1(t)
G 11 (s)
G 21 (s)
G 12 (s)
u 2 (t)
y2(t)
G 22 (s)
A blocking zero will show up in G 11(s), G 12(s), G 21(s) and G 22(s). No

matter what K is, if U (s) = K /(s ), and is a blocking zero, the
output does not have an e t term.
A transmission zero may not show up as a zero in any of the
individual transfer functions, but will in combinations thereof (with
specific initial states).
To find transmission zeros, put in u(t) = u 0e zi t and you get a zero
output at frequency e zi t .
State space: Have input and state contributions (consider first the
SISO case)
u(t) = u 0e zi t ,
x(t) = x 0e zi t
...

y(t) = 0.
329
x(t)
= Ax(t) + Bu(t) z i e zi t x0 = Ax0e zi t + Bu 0e zi t
i x0
h
=0
zi I A B
u 0
y(t) = C x(t) + Du(t) C x 0e zi t + Du 0e zi t = 0
i x0
h
= 0.
C D
u 0
Put the two together
zi I A
C
B
D
x0
u 0
= 0.
Zero at frequency z i if there exists a nontrivial solution of
zi I A B
= 0.
det
C
D
Recall
G(s) =
det
sI A
C
B
D
det(s I A)
Ahah! (The u 0 before gave us the correct sign in G(s)).

In the MIMO case, with n state variables, p inputs and q outputs, a
transmission zero is any value z i for which
zi I A B
< n + min{ p, q}
rank
C
D

330
EXAMPLE :
s(s + 1)
s+1
2
+2
G(s) = s + 1 (s +s2)(s
+ 1) .
0
s 2 + 2s + 2
We can find that G(s) has a blocking zero at s = 1 and has
transmision zeros at s = 0, s = 1 and s = 2.
s +1
s(s + 1)
U1(s) +
U2(s)
Y1(s) = 2
s +1
s +2
(s + 2)(s + 1)
Y2(s) = 2
U2(s).
s + 2s + 2
Let U = K /(s )

s + 1 s(s + 2)k1 + (s 2 + 1)k2
Y1(s) =
s
(s 2 + 1)(s + 2)

s + 1 (s + 2)k2
.
Y2(s) =
s s 2 + 2s + 2
For all k1, k2, s = 1 is a zero. Therefore, both blocking and

transmission.
For k2 = 0, k , s = 0 is a zero. Therefore, transmision.

Not so obvious, but s = 2 is also a zero. Therefore, transmission. [In
a MIMO system, we can have a zero and pole at the same frequency!]
Recall from before,
zi I A
C
B
D
x0
u 0
= 0.
gives the initial state x 0 and K = u 0 if zi is a transmission zero.

(mostly blank)
(mostly blank)
41
placements
STATE-SPACE DYNAMIC SYSTEMS

(DISCRETE-TIME)
Digital Control Systems

Computer control requires analog-to-digital (A2D) and
digital-to-analog (D2A) conversion.
r (t)
e(t)
A2D
e[k]
D(z)
u[k]
D2A
zoh
w(t)
u(t)
G(s)
y(t)
v(t)
The z-Transform
Just as Laplace transforms are used for continuous-time systems, the
z-transform is used with discrete-time systems.
DEFINITION :
X
X (z) =
x[k]z k .
k=0
There is an extensive literature on the z-transform and we wont

discuss any of it here.
Some simple correspondences:
z = esT
maps continuous-time pole locations s to discrete-time pole locations

z, where T is the sampling period.
42
ECE4520/5520, STATE-SPACE DYNAMIC SYSTEMSDISCRETE-TIME
Pole locations in the z-plane correspond to discrete-time impulse

responses as:
(z)
rag replacements
Discrete Impulse Responses versus Pole Locations
Conversion between s-plane and z-plane:

j
Sfrag replacements
PSfrag replacements
j
T
j
j
T
T
s-plane
z-plane

(z)
43
Desirable locations for poles in the z-plane:
Good
cements
PSfrag replacements
Damping
Good
Good
PSfrag replacements
Frequency n
Settling Time
Discrete-Time State-Space Form

Discrete-time systems can also be represented in state-space form.
x[k + 1] = Ad x[k] + Bd u[k]
y[k] = C d x[k] + Dd u[k]
The subscript d is used here to emphasize that, in general, the A,
B, C and D matrices are DIFFERENT for discrete-time and
continuous-time systems, even if the underlying plant is the same.
I will usually drop the d and expect you to interpret the system from
its context.
Formulating from Transfer Functions
Dynamics in discrete-time are represented as difference equations.
e.g.,
y[k +3]+a1 y[k +2]+a2 y[k +1]+a3 y[k] = b1u[k +2]+b2u[k +1]+b3u[k].
This particular example has transfer function
Y (z)
b1 z 2 + b 2 z + b 3
=
G(z) = 3
.
z + a1 z 2 + a2 z + a3 U (z)
44
This transfer function may be converted to state-space in a very

similar way to continuous-time systems.
First, consider the poles:
1
V (z)
=
z 3 + a1 z 2 + a2 z + a3
U (z)
v[k + 3] + a1v[k + 2] + a2v[k + 1] + a3v[k] = u[k].
G p (z) =
Choose current and advanced versions of v[k] as state.

h
iT
x[k] = v[k + 2] v[k + 1] v[k] .
Then
v[k + 3]
x[k + 1] = v[k + 2]
v[k + 1]

a1 a2 a3
v[k + 2]
1

= 1
0
0 v[k + 1] + 0 u[k].
0
1
0
v[k]
0
We now add zeros.
b1 z 2 + b 2 z + b 3
Y (z)
G(z) = 3
=
.
z + a1 z 2 + a2 z + a3 U (z)
Break up transfer function into two parts.

poles of
V (z)
contains all of the
U (z)
Y (z)
. Then,
U (z)

Y (z) = b1 z 2 + b2 z + b3 V (z).

45
Or,
y[k] = b1v[k + 2] + b2v[k] + b3v[k].
Then
a1 a2 a3
1
v[k + 2]

x[k + 1] = 1
0
0 v[k + 1] + 0 u[k]
0
0
1
0
v[k]
h
i
h i
y[k] = b1 b2 b3 x[k] + 0 u[k].
Many discrete-time transfer functions are not strictly proper. Solve by

polynomial long division, and setting D equal to the quotient.
Matlab command tf2ss(num,den) converts a transfer function
form to state-space form.
As with continuous-time systems, we have a lot of freedom when
making our state-space models (i.e., in choosing the components of
x[k]).
Canonical Forms
In discrete-time we have the same canonical forms: Controller,
observer, controllability, observability, modal and Jordan.
They are derived in the same way, as demonstrated above for the
controller form.
A block diagram for controller form is:

46
b1
y[k]
b2
u[k]
z 1
x 1c
z 1
x 2c
z 1
x 3c
b3
a1
a2
a3
State-Space to Transfer Function

x[k + 1] = Ax[k] + Bu[k]
y[k] = C x[k] + Du[k].
z-transform
or
z X (z) zx[0] = AX (z) + BU (z)

Y (z) = C X (z) + DU (z)
(z I A)X (z) = BU (z) + zx[0]
X (z) = (z I A)1 BU (z) + (z I A)1 zx[0]
and
1
Y (z) = [C(z
I A)
A)1 zx[0]} .
|
{z B + D]} U (z) + C(z
| I {z
So,
Y (z)
= C(z I A)1 B + D
U (z)
Same form as for continuous-time systems.
Poles of system are roots of det[z I A] = 0.

47
Transformation
State-space representations are not unique. Selection of state x are
quite arbitrary.
Analyze the transformation of
x[k + 1] = Ax[k] + Bu[k]
y[k] = C x[k] + Du[k]
Let x[k] = T w[k], where T is an invertible (similarity) transformation
matrix.
1
w[k + 1] = |T 1
{zAT} w[k] + |T {z B} u[k]
A
D u[k]
y[k] = |{z}
C T w[k] + |{z}
C
so, w[k + 1] = Aw[k]

+ Bu[k]
y[k] = Cw[k]
+ Du[k].
Same as for continuous-time.
Homogeneous Part
First, consider the scalar case
x[k + 1] = ax[k],
x[0].
Take z-transform. X (z) = (z a)1 zx[0].
Inverse z-transform. x[k] = a k x[0].
Similarly, the full solution (vector case) is

x[k] = Ak x[0].
48
Aside: Nilpotent Systems

A is nilpotent if some power of n exists such that
An = 0.
A does not just decay to zero, it is exactly zero!
This might be a desirable control design! (Why?) You might imagine

that all the eigenvalues of A must be zero for this to work.
Forced Solution
The full solution is:
x[k] = Ak x[0] +
k1
X
j =0
Ak1 j Bu[ j] .
{z
convolution
This can be proved by induction from the equation

x[k + 1] = Ax[k] + Bu[k], x[0]
Clearly, if y[k] = C x[k] + Du[k],
k1
X
k
C Ak1 j Bu[k] + |Du[k]
y[k] = C
| A{zx[0]} +
{z } .
j =0
initial resp.
|
{z
} feedthrough
convolution
PSfrag replacements
Converting Plant Dynamics to Discrete Time.
Combine the dynamics of the zero-order hold and the plant.

u[k]
ZOH
u(t)
A, B, C, D
The continuous-time dynamics of the plant are:

x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t).
y(t)
49
Evaluate x(t) at discrete times. Recall

Z t
e A(t ) Bu( ) d
x(t) =
0
x((k + 1)T ) =
(k+1)T
e A((k+1)T ) Bu( ) d
With malice aforethought, break up the integral into two pieces. The
first piece will become A d times x(kT ). The second part will become
Bd times u(kT ).
Z kT
Z (k+1)T
=
e A((k+1)T ) Bu( ) d +
e A((k+1)T ) Bu( ) d
0
=e
kT
kT
0
AT
AT A(kT )
x(kT ) +
Bu( ) d +
(k+1)T
kT
(k+1)T
kT
e A((k+1)T ) Bu( ) d
e A((k+1)T ) Bu( ) d.
In the remaining integral, note that u( ) is constant from kT to

(k + 1)T, and equal to u(kT ); let = (k + 1)T ; = (k + 1)T ;
d = d .

Z T
e A B d u(kT )
x((k + 1)T ) = e AT x(kT ) +
0
or, x[k + 1] = e AT x[k] +
Z
e A B d u[k].
So, we have a discrete-time state-space representation from the

continuous-time representation.
x[k + 1] = Ad x[k] + Bd u[k]
Z T
e A B d .
where Ad = e AT , Bd =
0

Similarly,
410
y[k] = C x[k] + Du[k].
That is, C d = C; Dd = D.
Calculating Ad , Bd , Cd and Dd
Cd and Dd require no calculation since C d = C and Dd = D.
Ad is calculated via the MATRIX exponential A d = e AT . This is

different from taking the exponential of each element in AT .
If Matlab is handy, you can type in

Ad=expm(A*T)
If Matlab is not handy, then you need to work a little harder. Recall
from the previous set of notes that e At = 1[(s I A)1]. So,

AT
1
1
e =
[(s I A) ] t=T ,
which is probably the easiest way to work it out by hand. Or,

A2 T 2 A3 T 3
AT
e = I + AT +
+
+
2!
3!
which is a convergent series so may be approximated with only a few
terms.
Now we focus on computing Bd . Recall that

Z T
Bd =
e A B d
0

2
I + A + A2 + . . . B d
=
2
0

3
T2
T
= I T + A + A2 + . . . B
2!
3!
Z

= A1(e AT I )B
= A1(Ad I )B.
So, calculating Bd is easy once we have already calculated A d .

An alternative method, with better numerical properties:
AT
A2 T 2
9=I+
+
+ ...
2!
3!

AT
AT
AT
AT
I+
I+
...
I+
2
3
N 1
N
Ad = I + AT 9
Bd = T 9 B,
for fairly large N .
Also, in Matlab,
[Ad,Bd]=c2d(A,B,T)

411
(mostly blank)
51
OBSERVABILITY AND CONTROLLABILITY

Overview
We describe dual ideas called observability and controllability.
Both have precise (binary) mathematical descriptions, but it doesnt

always pay to listen to the math!
We develop some other techniques to help quantify the concepts.
Continuous-Time Observability: Where am I?
If a system is observable, we can determine the initial condition of the
state vector x(0) via processing the output of the system y(t).
Since we can simulate the system if we know x(0) and u(t) this also
implies that we can determine x(t) for t 0.
Consider the LCCODE
...
...
y (t) + a1 y (t) + a2 y (t) + a3 y(t) = b0u(t) + b1u(t)
+ b2u(t)
+ b3u(t).
If we have a realization of this system in state-space form

x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t),
and we have initial conditions y(0), y (0), y (0), how do we find x(0)?
y(0) = C x(0) + Du(0)
y (0) = C(Ax(0)
+ Bu(0)}) + D u(0)
{z
|
x(0)
= C Ax(0) + C Bu(0) + D u(0)

52
ECE4520/5520, OBSERVABILITY AND CONTROLLABILITY
y (0) = C A2 x(0) + C ABu(0) + C B u(0)
+ D u(0).
In general,
y (k)(0) = C Ak x(0) + C Ak1 Bu(0) + + C Bu (k1)(0) + Du (k)(0),
or,
where
y(0)
D
0 0
u(0)
C
y (0) = C A x(0) + C B D 0 u(0)

,
y (0)
C AB C B D
u(0)
C A2
| {z }
|
{z
}

Thus, if
(C, A)
is a (block) Toeplitz matrix.
(C, A) is invertible, then
u(0)
y(0)
1
x(0) =
y (0) u(0)
.
y (0)
u(0)
We say that {C, A} is an observable pair if
is nonsingular.
If
is nonsingular, then we can determine/estimate the
initial state of the system x(0) using only u(t) and y(t) (and therefore,
we can estimate x(t) for all t 0).
CONCLUSION :
EXAMPLE :
Observability canonical form:
1
0
1
0
x(t)
= 0
0
1 x(t) + 2 u(t)
3
a3 a2 a1
h
i
y(t) = 1 0 0 x(t).

53
Then
C
1 0 0
= C A = 0 . . . 0 = In .
C A2
0 0 1
ments
This is why it is called observability form!

EXAMPLE :
Two unobservable networks
1
u
1
1H x 1
1
y
2
1
1
1H x 1
x2
1F
2
1
x2
1F
(Redrawn)
In the first, if u(t) = 0 then y(t) = 0 t. Cannot determine x(0).

In the second, if u(t) = 0, x 1(0) 6= 0 and x 2(t) = 0, then y(t) = 0 and
we cannot determine x 1(0). [circuit redrawn for u(t) = 0].
Observers
An observer is a device which has as inputs u(t) and y(t)the input
and output of a linear system. The output of the observer is the
(estimated) state of the linear system.
PSfrag replacements
The observer observes

the internal state x
(estimated as x)
from
external signals u and y.
u(t)
A, B, C, D
y(t)
x
Observer
Note that our equations yield an observer:

54
A, B, C, D
u(t)
y(t)
x
u
1
u
s
u
s2
1
s
s2
x
Later, well design more practical observers which dont use

differentiators.
Continuous-Time Controllability: Can I get there from here?
Can we generate an input u(t) to quickly set an initial condition?
x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t).
If u(t) = (t) and x(0) = 0, then
X (s) = (s I A)1 BU (s) = (s I A)1 B.
So, via the Laplace initial-value theorem,
x(0+) = lim s X (s)
s
= lim s(s I A)1 B

s
= lim
= B.
A
I
s
1
Thus, an impulse input brings the state to B from 0.
What if u(t) = (k)(t)?

55
Then

1
1
A
X (s) = (s I A)1 Bs k =
Bs k
I
s
s

A A2
1
I + + 2 + . . . Bs k
=
s
s
s
{z
}
|
holds for large s
Ak+1 B
Ak B
+
+
= Bs
+ ABs
+ A Bs
+ +
s
s2
The first terms are impulsive: they have zero value for t > 0.
k1
k2
Thus,
x(0+) = lim s
s
= Ak B.
k3
Ak+1 B
Ak B
+
+
s
s2
So, if u(t) = (k) (t) then x(0+) = Ak B.

Now, consider the input
+ gn (n)(t).
u(t) = g1(t) + g2(t)
Since x(0) = 0, x(0+) = g1 B + g2 AB + + gn An1 B, or

g1

x(0+) = B AB An1 B g2
|
{z
}

g3
where is called the controllability matrix.
If is nonsingular, then there is an impulsive input u such

that x(0+) is any desired vector if x(0) = 0.
CONCLUSION :

56
In fact, we may use

u(t) =
where
g1

g2 = B
g3
n1
X
gi (i )(t)
i =0
AB
n1
1
xd
where x d is the desired x(0+) vector.

If is nonsingular, we say {A, B} is a controllable pair and the system
is controllable.
EXAMPLE :
Controllability canonical form:
0 0 a3
1
x(t)
= 1 0 a2 x(t) + 0 u(t)
y(t) =
Then
0 1 a1
1 2 3 x(t).
= [B
AB
An1 B]
1 0 0
= 0 . . . 0 = In .
0 0 1
This is why it is called controllability form!
If a system is controllable, we can instantaneously move the state

from any known state to any other state, using impulse-like inputs.

57
Later, well see that smooth inputs can effect the state transfer (not
instantaneously, though!).
Sfrag replacements
T
T
T
T
DUALITY: If {A, B, C, D} controllable {A , C , B , D } is observable.
EXAMPLE :
Two uncontrollable networks.

1
u
1
1
1F
x 1 1F
x2
u
1
1
1
In the first one, if x(0) = 0 then x(t) = 0 t. Cannot influence state!

In the second one, if x 1(0) = x 2(0) then x 1(t) = x 2(t) t. Cannot
independently alter state.
Diagonal Systems, Controllability and Observability
Recall the diagonal form
1
1
0

2
x(t) + 2 u(t)
x(t)
=
...
...
n
0
n
h
i
h i
y(t) = 1 2 n x(t) + 0 u(t).

58
x 1(t)
1
..
.
u(t)
y(t)
x n (t)
When controllable?
When
observable?
C
1
2
n
C A 1 1
2 2
n n
= ... =
...
n1
n1
n1
n1
CA
1 1 2 2 n n
1
0
1
1 1
2
2
n
.
...
...
n1
n1
n1
n
1
2
{z
}
|
Vandermonde matrix
Singular?
cements
det{
} = (1 n ) det{ } = (1 n )
CONCLUSION :
u(t)
Y
i< j
( j i ).
Observable i 6= j , i 6= j and i 6= 0 i = 1, , n.
1
s+1
1
s+1
x 1(t)
y(t)
x 2(t)
u(t)
1
s+2
1
s+1

x 1(t)
y(t)
x 2(t)
59
If 1 = 2 then not observable. Can only observe the sum x 1 + x2.

If k = 0 then cannot observe mode k.
cements
What about controllability? Use duality and switch s and s.
CONCLUSION :
u(t)
Controllable i 6= j , i 6= j and i 6= 0 i = 1, , n.
1
s+1
1
s+1
x 1(t)
y(t)
u(t)
x 2(t)
1
s+2
1
s+1
x 1(t)
y(t)
x 2(t)
If 1 = 2 then not controllable. Can only control the sum x 1 + x2.

If k = 0 then cannot control mode k.
Discrete-Time Controllability
Similar concept for discrete-time.
Consider the problem of driving a system to some arbitrary state x[n]
x[k + 1] = Ax[k] + Bu[k]
x[1] = Ax[0] + Bu[0]
x[2] = A [Ax[0] + Bu[0]] + Bu[1]

x[3] = A A2 x[0] + ABu[0] + Bu[1] + Bu[2]
...
h
i u[n 1]
...
x[n] = An x[0] + B AB A2 B An1 B
{z
}
|

u[0]
510
Which leads to
u[n 1]
...
=
u[0]

x[n] An x[0] .
If has no inverse (det( ) = 0, is not full-rank) then these control

signals dont exist. In that case, the input is only partially effective in
influencing the state.
If is full-rank, then the input can move the system to any arbitrary
state for any x[0].
NOTE :
State transition is not instantaneous. Takes N time steps.
In continuous-time, we used the input

+ a signal we could only approximate in
u(t) = g0(t) + g1(t)
practice. Here, the input is a perfectly good input signal.
REMARK :
Discrete-Time Reachability
In the literature, there are three different controllability definitions:
1. Transfer any state to any other state.
2. Transfer any state to the zero state, called controllability to the
origin.
3. Transfer the zero state to any state, called controllability from the
origin, or reachability.
In continuous time, because e At is nonsingular, the three definitions
are equivalent.
In discrete time, if A is nonsingular, the three definitions are also
equivalent.
511
However, if A is singular, (1) and (3) are equivalent but not (2) and (3).
EXAMPLE :
0
0 1 0

x[k + 1] = 0 0 1 x[k] + 0 u[k].

0
0 0 0
Its controllability matrix has rank 0 and the equation is not controllable
in (1) or (3).
However, Ak = 0 for k 3 so x[3] = A 3 x[0] = 0 for any initial state
x[0] and any input u[k].
Thus, the system is controllable to the origin but not controllable from
the origin or reachable.
Definition (1) encompasses the other two definitions, so is used as
our definition of controllable.
Discrete-Time Observability
Can we reconstruct the state x[0] from the output y[k]?
y[k] = C x[k] + Du[k]
y[0] = C x[0] + Du[0]
y[1] = C [Ax[0] + Bu[0]] + Du[1]

y[2] = C A2 x[0] + ABu[0] + Bu[1] + Du[2]
...

y[n 1] = C An1 x[0] + An2 Bu[0] + + Bu[n 1] + Du[n 2]

512
In vector form, we can write
C
D
0
y[0]
y[1] C A
x[0] + C B D
=
...
...
C AB C B
...
...
n1
CA
y[n 1]
|
|
{z
{z

}
So,
x[0] =
...
0
0
0
D
u[0]
u[1]
...
u[n 1]
y[0]
u[0]
...
...
1

.
y[n 1]
u[n 1]
If
is full-rank or nonsingular, x[0] may be reconstructed with any
y[k], u[k]. We say that {C, A} form an observable pair.
Do more measurements of y[n], y[n + 1], . . . help in reconstructing
x[0]? No! (Caley-Hamilton theorem). So, if the original state is not
observable with n measurements, then it will not be observable with
more than n measurements either.
Since we know u[k] and the dynamics of the system, if the system is
observable we can determine the entire state sequence x[k], k 0
once we determine x[0]
n1
X
x[n] = An x[0] +
An1i Bu[k]
i =0
= An
y[0]
u[0]
u[n 1]
...
...
...
1

+
.
y[n 1]
u[n 1]
u[0]
A perfectly good observer (no differentiators...)

513
Disclaimer: Is any of this of practical importance?

The singularity of has only one bit of information: Is the realization
mathematically controllable or not. This may not tell the whole story.
"
#
"
#
" #
1
1
1
0
1

=
A=
,
B=
,
1
1+
0
1+
1
{A, B} are a controllable pair, but barely.

EXAMPLE :
Controlling an airplane. (Ideas only, no details). System state

h
iT
4
,
= Pitch, = Roll.
x =
Control with elevator?
x =
"
F
0 F
where e is the elevator angle.

elevators.
x =
0

1
x +
0 e

0
is singular cant influence roll with
Control with ailerons?

"
F
0 F
0

0
x +
0 a

1
where a is the aileron angle. is nonsingular! So, we can control

both pitch AND roll with ailerons.
THIS IS NONSENSE! Physically think of the system! Do you want to
roll plane over every time you need to pitch down?
514
Physical intuition can be better than finding . Other tools can help. . .
Continuous-Time Controllability Gramian

If a continuous-time system is controllable, then
Z t
T
e A B B T e A d
Wc (t) =
0
is nonsingular for t > 0.
SIGNIFICANCE :
Consider
x(t1) = e
At1
x(0) +
t1
0
e A(t1 ) Bu( ) d.
We claim that for any x(0) = x 0 and any x(t1) = x1 the input

T
u(t) = B T e A (t1t) Wc1(t1) e At1 x0 x1
will transfer x 0 to x1 at time t1. (Proof by direct substitution).
Therefore, we can compute the input u(t) required to transfer the

state of the system from one state to another over an arbitrary interval
of time. The solution is also the minimum-energy solution.
EXAMPLE :
Consider the system in diagonal form

#
"
#
"
0.5
0.5 0
u(t).
x(t)
=
x(t) +
0 1
1
The controllability matrix is:
"
0.5 0.25
1
1
which has rank 2, so the system is controllable. Consider the input

required to move the system state from x(0) = [10 1] T to zero in
515
two seconds.
Z
Wc (2) =
"
"
0.2162 0.3167
0.3167 0.4908
and
h
0.5
u(t) = 0.5 1
"
#"
#
0.5
1
0.5 1
"
0.5
#!
0.5(2t)
0
e(2t)
Wc (2)1
"
= 58.82e0.5t + 27.96et .
e
0
0
e2
#"
10
1
If a continuous-time system is controllable, and if it is also stable, then

Z
T
e A B B T e A d
Wc =
0
can be found by solving for the unique (positive-definite) solution to

the (Lyapunov) equation
AWc + Wc A T = B B T .
Wc is called the controllability Gramian.
Wc measures the minimum energy required to reach a desired point

x1 starting at x(0) = 0 (with no limit on t)

Z t

ku( )k2 d x(0) = 0, x(t) = x 1 = x1T Wc1 x1.
min
0
If A is stable, Wc1 > 0 which implies we cant get anywhere for free.
If A is unstable, then Wc1 can have a nonzero nullspace Wc1 z = 0 for

some z 6= 0 which means that we can get to z using us with energy
as small as you like! (u just gives a little kick to the state; the
instability carries it out to z efficiently).

Wc may be a better indicator of controllability than .
516
Continuous-Time Observability Gramian

If a system is observable,
Wo(t) =
SIGNIFICANCE :
t
0
e A C T Ce A d
We can prove that

x(0) =
Wo1(t1)
where
y (t) = y(t) C
t1
e A t C T y (t) dt
e A(t ) Bu( ) d Du(t).
Therefore, we can determine the initial state x(0) given a finite

observation period (and not use differentiators!).
If a continuous-time system is observable, and if it is also stable, then
Z
T
e A C T Ce A d
Wo =
0
can be found as the unique (positive-definite) solution to the

(Lyapunov) equation
A T Wo + Wo A = C T C.
Wo is called the observability Gramian.
If measurement (sensor) noise is IID (0, I ) then Wo is a measure

of error covariance in measuring x(0) from u and y over longer and
longer periods

2

lim E x(0)
x(0) = x(0)T W 1 x(0).

t

517
If A is stable, then Wo1 > 0 and we cant estimate the initial state
perfectly even with an infinite number of measurements u(t) and y(t)
for t 0 (since memory of x(0) fades).
If A is not stable then Wo1 can have a nonzero nullspace Wo1 x(0) = 0
which means that the covariance goes to zero as t .
Wo may be a better indicator of observability than
Discrete-Time Controllability Gramian

In discrete-time, if a system is controllable, then
Wdc [n 1] =
is nonsingular. In particular,
Wdc =
n1
X
Am B B T (A T )m
m=0
Am B B T (A T )m
m=0
is called the discrete-time controllability Gramian and is the unique

positive-definite solution to the Lyapunov equation
Wdc AWdc A T = B B T .
As with continuous-time, Wdc measures the minimum energy required
to reach a desired point x 1 starting at x[0] = 0 (with no limit on m)
( m
)

X

2
x[0] = 0, x[m] = x 1 = x T W 1 x1.
ku[k]k
min
1
dc

k=0
To show that the above is indeed a Lyapunov equation, (or may

be converted to a standard Lyapunov equation for solving it), let
ASIDE :
Ac = (A + I )1(A I )
518
and
Cc = 2(A + I )1 B B T (A T + I )1.
Expand the continuous-time Lyapunov equation using these

definitions
Ac Wdc + Wdc AcT = Cc
(A + I )1(A I )Wdc + Wdc (A T I )(A T + I )1 =
2(A + I )1 B B T (A T + I )1.
Replace (A I ) with (A + I 2I ) and replace (A T I ) with

(A T + I 2I ). Multiply through and simplify to get
Wdc (A + I )1 Wdc Wdc (A T + I )1 = (A + I )1 B B T (A T + I )1.
Left multiply by (A + I ) and right multiply by (A T + I ) and simplify

So, with the above definitions of A c and Cc based on A and B, we can
create a continuous-time Lyapunov equation and solve it for W dc .
Discrete-Time Observability Gramian
In discrete-time, if a system is observable, then
n1
X
Wdo [n 1] =
(A T )m CC T Am
m=0
is nonsingular. In particular,
Wdo =
(A T )m CC T Am
m=0
is called the discrete-time observability Gramian and is the unique

Wdo A T Wdo A = C T C.
519
As with continuous-time, if measurement (sensor) noise is IID

(0, I ) then W is a measure of error covariance in measuring x[0]
do
from u and y over longer and longer periods

2

lim E x[0]
x[0] = x[0]T W 1 x[0].
do
Transformation to Controllability Form

Given a system
x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t),
can we find a transformation (similarity) matrix T to transform this

system into controllability form
1
0 0 an
... a
1
0
n1
x
(t)
+
xco (t) =
... co
... u(t)
... . . .
0 1 a1
0
i
h
y(t) = 1 2 n xco (t) + Du(t)
Note that co = I so it is controllable. Thus, our original system must

be controllable.
The transformation is accomplished via
x = T xco .
Thus
T 1 AT = Aco ,
C T = Cco ,
Lets find T explicitly; let
T 1 B = Bco
D = Dco .
T = [t1, . . . , tn ].

520
Note that B = T Bco or
1

0
B=T
... = t1.

0
So, t1 = B. From AT = T Aco we have
So
0 0 an
... a
1
n1
A[t1, . . . , tn ] = [t1, . . . , tn ]
...
... . . .
0 1 a1
"
[At1, . . . Atn ] = t2, . . . , tn ,

By induction,
n
X
ak tnk+1 .
k=1
At1 = t2 = AB
At2 = t3 = A2 B, and so forth . . .

so
T = [B AB An1 B] = .
A system {A, B, C, D} can be transformed to controllability

canonical form if and only if it is controllable, in which case the
change of coordinates is
x = xco.
CONCLUSION :
1
If x old = T xnew then T = old new
. That is, to convert
between any two realizations, T is a combination of the controllability
EXTENSION I :

521
matrices of the
two different realizations.

n1
new = Bnew Anew Bnew Anew Bnew
1

1
1
1 n1
1
= T Bold T Aold T T Bold T Aold T T Bold
= T 1
old ,
or
T =
If x old = T xnew then T =

similar way.
EXTENSION II :
1
old new .
1
old
new .
This can be shown in a
Canonical Decompositions
What happens if {A, B} not controllable or if {A, C} not observable?
Is part of the system controllable?
Is part of the system observable?
Given a system with {A, B} not controllable, x(t)
= Ax(t) + Bu(t),
y(t) = C x(t), lets try to transform to controllability canonical form.
Let t1 = B, t2 = AB, . . . tr = Ar 1 B, and suppose t1, t2, . . . , tr are
independent but
tr +1 = Ar B = r t1 1tr
for some constants 1, . . . , r .
Then rank( ) = r since the vectors Ar B, Ar +1 B, . . . , An1 B can all be
expressed as a linear combination of t1, t2, . . . , tr .
Let sr +1, . . . , sn be your favorite vectors for which
h
i
T = t1 tr sr +1 sn
is invertible. Change coordinates via x old = T xnew.

522
We get (in the new coordinate system)

"
# "
#
#"
# "
xc (t)
Ac A12
xc (t)
Bc
=
u(t)
+
0 Ac
xc (t)
0
xc (t)
"
#
i x (t)
h
c
+ Du(t).
y(t) = Cc Cc
xc (t)
Ac is a right-companion matrix and Bc is of the

controllability-canonical form.
We see that the uncontrollable modes x c are completely decoupled

from u(t).
EXAMPLE :
Consider
s1
s 1
1
=
= 2
.
s + 1 (s + 1)(s 1) s 1
In observer-canonical form,
"
#
"
#
0 1
1
x(t)
=
x(t) +
u(t)
1
1 0
h
i
y(t) = 1 0 x(t).
So,
t1 =
"
1
1
and
t2 = At1 =
"
1
1
= t1.
So, rank( ) = 1. Let s1 = [1 0]T . The converted state-space form is

"
#
" #
1
= 1 1 x(t)
x(t)
+
u(t)
0 1
0
h
i
y(t) = 1 1 x(t).

523
The dual form separates out the unobservable states

#
#"
# "
"
# "
xo(t)
Bo
xo (t)
Ao 0
u(t)
+
=
Bo
xo (t)
xo (t)
A21 Ao
"
#
h
i x (t)
o
y(t) = Co 0
+ Du(t).
xo (t)
Note: No path from x o to y!
Popov-Belevitch-Hautus (PBH) Tests for Controllability/Observability

PBH EIGENVECTOR TEST: {C, A} is an unobservable pair iff a non-zero
eigenvector v of A satisfies Cv = 0. (i.e., C and v are perpendicular).
Suppose Av = v and Cv = 0 for v 6= 0. Then
C Av = Cv = 0 and so forth up to C A n1v = v = 0. So,
v = 0 and
since v 6= 0 this means that
is not full rank and {C, A} is not
observable.
PROOF :
Now suppose that

is not full rank ({C, A} unobservable). Lets
extract the unobservable part. That is, find T such that
"
#
Ao 0
T 1 AT =
A21 Ao
i
h
C T = Co 0 ,
where Ao is size r where r = rank(

) and therefore A o is size n r.
Let v2 6= 0 be an eigenvector of A o . Then
" #
#" #
" # "
0
0
Ao 0
0
=
=
T 1 AT
v2
v2
A21 Ao
v2

524
so we have
Az = z
where
z=T
"
0
v2
Now, we just need to show that C z = 0 (note: z 6= 0).

" #
" #
h
i
0
0
C z = |{z}
CT
= Co 0
=0
v
v
and we are done.
{A, B} is uncontrollable iff there is a left eigenvector w T of A such

that w T B = 0.
DUAL :
In modal coordinates, homogeneous response is

n
X
and
y(t) = C x(t).
ei t vi (wiT x(0))
x(t) =
INTERPRETATION :
Or,
i =1
y(t) =
n
X
ei t Cvi (wiT x(0)).
i =1
If {C, A} is unobservable, then it has an unobservable mode, where

Avi = i vi and Cvi = 0.
If {A, B} is uncontrollable, then it has an uncontrollable mode, namely
the coefficients of the state along that mode is independent of the
input u(t).
The coefficients of x in the mode associated with are w T x.
d T
(w x) = w T (Ax + Bu) = (w T x)
dt
or
w T x(t) = et (w T x(0))
525
regardless of the input u(t)!

PBH RANK TESTS : The following two tests are often easier to perform
h
i
1. {A, B} controllable iff rank s I A B = n for all s .
h
i
If rank s I A B = n for all s then there can be no nonzero
i
h
i h
vector v such that v s I A B = v(s I A) v B = 0.
Consequently, there is no nonzero vector v such that vs = v A and
v B = 0. By the PBH eigenvector test, the system will therefore be
controllable
If the system is controllable, then there is no nonzero vector v such
that hvs = v A andiv B = 0 by the PBH eigenvector test. Therefore,
rank s I A B = n for all s .
"
#
C
= n for all s . Proof similar.
2. {C, A} observable iff rank
sI A
h
i
COMMENTS : rank s I A B = n for all s not eigenvalues of A, so the
i
h
test is really {A, B} controllable iff rank i I A B = n for i ,
i = 1, . . . , n the eigenvalues of A. (Dual argument for observability).
h
i
If s I A B drops rank at s = then there is an uncontrollable
mode
" with #exponent (frequency) .
C
If
drops rank at s = then there is an unobservable
sI A
mode with exponent (frequency) .
Summary
Therefore, we can label individual modes of a system as either
controllable or not, or observable or not.
526
The overall picture is:
PSfrag replacements
u(t)
Controllable
and
observable
Controllable
but not
observable
y(t)
Observable
but not
controllable
Neither
observable nor
controllable
Some other definitions:
STABILIZABLE :
DETECTABLE :
A system whose unstable modes are controllable.
A system whose unstable modes are observable.
Minimal RealizationsWhy is the system not Control/Observ-able?

1
we could use
To realize H (s) =
s+1
1. x(t)
= x(t) + u(t), y(t) = x(t). This gives
A = [1], B = [1], C = [1], D = [0].
C adj(s I A)B
1
=
.
det(s I A)
s+1
This realization is both controllable and observable.
1 s1
s1
2. Observer realization of
= 2
. This gives
s
+
1
s
1
s
1
"
#
"
#
h
i
0 1
1
A=
,B=
, C = 1 0 , D = [0].
1 0
1
s1
1
C adj(s I A)B
= 2
=
.
det(s I A)
s 1 s +1

"
"
527
1 0
. Observable.
0 1
"
#
h
i
1 1
= B AB =
. Not controllable.
1 1
C
CA
s + 10
1 s + 10
= 2
.
3. Controller realization of
s
+
1
s
+
10
s
+
11s
+
10
"
#
" #
h
i
11 10
1
A=
,B=
, C = 1 10 , D = [0].
1
0
0
C adj(s I A)B
s + 10
1
= 2
=
.
det(s I A)
s + 11s + 10 s + 1
#
"
h
i
1 11
= B AB =
. Controllable.
0 1
# "
#
"
C
1 10
=
=
. Not observable.
CA
1 10
Non-minimal realizations of transfer functions will either be

non-controllable, non-observable, or both.
TREND :
Four equivalent statements:

I:
II :
There exist common roots of C adj(s I A)B and det(s I A).
There exist eigenvalues of A which are not poles of G(s), counting

multiplicities.
III :
The system is either non-observable or non-controllable.
IV :
There exist extra (unnecessary) statesnon minimal.
We say a system is minimal if no system with the same

transfer function has fewer states.
DEFINITION:
PROOF : I II
I H II :
528
The transfer function

G(s) = C(s I A)1 B + D
C adj(s I A)B + D det(s I A)
=
.
det(s I A)
If there are common roots in C adj(s I A)B and det(s I A) they will
cancel out of the transfer function. But, eigenvalues of A are
det(s I A) = 0, so poles of G(s) will not contain all eigenvalues of A.
Eigenvalues of A are det(s I A) = 0. Poles of G(s), from above,
are det(s I A) = 0 unless canceled. The only way to cancel a pole is
to have common root in C adj(s I A)B.
II H I :
{A, B, C, D} is minimal iff C adj(s I A)B and det(s I A)

are coprime (have no common roots).
PROOF : I IV
I H IV :
Suppose C adj(s I A)B and det(s I A) have common roots.
Cancel to get
C adj(s I A)B
br (s)
+D=
det(s I A)
ar (s)
where br (s) and ar (s) are coprime (r means reduced).
G(s) =
Because of cancellation, K = deg(ar ) < deg(det(s I A)) = n.
Consider controller canonical form realization of br (s)/ar (s), for

example.
It has k states, but same transfer function as {A, B, C, D},
contradicting that {A, B, C, D} minimal.
If there are extra states (non-minimal) then k > n (k =# states,
n =# poles in G(s)).
G(s) = C(s I A)1 B + D
IV H I :

529
C adj(s I A)B + D det(s I A)

det(s I A)
G(s) has k poles unless some cancel with C adj(s I A)B. Therefore,
if k > n, C adj(s I A)B and det(s I A) are not coprime.
PROOF : III IV Controllable and observable iff minimal.
PSfrag replacements
III H IV : Uncontrollable or unobservable H not minimal.
Perform Kalman decomposition to split system into co, co,

co
and co
parts.
u(t)
Bc
x 1(t)
Cc
Ac
y(t)
A12
Uncontrollable
part of
realization
x 2(t)
Cc
Ac
"
"
h
i
A
A
B
c
12
c
A =
, B =
, C = Cc Cc , D = [0].
0 Ac
0
B,
C.
Therefore
Same transfer function using A c , Bc, Cc as A,
uncontrollable and/or unobservable means not minimal.
IV H III :
Non-minimal means uncontrollable or unobservable.
Suppose {A, B, C, D} is non-minimal.

1
I A)
1 B + D
=
C(s
C(s
I
A)
B
+
D
{z
} |
{z
}
|
n states
r <n states
C B C A B C A 2 B
C B C AB C A2 B
+ 2 +
+ 2 +
+ =
+
s
s
s3
s
s
s3

530
Consider

CA h
i
2
n1
2
=
C A B AB A B A B
...
C An1
C B
C A n1 B
CB
C An1 B
...
...
...
...
=
=
C A n1 B C A 2n2 B
C An1 B C A2n2 B
x
n1
C A
B A B A B r
y
.
.
n
=
.
0
C A n2
nr
0
y C A n1
y
nr
Therefore det(
) det( ) = 0, so the system is either unobservable,
or uncontrollable, or both.
The four equivalences have now been proven.
All minimal realization of G(s) are related by a unique change of
coordinates T . Can you prove this?
FACT :

(mostly blank)
(mostly blank)
61
CONTROLLER/ ESTIMATOR DESIGN

State Feedback Control
System dynamics
x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t).
System poles given by eigenvalues of A.

Want to use input u(t) to change the dynamics.
Will assume the form of LINEAR STATE FEEDBACK.
1n .
u(t)
=
r(t)
K
x(t),
K
PSfrag replacements
r (t)
u(t)
A, B, C, D
y(t)
Full state feedback with gain vector K .

Substitute:
x(t)
= Ax(t) + B(r(t) K x(t))
= (A B K )x(t) + Br(t)
y(t) = C x(t) + Du(t).
For now we talk about regulation (r(t) = 0) and generalize later.

For now we consider SISO systems, and generalize later.
62
ECE4520/5520, CONTROLLER/ ESTIMATOR DESIGN
Design K so that A CL = A B K has some nice properties.

For example,
OBJECTIVE :
A unstable, ACL stable.
Put two poles at 2 j. (Pole placement).

There are n parameters in the gain vector K and n eigenvalues of A.
So, what can we achieve?
EXAMPLE :
x(t)
=
"
1 1
1 2
x(t) +
"
1
0
u(t).
det(s I A) = (s 1)(s 2) 1 = s 2 3s + 1.
(Note. The original system is unstable).
Let
u(t) = k1 k2 x(t) = K x(t)

# " #
"
i
1 h
1 1
ACL = A B K =
k1 k2
0
1 2
#
"
1 k1 1 k2
.
=
1
2
So, det(s I A) = s 2 + (k1 3)s + (1 2k1 + k2).
By choosing k1 and k2, we can put eig(A CL ) ANYWHERE in the

complex plane (in complex-conjugate pairs, that is!)
Poles at 5, 6?
Compare desired closed-loop characteristic equation

(s + 5)(s + 6) = s 2 + 11s + 30

63
with
det(s I A) = s 2 + (k1 3)s + (1 2k1 + k2).
So,
k1 3 = 11,
1 2k1 + k2 = 30,
or, k1 = 14
or, k2 = 57.
K = [14 57].
So, with the n parameters in K , can we always relocate all n eig(A CL )?
Most physical systems, qualified yes.
Mathematically, EMPHATIC NO!
Boils down to whether or not the system is controllable. That is, if
every internal system mode can be excited by inputs, either directly or
indirectly.
EXAMPLE :
x(t)
=
ACL
"
1 1
0 2
x(t) +
"
1
0
u(t)
u(t) = K x(t).
"
#
1 k1 1 k2
= A BK =
0
2
det(s I ACL ) = (s 1 + k1)(s 2).

Feedback of the state cannot move the pole at s = 2. System cannot
be stabilized via state feedback. (Note that the system is already in
Kalman form, and the uncontrollable mode has eigenvalue 2).
If {A, B} is controllable, then we can arbitrarily assign the
eigenvalues of AC L using state feedback. More precisely, given any
FACT:

64
polynomial s n + 1s n1 + + n there exists a (unique for SISO)

K mn (m = number of system inputs = 1 for SISO) such that
det(s I A + B K ) = s n + 1s n1 + + n .
If {A, B} is controllable, we can find a state feedback for
which the closed-loop realization is stable.
COROLLARY:
Suppose {A, B} is controllable. Put {A, B, C, D} in

controller canonical form.
PROOF OF FACT:
That is, find T such that
a1 a2 an
1
0
1
T AT = Ac =
...
...
0 1
0
and
where det(s I A) = s n + a1s n1 + + an .
1

0
1
T B = Bc =
...

0
Lets apply state-feedback K c to the controller realization.

i
h
Note, K c = k1 kn , so
k1 k2 kn
0
0
Bc K c =
... .
...
Useful because characteristic equation obvious.
(a1 + k1) (a2 + k2) (an + kn )
1
0
,
ACL = Ac Bc K c =
...
...
1
0

65
frag replacements
still in controller form!

b1
y(t)
b2
u(t)
a1
x 1c
x 2c
x 3c
b3
a2
a3
k1
k2
k3
Thus, after state feedback with K c the characteristic equation is

det(s I Ac + Bc K c ) = s n + (a1 + k1)s n1 + + (an + kn ).
If we set k1 = 1 a1, . . . , kn = n an then we get the desired
characteristic polynomial.
Now, we transform back to the original realization
det(s I Ac + Bc K c ) = det(s I T Ac T 1 + T Bc K c T 1)
= det(s I A + B K c T 1).
So, if we use state feedback

K = K c T 1
h
i
= (1 a1) (n an ) T 1
we will have the desired characteristic polynomial.

66
One remaining question: What is T ? We know T = 1

c and
2
1 a1 an1
1 a1 a1 a2
.
.
.
0 1
0 1
a1
0
1
= .
c = ...
.
.
.
.
.
. a1
.
1
0
1
0
1
|
{z
}
upper 1 Toeplitz
So,
1 a1 an1
h
i 0 1
an2
K = (1 a1) (n an )
...
. . . ...
0
1
This is called the Bass-Gura formula for K .
Ackermanns Formula
We need to know ai to use the Bass-Gura formula. Ackermanns
method may require less work.
Consider (a system already in controller canonical form, for now),
(Ac ) = Anc + a1 An1
+ + an I = 0
c
by Cayley-Hamilton.
Also,
d (Ac ) = Anc + 1 An1

+ + n I
c
= (1 a1)An1
+ + (n an )I.
c
For the controller form,

h
i
h
i
k
0 1 Ac = 0 |{z}
1 0
nk

67
so that
h
0 1 d (Ac ) =
Therefore,
(1 a1) (2 a2) (n an )
K = K c T 1
h
= 0
h
= 0
h
= 0
h
= 0
= K c.
1 d (T 1 AT )T 1
i
1 T 1d (A)
i
1 c 1d (A)
i
1 1d (A).
Revisit previous example. d (s) = s 2 + 11s + 30.

"
#
= 10 11
"
# ("
#"
#
"
#
"
#)
h
i 1 1
1 1
1 1
1 1
1 0
K = 0 1
+ 11
+ 30
0 1
1 2
1 2
1 2
0 1
("
# "
#)
i
h
2 3
41 11
= 0 1
+
3 5
11 52
"
#
h
i 43 14
= 0 1
14 57
h
i
= 14 57 .
same as before.
acker.m
Very easy in Matlab, but numerical issues.
place.m
Use this instead, unless you have repeated roots.
polyvalm.m
To compute d (A).

68
Simulating State Feedback in Simulink

The following block diagram may be used to simulate a
state-feedback control system in Simulink.
K
r
1
xdot
D
1
s
y
1
C
K
A
K

Some Comments
The eigenvalues associated with uncontrollable modes are fixed
(dont change) under state feedback, but those associated with
controllable modes can be arbitrarily assigned.
FACT:
FACT:
State feedback does not change zeros of a realization.
Drastic changes in characteristic polynomial requires large gains

K (high control effort).
FACT:
State feedback can result in unobservable modes (pole-zero

cancellations).
FACT:
Reference Input
So far, we have looked at how to pick K to get homogeneous
dynamics that we want.
eig(ACL) fast/slow/real poles ...
How does this improve our ability to track a reference?
69
Started with u(t) = r(t) K x(t).

Want y(t) r(t) for good tracking.
Y (s)
Frequency domain, want
1. Usually only get this performance
R(s)
at low frequencies.
Problem is that u(t) = r(t) K x(t) is simple, but it gives steady-state
errors.
EXAMPLE :
"
"
h
i
1 1
1
A=
,
B=
,
K = 14 57 .
1 2
0
h
i
Y (s)
= C(s I A + B K )1 B.
Let C = 1 0 . Then
R(s)
"
#1 " #
h
i
Y (s)
s + 13 56
1
= 1 0
R(s)
1 s 2
0
"
#
s 2 56
" #
h
i
1 s + 13
s2
1
= 1 0 2
= 2
.
s + 11s 26 + 56 0
s + 11s + 30
2
6= 1 !
30
Final value theorem for step input, y(t)

Step Response
0.06
Amplitude
0.04
PSfrag replacements
0.02
0
0.02
0.04
0.06
0.08
10
Time (sec.)
610
A constant output yss requires constant state x ss and

constant input u ss . We can change the tracking problem to a
regulation problem around u(t) = u ss and x(t) = x ss .
(u(t) u ss ) = K (x(t) x ss ).
OBSERVATION :
u ss and xss related to rss . Let
u ss = |{z}
Nu rss
11
xss = |{z}
N x rss .
n1
How to find Nu and N x ? Use equations of motion.

x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t).
At steady state,
x(t)
= 0 = Ax ss + Bu ss
y(t) = rss = C xss + Du ss .
Two equations and two unknowns.

"
#" #
Nx
A B
C D
Nu
"
0
I
In steady-state we had
(u(t) u ss ) = K (x(t) x ss )
which is achieved by the control signal
u(t) = Nu r(t) K (x(t) N x r(t))
= K x(t) + (Nu + K N x )r(t)
= K x(t) + Nr(t).
N computed without knowing r(t). It works for any r(t).
611
In our example we can find that

" #
Nx
Nu
Nu + K N x = 15.
= 1/2 .
1/2
New equations:
x(t)
= Ax(t) + B(Nu + K N x )r(t) B K x(t)
= (A B K )x(t) + B Nr(t)
y(t) = C x(t).
Therefore,
Y (s)
R(s)
new
Y (s)
R(s)
15s + 30
N = 2
s + 11s + 30
old
which has zero steady-state error to a unit-step.

r (t)
u(t)
Nu
A, B, C, D
y(t)
x
K
Sfrag replacements
Nx
r (t)
u(t)
A, B, C, D
x
Simulate (either method)

y(t)
612
Step Response
1
0.8
Amplitude
0.6
PSfrag replacements
0.4
0.2
0
0.2
0.4
0.6
0.8
10
Time (sec.)
Pole Placement
Classical question: Where do we place the closed-loop poles?
Dominant second-order behavior, just as before.
Assume dominant behavior given by roots of
n
o
p
2
2
2
s + 2 n s + n s = n jn 1
Put other poles so that the time response is much faster than this
dominant behavior.
Place them so that they are sufficiently damped.
Real part < 4 n .
Keep frequency same as open loop.

Be very careful about moving poles too far. Takes a lot of control
effort.
Can also choose closed-loop poles to mimic a system that has
performance that you like. Set closed-loop poles equal to this
prototype system.
Scaled to give settling time of 1 sec. or bandwidth of = 1 rad/sec.
613
Bessel Prototype Systems

Step Response: Constant ts
ments
0.8
0.8
0.6
0.4
PSfrag replacements
0.2
0
0
10
12
14
Amplitude
Amplitude
Step Response: Constant Bandwidth
0.6
0.4
0.2
0
0
16
0.5
1.5
2.5
Time (sec.)
Time (sec.)
ITAE Prototype Systems
ments
0.8
0.8
Amplitude
Amplitude
0.6
0.4
PSfrag replacements
0.2
0
0
10
12
14
16
0.6
0.4
0.2
0
0
Time (sec.)
PROCEDURE :
0.5
1.5
2.5
Time (sec.)
For nth-order systemdesired bandwidth.
1. Determine desired bandwidth o .

2. Find the nth-order poles from the table of constant bandwidth, and
multiply pole locations by o .
3. Use Acker/place to locate poles. Simulate and check control effort.
PROCEDURE :
For nth-order systemdesired settling time.
1. Determine desired settling time ts .

2. Find the nth-order poles from the table of constant settling time,
and divide pole locations by ts .
1:
2:
3:
4:
5:
6:
1.000
0.707 0.707 j
0.708
0.376 1.292 j
0.576 0.534 j
0.310 0.962 j
0.626 0.414 j
0.581 0.783 j
3:
4:
0.896
0.735 0.287 j
Bessel pole locations for o = 1 rad/sec.

1:
2:
3:
4:
5:
6:
1.000
0.866 0.500 j
0.942
0.591 0.907 j
0.852 0.443 j
0.539 0.962 j
5:
6:
0.905 0.271 j
1:
0.800 0.562 j
3:
4:
0.926
0.909 0.186 j
4.620
4.660 4.660 j
4.350 8.918 j
5.913
4.236 12.617 j
6.254 4.139 j
2.990 12.192 j
5.602 7.554 j
3.948 13.553 j
6.040 5.601 j
9.394
7.089 2.772 j
Bessel pole locations for ts = 1 sec.

2:
0.746 0.711 j
0.657 0.830 j
1:
2:
0.521 1.068 j
0.424 1.263 j
ITAE pole locations for ts = 1 sec.

ITAE pole locations for o = 1 rad/sec.
5:
6:
4.620
4.053 2.340 j
3.967 3.785 j
5.009
4.110 6.314 j
5.927 3.081 j
4.016 5.072 j
4.217 7.530 j
5.528 1.655 j
6.261 4.402 j
6.448
7.121 1.454 j
614
615

Bessel model has no overshoot, but is slow compared with ITAE.
NOT a good idea for flexible systems. Why?
ITAE: , Bessel,
Magnitude
20
PSfrag replacements
40
60
80
100
120
140
1
10
10
10
, (rads/sec.)
EXAMPLE :
G(s) =
1
s(s + 1)(s + 4)
want ts = 2, and 3rd-order Bessel.
A= 1
0
h
C= 0 0
4 0
0 0 ,
1 0
i
1 .
1

B=0
0
Open- and Closed-Loop Step Responses

5.009
s1 =
= 2.505
2
3.967 3.784 j
s2,3 =
2
PSfrag
replacements
= 1.984 1.892
j
0.5
0.45
d (s) = s 3 + 6.473s 2 +
17.456s + 18.827.
h
i
K = 1.473 13.456 18.827 .
Amplitude
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0.5
1.5

2.5
Time
3.5
4.5
616
Poles are now in the right place, but poor steady-state error to step
input. Use reference-tracking method from before

5 4 0 1
0

1 0 0 0 Nx 0

0 1 0 0 = 0 .

Nu
0 0 1 0
1
or
0

Nx 0
=
1

Nu
0
or
N = 18.827.
This fixes our step response, but we cannot use a similar strategy to
improve ramp responses etc. Recall from ECE4510 that we need
integrators in the open-loop system to increase system type.
Integral Control for Continuous-Time Systems
In many practical designs, integral control is needed to counteract
disturbances, plant variations, or other noises in the system.
Up until now, we have not seen a design that has integral action. In
fact state-space designs will NOT produce integral action unless we
make special steps to include it!
How do we introduce integral control? We augment our system with
one or more integrators:

placements
617
u(t)
KI
s
r (t)
A, B, C, D
y(t)
x
K
Nx
In other words, include an integral state equation of

x I (t) = r(t) y(t)
= r(t) C x(t).
and THEN design K I and K such that the system had good
closed-loop pole locations.
Note that we can include the integral state into our normal
state-space form by augmenting the system dynamics
"
x I (t)
x(t)
"
0 C
0
#"
x I (t)
x(t)
"
0
B
u(t) +
"
I
0
r(t)
y(t) = C x(t) + Du(t).
Note that the new A matrix has an open-loop eigenvalue at the

origin. This corresponds to increasing the system type, and integrates
out steady-state error.
The control law is,
u(t) = K I K
"
#
i x (t)
I
x(t)
+ K N x r(t).
So, we now have the task of choosing n + n I closed-loop poles.

618
Our previous example becomes:

x I (t)
x I (t)
0 0 0 1
0 5 4 0

+
x(t)

0 1 0 0 x(t)
0 0 1 0
x I (t)
h
i
y(t) = 0 0 0 1
x(t) .

1
u(t) + 0 r(t)

0
0
0
0
We choose the fourth-order Bessel poles to give ts = 2.0 seconds

s1,2 = (4.016 5.072 j)/2
and
s3,4 = (5.528 1.655 j)/2.
h
i
We find that K = 87.102 4.544 36.988 91.273 .
Closed-Loop Step Response
Amplitude
PSfrag replacements
0.8
0.6
0.4
0.2
0
0
Time
State Feedback for Discrete-Time Systems

The result is identical.

619
Characteristic frequencies of controllable modes are freely

assignable by state feedback; characteristic frequencies of
uncontrollable modes do not change with state feedback.
There is a special characteristic polynomial for discrete-time systems
(z) = z n ;
that is, all eigenvalues are zero.
What does this mean? By Cayley-Hamilton,
(A B K )n = 0.
Hence, with no input, the state reaches 0 in at most n steps since
x[n] = (A B K )n x[0] = 0
no matter what x[0] is.
This is called dead-beat control and A B K is called a Nilpotent
matrix.
EXAMPLE :
Consider
x[k + 1] =
"
1 0
2 2
x[k] +
"
1
0
u[k].
This system is controllable, so we can find a K = [k 1

#
"
z 1 + k1 k2
= z2
det
2
z2
(z 1 + k1)(z 2) + 2k2 =
and therefore k1 = 3 and k2 = 2.

"
#
2 2
A BK =
2 2
k2] such that
620
(A B K )2 =
"
0 0
0 0
as claimed.
The open-loop system is unstable, but the closed-loop system is not

only stable but effects of initial conditions completely disappear after
two time stepsthey do not merely decay.
This is a common design procedure, but beware of high control effort.
Reference Input
Tracking a reference input with a discrete-time system requires the
same method as for continuous-time systems.
Integral Control
placements
Again, we augment our system with a (discrete-time) integrator:

r [k]
KIz
z1
u[k]
A, B, C, D
y[k]
x
K
Nx
In discrete time, we include an integral state equation of

x I [k + 1] = x I [k] + r[k] y[k]
= x I [k] + r[k] C x[k].
We can include the integral state into our normal state-space form by
augmenting the system dynamics

621
"
x I [k + 1]
x[k + 1]
"
1 C
0
#"
x I [k]
x[k]
"
0
B
u[k] +
"
I
0
r[k]
y[k] = C x[k] + Du[k].
Notice the new open-loop eigenvalue of A at z = 1.

The control law is,
u[k] = K I K
#
"
i x I [k]
x[k]
+ K N x r[k].

Discrete-Time Prototype Pole Placement
Where do we place the closed-loop poles?
Can choose closed-loop poles to mimic a system that has

prototype system.
Can be done using the ITAE and Bessel (continuous-time) tables.
PROCEDURE :

3. Convert s-plane locations to z-plane locations using z = e sT .

PROCEDURE :

622

Estimator Design
In the design of state-feedback control, we assumed that all states of
our plant were measured.
This is often IMPOSSIBLE to do or TOO EXPENSIVE.
So, we now investigate methods of reconstructing the plant state
vector given only limited measurements.
Open-Loop Estimator
Since we know the system dynamics, simulate the system in
real-time.
IDEA :
If x(t) is the true state, x(t)

is called the state estimate.
We want x(t)
= x(t), or at least x(t)
x(t). How do we build this?
To start, use our knowledge of the plant
x(t)
= Ax(t) + Bu(t).
Let our state estimate be
= A x(t)
x(t)
+ Bu(t).

PSfrag replacements
623
w(t)
u(t)
A, B
A, B
x(t)
x(t)
y(t)
y (t)
This is called an open-loop estimator.

Some troubling issues:
We need our model to be very accurate!
What do we use for x(0)?
What does disturbance do?
Lets analyze our open-loop estimator by examining the

state-estimate error
x(t)
= x(t) x(t).
We want x(t)
= 0.
For our estimator,
= x(t)
x(t)
x(t)
= Ax(t) + Bu(t) A x(t)
Bu(t)
= A x(t).
So,
x(t)
= e At x(0).
Hence, x(t)
x(t) if A is stable!
(This is not too impressive though since x(t)
x(t) because both
x(t)
and x(t) go to zero).
624
We need to improve our estimator:

Speed up convergence.
Reduce sensitivity to model uncertainties.
Counteract disturbances.
Have convergence even when A is unstable.
Key Point: Use feedback of measured output.
Sfrag replacements
Closed-Loop Estimator
w(t)
u(t)
A, B
A, B
x(t)
x(t)
y(t)
y (t)
y (t) = y(t) y (t)
This is called a closed-loop estimator.

Note: If L = 0 we have an open-loop estimator.

= A x(t)
x(t)
+ Bu(t) + L y(t) C x(t)
.
Lets look at the error.
= x(t)
x(t)
x(t)

Bu(t) L y(t) C x(t)

= A x(t)
L C x(t) C x(t)
= (A LC) x(t),

625
or, x(t)
x(t) if A LC is stable, for any value of x(0)
and any u(t),

whether or not A is stable.
In fact, we can look at the dynamics of the state estimate error to
quantitatively evaluate how x(t)
x(t).
= (A LC) x(t)
x(t)
has dynamics related to the roots of the characteristic equation

ob (s) = det (s I A + LC) = 0.
So, for our estimator, we specify the convergence rate of x(t)
x(t)
by choosing desired pole locations: Choose L such that
ob,des (s) = det (s I A + LC) .
This is called the observer gain problem.
In Simulink, the following diagram implements a closed-loop
estimator. The output is xhat.
K
y
2
L
u
1
xhat
yhat
C
K
A

K
D
The Observer Gain Design Problem

We would like a method for computing the observer gain vector L
given a set of desired closed-loop observer gains ob,des (s).
626
Bass-Gura Inspired Method

We want a specific characteristic equation for A LC.
Suppose {C, A} is observable. Put {A, B, C, D} in observer canonical
form.
That is, find T such that
a1 1
0
. . . ...
a
2
1
T AT = Ao = .
.
1
.
an 0 0
and C T = C o =
1 0 0
where det(s I A) = s n + a1s n1 + + an .
Apply feedback L o to the observer realization to end up with

Ao L o C o .
h
iT
Note, L o = l1 ln , so
l1 0 0
l2
.
L oCo = .
..
ln 0 0
Useful because characteristic equation obvious.
(a1 + l1) 1
...
(a2 + l2)
Ao L o C o =
...
still in observer form!
0
...
,
1
(an + ln ) 0 0

627
After feedback with L o the characteristic equation is

det(s I Ao + L oCo ) = s n + (a1 + l1)s n1 + + (an + ln ).
If we set l1 = 1 a1, . . . , ln = n an then we get the desired
characteristic polynomial.
Now, we transform back to the original realization
det(s I Ao + L oCo ) = det(s I T Ao T 1 + T L oCo T 1)

= det(s I A + T L oC).
So, if we use feedback

L = T Lo
h
iT
= T (1 a1) (n an )
we will have the desired characteristic polynomial.
One remaining question: What is T ? We know T =
1
1
0 0
...
a1 1
o = ..
.
.
. 0
.
an1 a1 1
So,
L=
1
0 0
.
.
1
.
1 a1
...
... 0
an1 a1 1
This is the Bass-Gura formula for L.
(1 a1)
(2 a2)
.
...
(n an )

and
628
The Ackermann-Inspired Method

The observer-gain problem is dual to the controller-gain problem.
Replace a controller-design procedures inputs A A T , B C T ,
K L T . Design the controller. Then, L will be the observer gains.
h
i
Recall, K = 0 1 (A, B)1d (A).
By duality,
L=
hh
0 1
dT (A T )
i
T
(A, B) d (A)
A C
AA T
BC T
n1T
0
.
..

0

1

0
.
.
1 .
= d (A)
(C, A) .
0
1
CA
= d (A)
...
C An1
In Matlab,
iT
L=acker(A,C,poles);
L=place(A,C,poles);

iT
0
.
..

0

1
629
Discrete-Time Prediction Estimator
Sfrag replacements
In discrete-time, we can do the same thing. The picture looks like

w[k]
u[k]
A, B
A, B
x[k]
x p [k]
Lp
y[k]
y [k]
y [k] = y[k] y [k]
We write the update equation for the closed-loop (prediction)

estimator as

x p [k + 1] = A x p [k] + Bu[k] + L p y[k] C x p [k] .
The prediction-estimation error can likewise be written as

x[k
+ 1] = A L p C x[k],
which has dynamics related to the roots of the characteristic equation

ob (z) = det z I A + L p C = 0.
For our prediction estimator, we specify the convergence rate of

x p [k] x[k] by choosing desired pole locations: Choose L p such that

ob,des (z) = det z I A + L p C .
EXAMPLE :
Let G(s) =
1
and measure y[k]. Let
s2
"
#
y[k]
x[k] =
,
y [k]

630
then
A=
"
1 T
0 1
C=
1 0 .
We desire x[k]
to decay with poles z p = 0.8 0.2 j, or
ob,des (z) = z 2 1.6z + 0.68.

"
#
z 1 + l1
T
det(z I A + L p C) = det
l2
z1
= z 2 + z(l1 2) + l2 T l1 + 1.
So,
l1 2 = 1.6
l2 T l1 + 1 = 0.68
or
Lp =
"
0.4
0.08/T
The estimator is

x p [k + 1] = A x p [k] + Bu[k] + L p y[k] C x p [k]

= A L p C x p [k] + Bu[k] + L p y[k],
or
"
2
T
0.6 T
0.4
y p [k + 1]
y p [k]
= 0.08
+
u[k] + 0.08 y[k].
2
y p [k + 1]
y p [k]
1
T
T
T
In general, we can arbitrarily select the prediction estimator poles iff
{C, A} is observable.
#
"

631
The observer-gain problem is dual to the controller-gain problem.

Replace a controller-design procedures inputs A A T , B C T ,
K L Tp . Design the controller. Then, L p will be the observer gains.
Compensator Design: Separation Principle
Now that we have a structure to estimate the state x(t), lets feed
back x(t)
to control the plant. That is,
u(t) = r(t) K x(t),
where K was designed assuming that u(t) = r(t) K x(t). Is this

going to work? How risky is it to interconnect two well-behaved,
stable systems? (Assume r(t) = 0 for now).
g replacements
w(t)
x(t)
= Ax(t) + Bu(t)
u(t)
y(t)
y(t) = C x(t) + Du(t)
r (t)
x(t)
x(t)
= A x(t)
+ Bu(t) +

L y(t) y (t)
D(s)
What is inside the dotted line is equivalent to our classical design

compensator:
PSfrag
replacements
w(t)
u(t)
G(s)
y(t)
r (t)
D(s)
632
Where G(s) is described by

x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t),
and D(s) is described by

= A x(t)
x(t)
+ Bu(t) + L y(t) y (t)
u(t) = K x(t),
so that we have
x(t)
= Ax(t) B K x(t)

x(t)
= (A B K ) x(t)
+ L C x(t) + Du(t) C x(t)
Du(t)
= (A B K LC) x(t)
+ LC x(t).
Our combined closed-loop-system state equations are

"
#"
# "
#
x(t)
x(t)
A
B K
=
,
x(t)
x(t)
LC A B K LC
or, in terms of x(t),
= x(t)
x(t)
x(t)
= Ax(t) B K x(t)
LC x(t) (A B K LC) x(t)
= (A LC) x(t)
x(t)
= Ax(t) B K (x(t) x(t))
or,
"
x(t)
x(t)
"
A BK
BK
A LC
#"
x(t)
x(t)
Note, we have simply changed coordinates:

"
# "
#"
#
x(t)
x(t)
I
0
=
.
x(t)
x(t)
I I
|
{z
}
T

633
With this change of coordinates, we may easily find the closed-loop

poles of the combined state-feedback/estimator system: The
eigenvalues of a block-upper-triangular 2 2 matrix are the collection
of the eigenvalues of the two diagonal blocks.
Therefore, the 2n poles are the n eigenvalues of A B K combined
with the n eigenvalues of A LC.
BUT, we DESIGNED the n eigenvalues of A B K to give good
(stable) state-feedback performance, and we DESIGNED the n
eigenvalues of A LC to give good (stable) estimator performance.
Therefore, the closed-loop system is also stable.
This is such an astounding conclusion that it has been given a special
name: The Separation Principle.
It implies that we can design a compensator in two steps:
1. Design state-feedback assuming the state x(t) is available.
2. Design the estimator to estimate the state as x(t).
and, that u(t) = K x(t)

works!
The CompensatorContinuous-Time
What is D(s)? We know the closed-loop poles, which are the roots of
1 + D(s)G(s) = 0, and we know the plants open-loop poles. What are
the dynamics of D(s) itself?
Start with a state-space representation of D(s)
= (A B K LC)x(t)
x(t)
+ Ly(t) L Du(t)
u(t) = K x(t)

634
so that
D(s) =
U (s)
= K (s I A + B K + LC L D K )1 L .
Y (s)
The poles of D(s) are the roots of det(s I A + B K + LC L D K ) = 0.

These are neither the controller poles nor the estimator poles.
D(s) may be unstable even if the plant is stable and the closed-loop
system is stable.
The CompensatorDiscrete-Time
The results in discrete-time are essentially the same.
1
U (z)
L p.
= K z I A + B K + L p C L p D K
D(z) =
Y (z)
The poles of D(z) are the roots of

det z I A + B K + L p C L p D K = 0.
replacements
The compensator has the block-diagram:

y[k]
Lp
x p [k]
u[k]
A B K L pC
+L p D K
Note that the control signal u[k] only contains information about
y[k 1], y[k 2], . . ., not about y[k].
So, our compensator is not taking advantage of the most current
measurement. (More on this later).
for loop:
u=-K*xhatp;
635
% now, wait until sample time...

A2D(y);
D2A(u); % u depends on past y, not current.
xhatp=(A-B*K-Lp*C+Lp*D*K)*xhatp+Lp*y;
end loop.
1
; T = 1.4 seconds.
EXAMPLE : G(s) =
s2
Design a compensator such that response is dominated by the poles
z p = 0.8 0.25 j.
System description
"
#
1 1.4
,
A=
0 1
B=
"
1
1.4
C=
D = [0],
1 0 ,
2
Control design: Find
h K such that
i det (z I A + B K ) = z 1.6z + 0.7.
This leads to K = 0.05 0.25 .
Estimator design: Choose poles to be faster than control roots. Lets

radially project z p = 0.8 0.25 j toward the origin, or z pe = 0.8z p .

2
Find L p "
such that
det
z
I
A
+
L
C
=
z
1.28z + 0.45. This leads
p
#
0.72
to L p =
.
0.12
So, our compensator is
"
x p [k + 1] =
u[k] =
or,
0.23 1.16
0.19 0.65
x p [k] +
"
0.72
0.12
0.05 0.25 x p [k],
D(z) = 0.68
z 0.87
z 0.44 0.423 j

y[k]
636
= 0.68
z 0.87
.
z 2 0.88z + 0.374
Lets see the root locus of D(z)G(z). (Note, since D(z) has a negative
sign, must use 0 locus).
1
0.8
Imag Axis
0.6
0.4
0.2
0
0.2
0.4
0.6
PSfrag replacements
0.8
1
1
0.5
0.5
Real Axis
Using the state representation of the plant and our compensator, we

can simulate the closed-loop system to find x[k], x p [k], x[k],
and u[k].
EXAMPLE :
In Matlab,
Q=[A B*K; L*C AB*KL*C];

% get resp. of u, x, xhat to init. condition
[u,X]=dinitial(Q,zeros([4,1]),[0 0 -K],0,x0);
% get the estimate error.
xtilde=X(:,1:2)-X(:,3:4);

637
System and Estimator State Dynamics

y
1
y
y
0.5
y
0
PSfrag replacements
ements
0.8
0.6
0.4
0.2
0
0.5
1
0
Error Dynamics
x and x p
1.5
0.2
5
10
15
20
25
30
0.4
0
10
Time
15
20
25
30
Time
Current Estimator/ Compensator

Using the prediction estimator to build our compensator, we found
that the control effort u[k] did not utilize the most current
measurement y[k], only past values of y: y[k 1], y[k 1] . . .
za
EXAMPLE: Consider trying to design D(z) =
to control a system.
zb
(Either lead or lag).
It cannot be done using a prediction estimator structure since
U (z)
ba
G(z) =
=1+
.
Y (z)
zb
That is, u[k] = f (y[k], . . .).
To develop the current estimate x c [k], consider tuning-up your

prediction estimate x p [k] at time k with y[k].

638
xc [k], x p [x]
x p [k 1]
xc [k 2]
k2
xc [k 1]
k1
xc [k]
x p [k]
k
x p [k + 1]
xc [k + 1]
Tune up estimate
from y[k + 1]
Predict from value of
xc [k] and u[k]
k+1
x p [k] : Estimate just before measurement at k

xc [k] : Estimate just after measurement at k.
Implementation:
Time update: Predict new state from old state estimate and system
dynamics
x p [k] = A xc [k 1] + Bu[k 1].
Measurement update: Measure the output and use that to
update/correct the estimate

xc [k] = x p [k] + L c y[k] C x p [k] .
L c is called the current estimator gain.
Note: This works well for multi-rate systems and for systems where
samples are sometimes missed.
Questions: How does x c [k] relate to x p [k]? How are L c and L p
related?
What is the xc [k] recursion relation?

xc [k + 1] = x p [k + 1] + L c y[k + 1] C x p [k + 1]
{z
}
|
error

639
= (I L c C) x p [k + 1] + L c y[k + 1]

= (I L c C) A xc [k] + Bu[k] + L c y[k + 1]
= (A L c C A) xc [k] + (B L c C B) u[k] + L c y[k + 1].
So,
xc [k] = f (xc [k 1], u[k 1], y[k]).
What about the estimate error?

x[k
+ 1] = x[k + 1] xc [k + 1]
= x[k + 1] x p [k + 1] + L c C x[k + 1] L c C x p [k + 1]

= (I L c C) x[k + 1] x p [k + 1]

= (I L c C) Ax[k] + Bu[k] A xc [k] Bu[k]
= (A L c C |{z}
A )x[k].
new!
Therefore, the current estimator error has dynamics related to the

roots of
det (z I A + L c C A) = 0.
What about the x p recursion relation?
x p [k + 1] = A xc [k] + Bu[k]
= A x p [k] + L c y[k] C x p [k]
= A x p [k] + Bu[k] + AL c
+ Bu[k]

y[k] C x p [k] .
Compare this equation with the prediction estimate recursive

equation. You will notice that is the same except that
L p = AL c .
So, x p in the current-estimator equations is the same quantity x p in
the prediction-estimator equations.

640
This implies that if we define x = x x p (not x xc ), then

x[k
+ 1] = A L p C x[k]
= (A AL c C) x[k].
So, in summary
x = x x p
x = x xc
x[k
+ 1] = (A L c C A) x[k]
x[k
+ 1] = (A AL c C) x[k].
These estimate errors have the same poles. They represent the
replacements
dynamics of the block diagrams:
u[k]
x p [k]
y[k]
Prediction Estimator
u[k]
x p [k]
xc [k]
Lp
C
Lc
Current Estimator
Design of L c
1. Relate coefficients of
det (z I A + L c C A) = ob,des (z).

y[k]
641
2. Ackermanns formula for L c
0

0
1
L p = ob,des (z)
{C, A}
...

1
replace C C A to find L c
0
CA

0
C A2
1
{C A, A}
L c = ob,des (A)
... = ob,des (A) ...

1
C An
| {z
[
Lc=acker(A,(C*A),poles);
Lc=place(A,(C*A),poles);
3. Find L p and then L c = A1 L p .
Compensator Design using the Current Estimator.
x[k + 1] = Ax[k] + Bu[k]
y[k] = C x[k].
Estimator equations
x p [k + 1] = A xc [k] + Bu[k]
Control

xc [k] = x p [k] + L c y[k] C x p [k] .
u[k] = K xc [k].

A]1
In Matlab,
Plant equations
0

0
.
...

1
642
Therefore, we have
x[k + 1] = Ax[k] B K x c [k]
xc [k + 1] = (I L c C) A xc [k] + (I L c C) Bu[k] + L c y[k + 1]
= (I L c C) (A B K ) xc [k] + L c y[k + 1].
With y[k + 1] = C Ax[k] + C Bu[k], then

xc [k + 1] = (A B K L c C A) xc [k] + L c C Ax[k].
Our 2n-order, closed-loop system is
# "
#
"
#"
x[k + 1]
x[k]
A
B K
=
.
L cC A A B K L cC A
xc [k + 1]
xc [k]
Compare this to the prediction estimator feedback case:
Lc L p
CA
In terms of x[k
+ 1] = x[k + 1] x c [k + 1]
"
# "
#"
#
x[k]
x[k + 1]
A BK
BK
=
.
0
A L cC A
x[k
+ 1]
x[k]
As in the prediction estimator case, the closed-loop poles of our

compensated system are the eigenvalues of
"
#
A BK
BK
poles = eig
0
A L cC A
or
cl (z) = des (z) ob,des (z)
= det(z I A + B K ) det(z I A + L c C A).
Therefore
u[k] = K xc [k]
also works!
643
What is D(z) for the current estimator?
Consider a recurrence relation given earlier

xc [k + 1] = (I L c C)(A B K )xc[k] + L c y[k + 1]
u[k] = K x c [k].
Taking the z-transform (x c [0] = 0)

z X c (z) = (I L c C)(A B K ) X c (z) + zL c Y (z)
U (z) = K X c (z)
so
D(z) =
or
U (z)
= K (z I (I L c C)(A B K ))1 L c z
Y (z)
D(z) = K (z I A + B K + L c C A L c C B K )1 L c z.
Extra term in ()1 and there is always a zero at z = 0!
So we always end up with a compensator zero at the origin. The

current compensator poles satisfy
det (z I A + B K + L c C A L c C B K ) = 0.
cements
Block Diagram:
y[k]
Lc
zI
z 1
u[k]
(I L c C)( A B K )
We cannot implement the z I block. To see a more useful block

diagram, lets write the compensator recurrence relations in standard
form, i.e., lets find
xcc [k + 1] = Ax cc [k] + By[k]
644
u[k] = C x cc [k] + Dy[k].

Start with the control equation
u[k] = K xc [k]
= K x p [k] + L c y[k] C x p [k]
= K (I L c C) x p [k] K L c y[k].
Recursion for x p [k]?
x p [k + 1] = A x p [k] + Bu[k] + AL c y[k] C x p [k]
= (A AL c C) x p [k] B K x c [k] + AL c y[k]
= (A AL c C B K + B K L cC) x p [k] + (AL c B K L c ) y[k].

Therefore
x p [k + 1] = A x p [k] + B y[k]
u[k] = C x p [k] + D y[k]
where
A = (A B K )(I L c C)
B = (A B K )L c
C = K (I L c C)
D = K L c
Our modified transfer function for D(z) is

I A)
1 B
D(z) = D + C(z
or
D(z) = K L c K (I L c C) (z I (A B K )(I L c C))1 (A B K )L c.
which you can verify is equivalent to the previous expression.
New block diagram

645
K L c
ments
y[k]
( A B K )L c
z 1
K (I L c C)
u[k]
( A B K )(I L c C)
Because of delay, everything inside the dotted box can be computed

before we sample y[k].
Note the feedthrough term K L c . So, the compensator responds
quickly to plant variations. That is,
u[k] = f (y[k], y[k 1], . . .).
IMPLEMENTATION METHOD
1: (not good)
xhatp=xhatpnew
A2D(y)
xhatc=xhatp+Lc*(y-C*xhatp)
u=-K*xhatc
D2A(u)
xhatpnew=A*xhatc+B*u
IMPLEMENTATION METHOD
2: (good)
xhatp=xhatpnew
upartial=-K*(I-Lc*C)*xhatp
A2D(y)
646
u=upartial-K*Lc*y
D2A(u)
xhatpnew=A*xhatp+B*u+A*Lc*(y-C*xhatp)
1
; T = 1.4 seconds.
EXAMPLE : G(s) =
s2
System description
"
#
"
#
h
i
1 1.4
1
A=
,
B=
,
C= 1 0 ,
0 1
1.4
Pick closed-loop poles as we did for prediction estimator
des (z) = z 2 1.6z + 0.7
ob,des (z) = z 2 1.28z + 0.45.
h
i
Control design K = 0.05 0.25 .
Estimator design
L c = A1 L p = A1
"
0.72
0.12
"
0.55
0.12
Our compensator is described by

#
#
"
"
0.66
0.29 1.16
x p [k] +
y[k]
x p [k + 1] =
0.11 0.65
0.04
h
i
u[k] = 0.0067 0.25 x p [k] 0.06y[k]
or
z(z 0.85)
z 0.47 0.31 j
z(z 0.85)
.
= 0.06 2
z 0.94z + 0.316
D(z) = 0.06
The root locus

647

1
0.8
Imag Axis
0.6
0.4
0.2
0
0.2
0.4
0.6
PSfrag replacements
0.8
1
1
0.5
0.5
Real Axis
Compare to prediction estimator.

EXAMPLE :
In Matlab,
Q=[A B*K; Lc*C*A AB*KLc*C*A];

% get resp. of u, x, xhat to init. condition
[u,X]=dinitial(Q,zeros([4,1]),[0 0 -K],0,x0);
% get the estimate error.
xtilde=X(:,1:2)-X(:,3:4);
y
1
y
y
0.5
y
0
PSfrag replacements
ements
0.5
1
0
Error Dynamics
1
0.8
0.6
0.4
x and x p
1.5
0.2
0
10
15
20
25
30
0.2
0
10
Time

15
Time
20
25
30
648
Reduced-Order Estimator
Why construct the entire state vector when you are directly measuring
a state? If there is little noise in your sensor, you get a great estimate
by just letting
x1 = y
(C = [1 0 . . . 0]).
If there is noise in the measurement
y[k] = C x[k] + v,
v = noise,
then the estimate y p or yc can be a smoothed version of y!
Consider partitioning the plant state into

xa : measured state
xb : to be estimated.
So
y = C x = xa .
(Note: This may require a transformation).

Our partitioned system
"
"
y[k] =
xa [k + 1]
xb [k + 1]
Aaa Aab
Aba Abb
I
#"
xa [k]
xb [k]
"
#
i x [k]
a
"
Ba [k]
Bb [k]
xb [k]
Dynamics of measured state:

xa [k + 1] = Aaa xa [k] + Aab xb [k] + Ba u[k]
where x b [k] is the only unknown. Let
z[k] = x a [k + 1] Aaa xa [k] Ba u[k].
u[k]
649
Then
z[k] = Aab xb [k],
where z[k] is known.

This is our reduced-order estimator output relation.
Dynamics of estimated state:
xb [k + 1] = Aba xa [k] + Abb xb [k] + Bb u[k].
Let
Br u r [k] = Aba xa [k] + Bbu[k]
so that the reduced-order recurrence relation is

xb [k + 1] = Abb xb [k] + Br u r [k].
So, we can design a prediction estimator for
xb [k + 1] = Abb xb [k] + Br u r [k]
z[k] = Aab xb [k],
or
A Abb ;
Bu[k] Br u r [k];
C Aab .
Reduced-order estimator:

xb [k + 1] = Abb xb [k] + Br u r [k] + L r z[k] Aab xb [k] .
In terms of our known quantities,

xb [k + 1] = Abb xb [k] + Aba xa [k] + Bb u[k] +

L r xa [k + 1] Aaa xa [k] Ba u[k] Aab xb [k] .
The error dynamics satisfy

xb [k + 1] = (Abb L r Aab ) xb [k].
So we pick estimate error dynamics related to roots of

r,des (z) = det (z I Abb + L r Aab ) = 0.
650
You might guess that arbitrary reduced-order estimator poles can be

selected if {Aab , Abb } forms an observable pair.
Design of L r
Relate coefficients of
det (z I Abb + L r Aab ) = r,des (z).
Ackermanns formula with
A Abb ,
In Matlab,
L r = r,des (Abb )
C Aab ,
1
Aab
0
.
.
Aab Abb
. .
...
0

n1
Aab Abb
1
Lr=acker(Abb,Aab,poles);
Lr=place(Abb,Aab,poles);
Design a current estimator with a compensator at the origin with
"
#
h
i
I
Lc =
,
C= I 0
Lr
(homework problem?)
Reduced-Order Compensator
Control law:
u[k] = K a xa [k] K b xb [k]

h
i
= K a K b x[k].

651
Plant:
x[k + 1] = Ax[k] + Bu[k]

y[k] = C x[k] = x a [k].
Estimator:
xb [k + 1] = Abb xb [k] + Aba xa [k] + Bbu[k] +
L r xa [k + 1] Aaa xa [k] Ba u[k] Aab xb [k]
using
u[k] = K a C x[k] K b xb [k]

L r xa [k + 1] = L r C (Ax[k] + Bu[k])
= L r C Ax[k] L r Ba K a C x[k] L r Ba K b x[k].
"
x[k + 1]
xb [k + 1]
"
=
A B K aC
L r C A + Aba C Bb K a C L r Aaa C
B K b
Abb Bb K b L r Aab
#"
What is D(z)?
xb [k + 1] = (Abb Bb K b + L r Ba K b L r Aab ) xb [k]
+ (Aba + L r Ba K a L r Aaa Bb K a ) y[k]
+L r y[k + 1]
xb [k + 1] = A xb [k] + B y[k] + L r y[k + 1],
and
u[k] = K b xb [k] K a y[k],
so taking the z-transform

1

U (z)
= K a K b z I A
B + zL r .
D(z) =
Y (z)
x[k]
xb [k]
652
1
; T = 1.4 seconds.
s2
System description
"
#
"
#
1 1.4
1
A=
,
B=
,
0 1
1.4
EXAMPLE :
G(s) =
C=
Measure y[k] = x 1[k] directly.
1 0 ,
Estimate v[k] = x 2[k].

Pick control poles the same as before
des (z) = z 2 1.6z + 0.7.
Choose dead-beat estimation of v[k].
r,des (z) = z.
h
i
Control design: K = 0.05 0.25 .
Reduced-order estimator
r (z) = det(z I Abb + L r Aab )
where
A=
"
Aaa Aab
Aba Abb
so
Lr =
implies
"
1 1.4
0 1
1
= 0.714
T
r (z) = det(z 1 + 0.714 1.4) = z.
T2
A as above, Ba =
= 0.98;
2
Bb = T = 1.4.

653
implies

T2
1
y[k + 1] y[k] u[k] T v[k]
v[k
+ 1] = v[k]
+ T u[k] +
T
2
y[k + 1] y[k]
=
+ 0.7u[k].
1.4
Control:
1
1
u[k] = y[k] v[k],
20
4
so

T
1
1
T
+ y[k + 1]
+
y[k]
v[k
+ 1] = v[k]
8
T
T
40
1
1
y[k].
u[k] = v[k]
4
20
Taking the transfer function:
T
U (z)
1
1 T1 z T1 40
D(z) =
=
,
T
Y (z)
20 4
z+ 8
or
T

1
5 z 1+ T5
D(z) =
1+
.
20
T
z + T8
In our case, T = 1.4,
D(z) = 0.229
Root locus and transient response
z 0.781
z + 0.175

654

1
0.8
cements
PSfrag replacements
0.4
0.2

1
y and y
y
0.5
y
x and x p
Imag Axis
0.6
0
0.2
0.4
0.6
0.5
0.8
1
1
0.5
0.5
1
0
10
15
20
25
30
35
Time
Real Axis
Estimator Pole Placement

As was the case for finding the control law, the design of an estimator
(for single-output plants) simply consists of
1. Selecting desired estimator error dynamics.
2. Solving for the corresponding estimator gain.
In other words, we find L p , L c , or L r by first selecting the roots of
ob,des (z) or r,des (z).
So, what estimator poles do we choose?
If possible, pick estimator poles that do not influence transient
response of control-law poles.
We know
cl = des ob,des.
Since des was chosen to meet transient specifications, try to pick

ob,des such that estimator dynamics die out before control-law
dynamics.
655
Dominated by controllaw dynamics
PSfrag replacements
Transient errors die out here.
With no disturbance, the only job requirement for the estimator is to

correct for the uncertainty of x[0]!
This will be (near) immediate if we pick poles well inside the unit
circle.
Pick estimator poles as fast as possible?
In control law design we were concerned with

Fast response
Large intermediate states
Large control effort
versus
Slower response
Smaller intermediate states
Smaller control effort
In estimator design, large intermediate states and large feedback

signals (control effort) do not carry the same penalty since they are
just computer signals!
That is, L p y[k] is not limited by actuator hardware!
Question: Why not pick very fast estimator poles?
Answer: Sensor noise and uncertainty.
Control law design: Transient response versus actuator effort.
Estimator design: Sensor-noise rejection versus process-noise

rejection.

656
Consider the design of L p : The plant is

x[k + 1] = Ax[k] + Bu[k] + Bw w[k]
y[k] = C x[k] + v[k].
w[k] : Process noise, plant disturbances, plant uncertainties
v[k] : Sensor noise, biases.

Estimator: x p [k + 1] = A x p [k] + Bu[k] + L p y[k] C x p [k] .
Only knowledge of v, w through y[k].
Estimator error dynamics:

x[k
+ 1] = Ax[k] + Bu[k] + Bw w[k] A x p [k] Bu[k]

L p C x[k] + v[k] C x p [k]

= A L p C x[k]
+ Bw w[k] L p v[k].
If v[k] = 0 then
Large L p produces fast poles of A L p C. Fast correction of w[k]

effects.
That is, we believe our sensor more than our model.
We rely heavily on feedback to keep x[k]
small.
High bandwidth to quickly compensate for disturbances.

If w[k] = 0 then
Small L p produces slow poles of A L p C. Less amplification of
sensor noise.
We believe our model more than our sensor.
Rely on model to keep x[k]
small. . . open-loop estimation!
Low bandwidth for good noise rejection (smoothing).

657
Therefore, we pick the estimator poles, or design the estimator gain

by examining the tradeoff:
If A L p C fast:
Small transient response effects.
Fast correction of model, disturbances.
Low noise rejection.
If A L p C slow:
Significant transient response effect.
Slow correction of modeling errors, disturbances.
High noise rejection.

In general
1. Place poles two to six times faster than controller poles and in
well-damped locations. This will limit the estimator influence on
output response.
2. If the sensor noise is too big, the estimator poles can be placed as
much as two times slower than controller poles.
Notes about (2):
Controller may need to be redesigned since the estimator will
strongly influence transient behavior.
These slow estimator dynamics will not be excited by our
reference command if our signal is properly included!
MIMO Control Design
So far, we have only discussed control design for SISO systems.
658
Several different MIMO approaches exist, and all require finding K

such that u(t) = K x(t).
K has as many rows as u(t), and as many columns as there are
states.
If a MIMO system is controllable, it is possible to choose a K
matrix to place the poles of the system anywhere in the s-plane (or
z-plane) in complex-conjugate pairs.
FACT:
If a MIMO system is controllable, the matrix K is not unique! This

brings up the question of optimal values of K . . .
FACT:
A number of design approaches exist. Some are very methodical, but

difficult. Others employ randomness but are easier (if they work).
We will investigate two of the random methods.

Cyclic Design [Chen]
In cyclic design, we change the multi-input problem into a single-input
problem.
659
A matrix is cyclic if it has one and only one Jordan block associated
with each distinct eigenvalue. Note, this does not imply that all
eigenvalues are distinct. (Although, if all eigenvalues are distinct, the
matrix is cyclic).
If the n-dimensional p-input pair {A, B} is controllable and if A is
cyclic, then for almost any p 1 vector v, the single-input pair {A, Bv}
is controllable.
Controllability is invariant under any equivalence transformation;
thus, we may assume A to be in Jordan form. For example,
x
2 1 0 0 0
0 1
" #
x
0 2 1 0 0
0 0
v1
=
A = 0 0 2 0 0 , B = 1 2 ; Bv = B
v
2
x
0 0 0 1 1
4 3
0 0 0 0 1
1 0
It can be shown that the condition for controllability is that 6= 0

and 6= 0. Because = v1 + 2v2 and = v1 the system is
uncontrollable only if v1 = 0 or v1/v2 = 1.
Any other v vector in two-dimensional space results in a
controllable pair {A, Bv}. If we choose v randomly, then with
probability 1, we will choose a good one.
Design method:
1. Randomly choose a vector v. Test for controllability of {A, Bv}.
Repeat until controllable.
2. The multi-input system {A, B, C, D} has been reduced to a
single-input system by stating that u(t) = vu 0(t). The new system
660
is {A, Bv, C, D} with input u 0(t) and output y(t). Use single-input
design methods such as Bass-Gura or Ackermann to find k to
place the poles of the single-input system. Then, the overall state
feedback is: u(t) = kvu 0(t), or, K = kv.
What if A is not cyclic? It can be shown that if {A, B} is controllable
then for almost any p n real constant matrix K the matrix (A B K )
has only distinct eigenvalues, and is therefore cyclic. Randomly
choose K matrices until (A B K ) is cyclic. Then, design for this
system.
Use small random numbers for low control effort.
DESIGN METHOD SUMMARY:
1. First, randomly choose p n constant matrix K 1 such that

A = A B K 1 is cyclic.
Bv} is controllable.
2. Randomly choose p 1 vector v such that { A,
3. Design state feedback vector k using Bass-Gura/ Ackermann/ etc on

PSfrag replacements
Bv} to put the poles in the desired place. Then K 2 = vk.
system { A,
Assemble together
K2
K1
u(t)
r (t)
x(t)
4. Design may be summed up as u(t) = r(t) (K 1 + K 2)x(t).

y(t)
661
Lyapunov-Equation Design [Chen]

DESIGN METHOD :
1. Select an n n matrix F with a set of desired eigenvalues that

contain no eigenvalues of A.
2. Randomly select p n matrix K such that {F, K } is observable.
3. Solve for the unique T in the Lyapunov equation AT T F = B K .

4. If T is singular, select a different K and repeat the process. If T is
nonsingular, we compute K = K T 1 and (A B K ) has the desired
eigenvalues.
If T is nonsingular, the Lyapunov equation and K T = K imply

(A B K )T = T F
or
A B K = T F T 1.
Thus, (A B K ) and F are similar and have the same set of

eigenvalues.
MIMO Observer Design
The MIMO observer design problem is the dual of the MIMO
controller design problem.
Therefore, if {A T , C T } is controllable, the controller design
procedures return L = K T .

(mostly blank)
(mostly blank)
(mostly blank)
71
LINEAR QUADRATIC REGULATOR

Introduction to Optimal Control
The engineering tradeoff in control-system design is
Fast response
EXAMPLE :
Consider
versus
Slower response
x(t)
= x(t) u(t)
with state feedback u(t) = kx(t), k .

x(t)
= (1 + k)x(t).
Eigenvalue at 1 + k. Can make as negative (fast) as we want, with
large-k and large input u(t).
Suppose x(0) = 1, so u(t) = ke (1+k)t .
Different control signals u(t)
PSfrag replacements
Amplitude
Time
72
ECE4520/5520, LINEAR QUADRATIC REGULATOR
As k (i.e., large) u(t) looks more and more like (t), the input
we found earlier which (instantaneously) moves x(t) to 0!
To see this,
for k large,
k large.
u(t) dt =
(k)
,
(k) 1
u(t) dt = 1. Clearly u(t) is bunching up near t = 0 for
In general, as we relocate our eigenvalues farther and farther to the

left, so that the closed-loop system is faster and faster, our plant input
begins to look like the impulsive inputs we considered earlier.
Once again, the tradeoff is speed versus gain/ size of input.
Cost Functions
To avoid large inputs, we consider the cost function:
X

T
2
J =
x[k] x[k] + u[k] .
k=0
We will find the K such that u[k] = K x[k] minimizes this cost.
We make large if we dont want large inputs (high cost of
control);
We make small if we want fast response and dont mind large
inputs (cheap control).
EXAMPLE :
with
Consider (where x[k] is a scalar)

x[k + 1] = x[k] + u[k]
u[k] = K x[k].

73
Thus x[k] = (1 K )k x[0] so
X
(x[k]2 + u[k]2)
J =
k=0
2
x[0]2 1 + K
, 0 < K < 2, x[0] 6= 0;
=
1 (1 K )2
,
otherwise.
Thus, J = px[0]2 where
1 + K 2
.
p=
K (2 K )
We can solve for the optimal K for any given by

K (2 K )(2K) (1 + K 2)(2 2K )
dp
=
=0
dK
[K (2 K )]2
K 2(2 K ) = (1 + K 2)(1 K )
2 K 2 K 3 = 1 + K 2 K K 3
K 2 + K 1 = 0.
So,
K opt =
1 +
1 + 4
.
2
(The other solution is a maximum, not a minimum).

The optimal cost is
J=
( 1 + 4 1)
.
2 1 + 4 + 1
For low cost (cheap) control, let 0. Then K opt 1 since
1 + 1 + 4
2(1 + 4)1/2
lim
= lim
= 1,
0
0
2
2
which is deadbeat control; closed-loop eigenvalues at 0.
74
1
For high cost (expensive) control, let then K opt , which
is a small (as expected) feedback which just barely stabilizes the

1
system, but plant input is small. Closed loop eigenvalue at 1
PSfrag replacements
which is < 1.
Dynamic Programming: Bellmans Principle of Optimality
We will want to minimize a more general cost function
J minimization is a topic in optimization theory.
We will use a tool called dynamic programming.
Consider the task of finding the lowest-cost route from point x o to x f ,
where there are many possible ways to get there.
J23
J12
J24
J36
J46
J47
J15
Then
J38
J68
J78
J58
= min {J15 + J58, J12 + J24 + J46 + J68, . . .} .

J18
We need to make only one simple observation:

In general, if xi is an intermediate point between x o and x f and
xi is on the optimal path, then
Jof = Joi + Jif .
This is called Bellmans Principle of Optimality.
75
Quadratic Forms
In the cost function
J=
(x[m]T x[m] + u[m]2),
m=0
all components of x[m] are weighted evenly. Often, some

components or some linear combination of components are more
critical than others.
It is critical that x 1[k] be brought to zero quickly; x 2[k] doesnt
matter so much. We might take
"
#
!
X
10 0
J =
x[m]T
x[m] + u[m]2 .
0 0.1
m=0
EXAMPLE :
More generally, we use the quadratic form

x T [k]Qx[k]
where Q is an n n weighting matrix.
PROPERTY I :
We may assume that Q = Q T . Why?

T
T
T
=
x
Qx.
(x
Qx)
| {z }
a scalar
Therefore x T Q T x = x T Qx and x T Qx = x T Q sym x where

1
Q sym = (Q + Q T ) is the symmetric part of Q.
2
PROPERTY II : J should always be 0. That is, we require

x T Qx 0
x n
then Q is positive semi-definite (we write Q 0).
If x T Qx > 0 for all x 6= 0 then Q is positive definite (we write Q > 0).
76
Vector Derivatives
In the following discussion we will often need to take derivatives of
vector/ matrix quantities.
This small dictionary should help: a(x) and b(x) are m 1 vector
functions with respect to the vector x, y is some other vector and A is
some matrix.

T
T

b(x)
a(x)
b(x) +
a(x),
1.
a T (x)b(x) =
x
x
x
T
2.
(x y) = y,
x
T
(x x) = 2x,
3.
x
T
4.
(x Ay) = Ay,
x
T
(y Ax) = A T y,
5.
x
T
(x Ax) = (A + A T )x,
6.
x

T
a(x) T
7.
a (x)Qa(x) = 2
Qa(x), where Q is a symmetric
x
x
matrix.
This brings us back to our problem. . .
The Discrete-Time Linear Quadratic Regulator Problem
Most generally, the discrete-time LQR problem is posed as minimizing
Ji,N = x T [N ]P x[N ] +
N 1
X

k=i

x T [k]Qx[k] + u T [k]Ru[k] ,
which may be interpreted as the total cost associated with the

transition from state x[i] to the goal state 0 at time N .
77
x T [N ]P x[N ] is the penalty for missing the desired final state.
x T [k]Qx[k] is the penalty on excessive state size.
u T [k]Ru[k] is the penalty on excessive control effort. (R = if SISO).

We require P 0, Q 0 and R > 0.
To find the optimum u[k], we start at the last step and work
backwards.
JN 1,N = x T [N ]P x[N ] + x T [N 1]Qx[N 1] + u T [N 1]Ru[N 1].
We express x[N ] as a function of x[N 1] and u[N 1] via the
system dynamics
JN 1,N = (Ax[N 1] + Bu[N 1])T P (Ax[N 1] + Bu[N 1])
+x T [N 1]Qx[N 1] + u T [N 1]Ru[N 1]
= x T [N 1]A T P Ax[N 1] + u T [N 1]B T P Bu[N 1]
+x T [N 1]A T P Bu[N 1] + u T [N 1]B T P Ax[N 1]

+x T [N 1]Qx[N 1] + u T [N 1]Ru[N 1].
We minimize over all possible inputs u[N 1] by differentiation

T JN 1,N
= 2B T P Bu[N 1] + 2B T P Ax[N 1] + 2Ru[N 1]
0=
u[N 1]

= 2 R + B T P B u[N 1] + 2B T P Ax[N 1].
Therefore,
u [N 1] = R + B T P B
1
B T P Ax[N 1].
The exciting point is that the optimal u[N 1], with no constraints on
its functional form, turns out to be a linear state feedback! To ease
notation, define
1 T
T
K N 1 = R + B P B
B PA

78
such that
u [N 1] = K N 1 x[N 1].
Now, we can express the value of J N 1,N as

T
JN 1,N =
Ax[N 1] B K N 1 x[N 1] P Ax[N 1]
B K N 1 x[N 1]
+x T [N 1]Qx[N 1] + x T [N 1]K NT 1 R K N 1 x[N 1]

= x T [N 1] (A B K N 1)T P(A B K N 1) + Q

+K NT 1 R K N 1 x[N 1].
Simplify notation once again by defining
PN 1 = (A B K N 1)T P(A B K N 1) + Q + K NT 1 R K N 1,
so that
JN 1,N = x T [N 1]PN 1 x[N 1].
To see that this notation makes sense, notice that

4
JN ,N = JN ,N = x T [N ]P x[N ] = x T [N ]PN x[N ].
Now, we take another step backwards and compute the cost J N 2,N
JN 2,N = JN 2,N 1 + JN 1,N .
Therefore, the optimal policy (via dynamic programming) is
JN 2,N = JN 2,N 1 + JN 1,N .

79
To minimize this, we realize that N 1 is now the goal state and

JN 2,N 1 = (Ax[N 2] + Bu[N 2])T P (Ax[N 2] + Bu[N 2])
+x T [N 2]Qx[N 2] + u T [N 2]Ru[N 2].
We can find the best result just as before

u [N 2] = K N 2 x[N 2]
where
T
K N 2 = R + B PN 1 B
In general,
B T PN 1 A.
u [k] = K k x[k]
where
T
K k = R + B Pk+1 B
and
1
1
B T Pk+1 A
Pk = (A B K k )T Pk+1(A B K k ) + Q + K kT R K k ,
This difference equation for Pk has a starting condition that occurs at

the final time, and is solved recursively backwards in time.
Simulate a feedback controller for the system
"
#
#
" #
"
2
0
2 1
x[k + 1] =
x[k] +
u[k],
x[0] =
1
3
1 1
EXAMPLE :
such that the cost criterion

"
#
"
#
!
9
X
5 0
2 0
J = x T [10]
x[10] +
x T [k]
x[k] + 2u 2[k]
0 5
0 .1
k=1
is minimized.

710
From the problem, we gather that

"
#
"
#
5 0
2 0
P10 =
,
Q=
,
0 5
0 .1
R = [2].
Iteratively, solve for K 9, P9, K 8, P8 and so forth down to K 1 and P1.

Then, u[k] = K k x[k].
A=[2 1; -1 1]; B=[0; 1]; x0=[2; -3];
P=zeros(2,2,10); K=zeros(1,2,9);
x=zeros(2,1,11); x(:,:,1)=x0;
P(:,:,10)=[5 0; 0 5]; R=2; Q=[2 0; 0 0.1];
for i=9:-1:1,
K(:,:,i)=inv(R+B*P(:,:,i+1)*B)*B*P(:,:,i+1)*A;
P(:,:,i)=(A-B*K(:,:,i))*P(:,:,i+1)*(A-B*K(:,:,i))+ ...
Q+K(:,:,i)*R*K(:,:,i);
end;
for i=1:9,
x(:,:,i+1)=A*x(:,:,i)-B*K(:,:,i)*x(:,:,i);
end;
State vector x[k]
Feedback Gains K[k]
2.5
2
k2
Value
Value
1.5
0
1
k1
1
0.5
0
0.5
3
0
4
6
Time sample, k
10
1
1
4
5
6
Time sample, k
Elements of the P matrix
60
P11
50
Value
40
P12=P21
30
P22
20
10
0
1
5
6
Time sample, k
10

711
Infinite-Horizon Discrete-Time LQR

If we let N , then Pk tends to a steady-state solution as k 0.
Therefore, K k K . This is clearly a much easier control design, and
usually does just about as well.
To find the steady-state P and K , we let Pk = Pk+1 = Pss in the above
equation.
Pss = (A B K )T Pss (A B K ) + Q + K T R K
and
K = R + B T Pss B
which may be combined to get
1
B T Pss A
Pss = A T Pss A A T Pss B R + B T Pss B
1
B T Pss A + Q
which is called a Discrete-Time Algebraic Riccati Equation, and may

be solved in Matlab using dare.m
For the previous example (with a finite end time), the solution
reached for P1 was
#
"
49.5336 28.5208
P1 =
.
28.5208 20.8434
EXAMPLE :
In Matlab, dare(A,B,Q,R) for the same system gives

#
"
49.5352 28.5215
.
Pss =
28.5215 20.8438
So, we see that the system settles very quickly to steady-state
behavior.
There are many ways to solve the the D.A.R.E., but when Q has the
form C T C, and the system is SISO, there is a simple method which
712
yields the optimal closed-loop eigenvalues directly. (Note, when

Q = C T C we are minimizing the output energy |y[k]|2).
Chang-Letov Method
The optimal eigenvalues are the roots of the equation
1
1 + G T (z 1)G(z) = 0
which are inside the unit circle, where

G(z) = C(z I A)1 B + D.
(Proved later for the continuous-time version).
EXAMPLE :
Consider G(z) =
1
so
z1
1
1+
=0
(z 1)(z 1 1)
2 + 1 z z 1 =
z =1+
1
1
+
.
4 2
The locus of optimal pole locations for all form a Reciprocal Root
Locus.
Reciprocal Root Locus in Matlab (SISO)
We want to plot the root locus
1
1 + G T (z 1)G(z) = 0,
where
G(z) = C(z I A)1 B + D.

713
We know how to plot a root locus of the form

1 + K G 0(z) = 0
ements
so we need to find a way to convert G T (z 1)G(z) into G 0(z).
We know that
1 T
C + DT
G T (z 1) = B T z 1 I A T

T
T 1
(AT C T ) + D T .
= B z zI A
Combining G(z) and G T (z 1) in block-diagram form:

DT
D
u[k]
x[k + 1]
x[k]
[k + 1]
C
C T
AT
BT
z 1
[k]
The overall system has state

"
# "
#"
# "
#
x[k]
x[k + 1]
A
0
B
=
+
u[k]
T T
T
T T
[k + 1]
[k]
A C C A
A C D
"
#
h
i x[k]
y[k] = B T AT C T C + D T C B T AT
+
[k]
T

T T T
D D B A C D u[k].
function rrl(sys)
[A,B,C,D]=ssdata(sys);
bigA=[A zeros(size(A)); -inv(A)*C*C inv(A)];

bigB=[B; -inv(A)*C*D];
bigC=[-B*inv(A)*C*C+D*C B*inv(A)];
bigD=-B*inv(A)*C*D+D*D;
y[k]
714
rrlsys=ss(bigA,bigB,bigC,bigD,-1);
rlocus(rrlsys);
Reciprocal Root Locus

2
Let
(z + 0.25)(z 2 + z + 0.5)
G(z) =
.
PSfrag
(z 0.2)(z 2
2z +replacements
2)
1.5
Imag Axis
EXAMPLE :
Note that G(z) is unstable.
1
0.5
0
0.5
1
1.5
2
2
1.5
0.5
0.5
Real Axis
1.5
For the expensive cost of control case, stable poles

remain where they are and unstable poles are mirrored into the unit
disc. (They are not moved to be just barely stable, as we might
expect!)
For the cheap cost of control case, poles migrate to the finite zeros
of the transfer function, and to the origin (deadbeat control).
OBSERVATIONS :
The Continuous-Time Linear Quadratic Regulator Problem

The continuous-time LQR problem is stated in a similar way, and
there are corresponding results. We wish to minimize
Z tf
T

T
T
x (t)Qx(t) + u (t)Ru(t) dt.
J (xo , u, to ) = x (t f )Pt f x(t f ) +
t0
Pt f , Q and R have the same restrictions and interpretations as before.
RESULTS :
The following are key results
The optimal control is a linear (time varying) state feedback

u(t) = R 1 B T P(t)x(t).

715
The symmetric p.s.d. matrix P(t) satisfies the (matrix) differential

equation
P(t)
= P(t)B R 1 B T P(t) Q P(t)A A T P(t),
with the boundary condition that P(t f ) = Pt f . The differential
equation runs backwards in time to find P(t).
If t f , P(t) Pss as t 0. Then,
0 = Pss B R 1 B T Pss Q Pss A A T Pss .
This is the Continuous-Time Algebraic Riccati Equation, and may

be solved in Matlab using care.m; then,
u(t) = R 1 B T Pss x(t),
which is a linear state feedback.
There are many ways to solve the the C.A.R.E., but when Q has the
form C T C, and the system is SISO, a variant of the Chang-Letov
method may be used:
1
1 + G T (s)G(s) = 0
which are in the left-half plane, where

G(s) = C(s I A)1 B + D.
The locus of all possible values of closed-loop optimal roots forms
the symmetric root locus.
Solving the Continuous-Time LQR Problem
1. Define the cost function.
2. Use Bellmans Principle of Optimality (dynamic programming).
716
3. Determine the Hamilton-Jacobi-Bellman equation.

4. Solve this equation (steps outlined later on).
Define the Cost Function
We define the cost function we wish to minimize
Z tf
T

J (xo , u, to ) = x T (t f )Pt f x(t f ) +
x (t)Qx(t) + u T (t)Ru(t) dt
to
where Q 0, Pt f 0 and R > 0.
We define the optimal cost

V (xo, to ) = min J (xo, u, to )
subject to x(t)
= Ax(t) + Bu(t).
u(t)
Invoke Bellmans Principle of Optimality

We break the cost function into two pieces (where t is small)
Z to +t

T
J (xo, u, to ) = x T (t f )Pt f x(t f ) +
to
tf
to +t

x T (t)Qx(t) + u T (t)Ru(t) dt.
From the Bellman equation we know that the optimal cost

Z to +t
T

V (xo, to ) = min
u(t)
to

+V (x(to + t), to + t) .
The minimum cost is the cost to go from x(to) to x(to + t) plus the
optimal cost to go from x(to + t) to x(t f ). The latter part includes the
terminal cost.

717
Determine the Hamilton-Jacobi-Bellman Equation

We evaluate V (x(to + t, to + t) by computing its Taylor-series
expansion around the point (x o, to ).

V (x, t)
[to + t to ]
V (x(to + t), to + t) = V (x o, to ) +
t xo ,to

V (x, t)
+
[x(to + t) x(to)] + h.o.t.
x xo ,to
So, if t is small
V (xo , to ) = min
u(t)
Z
to +t
to

x T (t)Qx(t) + u T (t)Ru(t) dt

V (x, t)
t
+V (xo, to ) +

t
x o ,to
|
{z
}
Not functions of u(t)

V (x, t)
+
x
x o ,to
[x(to + t) x(to)]

V (x, t)
= V (xo, to ) +
t
t xo ,to
(Z
to +t

T
T
+ min
x (t)Qx(t) + u (t)Ru(t) dt
u(t)
{z
}
| to
[x T (to )Qx(to )+u T (to )Ru(to )]t

V (x, t)
+
x
|[x(to + t)
{z x(to)]} .
x o ,to
[ Ax(to )+Bu(to )]t
Subtracting like terms from both sides and dividing by t

V (x, t)
0=
+
t xo ,to

718
min
u(t)
V
(x,
t)

[x T (to)Qx(to) + u T (to )Ru(to)] +
[Ax(to) + Bu(to)]
x xo ,to
|
{z
}
Hamiltonian
This is called the Hamilton-Jacobi-Bellman Equation.
To minimize the Hamiltonian (with respect to u(to )), take derivatives

with respect to u(to ) and set to zero.
"
#

T

V (x, t)
[x T (to)Qx(to) + u T (to)Ru(to )] +
[Ax(to) + Bu(to)]
0=
u(to )
x xo ,to

T
V (x, t)
.
= 2Ru(to ) + B T

x
x o ,to
So,
V
(x,
t)
1

,
u (to) = R 1 B T

2
x
x o ,to
hence the need for R to be positive definite.

V (x, t)
We still need to determine
.
x xo ,to
1. Show V (z, to ) = z T P(to)z where P(to) is symmetric, p.s.d.

2. Use this result to compute the final desired term.
Show that V (z, to ) = z T P(to)z

The minimum cost-to-go starting in state z is a quadratic form in z.
Can be shown in a number of steps. The main steps are:
1. Show that the gradient operator on V (that is, V ) is linear.
2. Integrate the (linear) gradient to get a quadratic form.
We will develop a number of properties in order to prove these results.

719
PROPERTY I :
For all scalars , J (z, u, to ) = 2 J (z, u, to ) and therefore

V (z, to ) = 2 V (z, to ).
Let x(t) be the state that corresponds to an input u(t) and an initial
condition z. Then,
Z t
x(t) = e At z +
e A(t ) Bu( ) d.
0
Now, denote by x(t)

the state that corresponds to an input u(t) and
an initial condition z. Then,

Z t
e A(t ) Bu( ) d = x(t).
x(t)
= e At z +
0
Thus,
2 T
J (z, u, to ) = x (t f )Pt f x(t f ) +
= 2 J (z, u, to )
and
tf
to
x T (t)Qx(t) + u T (t)Ru(t) dt
V (z, to ) = 2 V (z, to ).
Let u and u be two input sequences, and let z and z be two

initial states. We will show that
J (z + z , u + u,
to ) + J (z z , u u,
to ) = 2J (z, u, to ) + 2J (z , u,
to )
PROPERTY II :
by plugging in and collecting terms.

Suppose
x(t)
= Ax(t) + Bu(t),
= A x(t)
x(t)
+ B u(t),
x o = z;
x o = z .
Adding (or subtracting) the above equations we obtain

x(t) x(t)
= A(x(t) x(t))
+ B(u(t) u(t)),
x o xo = z z .
720
Therefore, x(t) x(t)

is the state that corresponds to an input
u(t) u(t)
and initial condition z z (respectively).
Now, we plug in
J (z z , u u,
to )
= (x x)
T (t f )Pt f (x x)(t
f)
Z tf

T
T
+
(x x)
(t)Q(x x)(t)
+ (u u)
(t)R(u u)(t)
dt
to
= x T (t f )Pt f x(t f ) x T (t f )Pt f x(t

f ) x T (t f )Pt f x(t f ) + x T (t f )Pt f x(t
f)
Z tf
x T (t)Qx(t) x T (t)Qx(t) x T (t)Q x(t)
+ x T (t)Q x(t)
+
to
u T (t)Ru(t) u T (t)R u(t)

u T (t)Ru(t) + u T (t)R u(t)
dt.
Therefore,
J (z + z , u + u,
to ) + J (z z , u u,
to ) = 2J (z, u, to ) + 2J (z , u,
to ).
PROPERTY III :
Conclude
Minimizing
Next, minimize the RHS with respect to u(t) and u(t).
V (z + z , to ) + V (z z , to ) 2V (z, to ) + 2V (z , to ).
to ) + J (z z , u u,
to )}
min {J (z + z , u + u,
u,u
= min {2J (z, u, to )} + min {2J (z , u,

to )} .
u
Now,
but
RHS = 2V (z, to ) + 2V (z , to )
LHS V (z + z , to ) + V (z z , to )

721
by the triangle inequality. Therefore,

V (z + z , to ) + V (z z , to ) 2V (z, to ) + 2V (z , to ).
Apply the above inequality with (z + z )/2 substituted for z
and (z z )/2 substituted for z to get:
V (z + z , to ) + V (z z , to ) = 2V (z, to ) + 2V (z , to ).
PROPERTY IV:
Substitute asdirected.

z + z
z z
2V
, to + 2V
, to V (z, to ) + V (z , to ).
2
2
By scalar multiplication principle,

2
2
V (z + z , to ) + V (z z , to ) V (z, to ) + V (z , to ).
4
4
Multiply both sides by 2
V (z + z , to ) + V (z z , to ) 2V (z, to ) + 2V (z , to ),
and, combined with results of property III ,
V (z + z , to ) + V (z z , to ) = 2V (z, to ) + 2V (z , to ).
Now we are getting somewhere. Recall that linearity

requires superposition and scaling properties be met. First, we prove
superposition of the gradient operator. Take partial derivatives of this
equation with respect to z and z . Show that
V (z + z ) = V (z) + V (z ).
PROPERTY V:
The gradient operator is defined as

f (x)T
f (ax)T
f (x) =
.
Also, f (ax) =
.
x
ax
Take partial derivatives of the equation with respect to z:

V (z, to )
V (z , to )
V (z + z , to ) V (z z , to )
+
=2
+2
z
z
z
z
722
V (z + z , to ) (z + z ) V (z z , to ) (z z )
+
= 2V (z, to )T
(z + z )
z
(z z )
z
V (z + z , to ) + V (z z , to ) = 2V (z, to ).
Take partial derivatives of the equation with respect to z :
V (z, to )
V (z , to )
V (z + z , to ) V (z z , to )
+
=2
+2
z
z
z
z
V (z + z , to ) (z + z ) V (z z , to ) (z z )
+
= 2V (z , to )T
(z + z )
z
(z z )
z
V (z + z , to ) V (z z , to ) = 2V (z , to ).
Add the two results and divide by two to get
V (z + z , to ) = V (z, to ) + V (z , to ).
To show linearity of the gradient the last step we must
perform is to show
V (z, to ) = V (z, to ).
PROPERTY VI :
From the definition of the gradient,
V (z, to )T
V (z, to ) =
(z)
2 V (z, to )T
=
z
= V (z, to ).
So, the gradient is linear. This means that V (z, to ), which is a vector,
is linear in z and hence has a matrix representation
V (z, to ) = M(to )z
where M(to )
nn

723
We are nearly ready to integrate the gradient to show our

desired result. First, we must show that
Z 1
V ( z, to )T zd.
V (z, to ) = V (0, to ) +
PROPERTY VII :
First, we note that V (z, to ) is a scalar. Consider a scalar function f ().

Then,
Z 1
f ()
d = f (1) f (0).
Let f () = V ( z, to ). Then,
V ( z, to )
d
0
Z 1
V ( z, to ) z
d
V (z, to ) = V (0, to) +
(
z)
0
Z 1
= V (0, to) +
V ( z, to )T z d
V (z, to ) V (0, to) =
Now, integrate away to show the desired result. Note

that V (0, to ) = 0.
Z 1
Z 1
V (z, to ) =
(M(to )( z))T z d =
z T M T (to )z d
PROPERTY VIII :

2 1
= z T M T (to)z
2 0
= z T M T (to)z/2.
Since V (z, to ) is a scalar, V (z, to )T = V (z, to ) = z T M(to )z/2. Averaging

our two (identical) results,

T
M(t
)
+
M
(t
)
o
o
z.
V (z, to ) = z T
4
724
M(to ) + M T (to )
. Also, P(to) 0 since
Therefore, P(to) =
4
J (z, u, to ) 0 for all u, z. Thus
V (z, to ) = min J (z, u, to ) = z T P(to)z 0 z,
u
and we have (finally) proven the desired result.

The Optimal u (t) and Differential Riccati Equation

V
(x,
t)
T

Because V (x, t) = x P(t)x,
= 2x T (to)P T (to). We can

x
x o ,to
now state
u (t) = |R 1 B{zT P(t)} x(t)
K (t)
so we see that the optimum control, with no a priori constraints on the

structure of the u(t) signal, is a (time varying) linear state feedback.
We need to determine P(t). Note that in the Hamilton-Jacobi-Bellman

equation we have yet to determine

T
V (x, t)
o )xo.
= x P(t)x
= xoT P(t

t
t
x o ,to
x o ,to
Substitute all results, including optimum u (t)

o)x(to) + x T (to)Qx(to) + x T (to )P(to)B R 1 B T P(to)x(to)
0 = x T (to) P(t
+2x T (to)P(to)Ax(to) 2x T (to)P(to)B R 1 B T P(to)x(to).
This expression is valid for all to . Also note that we can write
2x T (to )P(to)Ax(to) = x T (to)P(to)Ax(to) + x T (to )A T P(to)x(to),
so

1 T
T
0 = x (t) P(t) + Q P(t)B R B P(t) + P(t)A + A P(t) x(t)

T
which is true for any x(t). Therefore,
P(t)
= P(t)B R 1 B T P(t) P(t)A A T P(t) Q
725
which is called the Differential (matrix) Riccati Equation.

This is a nonlinear differential equation with boundary condition
P(t f ) = Pt f , solved backward in time.
Steady-State Solution
As the differential equation for P(t) is simulated backward in time
from the terminal point, it tends toward steady-state values as t 0.
It is much simpler to approximate the optimal control gains as a
constant set of gains calculated using Pss .
0 = Pss B R 1 B T Pss Pss A A T Pss Q.
This is called the Algebraic Riccati Equation. In Matlab, care.m
Solving the Differential Riccati Equation via Simulation
The differential Riccati equation may be solved numerically by
integrating the matrix differential equation
P(t)
= P(t)B R 1 B T P(t) P(t)A A T P(t) Q
backward in time.
The problem we discover is that Matlabs integration routines
ode45.m will only work on vector differential equations, not matrix
differential equations such as this.
The Kronecker product comes to the rescue once again, along with
the matrix stacking operator. We can write the above matrix
differential equation as a vector differential equation:

T
T
T
1 T
Pst = A I + I A Pst + Q st P P B R B st .
726
A sign has been introduced in order for the forward-time ode45.m

(for example) to work on the backward-time equation.
In Matlab
pdot=(kron(A,eye(size(A)) + kron(eye(size(A)),A)) ...

*st(P) + st(Q) - kron(P,P)*st(B*inv(R)*B);
function col=st(m) % stack subfunction

col=reshape(m,prod(size(m)),1);
function mat=unst(v) % unstack subfunction
mat=reshape(v,sqrt(length(v)),sqrt(length(v)));
EXAMPLE :
Consider the continuous-time system

#
" #
"
1
1 0
u(t)
x(t) +
x(t)
=
0
2 0
h
i
y(t) = 0 1 x(t).
Solve the differential matrix Riccati equation that results in the control
signal that minimizes the cost function
#
"
Z 5
2
0
[y T (t)y(t) + u T (t)u(t)] dt.
J = x T (5)
x(5) +
0 2
0
First, note that the open-loop system is unstable, with poles at 0 and
1. It is controllable and observable.
The cost function is written in terms of y(t) but not x(t). However,
since there is no feedthrough term, we can also write it as
"
#
Z 5
T

2
0
x (t)C T C x(t) + u T (t)u(t) dt.
J = x T (5)
x(5) +
0 2
0
This is a common trick.

727
Therefore, the penalty matrices are Q = C T C and R = = 1.

We can simulate the finite horizon case to find P(t).
Integrator has
st(Q) "initial condition"
st(Ptf).
1/s
MATLAB
Function
(kron(A,eye(size(A)))+
MATLAB
kron(eye(size(A)),A))*u Function
kron(unst(u),unst(u))*
st(B*inv(R)*B)
Final time.
5
pvec
Solving for P
4
3.5
P11
3
2.5
2
1.5
tvec
P22
P12=P21
0.5
Clock
To plot: plot(tvec.signals.values,pvec.signals.values)
0
0
Time (sec)
We can also solve the infinite-horizon case (analytically, for this

example). Consider the A.R.E.
0 = A T P + P A + C T C P B R 1 B T P
"
# "
#"
# "
#"
# "
#
0 0
1 2
p11 p12
p11 p12
1 0
0 0
=
+
+
0 0
0 0
p21 p22
p21 p22
2 0
0 1
#
#"
#"
"
1 0
p11 p12
p11 p12
p21 p22
0 0
p21 p22
"
# "
#
p11 + 2 p12 p12 + 2 p22
p11 + 2 p12 0
=
+
+
0
0
p12 + 2 p22 0
#
"
# "
2
0 0
p11 p11 p12
.
2
p11 p12 p12
0 1
This matrix equality represents a set of three simultaneous equations
(because P is symmetric). They are:
2
2 p11 p11
+ 4 p12 = 0
p12 + 2 p22 p11 p12 = 0
2
1 p12
= 0.

728
The final equation gives us p12 = 1. If we select p12 = 1 then the

first equation will have complex roots (bad). So, p12 = 1.
Then, p11 = 1 5. If p11 = 1 5 then P cannot be positive
definite. Therefore, p11 = 1 + 5 = 3.236.
Finally, we get p22 = 5/2 = 1.118.

These are the same values as the steady-state solution found by
integrating the differential Riccati equation.
The static feedback control signal is
u(t) = R
B Pss x(t) = 3.236 1 x(t).
5
3
For this feedback, the closed-loop poles are at
j (stable).
2
2
Continuous-Time Systems and Chang-Letov (SISO only)

For a SISO system, we can easily plot the locus of closed-loop poles.
Tradeoff between control effort and output error is evident.
Consider the infinite-horizon LQR problem with Q = C T C, C

and R = , .
The cost function is then Z

J=
y 2(t) + u 2(t) dt.
The Algebraic Riccati Equation becomes
A T P + P A P B 1 B T P + C T C = 0
or
C T C = P(s I A) + (s I A T )P + P B 1 B T P.

1n
729
Multiply both sides on the left by B T (s I A T )1 and on the right by

(s I A)1 B 1.

1 T
T
T
1
B (s I A )C
C(s I A) B =
1
1
1 T
B T (s I A T )1 |P B
(s
I
A)
B
+
B
P
{z } | {z }
KT
+B T (s I A)T P B 2 B T P(s I A)1 B.
1 T
G (s)G(s). Add 1 to both sides and collect
1
(1 + K (s I A)1 B)(1 + K (s I A)1 B) = 1 + G T (s)G(s).
The left-hand side is
Note that all terms are scalars.

FACT:
Consider the determinant of a block matrix:
A B
det( D C A 1 B).
= det( A)
det
C D
We will not prove this fact here, but the result may be found in many
linear algebra books.
det(s I A + B K ) cl (s)
1
FACT: 1 + K (s I A) B =
=
.
det(s I A)
ol (s)
PROOF : Consider the block matrix
sI A B
M1 =
K
1
det(M1) = det(s I A) det(1 + K (s I A)1 B)
= det(s I A)(1 + K (s I A)1 B).
Now, consider the product of matrices (where r 6= 0)

730
M2 =
or
sI A
K
B
1
I
K
p
r
s I A + B K (s I A) p + br
0
Kp +r
det(M2) = det(s I A + B K ) det(K p + r)

= det(s I A)(1 + K (s I A)1 B) det(K p + r),
det(s I A + B K ) cl (s)
=
.
det(s I A)
ol (s)
1 + K (s I A)1 B =
So, from before, we have

1
cl (s)cl (s)
= 1 + G T (s)G(s)
ol (s)ol (s)
= 1(s).
Therefore 1(s) = 0 requires cl (s) = 0 or cl (s) = 0. LQR requires
that cl (s) be Hurwitz (stable), so we have the conclusion:
1
Closed-loop poles are the LHP zeros of 1 + G T (s)G(s).
Symmetric Root Locus in Matlab

We want to plot the root locus
1
1 + G T (s)G(s) = 0.
We need to find a way to represent G T (s)G(s) as a state-space

system in Matlab.
G(s) = C(s I A)1 B + D
731
and
G T (s) = B T s I A T
1
C T + DT .
This can be represented in block-diagram form as:

DT
D
u(t)
x(t)
x(t)
(t)
R
(t)
y(t)
BT
AT
The overall system has state

"
# "
# "
#"
#
x(t)
x(t)
A
0
B
=
+
u(t)
T
T
T
(t)
(t)
C C A
C D
#
"
h
i x(t)
y(t) = D T C B T
+ D T Du(t).
(t)
function srl(sys)
[A,B,C,D]=ssdata(sys);
bigA=[A zeros(size(A)); -C*C -A];
bigB=[B; -C*D];
bigC=[D*C B];
bigD=D*D;
srlsys=ss(bigA,bigB,bigC,bigD);
rlocus(srlsys);
Symmetric Root Locus

3
Let
Imag Axis
EXAMPLE :
1
G(s) =
.
replacements
(s 1.5)(s 2 +PSfrag
2s + 2)
Note that G(s) is unstable.
1
0
1
2
3
3

Real Axis
732
Multivariable control via LQR. Place poles of the MIMO

MagLev using LQR for
Q = diag(1, 3, 1, 3)
R = diag(100, 100).
EXAMPLE :
The poles end up at 179.3358, 7.0858, 101.2163, 3.6269. Place

poles at these locations using the Lyapunov method (from section 6)
as well, and compare u(t) and x(t).
State value: LQR ; LYAP
1.5
0.5
0
0.5
0
1
1.5
2
0
0.2
0.4
0.6
0.8
1
Time (sec)
Control effort: LQR ; LYAP
Amplitude
Amplitude
1.2
1.4
1.6
4
0
0.2
0.4
0.6
0.8
1
Time (sec)

1.2
1.4
1.6
81
REVIEW OF MULTIVARIABLE CONTROL

GOALS OF FEEDBACK CONTROL :
Change dynamic response of a system
to have desired properties.

System stablized, has good transient and steady-state response.
Output of system tracks reference input.
Disturbances are rejected.
MULTIVARIABLE , STATE - SPACE CONTROL :
Use primarily time-domain matrix representations of systems.

Very powerful. Can often place poles of closed-loop system
anywhere we want!
Same methods work for single-input, single-output (SISO) or

multi-input, multi-output (MIMO or multivariable) systems.
Advanced techniques (cf., ECE5530) allow design of optimal linear
controllers with a single Matlab command!
Dynamic Response versus Pole Locations
Consider a system with transfer function H (s).
The poles of H (s) determine (qualitatively) the dynamic response of

the system. The zeros of H (s) quantify the relationship.
If the system has only real poles, each one is of the form:
1
.
H (s) =
s+
82
ECE4520/5520, REVIEW OF MULTIVARIABLE CONTROL
Stable for > 0. Normalized

impulse- and step-responses:
PSfrag replacements
impulse([0 1],[1 1]);
impulse([0 1],[1 1]);
1
e
t
h(t)
0.8
y(t) K
(1 et/ )
tem response.
= dc gain
o init. condition
e t
0.6
0.4
e
h(t)
1
e
step([0 1],[1 1]);
1
0.8
K (1 et/ )
System response.
K = dc gain
y(t) K
replacements
1],[1 1]);
0.6
0.4
0.2
Response to init. condition

0.
0.2
0
0
t =
0
0
Time (sec )
t =
Time (sec )
If a system has complex-conjugate poles, each may be written as:

n2
H (s) = 2PSfrag replacements
.
s + 2 n s + n2
We can extract two more parameters from this equation:
p
= n and d = n 1 2.
plays the same role as aboveit specifies

decay rate of the response.
(s)
= sin1 ( )
d is the oscillation frequency of the output.

Note: d 6= n unless = 0.
is the damping ratio and it also plays a

role in decay rate and overshoot.
Second-order system responses:
y(t)
0.1
0.3
0.5
0.7
0.9
0.7
0.8
0.9
1.0
Responses
PSfrag replacements
Impulse Responses
=0
0.2
0.4
0.6
0.5
= 1 0.8
0.1
0.3
0.5
0.7
0.8
0.9
=1
0.7
0.9
y(t)
placements
0.5
1
0
Impulse
Responses
6
8
10
12
n t
Step Responses
=0
0.2
1.5
0.4
0.6
0.8
0.5
1.0
0
0

n t
10
12
(s)
83
s-plane impulse response and step response summary:
(s)
(s)
PSfrag replacements
ements
PSfrag replacements
(s)
(s)
Impulse responses vs. pole locations

cationsImpulse responses vs. pole locations
Step responses vs. pole locations
Time-domain
specifications determine
where poles SHOULD be
placed in the s-plane.
(step-response).
Mp
tp
1
0.9
0.1
tr
ts
100
90
80
70
M p, %
Rise time tr = time to go from 10%

to 90% of final value.
Settling time ts = time until

permanently within 1% ofPSfrag
final value.
replacements
Overshoot M p = maximum
PERCENT overshoot.
60
50
40
30
20
10
0
0
0.2
0.4
0.6
0.8
Design rules: n 1.8/tr ; 4.6/ts ; fn(M p ) (use the graph).

1.0
84
Linear Algebra (Matrix) Review

Block Matrices and Submatrices
Sometimes it is convenient to form matrices whose entries are
themselves matrices. Examples:
"
#
h
i
F I
or
.
A B C
0 G
where A, B, C, F and G are matrices (as are 0 and I ). Such
matrices are called block matrices.
Block matrices need to have the right dimensions to fit together.

Block matrices may be added and multiplied as if the entries were
numbers, provided the corresponding entries have the right size and
you are careful about the order of multiplication.
"
#
#" # "
AX + BY
A B
X
=
,
C X + DY
C D
Y
provided the products AX, BY, C X and DY make sense.
Linear Functions
Suppose that f is a function that takes as input n-vectors and returns
m-vectors.
We say that f is linear iff
Scaling: for any n-vector x and scalar , f (x) = f (x).
Superposition: for any n-vectors x and y, f (x + y) = f (x) + f (y).

Such a function may always be represented as a matrix-vector
multiplication f (x) = Ax.
85
Conversely, all functions represented by f (x) = Ax are linear.

Linear Equations
Any set of m linear equations in (scalar) variables x 1, . . . x n can be
represented by the compact matrix equation Ax = b, where x is a
vector made from the variables, A is a m n matrix and b is a
m-vector.
Solving Linear Equations
Suppose we have n linear equations in n variables x 1, . . . , x n , written
in the compact matrix notation Ax = b.
A is a n n matrix; b is an n-vector. Suppose that A 1 exists. Multiply
both sides of Ax = b by A 1.
A1(Ax) = A1b
I x = A1b
x = A1b.
We cant always solve n simultaneous equations for n variables. One

or more of the equations may be redundant (i.e., may be obtained
from the others), or the equations may be inconsistent (i.e., x 1 = 1,
and x1 = 2).
When these pathologies occur, A is singular (non-invertible).
Conversely, when A is non-invertible, the equations are either
redundant or inconsistent.

86
Null-Space
The nullspace of A
mn
is defined as

(A) = x n Ax = 0 .
(A) is the set of vectors mapped to zero by y = Ax.
(A) gives ambiguity in x given y = Ax: If y = Ax and z

y = A(x + z).
Range-Space
The rangespace of A
(A) then
mn
is defined as

(A) = Ax x n
(A) is the set of vectors that can be generated by y = Ax.

(A) is the span of the columns of A.
Rank
We define the rank of A mn as

rank(A) = dim (A).
rank(A) = rank(A T ).
rank(A) is maximum number of independent columns (or rows) of A.

Hence, rank(A) min(m, n).
rank(A) + dim( (A)) = n.

Interpreting y = Ax
Consider the system of linear equations
y1 = A11 x1 + A12 x2 + + A1n xn
87
y2 = A21 x1 + A22 x2 + + A2n xn

...
ym = Am1 x1 + Am2 x2 + + Amn xn
which can be written as y = Ax, where
y1
A11 A12 A1n
y2
A21 A22 A2n
,
y=
A
=
...
...
. . . ...
ym
Am1 Am2 Amn
Some interpretations of y = Ax :
x=
x1
x2
...
xn
y is measurement or observation; x is unknown to be determined

x is input or action; y is output or result.
y = Ax defines a function that maps x
into y
Interpreting y = Ax via the Columns of A: ai

Write A in terms of its columns
h
where a j
A=
a1 a2 a n ,

y = x 1 a1 + x 2 a2 + + x n an
(note: x j s are scalars, a j s are m-vectors)
y is a linear combination or mixture of the columns of A.
Coefficients of x give coefficients of mixture.

88
Interpreting y = Ax via the Eigenvectors/ Eigenvalues of A

An eigenvector is a vector satisfying
Av = v,
where the eigenvalue is a (possibly complex) constant, and v 6= 0.
Also,
det(I A) = 0.
Note that if v is an eigenvector, kv is also an eigenvectorso
eigenvectors are often normalized to have unit length: kvk2 = 1.
If A can be diagonalized (has n linearly independent eigenvectors),
we can write y = Ax as
y = V 3V 1 x.
V is a collection of all the eigenvectors put into a matrix.
V 1 decomposes x into the eigenvector coordinates. 3 is a
diagonal matrix multiplying each component of the resulting vector by
the eigenvalue associated with that component, and V puts
everything back together.
Thus, eigenvectors are the directions of matrix A, and the
eigenvalues are the magnifications along those directions.
Interpreting y = Ax via the Jordan Form of A.

89
Any matrix A nn can be put in Jordan form by a similarity

transformation; i.e.,
J1
0
...
T 1 AT = J =
where
Ji =
Jq
i 1
0

i . . .
... 1
0
i
n i n i

J is block-diagonal and upper bidiagonal.
q
X
n i ).
i =1
J is diagonal is the special case of n Jordan blocks of size n i = 1.

Jordan form is unique (up to permutations of the blocks).
Can have multiple blocks with the same eigenvalue.
(s) = det(s I A) = det(s I J ) = (s 1)n1 (s q )nq .

If A has distinct eigenvalues n i = 1 A diagonalizable. (The
converse is not necessarily true).

dim

( I A) = dim
i
eigenvalue i .
(i I J ) is the number of Jordan blocks with
The sizes of each Jordan block may also be computed, but this is
complicated. i.e., leave it to Matlab!

810
Cayley-Hamilton Theorem
The square matrix A satisfies its own characteristic equation. That is,
if
() = det(I A) = 0
then
(A) = 0.
The Cayley-Hamilton theorem shows us that A n is a

function of matrix powers A n1 down to A0. Therefore, to compute any
polynomial of A it suffices to compute only powers of A up to A n1
and appropriately weight their sum.
SIGNIFICANCE :
STATE-SPACE DYNAMIC SYSTEMS (CONTINUOUS-TIME)

Representation of the dynamics of an nth-order system as a
first-order differential equation in an n-vector called the STATE. n
first-order equations.
Fundamental form for linear state-space model:
x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t).
where u(t) is the input, y(t) is the output, x(t) is the state, A, B, C,
D are constant matrices.
The state of a system at time t0 is the minimum amount of
information at t0 that, together with the input u(t), t t0, uniquely
determines the behavior of the system for all t t0.
DEFINITION :

811
Contrast with impulse-response (convolution) representation which

requires all past history of u(t)
Z t
h( )u(t ) d.
y(t) =
0
Converting State-Space to Transfer Function

A state-space system may be converted to a transfer function
(input-output relationship) via
1
Y (s) = [C(s
I A)
A)1 x(0)} .
{z B + D]} U (s) + C(s
|
| I {z
So,
Y (s)
= C(s I A)1 B + D,
U (s)
but
(s I A)1 =
adj(s I A)
.
det(s I A)
Characteristic equation for the system is (s) = det(s I A) = 0.

Poles of system are roots of det(s I A) = 0, the eigenvalues.
In transfer function matrix form, G(s) = C(s I A)1 B + D, a pole of
any entry in G(s) is a pole of the system.
The following method is a direct
implementation of the transfer function above, and the initial state
may be set by setting the initial integrator values.
SIMULATING SYSTEMS IN SIMULINK :

812

K
u
1
D
1
xdot
y
1

K
A
SISO Canonical Forms

Convert a transfer function to state-space
b1 s 2 + b 2 s + b 3
G(s) = 3
s + a1 s 2 + a2 s + a3
Controller Canonical Form
...

y (t)
a1 a2 a3
y (t)
1

x(t)
= y (t) = 1
0
0 y (t) + 0 u(t)
frag replacements
y (t)
0
1
0
y(t)
0
h
i
y(t) = b1 b2 b3 x(t).
b1
y(t)
b2
u(t)
a1
x 1c
x 2c
x 3c
b3
a2
a3

813
Observer Canonical Form
PSfrag replacements
u(t)
a1 1 0
b1

x(t)
= a2 0 1 x(t) + b2 u(t)
a3 0 0
b3
h
i
y(t) = 1 0 0 x(t).
b3
b2
a3
b1
x 3o
a2
x 2o
a1
x 1o
y(t)
Controllability Canonical Form

First, compute
Then,
b1
1 0 0
1

a 1 1 0 2 = b 2 .
b3
3
a2 a1 1
x(t)
=1
0
h
y(t) = 1
1
0 a3

0 a2 x(t) + 0 u(t)
0
1 a1
i
2 3 x(t).

814
1
x 1co
R
u(t)
a3
a2
x 2co
R
y(t)
a1
x 3co
Observability Canonical Form
Compute the values, as for controllability canonical form.
0
1
0
1
x(t)
= 0
0
1 x(t) + 2 u(t)
PSfrag replacements
a3 a2 a1
3
h
i
y(t) = 1 0 0 x(t).
u(t)
3
R
x 3ob
a1
x 2ob
x 1ob
y(t)
a2
a3
Modal (Diagonal) Form

Factor
G(s) =
r2
rn
r1
+
+ +
.
s p1 s p2
s pn

815
Then,
p1
r1
p2
x(t) + r2
...
...
0
pn
rn
h
i
y(t) = 1 1 1 x(t).
x(t)
=
u(t)
For complex poles i = i + ji , use real modal form

Compute partial-fraction expansion where complex pole-pairs are
represented as
i s + i
.
G i (s) =
(s i )2 + i2
The real-modal form has an A matrix which is block diagonal, and

of the form
#!
"
#
"
n n
r +1 r +1
,...,
A = diag 3r ,
n n
r +1 r +1
where 3r is a diagonal matrix containing the real poles, and
i = i + ji ,
i = r + 1, . . . , n
are the complex poles.
The B matrix has corresponding entries:

# "
#1 " #
"
1
1
i
bi,1
.
=
bi,2
i i i i
i

816
Transformations
Let x(t) = T z(t), where T is an invertible (similarity) transformation
matrix.
1
z (t) = |T 1
{zAT} z(t) + |T {z B} u(t)
A
D u(t).
y(t) = |{z}
C T z(t) + |{z}
D
Time (Dynamic) State Response
Homogeneous response: x(t) = e At x(0).

Forced response:
At
x(t) = e x(0) +
t
0
e A(t ) Bu( ) d
Clearly, if y(t) = C x(t) + Du(t),

Z t
A(t )
At
Ce
Bu( ) d + |Du(t)
+
.
y(t) = Ce
x(0)
{z
}
| {z }
|0
{z
} feedthrough
initial resp.
convolution
Easiest to solve if A is diagonalized; then, e At
e 1 t
0
e 2 t
3t
e =
...
e n t
= V e3t V 1, and
V is the matrix of eigenvectors; the i are the eigenvalues.
If A cannot be diagonalized, it can be put into Jordan form, i.e.,
J1
0
...
T 1 AT = J =
Jq

817
where
Ji =
i 1
0

i . . .
... 1
0
i
n i n i
g replacements
q
X
i =1
n i ).
System decomposed into independent Jordan chains xi (t) = Ji xi (t)

u(t)
1
s
1
s
1
s
cn
y(t)
c2
c1
In the time domain

(s I J )1
1
s ...
...
0
(s )
s
2
(s )
(s )
(s )1 (s )k+1
...
...
(s )1
= (s )1 I + (s )2 F1 + + (s )k Fk
where Fk is the matrix with ones on the kth upper diagonal.

818
Hence, the matrix exponential is
k1
1 t t /(k 1)!
k2
t
/(k
2)!
J t
t
e =e
...
...
= et (I + t F1 + + t k1/(k 1)!Fk ).
Thus, Jordan blocks yield repeated poles and terms of the form t p et
in e At .
Zeros of a State-Space System
Blocking Zero
Consider the transfer-function matrix G(s).
A blocking zero is a value s0 for which G(s0) is identically zero.
Put in u(t) = u 0e zi t and you get zero output (except for output due to
initial conditions).
Not considered a very useful definition of MIMO zero.
Transmission Zero
Put in u(t) = u 0e zi t and you get a zero output at frequency e zi t .
State space: Have input and state contributions (consider first the
SISO case)
u(t) = u 0e zi t ,
x(t) = x 0e zi t
...
y(t) = 0.
x(t)
= Ax(t) + Bu(t) z i e zi t x0 = Ax0e zi t + Bu 0e zi t
819
zi I A
x0
u 0
=0
y(t) = C x(t) + Du(t) C x 0e zi t + Du 0e zi t = 0
i x0
h
= 0.
C D
u 0
Put the two together
zi I A
Zero at frequency z i if
rank
placements
zi I A
C
B
D
B
D
x0
u 0
= 0.
< n + min{ p, q}
STATE-SPACE DYNAMIC SYSTEMS (DISCRETE-TIME)

Digital Control Systems
Computer control requires analog-to-digital (A2D) and
digital-to-analog (D2A) conversion.
r (t)
e(t)
A2D
e[k]
D(z)
u[k]
D2A
zoh
w(t)
u(t)
G(s)
y(t)
v(t)
Use the z-transform instead of the s-transform. Pole locations in the

z-plane correspond to discrete-time impulse responses as:

(z)
820
rag replacements
Discrete Impulse Responses versus Pole Locations
Conversion between s-plane and z-plane:

j
Sfrag replacements
PSfrag replacements
j
T
j
j
T
T
s-plane
z-plane
Desirable locations for poles in the z-plane:

(z)
821
Good
cements
Good
PSfrag replacements
Damping
Good
PSfrag replacements
Frequency n
Settling Time
Discrete-Time State-Space Form

Discrete-time systems can also be represented in state-space form.
x[k + 1] = Ad x[k] + Bd u[k]
y[k] = C d x[k] + Dd u[k]
The subscript d is used here to emphasize that, in general, the A,
B, C and D matrices are DIFFERENT for discrete-time and
continuous-time systems, even if the underlying plant is the same.
I will usually drop the d and expect you to interpret the system from
its context.
State-Space to Transfer Function
x[k + 1] = Ax[k] + Bu[k]
y[k] = C x[k] + Du[k].
Then
1
1
Y (z) = [C(z
I
A)
B
+
D]
U
(z)
+
C(z
I
A)
|
{z
}
|
{z zx[0]} .

822
So,
Y (z)
= C(z I A)1 B + D
U (z)
Same form as for continuous-time systems.

Poles of system are roots of det(z I A) = 0.
Homogeneous Part
x[k] = Ak x[0].
Aside: Nilpotent Systems
A is nilpotent if some power of n exists such that
An = 0.
A does not just decay to zero, it is exactly zero!
This might be a desirable control design! (Why?) You might imagine
that all the eigenvalues of A must be zero for this to work.
Forced Solution
The full solution is:
x[k] = Ak x[0] +
Clearly, if y[k] = C x[k] + Du[k],
k
y[k] = C
| A{zx[0]} +
initial resp.
k1
X
k1
X
j =0
Ak1 j Bu[ j] .
{z
convolution
C Ak1 j Bu[k] + |Du[k]

{z } .
j =0
|
{z
} feedthrough
convolution

823
PSfrag replacements
Converting Plant Dynamics to Discrete Time.

Combine the dynamics of the zero-order hold and the plant.
u[k]
ZOH
u(t)
A, B, C, D
y(t)
The continuous-time dynamics of the plant are:

x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t).
Then,
x[k + 1] = Ad x[k] + Bd u[k]

Z T
e A B d = A1(Ad I )B.
where Ad = e AT , Bd =
0
Similarly,
y[k] = C x[k] + Du[k].
That is, C d = C; Dd = D.
OBSERVABILITY AND CONTROLLABILITY
Continuous-Time Observability: Where am I?
Define
If
CA
(C, A) =
...
C An1
(C, A) is full rank, then the system is observable.

824
Continuous-Time Controllability: Can I get there from here?

Define
(A, B) = B
AB
n1
If (A, B) is full rank, the system is controllable.
B .
Continuous-Time Controllability Gramian

If a continuous-time system is controllable, then
Z t
T
e A B B T e A d
Wc (t) =
0
Furthermore, the input
T A T (t1 t)
u(t) = B e
Wc1(t1)
At1

e x0 x1
will transfer the state x 0 at time 0 to x 1 at time t1.
If a continuous-time system is controllable, and if it is also stable, then

Z
T
e A B B T e A d
Wc =
0
can be found by solving for the unique (positive-definite) solution to

the (Lyapunov) equation
AWc + Wc A T = B B T .
Wc is called the controllability Gramian.
Wc measures the minimum energy required to reach a desired point

x1 starting at x(0) = 0 (with no limit on t)

Z t

ku( )k2 d x(0) = 0, x(t) = x 1 = x1T Wc1 x1.
min
0

825
Continuous-Time Observability Gramian

If a system is observable,
Wo(t) =
t
0
e A C T Ce A d
Furthermore, we can find the initial state

Z t1
T
x(0) = Wo1(t1)
e A t C T y (t) dt
0
where
y (t) = y(t) C
e A(t ) Bu( ) d Du(t).
If a continuous-time system is observable, and if it is also stable, then

Z
T
e A C T Ce A d
Wo =
0
can be found as the unique (positive-definite) solution to the

(Lyapunov) equation
A T Wo + Wo A = C T C.
Wo is called the observability Gramian.
If measurement (sensor) noise is IID (0, I ) then Wo is a measure

of error covariance in measuring x(0) from u and y over longer and
longer periods

2

lim E x(0)
x(0) = x(0)T W 1 x(0).

t

826
Discrete-Time Controllability Gramian

In discrete-time, if a system is controllable, then
n1
X
Am B B T (A T )m
Wdc [n 1] =
m=0
is nonsingular.
In particular,
Wdc =
Am B B T (A T )m
m=0
is called the discrete-time controllability Gramian and is the unique

Wdc measures the minimum energy required to reach a desired point
x1 starting at x[0] = 0 (with no limit on m)
( m
)

X

1
ku[k]k2 x[0] = 0, x[m] = x 1 = x1T Wdc
min
x 1.
k=0
Discrete-Time Observability Gramian
In discrete-time, if a system is observable, then

n1
X
Wdo [n 1] =
(A T )m CC T Am
m=0
is nonsingular.
In particular,
Wdo =
(A T )m CC T Am
m=0

827
is called the discrete-time observability Gramian and is the unique

Wdo A T Wdo A = C T C.
As with continuous-time, if measurement (sensor) noise is IID
(0, I ) then W is a measure of error covariance in measuring x[0]
do
from u and y over longer and longer periods

2

lim E x[0]
x[0] = x(0)T W 1 x[0].
do
Transformation to Controllability Form

Given a system
the matrix T
x(t)
= Ax(t) + Bu(t)
y(t) = C x(t) + Du(t),
T = [B AB An1 B] =
transforms the system into controllability form iff the original system is
controllable.
EXTENSION I :
EXTENSION II :
To convert between any two realizations,

1
T = old new
.
If x old = T xnew then T =
1
old
new .
Popov-Belevitch-Hautus (PBH) Tests for Controllability/Observability

PBH EIGENVECTOR TESTS :
1. {C, A} is unobservable iff a non-zero eigenvector v of A satisfies
Cv = 0.
828
2. {A, B} is uncontrollable iff a left eigenvector w T of A satisfies

w T B = 0.
PBH RANK TESTS :
"
#
C
1. If
drops rank at s = then there is an unobservable mode
sI A
with exponent (frequency) .
h
i
2. If s I A B drops rank at s = then there is an uncontrollable
mode with exponent (frequency) .
Minimality
The system is controllable and observable iff it is minimal. If it is not
minimal, it is either uncontrollable, unobservable, or both.
CONTROLLER/ ESTIMATOR DESIGN
Control is accomplished using LINEAR STATE FEEDBACK.
u(t) = r(t) K x(t),
1n
Closed loop poles are eigenvalues of A B K .

Bass-Gura design:
1 a1 an1
h
i 0 1
an2
K = (1 a1) (n an )
...
. . . ...
0
1
Ackermann design:
K =
0 1
d (Ac ).

829
Simulating state feedback in Simulink:

K
r
1
xdot
D
1
y
1
C
K
A
K

Reference Input
A constant output yss requires constant state x ss and
constant input u ss . We can change the tracking problem to a
regulation problem around u(t) = u ss and x(t) = x ss .
(u(t) u ss ) = K (x(t) x ss ).
OBSERVATION :
u ss and xss related to rss . Let
u ss = |{z}
Nu rss
11
xss = |{z}
N x rss .
n1
Can find N x and Nu by solving

#" # " #
"
Nx
A B
0
=
.
Nu
C D
I
We can also use N = Nu + K N x .

830
Pole Placement via Prototype

Bessel Prototype Systems
ments
0.8
0.8
0.6
0.4
PSfrag replacements
0.2
0
0
10
12
14
Amplitude
Amplitude
0.6
0.4
0.2
0
0
16
0.5
1.5
2.5
Time (sec.)
Time (sec.)
ITAE Prototype Systems
ments
0.8
0.8
Amplitude
Amplitude
0.6
0.4
PSfrag replacements
0.2
0
0
10
12
14
16
0.6
0.4
0.2
0
0
Time (sec.)
PROCEDURE :
0.5
1.5
2.5
Time (sec.)

PROCEDURE :

1:
2:
3:
4:
5:
6:
1.000
0.707 0.707 j
0.708
0.376 1.292 j
0.576 0.534 j
0.310 0.962 j
0.626 0.414 j
0.581 0.783 j
3:
4:
0.896
0.735 0.287 j
Bessel pole locations for o = 1 rad/sec.

1:
2:
3:
4:
5:
6:
1.000
0.866 0.500 j
0.942
0.591 0.907 j
0.852 0.443 j
0.539 0.962 j
5:
6:
0.905 0.271 j
1:
0.800 0.562 j
3:
4:
0.926
0.909 0.186 j
4.620
4.660 4.660 j
4.350 8.918 j
5.913
4.236 12.617 j
6.254 4.139 j
2.990 12.192 j
5.602 7.554 j
3.948 13.553 j
6.040 5.601 j
9.394
7.089 2.772 j
Bessel pole locations for ts = 1 sec.

2:
0.746 0.711 j
0.657 0.830 j
1:
2:
0.521 1.068 j
0.424 1.263 j
ITAE pole locations for ts = 1 sec.

ITAE pole locations for o = 1 rad/sec.
5:
6:
4.620
4.053 2.340 j
3.967 3.785 j
5.009
4.110 6.314 j
5.927 3.081 j
4.016 5.072 j
4.217 7.530 j
5.528 1.655 j
6.261 4.402 j
6.448
7.121 1.454 j
831
832
Integral Control for Continuous-Time Systems
In many practical designs, integral control is needed to counteract
disturbances, plant variations, or other noises in the system.
placements
u(t)
KI
s
r (t)
A, B, C, D
y(t)
x
K
Nx
In other words, include an integral state equation of

x I (t) = r(t) y(t)
= r(t) C x(t).
and THEN design K I and K such that the system had good
closed-loop pole locations.
Note that we can include the integral state into our normal
state-space form by augmenting the system dynamics
"
x I (t)
x(t)
"
0 C
0
#"
x I (t)
x(t)
"
0
B
u(t) +
y(t) = C x(t) + Du(t).

"
I
0
r(t)
833
Note that the new A matrix has an open-loop eigenvalue at the

origin. This corresponds to increasing the system type, and integrates
out steady-state error.
The control law is,
u(t) = K I K
#
"
i x I (t)
x(t)
+ K N x r(t).

State Feedback for Discrete-Time Systems
The result is identical.
Characteristic frequencies of controllable modes are freely
assignable by state feedback; characteristic frequencies of
uncontrollable modes do not change with state feedback.
There is a special characteristic polynomial for discrete-time systems
(z) = z n ;
that is, all eigenvalues are zero.
What does this mean? By Cayley-Hamilton,
(A B K )n = 0.
Hence, with no input, the state reaches 0 in at most n steps since
x[n] = (A B K )n x[0] = 0
no matter what x[0] is.
This is called dead-beat control and A B K is called a Nilpotent
matrix.
834
Integral Control
eplacements
Again, we augment our system with a (discrete-time) integrator:

KI z
z1
r [k]
u[k]
A, B, C, D
y[k]
x
K
Nx
In discrete time, we include an integral state equation of

x I [k + 1] = x I [k] + r[k] y[k]
= x I [k] + r[k] C x[k].
We can include the integral state into our normal state-space form by
augmenting the system dynamics
"
x I [k + 1]
x[k + 1]
"
1 C
0
#"
x I [k]
x[k]
"
0
B
u[k] +
"
I
0
r[k]
y[k] = C x[k] + Du[k].
Notice the new open-loop eigenvalue of A at z = 1.

The control law is,
u[k] = K I K
"
#
i x I [k]
x[k]
+ K N x r[k].

Discrete-Time Prototype Pole Placement
Where do we place the closed-loop poles?
835
Can choose closed-loop poles to mimic a system that has

prototype system.
Can be done using the ITAE and Bessel (continuous-time) tables.
PROCEDURE :


PROCEDURE :


Closed-Loop Estimator Design
In the design of state-feedback control, we assumed that all states of
our plant were measured.
This is often IMPOSSIBLE to do or TOO EXPENSIVE.
So, we now investigate methods of reconstructing the plant state
vector given only limited measurements.

Sfrag replacements
836
w(t)
u(t)
A, B
A, B
x(t)
x(t)
y(t)
y (t)
y (t) = y(t) y (t)
Note: If L = 0 we have an open-loop estimator.

x(t)
= A x(t)
+ Bu(t) + L y(t) C x(t)
.
Lets look at the error.
= x(t)
x(t)
x(t)

Bu(t) L y(t) C x(t)

= A x(t)
L C x(t) C x(t)
= (A LC) x(t),
or, x(t)
x(t) if A LC is stable, for any value of x(0)
and any u(t),

whether or not A is stable.
In fact, we can look at the dynamics of the state estimate error to
quantitatively evaluate how x(t)
x(t).
= (A LC) x(t)
x(t)
has dynamics related to the roots of the characteristic equation

ob (s) = det (s I A + LC) = 0.
So, for our estimator, we specify the convergence rate of x(t)
x(t)
by choosing desired pole locations: Choose L such that
ob,des (s) = det (s I A + LC) .
837
This is called the observer gain problem.

In Simulink, the following diagram implements a closed-loop
estimator. The output is xhat.
K
y
2
L
u
1
xhat
yhat
C
K

A
K
D
The Observer Gain Design Problem

We would like a method for computing the observer gain vector L
given a set of desired closed-loop observer gains ob,des (s).
Bass-Gura method:
L=
1
0 0
.
.
1
.
1 a1
...
... 0
an1 a1 1
Ackermann method:
(1 a1)
(2 a2)
.
...
(n an )
0
.
.
1 .
L = d (A)
(C, A) .
0
1

838
Discrete-Time Prediction Estimator
Sfrag replacements
In discrete-time, we can do the same thing. The picture looks like

w[k]
u[k]
A, B
A, B
x[k]
x p [k]
Lp
y[k]
y [k]
y [k] = y[k] y [k]
We write the update equation for the closed-loop (prediction)

estimator as

x p [k + 1] = A x p [k] + Bu[k] + L p y[k] C x p [k] .
The prediction-estimation error can likewise be written as

x[k
+ 1] = A L p C x[k],
which has dynamics related to the roots of the characteristic equation

ob (z) = det z I A + L p C = 0.
For our prediction estimator, we specify the convergence rate of

x p [k] x[k] by choosing desired pole locations: Choose L p such that

ob,des (z) = det z I A + L p C .
Regulator Design: Separation Principle
Now that we have a structure to estimate the state x(t), lets feed
back x(t)
to control the plant. That is,
u(t) = r(t) K x(t),

839
where K was designed assuming that u(t) = r(t) K x(t). Is this

going to work? How risky is it to interconnect two well-behaved,
stable systems? (Assume r(t) = 0 for now).
Our combined closed-loop-system state equations are
# "
#
"
#"
x(t)
x(t)
A BK
BK
=
.
x(t)
x(t)
0
A LC
The 2n closed-loop poles of the combined regulator/estimator system

are the n eigenvalues of A B K combined with the n eigenvalues of
A LC.
The CompensatorContinuous-Time
U (s)
= K (s I A + B K + LC L D K )1 L .
Y (s)
The CompensatorDiscrete-Time, Prediction-Estimator Based
D(s) =
D(z) =
1
U (z)
= K z I A + B K + L p C L p D K
L p.
Y (z)
Current Estimator/ Compensator
Time update: Predict new state from old state estimate and system
dynamics
x p [k] = A xc [k 1] + Bu[k 1].
Measurement update: Measure the output and use that to
update/correct the estimate

xc [k] = x p [k] + L c y[k] C x p [k] .
L c is called the current estimator gain.

840
The prediction and current estimate errors have dynamics

x p = x x p x p [k + 1] = (A L c C A) x p [k]
xc = x xc
xc [k + 1] = (A AL c C) xc [k].
The CompensatorDiscrete-Time, Current-Estimator Based

D(z) = K L c K (I L c C) (z I (A B K )(I L c C))1 (A B K )L c.
Reduced-Order Estimator/Compensator
Why construct the entire state vector when you are directly measuring
a state? If there is little noise in your sensor, you get a great estimate
by just letting
x1 = y
(C = [1 0 . . . 0]).
Consider partitioning the plant state into
xa : measured state
xb : to be estimated.
"
xa [k + 1]
"
Aaa Aab
#"
xa [k]
"
Ba [k]
+
u[k]
xb [k]
Bb [k]
"
#
h
i x [k]
a
y[k] = 1 0 0
xb [k]
xb [k + 1]
Aba Abb
xb [k + 1] = Abb xb [k] + Aba xa [k] + Bb u[k] +
L r xa [k + 1] Aaa xa [k] Ba u[k] Aab xb [k] .
The error dynamics satisfy

xb [k + 1] = (Abb L r Aab ) xb [k].
841
So we pick estimate error dynamics related to roots of

r,des (z) = det (z I Abb + L r Aab ) = 0.
MIMO Control Design
Cyclic Design [Chen]
DESIGN METHOD :
1. First, randomly choose p n constant matrix K 1 such that

A = A B K 1 is cyclic.
Bv} is controllable.
2. Randomly choose p 1 vector v such that { A,
3. Design state feedback vector k using Bass-Gura/ Ackermann/ etc on

Bv} to put the poles in the desired place. Then K 2 = vk.
system { A,
Assemble together
PSfrag replacements
K2
K1
u(t)
r (t)
x(t)
y(t)
4. Design may be summed up as u(t) = r(t) (K 1 + K 2)x(t).

Lyapunov-Equation Design [Chen]
DESIGN METHOD :
1. Select an n n matrix F with a set of desired eigenvalues that

contain no eigenvalues of A.
2. Randomly select p n matrix K such that {F, K } is observable.
842
3. Solve for the unique T in the Lyapunov equation AT T F = B K .

4. If T is singular, select a different K and repeat the process. If T is
nonsingular, we compute K = K T 1 and (A B K ) has the desired
eigenvalues.
If T is nonsingular, the Lyapunov equation and K T = K imply

(A B K )T = T F
or
A B K = T F T 1.
Thus, (A B K ) and F are similar and have the same set of

eigenvalues.
MIMO Observer Design
The MIMO observer design problem is the dual of the MIMO
controller design problem.
Therefore, if {A T , C T } is controllable, the controller design
procedures return L = K T .
LINEAR QUADRATIC REGULATOR
Introduction to Optimal Control
The engineering trade off in control-system design is
Fast response
versus
Slower response

843
The Discrete-Time Linear Quadratic Regulator Problem

The discrete-time LQR problem is posed as minimizing
Ji,N = x T [N ]P x[N ] +
N 1
X

k=i

x T [k]Qx[k] + u T [k]Ru[k] ,
which may be interpreted as the total cost associated with the

transition from state x[i] to the goal state 0 at time N .
x T [N ]P x[N ] is the penalty for missing the desired final state.
x T [k]Qx[k] is the penalty on excessive state size.
u T [k]Ru[k] is the penalty on excessive control effort. (R = if SISO).

We require P 0, Q 0 and R > 0.
RESULTS :
The optimum control is a linear (time varying) feedback

u [k] = K k x[k]
where
K k = R + B T Pk+1 B
and
1
B T Pk+1 A
Pk = (A B K k )T Pk+1(A B K k ) + Q + K kT R K k ,
This difference equation for Pk has a starting condition that occurs at

the final time, and is solved recursively backwards in time.
If we let N , then Pk tends to a steady-state solution as k 0.
1 T
T
T
T
Pss = A Pss A A Pss B R + B Pss B
B Pss A + Q
and
K = R + B Pss B
1
B T P A.

844

1
1 + G T (z 1)G(z) = 0
which are inside the unit circle, where

G(z) = C(z I A)1 B + D.
The Continuous-Time Linear Quadratic Regulator Problem
We wish to minimize
T
J (xo , u, to ) = x (t f )Pt f x(t f ) +
tf
t0

x T (t)Qx(t) + u T (t)Ru(t) dt.
Pt f , Q and R have the same restrictions and interpretations as before.

RESULTS :
The following are key results
The optimal control is a linear (time varying) state feedback

u(t) = R 1 B T P(t)x(t).
The symmetric p.s.d. matrix P(t) satisfies the (matrix) differential

equation
P(t)
= P(t)B R 1 B T P(t) Q P(t)A A T P(t),
with the boundary condition that P(t f ) = Pt f . The differential
equation runs backwards in time to find P(t).
If t f , P(t) Pss as t 0. Then,
0 = Pss B R 1 B T Pss Q Pss A A T Pss .
This is the Continuous-Time Algebraic Riccati Equation, then,

u(t) = R 1 B T Pss x(t),
which is a linear state feedback.
845
There are many ways to solve the the C.A.R.E., but when Q has the
form C T C, and the system is SISO, a variant of the Chang-Letov
method may be used:
1
1 + G T (s)G(s) = 0
which are in the left-half plane, where

G(s) = C(s I A)1 B + D.
The locus of all possible values of closed-loop optimal roots forms
the symmetric root locus.

(mostly blank)
(mostly blank)
(mostly blank)

Multivariable Control Systems I

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multivariable Control Systems I

Uploaded by

Copyright:

Available Formats

ECE4520/ECE5520

MULTIVARIABLE CONTROL SYSTEMS I

Office Hours: TBD

1) Graded homework assignments, 30% total.

Homework Policy #1:

Homework Policy #2:

Your homework is expected to be a bona-fide individual effort. Copying homework from

Homework Policy #3:

Homework Format Rules:

1. Use 8 1/2 by 11 paper (engineering paper is good).

ECE4520/5520: Multivariable Control Systems I.

FUNDAMENTALS OF FEEDBACK CONTROL

ECE4520/5520, FUNDAMENTALS OF FEEDBACK CONTROL

Second-order system in standard form:

u(t) is the input.

y(t) is the output.

ECE4520/5520, FUNDAMENTALS OF FEEDBACK CONTROL

Transforms for systems with LCCODE representations can be written

These can be represented in Matlab using vectors of numerator and

Input signals of interest include the following:

c 2001, 2000, Gregory L. Plett

ECE4520/5520, FUNDAMENTALS OF FEEDBACK CONTROL

Matlabs impulse, step, and lsim commands can be used to

This is useful when investigating steady-state errors in a control

Y (s) = H (s)U (s)

Y (s) = [H1(s)H2(s)] U (s)

Y (s) = [H1(s) + H2 (s)] U (s)

c 2001, 2000, Gregory L. Plett

ECE4520/5520, FUNDAMENTALS OF FEEDBACK CONTROL

Block-diagram algebra (or Masons rule) may be used to reduce block

Dynamic Response versus Pole Locations

c 2001, 2000, Gregory L. Plett

ECE4520/5520, FUNDAMENTALS OF FEEDBACK CONTROL

step([0 1],[1 1]);

impulse([0 1],[1 1]);

impulse([0 1],[1 1]);

Response to initial condition

If a system has complex-conjugate poles, each may be written as:

We can extract two more parameters from this equation:

c 2001, 2000, Gregory L. Plett

ECE4520/5520, FUNDAMENTALS OF FEEDBACK CONTROL

Impulse Responses of 2nd-Order Systems

Step Responses of 2nd-Order Systems

A summary chart of impulse responses and step responses versus

Impulse responses vs. pole locations

Time-domain specifications determine where poles SHOULD be

c 2001, 2000, Gregory L. Plett

ECE4520/5520, FUNDAMENTALS OF FEEDBACK CONTROL

Rise time tr = time to go from

Settling time ts = time until

Basic Feedback Properties

Stability depends on roots of denominator of T (s): 1 + D(s)G(s) = 0.

ECE4520/5520, FUNDAMENTALS OF FEEDBACK CONTROL

Steady-state error found from (for unity feedback case)

ess = lim e(t) = lim s E(s) if the limit exists.

System type = 0 iff ess is finite for unit-step reference-input 1(t).

System type = 1 iff ess is finite for unit-ramp reference-input r(t).

System type = 2 iff ess is finite for unit-parabola ref.-input p(t). . .

position error constant

Steady-state errors versus system type for unity feedback:

ECE4520/5520, FUNDAMENTALS OF FEEDBACK CONTROL

The poles of the closed-loop system T (s) depend on the open-loop

ECE4520/5520, FUNDAMENTALS OF FEEDBACK CONTROL