You are on page 1of 132

EMT 4203: CONTROL ENGINEERING

II

Bsc. Mechatronic Engineering

DeKUT

Lecture Notes

By

Dr. Inno Oduor Odira

March 2024
Table of contents

I Classical Control 3

1 Control Problem and Control Actions 5


1.1 Control Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Control actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 State feedback control configuration. . . . . . . . . . . . . . . . . . . . 7
1.2.2 Output feedback control configuration. . . . . . . . . . . . . . . . . . . 7

2 PID Controllers 15
2.1 Key features of the PID controller. . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Structures and main properties of PID controllers . . . . . . . . . . . . . . . . . 16
2.2.1 PD Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 PI Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.3 PID Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 PID Controller Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.1 PID controller actions selection general guide . . . . . . . . . . . . . . . 19
2.3.2 Ziegler-Nicholas Tuning: Reaction curve method (Open loop method) . . 19
2.3.3 Ziegler-Nicholas Tuning: Continuous cycling (Closed loop method) . . . 20
2.3.4 Analytical Ziegler-Nicholas continuos cycling method . . . . . . . . . . 21
2.4 PD Controller Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 PI Controller Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5.1 PID and Phase-Lag-Lead Controller Designs . . . . . . . . . . . . . . . 31
2.6 PID controller design using controlerdesigner in MATLAB . . . . . . . . . . . . 33
2.7 PID Algorithm (issues and implementation) . . . . . . . . . . . . . . . . . . . . 34
2.7.1 Implementation of PID controllers . . . . . . . . . . . . . . . . . . . . . 39
iv Table of contents

2.7.2 Limitations of PID Control . . . . . . . . . . . . . . . . . . . . . . . . . 45

II Modern Control 47

5 Optimal Control (DP, LQR and Kalman Filters) 101


5.1 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.1.1 Algorithm: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.2 Optimal Control: LQR (Linear Quadratic Regulator) . . . . . . . . . . . . . . . 109
5.2.1 LQR: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.2.2 Optimal Control:LQR . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.2.3 Quadratic performance index . . . . . . . . . . . . . . . . . . . . . . . . 109
5.2.4 The Linear quadratic problem (LQR) . . . . . . . . . . . . . . . . . . . 110
5.2.5 Dynamic Programming and Full-State Feedback . . . . . . . . . . . . . 110
5.2.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

References 133
COURSE OULTLINE

EMT 4203: CONTROL ENGINEERING II


Prerequisites: EMT 4102 Control Engineering I

Course Purpose

The aim of this course is to enable the student to understand state space and polynomial representa-
tions of linear dynamical systems, develop skills for design of linear multivariable control systems
by pole placement and linear quadratic optimization and grasp basic nonlinear system analysis and
design methods

Expected Learning Outcomes

By the end of this course, the learner should be able to;

i) Analyse linear dynamical systems by state space and polynomial methods

ii) Design controllers for linear and nonlinear systems

iii) Derive state space representations for nonlinear systems

Course description

Control problem and basic control actions: Proportional (P) control actions, Derivative (D)
control action, integral (I) control action, Proportional plus Derivative (PD) control action, Pro-
portional plus Integral (PI) control action. Controllers based on PID actions, Observability and
state estimation. Pole placement by state feedback. Observer design by pole placement. Polyno-
mial approach for pole placement. State variable feedback controller design: controllability,
observability, eigenvalue placement, observe design for linear systems. Linear Quadratic Reg-
ulator (LQR), Algebraic Riccati Equation, (ARE), disturbance attenuation problem, tracking
problem, Kalman filter as an observer: digital implementation of state-space model, Dynamic
programming.
2 Table of contents

Mode of delivery

Two (2) hour lectures and two (2) hour tutorial per week, and at least five 3-hour laboratory
sessions per semester organized on a rotational basis.

Instructional Materials/Equipment

White Board, LCD Projector

Course Assessment

1. Practicals: 15%
2. Assignments 5%
3. CATs: 10%
4. Final Examination: 70%
5. Total: 100%

Reference books

1. Franklin, Gene F. Feedback control of Dynamic Systems. ( 2006. ) 5th ed. India : Prentice
Hall,
2. Golnaraghi, Farid. Automatic control systems. (2010.) 9th Ed. New Jersey: Wiley.
3. Gearing Hans P. Optimal Control with Engineering Applications. (2007.) Verlag Berlin
Heideberg : Springer,
4. Ogata, Katsuhiko. Modern Control Engineering. (2010.) 5th. Boston: Pearson.
5. Bolton W. Control Systems (2006.) -. Oxford: Newnes.

Course Journals

1. Journal of Dynamic Systems, Measurement, and Control, ISSN: [0022-0434]


2. IRE Transactions on Automatic Control, ISSN: [0096-199X]
3. SIAM Journal on Control and Optimization, ISSN: [0363-0129]
4. Transactions of the Institute of Measurement and Control, ISSN: [0142-3312]
Part I

(Classical Control)
Chapter 1

Control Problem and Control Actions

1.1 Control Problem


In any control system, where the dynamic variable has to be maintained at the desired set point
value, it is the controller which enables the requirement of the control objective to be met.
The control design problem is the problem of determining the characteristics of the controller
so that the controlled output can be:
1. Set to prescribed values called reference
2. Maintained at the reference values despite the unknown disturbances
3. Conditions (1) and (2) are met despite the inherent uncertainties and changes in the plant
dynamic characteristics.
4. Maintained within some constrains.
The first requirement above is called Tracking or stabilization (regulator) depending on whether
the set-point continuously changes or not, The second condition is called disturbance rejection.
The third condition is called Robust tracking/stabilization and disturbance rejection. The fourth
condition is called optimal tracking/stabilization and disturbance rejection.

1.2 Control actions


The liquid level control system in a buffer tank shown in Fig. 1.1 will be used for illustration. This
can be presented as a general plant shown Fig. 1.2. The manner in which the automatic controller
produces the control signal is called the control action.
The control signal is produced by the controller, thus a controller has to be connected to the
plant. The configuration may take either Close loop or Open loop as shown in Fig. 1.3 and 1.4
respectively. These may also be configured as either output feedback control configuration or State
feedback control configuration.
6 Control Problem and Control Actions

Fig. 1.1 Liquid level control system in a buffer tank

Fig. 1.2 General plant

Fig. 1.3 Close loop Controlled system.

Fig. 1.4 Open loop Controlled system.


1.2 Control actions 7

1.2.1 State feedback control configuration.


The general mathematical model of state feedback takes the form

x = Ax + Bu State Equation
y = Cx + Du Output Equation

The associated block diagram is the following Fig. 1.5. Two typical control problems of interest:

Fig. 1.5 Regulation and Tracking configuration.

• The regulator problem, in which r = 0 and we aim to keep limt→∞ y(t) = 0 (i.e) (i.e., a
pure stabilisation problem)
• The tracking problem, in which y(t) is specified to track r(t) ̸= 0.
When r(t) = R ̸= 0, constant, the regulator and tracking problems are essentially the same.
Tracking a nonconstant reference r(t) is a more difficult problem, called the servomechanism
problem.
The control law for state feedback the takes the form

u(t) = K2 r − K1 X

1.2.2 Output feedback control configuration.


Controller compares the actual value of the system output with the reference input (desired value),
determines the deviation, and produces a control signal that will reduce the deviation to zero or a
small value as illustrated in Fig. 1.6.
The algorithm that relates the error and the control signal is called the control action (law)(strategy).
A controller is required to shape the error signal such that certain control criteria or specifica-
tions, are satisfied. These criteria may involve :
• Transient response characteristics,
• Steady-state error,
• Disturbance rejection,
• Sensitivity to parameter changes.
8 Control Problem and Control Actions

Fig. 1.6 Expanded Close loop Output feedback configuration.

Fig. 1.7 Control action.

The most commonly used Control Actions are :

1. Two position (on-off, bang-bang)

2. Proportional (P-control)

3. Derivative (D-control)

4. Integral (I-control)

Two position (on-off, bang-bang)

In a two position control action system, the actuating element has only two positions which are
generally on and off. Generally these are electric devices. These are widely used as they are simple
and inexpensive. The output of the controller is given by Eqn.1.1.
( )
U1 : ∀e(t) ≥ 0
u(t) = . (1.1)
U2 : ∀e(t) < 0
Where, U1 and U2 are constants
The block diagram of on-off controller is shown in Fig. 1.8
The value of U2 is usually either:
• zero, in which case the controller is called the on-off controller (Fig. 1.9), or
• equal to −U1 , in which case the controller is called the bang-bang controller Fig. 1.10.
Two position controllers suffers cyclic oscillations which is mitigated by introduction of a
differential gap or Neutral zone such that the output switches to U1 only after the actuating error
1.2 Control actions 9

Fig. 1.8 Block diagram of on off controller.

Fig. 1.9 Block diagram of on off controller.

Fig. 1.10 Block diagram of on off controller.

becomes positive by an amount d. Similarly it switches back to U2 only after the actuating error
becomes equal to −d.

Fig. 1.11 Block diagram of on off controller.

The existence of a differential gap reduces the accuracy of the control system, but it also
reduces the frequency of switching which results in longer operational life.
With reference to Fig. 1.12, Assume at first that the tank is empty. In this case, the solenoid
will be energized opening the valve fully.
10 Control Problem and Control Actions

Fig. 1.12 water level control system.

If, at some time to, the solenoid is de-energized closing the valve completely, qi = 0, then the
water in the tank will drain off. The variation of the water level in the tank is now shown by the
emptying curve.

Fig. 1.13 water level control system.

If the switch is adjusted for a desired water level, the input qi will be on or off (either a positive
constant or zero) depending on the difference between the desired and the actual water levels to
create differential gap.
Therefore during the actual operation, input will be on until the water level exceeds the desired
level by half the differential gap.
Then the solenoid valve will be shut off until the water level drops below the desired level by
half the differential gap. The water level will continuously oscillate about the desired level.
It should be noted that , the smaller the differential gap is, the smaller is the deviation from the
desired level. But on the other hand, the number of switch on and offs increases.
1.2 Control actions 11

Fig. 1.14 water level control system.

Fig. 1.15 water level control system.

Proportional Control Action

The proportional controller is essentially an amplifier with an adjustable gain. For a controller with
proportional control action the relationship between output of the controller u(t) and the actuating
error signal e(t) is

u(t) = K p e(t)

Where, K p is the proportional gain.


Whatever the actual mechanism may be the proportional controller is essentially an amplifier
with an adjustable gain. The block diagram of proportional controller is shown in Fig.1.16.
The proportional action is shown Fig. 1.17 In general,
• For small values of Kp, the corrective action is slow particularly for small errors.
• For large values of Kp, the performance of the control system is improved. But this may lead
to instability.
Proportional control is said to look at the present error signal.
Usually, a compromise is necessary in selecting a proper gain. If this is not possible, then
proportional control action is used with some other control action(s).
The value of K p should be selected to satisfy the requirements of
12 Control Problem and Control Actions

Fig. 1.16 Block diagram of a proportional controller.

Fig. 1.17 Proportional action.

• stability,
• accuracy, and
• satisfactory transient response, as well as
• satisfactory disturbance rejection characteristics.

Integral Control Action

The value of the controller output u(t) is changed at a rate proportional to the actuating error signal
e(t) given by Eqn.4

du(t)
= Ki e(t) (1.2)
dt
or Z t
u(t) = Ki e(t) (1.3)
0
Where, Ki is an adjustable constant.
With this type of control action, control signal is proportional to the integral of the error signal.
It is obvious that even a small error can be detected, since integral control produces a control
signal proportional to the area under the error signal.
1.2 Control actions 13

Hence, integral control increases the accuracy of the system.


Integral control is said to look at the past of the error signal.
If the value of e(t) is doubled, then u(t) varies twice as fast. For zero actuating error, the value
of u(t) remains stationary. The integral control action is also called reset control. Fig.4 shows the
block diagram of the integral controller.
Remember that each s term in the denominator of the open loop transfer function increases the
type of the system by one, and thus reduces the steady state error.
The use of integral controller will increase the type of the open loop transfer function by one.

Fig. 1.18 Block diagram of an integral controller.

Fig. 1.19 Integral control action.

Derivative Control Action

In this case the control signal of the controller is proportional to the derivative (slope) of the error
signal.
Derivative control action is never used alone, since it does not respond to a constant error,
however large it may be.
Derivative control action responds to the rate of change of error signal and can produce a
control signal before the error becomes too large.
14 Control Problem and Control Actions

Fig. 1.20 Block diagram of a Derivative controller.

Fig. 1.21 Integral control action.

As such, derivative control action anticipates the error, takes early corrective action, and tends
to increase the stability of the system.
Derivative control is said to look at the future of the error signal and is said to apply breaks to
the system.
Derivative control action has no direct effect on steady state error.
But it increases the damping in the system and allows a higher value for the open loop gain K
which reduces the steady state error.
Derivative control, however, has disadvantages as well.
It amplifies noise signals coming in with the error signal and may saturate the actuator.
It cannot be used if the error signal is not differentiable.
Thus derivative control is used only together with some other control action!
Chapter 2

PID Controllers

2.1 Key features of the PID controller.


1. The basic PID controller has the form

Z t  Z 
de 1 t de
u = k p e + ki e(τ)dτ + kd = k p e + e(τ)dτ + Td
0 dt Ti 0 dt
where u is the control signal and e is the control error and r variable is often called the set
point.

Fig. 2.1 PID control structure

The control signal is thus a sum of three terms: the P-term (which is proportional to the error),
the I-term (which is proportional to the integral of the error), and the D-term (which is proportional
to the derivative of the error). The controller parameters are proportional gain K, integral time Ti ,
and derivative time Td .
The integral, proportional and derivative part can be interpreted as control actions based on
the past, the present and the future as is illustrated in Figure ??. The derivative part can also be
interpreted as prediction by linear extrapolation over the time Td . Using this interpretation it is
easy to understand that derivative action does not help if the prediction time Td is too large.
16 PID Controllers

Fig. 2.2 PID terms interpretation illustration

Integral action guarantees that the process output agrees with the reference in steady state and
provides an alternative to including a feedforward term for tracking a constant reference input.

1
u= u
sTi
where Ti = k p /ki
Derivative action provides a method for predictive action. where Td = kd /k p is the derivative
time constant. The action of a controller with proportional and derivative action can be interpreted
as if the control is made proportional to the predicted process output, where the prediction is made
by extrapolating the error Td time units into the future using the tangent to the error curve.

2.2 Structures and main properties of PID controllers


Several common dynamic controllers appear very often in practice. They are known as P, PD,
PI, PID, phase-lag, phase-lead, and phase-lag-lead controllers. In this section we introduce their
structures and indicate their main properties.
In the most cases these controllers are placed in the forward path at the front of the plant
(system) as presented in Figure 8.1.

Fig. 2.3 A common controller-plant configuration

P Controller

In some cases it is possible to achieve the desired system performance by changing only the static
gain . In general, as K p increases, the steady state errors decrease, but the maximum percent
2.2 Structures and main properties of PID controllers 17

overshoot increases. However, very often a static controller is not sufficient and one is faced with
the problem of designing dynamic controllers.

2.2.1 PD Controller
PD stands for a proportional and derivative controller. The output signal of this controller is equal
to the sum of two signals: the signal obtained by multiplying the input signal by a constant gain
K p and the signal obtained by differentiating and multiplying the input signal by Kd , i.e.
 
de de
u = k p e + kd = k e + Td
dt dt

its transfer function is given by

Kp
Gc (s) = K p + Kd s = Kd (s + ) = Kd (s + zc )
Kd

This controller is equivalent to adding a zero to the system with a result of positive phase
contribution. It is used to improve the system transient response.
PD Controller is equivalent to Phase-Lead compensator which adds not only a zero but also a
pole in away that the phase contribution is positive.

Phase-Lead compensator

The phase-lead compensator is designed such that its phase contribution to the feedback loop is
positive. It is represented by

s + z2
Gc (s) = , p2 > z2 > 0
s + p2
Gc (s) = arg (s + z2 ) − arg (s + p2 ) = θz2 − θ p2 > 0
where θz2 and θ p2 are given in Figure 2.4 (b) . This controller introduces a positive phase shift
in the loop (phase lead). It is used to improve the system response transient behaviour.

Fig. 2.4 Poles and zeros of phase-lag (a) and phase-lead (b) controllers
18 PID Controllers

2.2.2 PI Controller
Similarly to the PD controller, the PI controller produces as its output a weighted sum of the input
signal and its integral. Z t
u(t) = k p e(t) + ki e(τ)dτ
0
Its transfer function is

1 K p s + Ki s + zc
Gc (s) = K p + Ki = = Kp
s s s
It is equivalent of adding a pole at the origin and a zero to the system.
In practical applications the PI controller zero is placed very close to its pole located at the
origin so that the angular contribution of this "dipole" to the root locus is almost zero. A PI
controller is used to improve the system response steady state errors since it increases the control
system type by one .
Equivalent to PI controller is the Phase-Lag compensator.

Phase-Lag Compensator

The phase-lag controller belongs to the same class as the PI controller. The phase-lag controller
can be regarded as a generalization of the PI controller. It introduces a negative phase into the
feedback loop, which justifies its name. It has a zero and pole with the pole being closer to the
imaginary axis, that is
 
p1 s + z1
Gc (s) = , z1 > p1 > 0
z1 s + p1
arg Gc (s) = arg (s + z1 ) − arg (s + p1 ) = θz1 − θ p1 < 0
where p1 /z1 is known as the lag ratio. The corresponding angles θz1 and θ p1 are given in
Figure 2.4(a). The phase-lag controller is used to improve steady state errors.

2.2.3 PID Controller


The PID controller is a combination of PD and PI controllers; hence its transfer function is given
by
Z t
de
u(t) = k p e(t) + ki e(τ)dτ + kd
0 dt

1 Ki + K p s + Kd s2 (s + z1 )(s + z2 )
Gc (s) = K p + Kd s + Ki = = Kd
s s s
The PID controller can be used to improve both the system transient response and steady state
errors. This controller is very popular for industrial applications.
2.3 PID Controller Tuning 19

Phase-Lag-Lead Compensator

The phase-lag-lead controller is obtained as a combination of phase-lead and phase-lag controllers


and is equivalent to PID controller. Its transfer function is given by

(s + z1 ) (s + z2 )
Gc (s) = , p2 > z2 > z1 > p1 > 0, z1 z2 = p1 p2
(s + p1 ) (s + p2 )
It has features of both phase-lag and phase-lead controllers, i.e. it can be used to improve
simultaneously both the system transient response and steady state errors. However, it is harder to
design phase-lag-lead controllers than either phase-lag or phase-lead controllers.
Note that all controllers presented in this section can be realized by using active networks
composed of operational amplifiers (see, for example, Dorf, 1992; Nise, 1992; Kuo, 1995).

2.3 PID Controller Tuning

2.3.1 PID controller actions selection general guide


• Determine what characteristics of the system needs to be improved (Transient and Steady
states requirements)

• Use K p to decrease the rise time Tr .

• Use Kd to reduce overshootMP%OS and settling time Ts

• Use KI to eliminate the steady state error

Several Tuning methods exists including: Ziegler-Nichols tuning methods. Cohen Coon method,
Lambda method etc.

2.3.2 Ziegler-Nicholas Tuning: Reaction curve method (Open loop method)


Also referred to as step response method. The step response method characterizes the open loop
response by the parameters a and τ illustrated in Fig.2.5 The time domain method is based on

Fig. 2.5 Reaction curve (Step response method)


20 PID Controllers

a measurement of part of the open loop unit step response of the process, as shown in Figure
2.5. The step response is measured by applying a unit step input to the process and recording the
response. The response is characterized by parameters a and τ, which are the intercepts of the
steepest tangent of the step response with the coordinate axes. The parameter τ is an approximation
of the time delay of the system and a/τ is the steepest slope of the step response. Notice that it is
not necessary to wait until steady state is reached to find the parameters, it suffices to wait until
the response has had an inflection point. The controller parameters are given in Table below. The
parameters

Type kp Ti Td
P 1/a
PI 0.9/a 3τ
PID 1.2/a 2τ 0.5τ

2.3.3 Ziegler-Nicholas Tuning: Continuous cycling (Closed loop method)


Also referred to as Frequency response method.
The frequency response method characterizes the process dynamics by the point where the
Nyquist curve of the process transfer function first intersects the negative real axis and the frequency
ωc where this occurs (Fig. 2.6).

Fig. 2.6 Nyquist curve (Frequency response method)

Frequency response method follows the following steps

Start with Closed-loop system with a proportional controller.

1. Begin with a low value of gain, Kp

2. Reduce the integrator and derivative gains to 0 .

3. Increase Kp from 0 to some critical value Kp = Kcr at which sustained oscillations occur. If
it does not occur then another method has to be applied.

4. Note the value Kcr and the corresponding period of sustained oscillation, Pcr
2.3 PID Controller Tuning 21

Fig. 2.7 Cyclic oscillation

Controller parameters for the Ziegler-Nichols frequency response method which gives con-
troller parameters in terms of critical gain Kcr and critical period Pcr
The corresponding PID gains are given in the following table:

Type of
Kp Ti Td
Controller
P 0.5Kcr ∞ 0
1
PI 0.45Kcr 1.2 Pcr 0
PID 0.6Kcr 0.5Pcr 0.125Pcr

2.3.4 Analytical Ziegler-Nicholas continuos cycling method


Steps to design PID controller:

1. Consider the system under pure proportional control.

2. Consider the closed loop characteristic equation of the system under pure proportional
control.

3. Form the Routh Array and establish the critical gain Kc that produces an all zero row.

4. Note the value of Kc and use auxiliary polynomial to calculate the period of oscillation T .

5. Obtain the controller settings from the table given above

Example 1

Consider a process with transfer function

1
G(s) =
(s + 1)(s + 3)(s + 5)

T.F. = K.G(s)/{1 + K.G(s)}


1 + KG(s) = 0
(s + 1)(s + 3)(s + 5) + K = 0
p(s) = s3 + 9s2 + 23s2 + 15 + K = 0
22 PID Controllers

Fig. 2.8 PID

1 + KG(s) = 0 ⇔
(s + 1)(s + 3)(s + 5) + K = 0 ⇔
p(s) = s3 + 9s2 + 23s + 15 + K = 0.
The corresponding Routh array is

s3 1 23 0
s2 9 15 + K 0
s1 192 − K 0
s0 15 + K
From this see that the range of K for stability is,
15 + K > 0 ⇒ K > −15 and
192 − K > 0 => K < 192. So Kcr = 192
When K = 192, we have imaginary roots since the S1 row is identically 0 .
The corresponding auxiliary equation is

9s2 + 15 + 192 = 0

with roots at s = ± j4.8


Since this is a quadratic factor of the characteristic polynomial ⇒ the sustained oscillation at
the limiting value of K, Kcr, is at 4.8rad/s.

Type of
Kp Ti = K p /Ki Td = Kd /K p
Controller
P 0.5Kcr ∞ 0
1
PI 0.45Kcr 1.2 Pcr 0
PID 0.6Kcr 0.5Pcr 0.125Pcr

Thus, Pcr = 1.31sec and Kcr = 192.


This gives for full PID control from the table as
Kp = 0.6Kcr = 115.2;
Ki = Kp/(0.5Pcr) = 175.87;
Kd = KpTd = Kp⋆ 0.125Pcr = 18.86
2.3 PID Controller Tuning 23

Example 2
6
G(s) =
(s + 1)(s + 2)(s + 3)
Consider the characteristic equation of the system

1 + K p G(s) = 0
s3 + 6s2 + 11s + 6 (1 + K p ) = 0
The Routh array is formed:

s3 1 11
s2 6 6 (K p + 1) ⇌ Auxiliary
s1 11 − (K p + 1) Polynomial
s0 6 (K p + 1)
For Stability: 11 − (K p + 1) > 0 ⇒ K p ≤ 10
6 (K p + 1) > 0 ⇒ KP > 0
Hence, 0 < K p ≤ 10
⇒ Kc = 10

The auxiliary Polynomial is formed as: 6s2 + 6(10 + 1) = s2 + 11 = 0


At critical gain the system is oscillatory or marginally stable i.e. only imaginary roots are
present. Hence in the above equation

s = jω

⇒ ω = 11

T= ⇒ T = 1.895
ω
Hence for the given system: Kc = 10 & T = 1.895
Controller settings are obtained from the table given below:

Type of
Kp Ti = K p /Ki Td = Kd /K p
Controller
P 0.5Kcr ∞ 0
1
PI 0.45Kcr 1.2 Pcr 0
PID 0.6Kcr 0.5Pcr 0.125Pcr

On Solving :

Type of
Kp Ti Td
controller
P 5 ∞ 0
PI 4.5 1.572 0
PID 6 0.947 0.237
24 PID Controllers

Improvement of Transient Response


The transient response can be improved by using either the PD or phase-lead controllers. In the
following, we consider these two controllers independently. However, both of them have the
common feature of introducing a positive phase shift, and both of them can be implemented in a
similar manner.

2.4 PD Controller Design


The PD controller is represented by

Gc (s) = s + zc , zc > 0

which indicates that the compensated system open-loop transfer function will have one ad-
ditional zero. The effect of this zero is to introduce a positive phase shift. The phase shift and
position of the compensator’s zero can be determined by using simple geometry. That is, for the
chosen dominant complex conjugate poles that produce the desired transient response we apply
the root locus angle rule. This rule basically says that for a chosen point, sd , on the root locus the
difference of the sum of the angles between the point sd and the open-loop zeros, and the sum of
the angles between the point sd and the open-loop poles must be 180◦ . Applying the root locus
angle rule to the compensated system, we get
m n
∡Gc (sd ) G (sd ) = ∡ (sd + zc ) + ∑ ∡ (sd + zi ) − ∑ ∡ (sd + pi ) = 180◦
i=1 i=1

which implies
m n
∡ (sd + zc ) = 180◦ − ∑ ∡ (sd + zi ) + ∑ ∡ (sd + pi ) = αc
i=1 i=1

From the obtained angle ∡ (sd + zc ) the location of the compensator’s zero is obtained by
playing simple geometry as demonstrated in Figure 2.9. Using this figure it can be easily shown
that the value of zc is given by
 q 
ωn 2
zc = ζ tan αc + 1 − ζ
tan αc
An algorithm for the PD controller design can be formulated as follows.

Design Algorithm

1. Choose a pair of complex conjugate dominant poles in the complex plane that produces
the desired transient response (damping ratio and natural frequency). These are obtained
through Transient performance specifications, ie. Percentage overshoot and Settling time.

2. Find the required phase contribution of a PD regulator by using formula .


2.4 PD Controller Design 25

Fig. 2.9 Determination of a PD controller’s zero location

3. Find the absolute value of a PD controller’s zero by using formula

4. Check that the compensated system has a pair of dominant complex conjugate closed-loop
poles.

Example 3

Let the design specifications be set such that the desired maximum percent overshoot is less than
20% and the 5%-settling time is 1.5 s. Then, the formula for the maximum percent overshoot
given by (6.16) implies
s
ζπ ln2 {OS}
−p = ln{OS} ⇒ ζ = = 0.456
1−ζ2 π 2 + ln2 {OS}
We take ζ = 0.46 so that the expected maximum percent overshoot is less than 20%. In order
to have the 5%-settling time of 1.5 s, the natural frequency should satisfy

3 3
ts ≈ ⇒ ωn ≈ = 4.348rad/s
ζ ωn ζ ts
The desired dominant poles are given by
q
sd = λd = −ζ ωn ± jωn 1 − ζ 2 = −2.00 ± j3.86

Consider now the open-loop control system

K(s + 10)
G(s) =
(s + 1)(s + 2)(s + 12)
The root locus of this system is represented in Figure 2.10.
It is obvious from the above figure that the desired dominant poles do not belong to the original
root locus since the breakaway point is almost in the middle of the open-loop poles located at -1
and -2 .
In order to move the original root locus to the left such that it passes through sd , we design a
PD controller by following Design Algorithm.
26 PID Controllers

Fig. 2.10 Root loci of the original (a) and compensated (b) systems

Step 1 has been already completed in the previous paragraph. Since we have determined the
desired operating point, sd , we now use angle formula to determine the phase contribution of a PD
controller.
By MATLAB function angle (or just using a calculator), we can find the following angles

∡ (sd + z1 ) = 0.4495rad, ∡ (sd + p1 ) = 1.8243rad


∡ (sd + p2 ) = 1.5708rad, ∡ (sd + p3 ) = 0.3684rad
Note that MATLAB function angle produces results in radians. Using angle formula formula ,
we get

∡ (sd + zc ) = π − 0.4495 + 1.8243 + 1.5708 + 0.3684


= 0.1723rad = 9.8734◦ = αc
Having obtained the angle αc , the magnitude formula produces the location of the controller’s
zero, i.e. zc = 24.1815, so that the required PD controller is given by

Gc (s) = s + 24.1815

The root locus of the compensated system is presented in Figures 2.10b and 2.10b. It can be
seen from Figure 2.11 that the point sd = −2 ± j3.86 lies on the root locus of the compensated
system.
At the desired point, sd , the static gain K, obtained by applying the root locus rule Magnitude
formula, is given by K = 0.825. This value can be obtained either by using a calculator or the
MATLAB function abs as follows:
2.4 PD Controller Design 27

Fig. 2.11 Enlarged portion of the root loci in the neighborhood of the desired operating point of
the original (a) and compensated (b) systems

d1 = abs(sd + p1);
d2 = abs(sd + p2);
d3 = abs(sd + p3);
d4 = abs(sd + z1);
d5 = abs(sd + zc);
K = (d1∗ d2∗ d3) / (d4∗ d5)
For this value of the static gain K, the steady state errors for the original and compensated
systems are given by ess = 0.7442, essc = 0.1074. Note that in the case when zc > 1, this controller
can also improve the steady state errors. In addition, since the controller’s zero will attract one of
the system poles for large values of K, it is not advisable to choose small values for zc since it may
damage the transient response dominance by the pair of complex conjugate poles closest to the
imaginary axis.
The closed-loop step response for this value of the static gain is presented in Figure 2.12. It
can be observed that both the maximum percent overshoot and the settling time are within the
specified limits.
The values for the overshoot, peak time, and settling time are obtained by the following
MATLAB routine:
Using this program, we have found that ts = 1.125 s and MPOS = 20.68%. Our starting
assumptions have been based on a model of the second-order system. Since the second-order
systems are only approximations for higher-order systems that have dominant poles, the obtained
results are satisfactory.
Finally, we have to check that the system response is dominated by a pair of complex conjugate
poles. Finding the closed-loop eigenvalues we get λ1 = −11.8251, λ2,3 = −2.000 ± j3.8600,
28 PID Controllers

Fig. 2.12 Step response of the compensated system for Example 3

Fig. 2.13 Matlab subroutine

which indicates that the presented controller design results are correct since the transient response
is dominated by the eigenvalues λ2,3 .

2.5 PI Controller Design


As we have already indicated, the PI controller represents a stable dipole with a pole located at
the origin and a stable zero placed near the pole. Its impact on the transient response is negligible
since it introduces neither significant phase shift nor gain change (see root locus rules 9 and 10 in
Table 7.1). Thus, the transient response parameters with the PI controller are almost the same as
those for the original system, but the steady state errors are drastically improved due to the fact
that the feedback control system type is increased by one.
2.5 PI Controller Design 29

The PI controller is represented, in general, by

s + KKpi
Gc (s) = K p , Ki ≪ K p
s
where K p represents its static gain and Ki /K p is a stable zero near the origin. Very often it is
implemented as

s + zc
Gc (s) =
s
This implementation is sufficient to justify its main purpose. The design algorithm for this
controller is extremely simple.

1. Set the PI controller’s pole at the origin and locate its zero arbitrarily close to the pole, say
zc = 0.1 or zc = 0.01.

2. If necessary, adjust for the static loop gain to compensate for the case when K p is different
from one. Hint: Use K p = 1, and avoid gain adjustment problem.

Comment: Note that while drawing the root locus of a system with a PI controller (compen-
sator), the stable open-loop zero of the compensator will attract the compensator’s pole located at
the origin as the static gain increases from 0 to +∞ so that there is no danger that the closed-loop
system may become unstable due to addition of a PI compensator (controller).
The following example demonstrates the use of a PI controller in order to reduce the steady
state errors.

Example 4

Consider the following open-loop transfer function

K(s + 6)
G(s) =
(s + 10) (s2 + 2s + 2)
Let the choice of the static gain K = 10 produce a pair of dominant poles on the root locus,
which guarantees the desired transient specifications. The corresponding position constant and the
steady state unit step error are given by

10 × 6 1
Kp = = 3 ⇒ ess = = 0.25
10 × 2 1 + Kp
Using a PI controller with the zero at −0.1 (zc = 0.1), we obtain the improved values as K p = ∞
and ess = 0. The step responses of the original system and the compensated system, now given by

10(s + 0.1)(s + 6)
Gc (s)G(s) =
s(s + 10) (s2 + 2s + 2)
are presented in Figure ??.
The closed-loop poles of the original system are given by
30 PID Controllers

Fig. 2.14 Step responses of the original (a) and compensated (b) systems for Example 4

λ1 = −9.5216, λ2,3 = −1.2392 ± j2.6204

For the compensated system they are

λ1c = −9.5265, λ2c,3c = −1.1986 ± j2.6109

Having obtained the closed-loop system poles, it is easy to check that the dominant system
poles are preserved for the compensated system and that the damping ratio and natural frequency
are only slightly changed. Using information about the dominant system poles we get

ζ ωn = 1.2392, ωn2 = (1.2392)2 + (2.6204)2 ⇒ ωn2 = 2.9019, ζ = 0.4270

and

2
ζc ωnc = 1.1986, ωnc = (1.1986)2 + (2.6109)2
2
⇒ ωnc = 2.8901, ζc = 0.4147
In Figure 2.15 we draw the step response of the compensated system over a long period of
time in order to show that the steady state error of this system is theoretically and practically equal
to zero.
Figures 2.15 and 2.14 are obtained by using the same MATLAB functions as those used in
Example 8.4.
The root loci of the original and compensated systems are presented in Figures 8.8 and 8.9. It
can be seen from these figures that the root loci are almost identical, with the exception of a tiny
dipole branch near the origin.
2.5 PI Controller Design 31

Fig. 2.15 Step response of the compensated system for Example 4

Fig. 2.16 Root locus of the original system for Example 4

2.5.1 PID and Phase-Lag-Lead Controller Designs


It can be observed from the previous design algorithms that implementation of a PI (phase-lag)
controller does not interfere with implementation of a PD (phaselead) controller. Since these two
groups of controllers are used for different purposes - one to improve the transient response and
the other to improve the steady state errors-implementing them jointly and independently will take
care of both controller design requirements.
Consider first a PID controller. It is represented as

K
Ki s2 + Kdp s + KKdi
GPID (s) = K p + Kd s + = Kd
s s
(s + zc2 )
= Kd (s + zc1 ) = GPD (s)GPI (s)
s
32 PID Controllers

Fig. 2.17 Root locus of the compensated system for Example 8.5

which indicates that the transfer function of a PID controller is the product of transfer functions
of PD and PI controllers. Since in Design Algorithms for PD and PI there are no conflicting steps,
the design algorithm for a PID controller is obtained by combining the design algorithms for PD
and PI controllers.

Design Algorithm: PID Controller

1. Check the transient response and steady state characteristics of the original system.

2. Design a PD controller to meet the transient response requirements.

3. Design a PI controller to satisfy the steady state error requirements.

4. Check that the compensated system has the desired specifications.

Example

Consider the problem of designing a PID controller for the open-loop control system studied in
Example 3, that is

K(s + 10)
G(s) =
(s + 1)(s + 2)(s + 12)
In fact, in that example, we have designed a PD controller of the form

GPD (s) = s + 24.1815

such that the transient response has the desired specifications. Now we add a PI controller in
order to reduce the steady state error. The corresponding steady state error of the PD compensated
2.6 PID controller design using controlerdesigner in MATLAB 33

system in Example 8.8 is essc = 0.1074. Since a PI controller is a dipole that has its pole at the
origin, we propose the following PI controller

s + 0.1
GPI (s) =
s
In comparison , we are in fact using a PID controller with Kd = 1, zc1 = 24.1815, zc2 = 0.1.
The corresponding root locus of this system compensated by a PID controller is represented in
Figure 2.18.

Fig. 2.18 Root locus for the system from Example 3 compensated by the PID controller

It can be seen that the PI controller does not affect the root locus, and hence Figures 2.10 and
2.18 are almost identical except for a dipole branch.
On the other hand, the step responses of the system compensated by the PD controller and
by the PID controller (see Figures 2.13 and 2.19) differ in the steady state parts. In Figure 2.13
the steady state step response tends to yss = 0.8926, and the response from Figure ?? tends to 1
since due to the presence of an open-loop pole at the origin, the steady state error is reduced to
zero. Thus, we can conclude that the transient response is the same one as that obtained by the
PD controller in Example 3, but the steady state error is improved due to the presence of the PI
controller.

2.6 PID controller design using controlerdesigner in MATLAB


https://www.mathworks.com/videos/control-system-design-with-control-system-tuning-app-68749.html
34 PID Controllers

Fig. 2.19 Step response of the system from Example 3 compensated by the PID controller

2.7 PID Algorithm (issues and implementation)


A PID controller is much more than
Z t
de(t)
u(t) = k p e(t) + ki e(τ)dτ + kd
t0 dt
We have to consider Actuator limitations (saturation), sensor noise and control mode switches
which are addressed through

• Integrator Windup

• Set point weighting

• Rate limitations (Derivative kick)

• Filtering

• Bumpless parameter changes

• Computer implementation

Dealing with these issues is a good introduction to practical aspects of any control algorithm.
A number of variations of PID controllers are useful in implementation. These include filtering
the derivative, setpoint weighting and other variations in how the derivative and integral actions
are formulated. PID controllers can be implemented using analog hardware, such as operational
amplifiers, or via digital implementations on a computer

Integrator Windup , derivative kick and setpoint weighting

Integral windup can occur in a controller with integral action when actuator saturation is present.
Integral action can be implemented using automatic reset.
2.7 PID Algorithm (issues and implementation) 35

The first method uses a switch to break the integral action, whenever the actuator goes to
saturation. This can be illustrated by Fig. 2.20. Consider schematic arrangement of a controller
shown in the figure. The arrangement can be confirmed to be PI control by solving for u

Fig. 2.20 PID with a switch to break the integral action

1 1 + sτ Kp
u = Kp e + u Solving for u u = K p e = K p e+ e (2.1)
1 + sτ sτ sτ
So when the switch is closed, the controller acts as a P-I controller. On the other hand, if the
switch is open, it is a simple P- controller. The switch is activated by the position of the actuator.
If the actuator is operating in the linear range, the switch is closed, and the controller is in P-I
mode. But whenever the actuator is in the saturation mode, the switch is automatically opened; the
controller becomes a P-controller. As a result, any windup due to the presence of integral mode is
avoided.
Anti-windup compensation can be used to minimize the effects of integral windup by feeding
back the difference between the commanded input and the actual input, as illustrated below:

Fig. 2.21 PID Integral windup configuration

A local feedback loop keeps integrator output close to the actuator limits. The gain kt or the
time constant Tt = 1/kt determines how quickly the integrator is reset.
The derivative of a sudden jump in the error causes the derivative of the error to be instanta-
neously large and causes the controller output to saturate for one cycle at either an upper or lower
36 PID Controllers

bound. While this momentary jump isn’t typically a problem for most systems, a sudden saturation
of the controller output can put undue stress on the final control element or potentially disturb the
process.
To overcome derivative kick, it is assumed that the set point is constant with

d(r)
=0
dt (2.2)
de(t) d(r − y) d(r) d(y) d(y)
= = − =−
dt dt dt dt dt

This modification avoids derivative kick but keeps a derivative term in the PID equation.

Set Point Weighting

It is common for the closed-loop system to track a constant reference input. In this case, the input
is called a setpoint. The setpoint weighted PID is thus a generalization of the PID, and has
Z t
ded (t)
u(t) = K p e p (t) + Ki ei (τ)dτ + Kd
0 dt

Where e p = a p r(t) − y(t), ei (t) = r(t) − y(t) and ed (t) = ad r(t) − y(t)
Each term has a different ”error” associated with it. Note that when a p = ad = 1 then we the
original PID design. Note also that when r(t) is a piecewise constant signal (only step changes),
then for all time except at actual step locations ṙ(t) = 0and thus

ded (t) d
= (ad r(t) − y(t)) = −ẏ(t) (2.3)
dt dt
Which is independent of r(t) and ad
In general, since y is the output of the plant, it will be a smooth function and thus ẏ will be
bounded. It is thus not uncommon to let ad = 0. This eliminates spikes in the term Kd dedtd (t) without
substantially affecting the overall control performance.
The block diagram for set point weighting is as shown below.

Fig. 2.22 PID with The Setpoint Weighting Configuration


2.7 PID Algorithm (issues and implementation) 37

As we have seen, changing ad does not change the overall design. Changing a p , however,
may change the design. The rationale behind ap is that if, for example, a p = 0.5, then a large step
change in r(t) does not create such a large control magnitude. However, in general, e p does not go
to zero when y = r. Thus, there is a persistent control applied even when it is not necessary. This
persistent control is effected by the integral term as a compensation for weighting the proportional
term. Therefore there is always bias or reset control input associated with the proportional term.
The use of this, then is of questionable value. Setting ap = 1, however, brings us back to the
original error.

Derivative kick

We noted, for instance, that it is not particularly good to differentiate step changes in the error
signal. A derivative block of a PID controller calculates the momentary derivative of the signal.
When the set value changes suddenly, we get a very large derivative output. As seen in Fig.2.23,
the derivative is not only unnecessarily large, it’s also pointed in the wrong direction. Such a
derivative term will actually impede its intended purpose.

Fig. 2.23 Open loop response of a standard derivative term

To mitigate this, we might think we need to get rid of the steepness of the e(t) curve by filtering
the setpoint signal. However, filtering or slope limitation of our set point will ultimately slow
down the response of the controller. The best option is to apply zero setpoint weighting for the
D-term as discussed earlier which gives better mitigation as shown Fig.2.24 .

Fig. 2.24 An improved derivative term (shown: open-loop response)


38 PID Controllers

Filtering

In general, a true differentiator is not available. This is because true differentiation is a wide-band
process, i.e., the gain of this term increases linearly with frequency. It is a non-causal process.
It tends to amplify high-frequency noises. In addition, if the error undergoes a sharp transition,
for example, when a step input is applied, the derivative skyrockets, requiring an unreasonably
large control effort. Generally, the control signal will saturate all amplifiers etc. To deal with the
impractical nature of the D term, it is not uncommon to use the modified derivative.
Technically, Filter only derivative part
 
1 sTd ki kd s
C f b (s) = k 1 + + = kp + +
sTi 1 + sT f s 1 + sT f
Filter the measured signal (several advantages)

• Better noise attenuation and robustness due to high frequency roll-off

Process dynamics can be augmented by filter and design can be made for an ideal PID

kd s2 + k p s + ki 1 + sTi + s2 Ti Td
C f b (s) =  = ki 
s 1 + sT f s 1 + sT f
High frequency rolloff improves robustness and noise sensitivity

The Proportional Controller - Proportional Band

u = Ke + ub , K gain, ub bias or reset

The proportional band PB is the range where the output does not saturate, often given as
percentage of error or measured signal.

Fig. 2.25 Proportional band


2.7 PID Algorithm (issues and implementation) 39

Manual and Automatic Control

Most controllers have several modes


Manual/automatic

• In manual control the controllers output is adjusted manually by an operator often by


increase/decrease buttons

• Mode switching is an important issue

• Switching transients should be avoided

• Easy to do if the same integrator is used for manual and automatic control

In such cases, in order to avoid any jerk in the process the controller output immediately
after the changeover should be identical to the output set in the manual mode. This can be
achieved by forcing the integral output at the instance of transfer to balance the proportional and
derivative outputs against the previous manual output; i.e. Integral output = (previous manual)
-(proportional+ derivative) output. Similarly, for automatic to manual transfer, initially the manual
output is set equal to the controller output and the difference is gradually reduced by incrementing
or decrementing the manual output to the final value of the manual signal and thus effecting a
change over.
Another way to transfer from Auto to Manual mode in a bumpless manner, the set point may
be made equal to the present value of the process variable and then slowly changing the set point
to its desired value. The above features can be easily be im-plemented if a digital computer is used
as a controller. This provision eliminates the chance of the process receiving sudden jolt during
transfer.

2.7.1 Implementation of PID controllers


Implementation of PID controlles have evolved from pneumatic types which were slow in nature.
After the development of electronic devices and operational amplifiers, the electronic controllers
started replacing the conventional pneumatic controllers. But with the advent of the micropro-
cessors and microcontrollers, the focus of development is now towards the implementation with
digital PID controllers. The major advantage of using digital PID controllers is that the controllers
parameter can be programmed easily; as a result, they can be changed without changing any
hardware. Moreover, the same digital computer can be used for a number of other applications
besides generating the control action.

Computer Implementation

Practically all control systems are today implemented using computers. We will briefly discuss
some aspects of this.
40 PID Controllers

AD and DA converters are needed to connect sensors and actuators to the computer. A clock is
also needed to synchronize the operations. We will discuss
Sampling and aliasing
A basic algorithm
Converting differential equations to difference equations

• Wordlength issues

Bump-less parameter changes


The following operations are executed by the computer.

1. Wait for clock interrupt

2. Convert setpoint ysp and process output y to numbers

3. Compute control signal u

4. Convert control signal to analog value

5. Update variables in control algorithm

6. Go to step 1

Desirable to make time between 1 and 4 as short as possible. Defer as much as possible of the
computations to step 5.
Alias and Anti-aliasing Filters

Fig. 2.26 Alias and Anti-aliasing Filters

• Nyquist frequency = ( Sampling frequency )/2

• High frequencies may appear as low frequencies after sampling

• To represent a continuous signal uniquely from its samples the continuous signal cannot
have frequencies above the Nyqyist frequency which is half the sampling frequency

• Anti-aliasing filters that reduce the frequency content above the Nyquist frequency is
essential.
2.7 PID Algorithm (issues and implementation) 41

The PID Algorithm

The PID controller is described by:

U(s) = P(s) + I(s) + D(s)


P(s) = k (bYsp (s) −Y (s))
1
I(s) = k (Ysp (s) −Y (s))
sTi
sTd
D(s) = −k Y (s)
1 + sTd /N
Computers can only add and multiply, it cannot integrate or take derivatives. To obtain a
programmable algorithm we must approximate. There are many ways to do this.
Introduce the times tk when the clock ticks, assume that tk − tk−1 = h, ,where h is the sampling
period.
Proportional and Integral Action

p (tk ) = k ∗ (bysp (tk ) − y (tk ))

Integral part
Z t
k
i(t) = e(s)ds
Ti
Differentiate

di k
= e(t)
dt Ti
Approximate the derivative by a difference

i (tk+1 ) − i (tk ) ke (tk )


=
h Ti
This equation can be written as

kh
i (tk+1 ) = i (tk ) + e (tk )
Ti
Derivative Part

sTd
D(s) = −k Y (s)
1 + sTd /N
Hence

(1 + sTd /N) D(s) = −ksTd Y (s)

In time domain

Td dd dy
d(t) + = −kTd
N dt dt
42 PID Controllers

Approximate derivative by backward difference

Td d (tk ) − d (tk−1 ) y (tk ) − y (tk−1 )


d (tk ) + = −kTd
N h h
Hence
 
Td Td kTd
1+ d (tk ) = d (tk−1 ) − (y (tk ) − y (tk−1 ))
Nh Nh h
or

Td kTd N
d (tk ) = d (tk−1 ) − (y (tk ) − y (tk−1 ))
Td + Nh Td + Nh
Notice that the algorithm works well even if Td is small, this is not the case if forward
approximations are used.

p (tk ) = k ∗ (bysp (tk ) − y (tk ))


Td
d (tk ) = (d (tk−1 ) − kN (y (tk ) − y (tk−1 )))
Td + Nh
v = p (tk ) + i (tk ) + d (tk )
u (tk ) = sat(v)
e (tk ) = ysp (tk ) − y (tk )
kh kh
i (tk+1 ) = i (tk ) + e (tk ) + (u − v)
Ti Tr

• Useful to precompute parameters

• Make sure updating is done safely

• Organize the code right

Organize Computations

p (tk ) = k ∗ (bysp (tk ) − y (tk ))


e (tk ) = ysp (tk ) − y (tk )
Td
d (tk ) = (d (tk−1 ) − kN (y (tk ) − y (tk−1 )))
Td + Nh
v = p (tk ) + i (tk ) + d (tk )
u (tk ) = sat(v)
kh kh
i (tk+1 ) = i (tk ) + e (tk ) + (u − v)
Ti Tr

• Useful to precompute parameters

• Make sure updating is done safely

• Organize the code right


2.7 PID Algorithm (issues and implementation) 43

Fix Point Implementation Word-length Issues


Over and under-flow
Consider updating of the integral part

kh
i (tk+1 ) = i (tk ) + e (tk )
Ti

Example

• h = 0.05 s

• Ti = 5000 s

• k=1

• kh
Tj = 10−5

If the error has 3 digits the integral need to be updated with 8 digits (28 bits) to avoid rounding
off the errors!
A PID controller is often switched between three modes: off, manual and automatic control. It
is important that there are no switching transients.
Notice the difference between
Z t Z t
I = ki (t) e(τ)dτ, I= ki (τ)e(τ)dτ
0 0
Integration and multiplication with a time varying function do not commute!
Some controllers require that you switch to manual mode to change parameters

• Problem is avoided by proper coding

Fig. 2.27 Root locus of the original system for Example 4


44 PID Controllers

Electronic PID Controllers:

Most analog electronic PID controllers utilized operational amplifiers in their designs. It is
relatively easy to construct circuits performing amplification (gain), integration, differentiation,
summation, and other useful control functions with just a few op-amps, resistors, and capacitors.
The following schematic diagram shows a full PID controller implemented using eight operational
amplifiers, designed to input and output voltage signals representing PV, SP, and Output:

Fig. 2.28 Analogue Electronic PID

Pneumatic Controller:

It has been already mentioned that the early days PID controllers were all pneumatic type. The
advantage of pneumatic controllers is its ruggedness, while its major limi-tation is its slow response.
Besides it requires clean and constant pressure air supply. The major components of a pneumatic
controller are bellows, flapper nozzle amplifier, air relay and restrictors (valves). The integral and
derivative actions are generated by controlling the passage of air flow through restrictors to the
bellows.
A simple scheme for implementation of a pneumatic PI controller is shown in Figures below.
2.7 PID Algorithm (issues and implementation) 45

Fig. 2.29 Pneumatic Controller

2.7.2 Limitations of PID Control


PID control is simple and useful but there are limitations when applied in the following systems.

• Multivariable and strongly coupled systems

• Complicated dynamics

• Large parameter variations

• Robust design

• Gain scheduling and adaptation

• Difficult compromises between load disturbance attenuation and measurement noise injection
Part II

(Modern Control)
Chapter 3a

State-space analysis and State space


design methods

3.1 State-space analysis


3.1.1 Introduction
We defined the state of a system as any set of quantities which must be specified at a given
time in order to completely determine the behavior of the system. The quantities constitut-
ing the state are called the state variables, and the hypothetical space spanned by the state
variables is called the state-space.

Modern control theory is based on the description of system equations in terms of n


first-order differential equations, which may be combined into a first order vector-matrix
differential equations. The use of vector-matrix notation greatly simplifies the mathematical
representation of systems of equations.

Consider a system described by the following couple of coupled differential equations

y¨1 + k1 y˙1 + k2 y1 = u1 + k3 u2 (1.1)


ẏ2 + k4 y2 + k5 ẏ1 = k6 u1 (1.2)

where u1 and u2 is defined as the control inputs and y1 and y2 is defined as the measurements
or outputs

We now define the outputs and if necessary the derivatives of the outputs as states.
Hence, define the states

x1 = y1 , x2 = y˙1 , x3 = y2

1
2
This gives the following set of 1st order differential equations for the states

x˙1 = x2 (1.3)
x˙2 = −k2 x − 1 − k1 x2 + u1 + k3 u2 (1.4)
x˙3 = −k5 x2 − k4 x3 + k6 u1 (1.5)

and the following measurements (outputs) variables

y 1 = x1 (1.6)
y 2 = x3 (1.7)

The model is put on matrix (State Space) form as follows


      
x˙1 0 1 0 x1 0 0 " #
       u1
 x˙2  =  −k2 −k1 0   x2  +  1 k3  (1.8)
      
u2
x˙3 0 −k5 k4 x3 k6 0
 
" # " # x1
y1 1 0 0   x2 

=   (1.9)
y2 0 0 1
x3

and finally in matrix form as follows

ẋ = Ax + Bu (1.10)
y = Cx + Du (1.11)

where u is the control vector, x is the state vector, y is the measurements vector and
x0 = x(t0 ) is the initial value of the state vector, which usually is assumed to be known.

Figure 1.1: Block diagram in state space


3
3.1.2 Stability: System Characteristics
We defined the characteristics of a system by its characteristic equation, whose roots are
the poles of the system. We also saw how the locations of the poles indicate a system’s
performance - such as natural frequency, damping factor, system type - as well as whether
the system is stable.

The characteristic equation was defined to be the denominator polynomial of the sys-
tem’s transfer function (or transfer matrix) equated to zero. Hence, we should first obtain
an expression for the transfer matrix in terms of the state-space coefficient matrices, A, B,
C, D.

Recall that the transfer matrix is obtained by taking the Laplace transform of the gov-
erning differential equations, for zero initial conditions. Taking the Laplace transform of
both sides of the matrix state-equation, assuming zero initial conditions (i.e.x(0) = 0) yields
the following result:

sX(s) = AX(s) + BU (s) (1.12)


We can write the same as
(sI − A)X(s) = BU (s) (1.13)
or
X(s) = (sI − A)−1 BU (s) (1.14)
Similarly, taking the Laplace transform of the output equation
Y (s) = CX(s) + DU (s) (1.15)
Substituting equation 1.14 in equation 1.15 we get
Y (s) = C(sI − A)−1 BU (s) + DU (s) = [C(sI − A)−1 B + D]U (s) (1.16)
From Eq.1.16, it is clear that the transfer matrix, G(s), defined by Y (s) = G(s)U (s)
is given as:
G(s) = C(sI − A)−1 B + D (1.17)
Equation 1.17 tells us that the transfer matrix is a sum of the rational matrix (i.e. a
matrix whose elements are ratios of polynomials in s), C(sI − A)−1 B, and the matrix D.
Thus, D represents a direct connection between the input, U(s), and the output, Y(s), and
is called the direct transmission matrix. Systems having D = 0 are called strictly proper,
because the numerator polynomials of the elements of G(s) are smaller in degree than the
corresponding denominator polynomials . Hence, the characteristic polynomial of the system
must be related to the denominator polynomial resulting from the matrix, C(sI − A)−1 B
4
Example: 1

Substituting Eqs. (3.107) and (3.108) into Eq. (3.106) we get

Note that the single-input, single-output system of Example above, the transfer function
has a denominator polynomial s2 − 2s + 5, which is also the characteristic polynomial of the
system . The denominator polynomial is equal to |(sI − A)| (Eq. (3.107)). Thus, the poles of
the transfer function are the roots of the characteristic equation, |(sI − A)| = 0. This is
also true for the multivariable system
Example: 2

Obtain the transfer function. for lh.e system,


-[ 1 0 -
'.X;;;;0 I] [OJ
-1 1 X l u
L -2 -3 1
V = [1 0 I] X

[s
Sel1111tion

I O l
(s] - A) = 0 S, + 1 - L
-1 2 s 3

[2 2 -(s. + ])
(sI- -A )-I = s ? 5s 2 1 .s 2 +4s 4 s,+l l
s +4s+5

+ + 10s + 6 2
5+1 -2(s + L) (s+1)
Toe transfer fu.R.Clion �s give-n by,
- (s
s + 4s + 4
+l

[OJ
s + I )l l
2
T(s)
t!.(s) [J -2(s ie 1)
2

(s + o- 1 L
,

Where �(s} = sl + 5s2 + lOs + 6


l- s
T(s) =
A(s)
....!... [
[1 O l) s 2 5s :5 ]
(s IXs-J)

s(s -1)
s +Ss' + 10s 6
=-----
1
5

Using linear algebra, the characteristic equation of a general, linear time-invariant system
is obtained from the following eigenvalue problem for the system:

Avk = λk vk (1.18)

where λk is the k th eigenvalue of the matrix A, and vk is the eigenvector associated with
the eigenvalue, λk The same equation can written as

(λI − A)v = 0 (1.19)

For the nontrivial solution of Eq. 1.19 (i.e.v 6= 0), the following must be true:

|(λI − A)| = 0 (1.20)

Equation 1.20 is another way of writing the characteristic equation, whose the roots are
the eigenvalues of A. Hence, the poles of the transfer matrix are the same as the eigenvalues of
the matrix A. Since A contains information about the characteristic equation of a system, it
influences all the properties s uch as s tability, performance and robustness of the s ystem. For
this reason, A is c alled the s ystem’s s tate-dynamics matrix.

Transition matrix:
The free (homogeneous) response of a system can be used to find the state transition
matrix. The homogenous state equation is given as

ẋ(t) = Ax(t) (1.21)

Taking laplace transform of both sides

sX(s) − X(0) = AX(s) (1.22)

Thus
(sI − A)X(s) = X(0) (1.23)
or
X(s) = (sI − A)−1 X(0) (1.24)
Taking Inverse Laplace transform

x(t) = `−1 [(sI − A)−1 ]X(0) (1.25)

but
I A A2
(sI − A)−1 = + 2 + 3 + ... (1.26)
s s s
6
Hence
A2 t2 A3 t3
`−1 [(sI − A)−1 ] = I + At + + + . . . = eAt (1.27)
2!‘ 3!‘
Thus
x(t) = eAt x(0) (1.28)

eAt = T ransitionM atrix


The forced response (particular) response of the system can be found using convo-
lution,
For the state equation
ẋ = Ax + Bu (1.29)
Taking laplace transform yields

sX(s) − X(0) = AX(s) + BU (s) (1.30)

or
(sI − A)X(s) = X(0)+BU (s) (1.31)
or
X(s) = (sI − A)−1 X(0) + (sI − A)−1 BU (s) (1.32)
but
(sI − A)−1 = `[eAt ] (1.33)
Thus
X(s) = `[eAt ]X(0) + `[eAt ]BU (s) (1.34)
Using convolution for the forced component
Z t
A(t)
x(t) = e x(0) + eA(t−τ ) Bu(τ )dτ (1.35)
0

This assumed initial time as zero. If the initial time is given as t0 instead of 0 the solution
is modified to Z t
x(t) = eA(t−t0 ) x(t0 ) + eA(t−τ ) Bu(τ )dτ (1.36)
t0

The solution consists of two parts. The first part represents the autonomous response
(homogenous solution) driven only by initial values diferent from zero. The second term
represents the inhomogenous solution driven by the control variable, u(t).

The output equation can also be solved in time domain by substituting the state solution
into the output equation
7

Example:
8

Understanding the Stability Criteria through the State-Transition Matrix

The elements of eAt are linear combinations of eλk t , where λk for k = 1, 2, . . . n are the
distinct eigenvalues of the s ystem . Such elements can be expressed as e akt multiplied by
oscillatory terms, s in b k t and cos b k t where ak and b k are real and i maginary parts of the kth
eigenvalue, λk = ak + ibk .

If the real parts, ak , of all the eigenvalues are negative (−) , the initial responses of all
the state-variables will decay to zero as time t becomes large due to the presence of exp ak t
as the factor in all the elements of eAt. Hence, a system with all eigenvalues having negative
real parts is asymptotically stable. This is the first stability criterion(Asymptotic Stability).

By the s ame token, i f any eigenvalue λk , has a positive real part, ak then the correspond-
ing f actor e At will diverge to i nfinity as time, t, becomes l arge, s ignifying an unstable system .
This is the second stability criterion.

If a zero eigenvalue is repeated twice, it leads to the presence of terms such as ct, where c
is a constant, in the elements of eAt . More generally, if an eigenvalue, λk , which is repeated
twice has zero real part (i.e. λk = ibk , then eAt will have terms such as tsin(bk t) and tcos(bk t)
- and their combinations - in its elements. If an eigenvalue with zero real part is repeated
thrice, then eAt will have combinations of t2 sin(bk t), t2 cos(bk t), tsin(bk t), and / tcos(bk t) in
its elements. Similarly, for eigenvalues with zero real parts repeated larger number of times,
there will be higher powers of t present as coefficients of the oscillatory terms in the elements
of eAt . Hence, if any eigenvalue,λk having zero real part is repeated two or more times, the
presence of powers of t as coefficients of the oscillatory terms, sin(bk t) and cos(bk t), causes
9
elements of eAt? to blow-up as time, t, increases, thereby indicating an unstable system.
This is the third stability criterion.

Note that individual initial responses to a specific initial condition may not be sufficient to
tell us whether a system is stable. Hence the need for BIBO stability

3.1.3 Controllability
The question of how we can (and if it is possible to) find a suitable control input u(t) that
will take the system from an initial state x(t0 ) to any desired final state x(t1 ) in a finite
(often very small) time, is answered by the theory of state controllability.
Definition 1.2 (State Controllability)
A system is said to be controllable if a control vector u(t) exists that will transfer the system
from any initial state x(t0 ) to some final state x(t) in a finite time interval.

A system described by a state space model ẋ = Ax + Bu with initial state x(t0 ) given is
controllable if for an arbitrarily finite time t1 > t0 exist a control function u(t) defined over
the time interval t0 ≤ t ≤ t1 , such that the final state, x(t1 ), can be arbitrarily specified.
There exist a few algebraic definitions which can be used for the analysis of state con-
trollability. Such a theorem is defined via the so called controllability matrix.
Applying the defination of complete state controllability we have
Z t1
x(t1 ) = 0 = eA(t1 )
x(0) + eA(t1 −τ ) Bu(τ )dτ (1.37)
0

or Z t1
x(0) = − e−Aτ Bu(τ )dτ (1.38)
0
From Sylvesters method to transition matrix solution
n−1
X
e−Aτ = αk (τ )Ak (1.39)
k=0

Substituting this in the above equation


n−1
X Z t1
k
x(0) = − A B αk (τ )u(τ )dτ (1.40)
k=0 0

Let Z t1
αk (τ )u(τ )dτ = βk (1.41)
0
Thus
n−1
X
x(0) = − Ak Bβk (1.42)
k=0
10
or  
β0
 
h i  β1 
 
x(0) = − B AB . . . An−1 B  .  (1.43)
 .. 
 
βn−1
If the system is completely state controllable, the the above equation must be satisfied.
c [B AB A2B . . . An−1 B] are linearly independent or the
This requires that the vector
h i
2
rank of the n × n matrix B AB A B . . . A n−1 B be n or full rank
Theorem 1.11.1 (Controllability matrix)
A system described by a state space model ẋ = Ax + Bu is state controllable if the control-
lability matrix
Cn = [ B AB A2 B . . . An−1 B ] (1.44)
has full rank, i.e.,

rank(Cn) = n (1.45)
Note that for single input systems, i.e., r = 1, then Cn which implies that Cn should be
6 0 in order for the system to be state controllable.
invertible and that det(Cn) =

Controllability of a system can be easily determined if we can decouple the state-equations


of a system. Each decoupled scalar state-equation corresponds to a sub-system. If any of
the decoupled state-equations of the system is unaffected by the input vector, then it is
not possible to change the corresponding state variable using the input, and hence, the sub-
system is uncontrollable. If any sub-system is uncontrollable, i.e. if any of the state variables
is unaffected by the input vector, then it follows that the entire system is uncontrollable.
Controllability by Inspection
We can explore controllability from another viewpoint: that of the state equation itself.
When the system matrix is diagonal, as it is for the parallel form, it is apparent whether or
not the system is controllable. For example,
   
−a1 0 0 1
.    
x=
 0 −a2 0 x +  1 u
   (1.46)
0 0 −a3 1
or
x˙1 = −a1 x1 +u
x˙2 = −a2 x2 +u (1.47)
x˙3 = −a3 x3 +u
11
Since each of Eqs. 1.46 or Eqs. 1.47is independent and decoupled from the rest, the control
u affects each of the state variables. This is controllability from another perspective.

Now let us look at the state equations for the system


   
−a4 0 0 0
   
x= 
 0 −a5 0  x +  1  u
  (1.48)
0 0 −a6 1

or
x˙4 = −a4 x4
x˙5 = −a5 x5 +u (1.49)
x˙6 = −a6 x6 +u
From the state equations in Eqs. 1.48 or Eqs. 1.49, we see that state variable x4 is not
controlled by the control u. Thus, the system is said to be uncontrollable. In summary, a
system with distinct eigenvalues and a diagonal system matrix is controllable if the input
coupling matrix B does not have any rows that are zero.

3.1.4 Observability
Definition 1.3 (State Observability)
A s ystem i s s aid to be observable i f at time t0, the s ystem s tate x(t0) can be exactly deter-
mined f rom observation of the output y(t) and inputs u(t), over a finite time i nterval.

A system described by a state space model ẋ = Ax + Bu and y = Cx with initial state


x(t0 ) is observable if from knowledge of known inputs, u(t), and outputs, y(t), over a time
it
interval t0 ≤ t ≤ t1 , is possible to compute the (initial) state vector, x(t0 ).

Theorem 1.12.1 (Observability matrix)


Define the observability matrix
 
C
 
 CA 
 
 
Oi = 
 CA
2 
 (1.50)
 .. 
 . 
 
CAi−1
The pair (C; A) is observable if and only if the observability matrix Oi for i = n has rank
n, i.e. rank(Cn) = n.
11b

Simply stated, observability is the ability to deduce the state variables from a knowledge
of the input, u (t) , and the output, y(t) .
Observability by Inspection
We can also explore observability from the output equation of a diagonalized system. The
output equation for the diagonalized system of Figure 12.15(a) is
h i
y = Cx = 1 1 1 x (1.51)

On the other hand, the output equation for the unobservable system of s
h i
y = Cx = 0 1 1 x (1.52)

Notice that the first column of Eq. (1.52) is zero. For systems represented in parallel form
with distinct eigenvalues, if any column of the output coupling matrix is zero, the diagonal
system is not observable.
1

State space System analysis tutorials


Stability Analysis
1. Transfer matrix (Poles)
2. State dynamics matrix A (Eigen values)
3. Transition matrix (transient and BIBO stability)

Example 3.1 Consider the two-dimensional state equation


        
ẋ1 (t) 0 1 x1 (t) 0 −1
= + u(t) x(0) = x0 =
ẋ2 (t) −2 −3 x2 (t) 1 1
 
 x1 (t)
y(t) = 1 1
x2 (t)

From      
s 0 0 1 s −1
sI − A = − =
0 s −2 −3 2 s+3

we find, by computing the matrix inverse and performing partial fraction


expansion on each element,
 
s+3 1
adj(sI − A) −2 s
(sI − A)−1 = = 2
|sI − A| s + 3s + 2
 
s+3 1
 (s + 1)(s + 2) (s + 1)(s + 2) 
=


−2 s
(s + 1)(s + 2) (s + 1)(s + 2)
 
2 −1 1 −1
s+1 + s+2 s+1 + s+2
= −2

2 −1 2 
+ +
s+1 s+2 s+1 s+2
It follows directly that

eAt = L−1 [(sI − A)−1 ]


  t ≥0
2e−t − e−2t e−t − e−2t
=
−2e−t + 2e−2t −e−t + 2e−2t
2

For the specified initial state, the zero-input response component of the
state and output are
 −t 
−e
xzi (t) = e x0 =
At
yzi (t) = CeAt x0 = Cxzi (t) = 0 t ≥ 0
e−t

For a unit-step input signal, the Laplace domain representation of the


zero-state components of the state and output response are

Xzs (s) = (sI − A)−1 BU (s)


 
s+3 1
 
−2 s 0 1
= 2
s + 3s + 2 1 s
 
1
 (s + 1)(s + 2)  1
=

s
s
(s + 1)(s + 2)
 
1
 s(s + 1)(s + 2) 
=


1
(s + 1)(s + 2)
 
1/2 1 1/2
 s − +
s+1 s+2
=


1 1

s+1 s+2
Yzs (s) = CXzs (s) + DU (s)
 
1/2 1 1/2
  s − s+1 + s+2
= 1 1   + [0] 1
 1 1  s

s+1 s+2
1/2 1/2
= −
s s+2
from which
 
1/2 − e−t + 1/2e−2t 1
xzs (t) = yzs (t) = (1 − e−2t ) t ≥ 0
e−t − e−2t 2
3

and complete state and output responses then are


 
1/2 − 2e−t + 1/2e−2t 1
x(t) = −t −2t y(t) = (1 − e−2t ) t ≥0
2e − e 2

Finally, the transfer function is given as

H (s) = C(sI − A)−1 B + D


 
s+3 1
 
 −2 s 0
= 1 1 2 +0
s + 3s + 2 1
s+1 s+1 1
= = =
s2 + 3s + 2 (s + 1)(s + 2) s+2

with associated impulse response

h(t) = e−2t t ≥0 
11

Controllability analysis

1. Through controllability matrix


2. Through inspection after decoupling state
equations

3.4 CONTROLLABILITY EXAMPLES


Example 3.4 Given the following single-input two–dimensional linear

state equation, we now assess its controllability.


ẋ1 (t) 1 5 x1 (t) −2
= + u(t)
ẋ2 (t) 8 4 x2 (t) 2
The controllability matrix P is found as follows:
−2 8 −2 8
B= AB = P =
2 −8 2 −8
Clearly, |P | = 0, so the state equation is not controllable. To see why this
is true, consider a different state definition
z1 (t) x1 (t)
=
z2 (t) x1 (t) + x2 (t)
The associated coordinate transformation is
x(t) = T z(t)
x1 (t) 1 0 z1 (t)
=
x2 (t) −1 1 z2 (t)
12

Applying this coordinate transformation yields the transformed state


equation
ż1 (t) −4 5 z1 (t) −2
= + u(t)
ż2 (t) 0 9 z2 (t) 0

We see that ż2 (t) does not depend on the input u(t), so this state variable
is not controllable. 

Example 3.5 Given the following three-dimensional single-input state


equation, that is,
      
ẋ1 (t) 0 1 0 x1 (t) 0
 ẋ2 (t)  = 0 0 1  x2 (t)  + 1 u(t)
ẋ3 (t) −6 −11 −6 x3 (t) −3

we construct the controllability matrix P using


 
0
B= 1
−3
    
0 1 0 0 1
AB = 0 0 1 1 = −3
−6 −11 −6 −3 7
    
0 1 0 1 −3
A2 B = A(AB) = 0 0 1 −3 = 7
−6 −11 −6 7 −15

This yields
 
P = B AB A2 B
 
0 1 −3
= 1 −3 7
−3 7 −15

To check controllability, we calculate

|P | = | B AB A2 B |
 
 0 1 −3 
 
=  1 −3 7
 
−3 7 −15
= [0 + (−21) + (−21)] − [(−27) + 0 + (−15)]
13

= −42 − (−42)
=0

and thus rank P < 3. This indicates that the state equation is not control-
lable. The upper left 2 × 2 submatrix

0 1
1 −3

has nonzero determinant, indicating that rankP = 2. 

Example 3.6 We investigate the controllability of the three–dimen-


sional state equation
      
ẋ1 (t) 0 1 0 x1 (t) 0
       
 ẋ2 (t)  = 0 0 1  x2 (t)  +  0  u(t)
ẋ3 (t) −a0 −a1 −a2 x3 (t) 1
 
x (t)
  1 
y(t) = b0 b1 b2  x2 (t) 
x3 (t)

which the reader will recall is a state-space realization of the transfer


function
b2 s 2 + b1 s + b0
H (s) = 3
s + a2 s 2 + a1 s + a0

The controllability matrix P is found as follows:


   
0 0
B =  0  AB =  1 
1 −a2
 
1
AB=
2  −a 2 
a2 − a1
2
 
P = B AB A2 B
 
0 0 1
= 0 1 −a2 
1 −a2 a22 − a1
14
Example 3.7 Given the five-dimensional, two-input state equation
      
ẋ1 (t) 0 1 0 0 0 x1 (t) 0 0
 ẋ2 (t)   0 0 0 0    
0   x2 (t)   1 0
    u (t)
       1
 ẋ3 (t)  =  0 0 0 1 0   x3 (t)  +  0 0
       u2 (t)
 ẋ4 (t)   0 0 0 0 1   x4 (t)   0 0
ẋ5 (t) 0 0 0 0 0 x5 (t) 0 1

the controllability matrix is


 
P = b1 b2 | Ab1 Ab2 | A2 b1 A2 b2 | A3 b1 A3 b2 | A4 b1 A4 b2
     
0 0 1 0 0 0 0 0 0 0
   
1 0 0 0 0 0 0 0 0 0
     
     
= 0 0 0 0 0 1 0 0 0 0
     
0 0 0 1 0 0 0 0 0 0
   
0 1 0 0 0 0 0 0 0 0

This state equation is controllable because P has full-row rank, as can


be seen from the pattern of ones and zeros. Also, columns 1, 2, 3, 4,
and 6 form a linearly independent set. Since the remaining columns are
each zero vectors, this turns out to be the only way to select five linearly
independent columns from P . 

Example 3.8 Now consider the following five-dimensional, two-input


state equation
      
ẋ1 (t) 0 1 0 0 0 x1 (t) 0 0
 ẋ2 (t)   1 0 −1 0 1    
     x2 (t)   1 −1  u (t)
       1
 3  = 0
ẋ (t) 0 0 1 0   x3 (t)  +  0 0
       u2 (t)
 ẋ4 (t)   0 0 0 0 1   x4 (t)   0 0
ẋ5 (t) 0 −1 0 −1 0 x5 (t) 1 1
This equation differs from the preceding example only in the second and
fifth rows of both A and B. These adjustments yield the more interesting
controllability matrix
 
P = b1 b2 | Ab1 Ab2 | A2 b1 A2 b2 | A3 b1 A3 b2 | A4 b1 A4 b2
     
0 0  1 −1  1 1  0 0  −2 −2
 1 −1  1 1  0 0  −2 −2  2 −2 
  
     
= 0 0 0 0 1 1  −1 1  −2 −2 
     
0 0  1 1  −1 1  −2 −2  1 −1 
1 1  −1 1  −2 −2  1 −1  4 4
which also has rank equal to 5. If we search from left to right for five
linearly independent columns, we find that the first five qualify. However,
there are many more ways to select five linearly independent columns
from this controllability matrix, as the reader may verify with the aid of
15
Observability analysis

4.2 OBSERVABILITY EXAMPLES

Example 4.1 Consider the two–dimensional single-output state equation

      
ẋ1 (t) 1 5 x1 (t) −2
= + u(t)
ẋ2 (t) 8 4 x2 (t) 2
 
x (t)
y(t) = [ 2 2 ] 1 + [0]u(t)
x2 (t)

for which the associated (A, B) pair is the same as in Example 4.1. The
observability matrix Q is found as follows:

C = [2 2]
CA = [ 18 18 ]
so  
2 2
Q=
18 18
Clearly |Q| = 0 so the state equation is not observable. Because rank
Q < 2 but Q is not the 2 × 2 zero matrix, we have rankQ = 1 and
nullityQ = 1.
To see why this state equation is not observable, we again use the state
coordinate transformation given by:
      
z1 (t) x1 (t) 1 0 x1 (t)
= =
z2 (t) x1 (t) + x2 (t) 1 1 x2 (t)

which yields the transformed state equation


      
ż1 (t) −4 5 z1 (t) −2
= + u(t)
ż2 (t) 0 9 z2 (t) 0
 
z1 (t)
y(t) = [ 0 2 ] + [0]u(t)
z2 (t)
16

Here both the state variable z2 (t) and the output y(t) are decoupled from
z1 (t). Thus, z1 (0) cannot be determined from measurements of the zero-
input response yzi (t) = 2e9t z2 (0). This is why the given state equation is
not observable.
Also, note that x0 = [1, −1]T satisfies Qx0 = [0, 0]T and we conclude
from the proof of Theorem 4.2 that x0 is a nonzero unobservable state. 

Example 4.2 Given the following three-dimensional single-output


homogeneous state equation, that is,

    
ẋ1 (t) 0 0 −6 x1 (t)
 ẋ2 (t)  =  1 0 −11   x2 (t) 
ẋ3 (t) 0 1 −6 x3 (t)
 
x1 (t)
y(t) = [ 0 1 −3 ]  x2 (t) 
x3 (t)

we construct the observability matrix as follows:

C = [ 0 1 −3 ]
 
00 −6
CA = [ 0 1 −3 ]  10 −11  = [ 1 −3 7 ]
01 −6
 
0 0 −6
CA2 = (CA)A = [ 1 −3 7 ]  1 0 −11  = [ −3 7 −15 ]
0 1 −6

yielding    
C 0 0 −6
 
Q =  CA  =  1 −3 7
CA2 −3 7 −15

To check observability, we calculate

C
|Q| = CA
CA2
17

0 1 −3
= 1 −3 7
−3 7 −15
= [0 + (−21) + (−21)] − [(−27) + 0 + (−15)]
= −42 − (−42)
=0

and thus rank Q < 3. This indicates that the state equation is not observ-
able, so there exist nonzero unobservable states for this state equation.
The upper left 2 × 2 submatrix
 
0 1
1 −3

has nonzero determinant, indicating that rankQ = 2 and nullityQ = 3 −


2 = 1 (by Sylvester’s law of nullity). Consequently, any nonzero solution
to the homogeneous equation

Qx0 = 0

will yield a nonzero unobservable state. Applying elementary row opera-


tions to the observability matrix Q yields the row-reduced echelon form
 
1 0 −2
 
QR =  0 1 −3 
0 0 0

from which an easily identified solution to

QR x0 = 0

is  
2
 
x0 =  3 
1

Moreover, any nonzero scalar multiple of this solution also yields a


nonzero unobservable state. 
18

Example 4.3 We investigate the observability of the three-dimensional


state equation
      
ẋ1 (t) 0 0 −a0 x1 (t) b0
 ẋ2 (t)  =  1 0 −a1   x2 (t)  +  b1  u(t)
ẋ2 (t) 0 1 −a2 x2 (t) b2
 
x1 (t)
y(t) = [ 0 0 1 ]  x2 (t) 
x2 (t)

which the reader will recall is a state-space realization of the transfer


function
b2 s 2 + b1 s + b0
H (s) = 3
s + a2 s 2 + a1 s + a0

The observability matrix Q is found as follows:


 
C
Q =  CA 
CA2
 
0 0 1
= 0 1 −a2 
1 −a2 a2 − a1
2

This observability matrix is identical to the controllability matrix P from


Example 3.3. The observability matrix Q is independent of the transfer
function-numerator coefficients b0 , b1 , and b2 . The determinant of the
observability matrix is |Q| = −1  = 0, so the state equation is observable.
Note that this outcome is independent of the characteristic polynomial
coefficients a0 , a1 , and a2 , so a state-space realization in this form is
always observable. This is also true for any system order n, as we will
demonstrate shortly. 

Example 4.4 Consider the five-dimensional, two-output homogeneous


state equation
    
ẋ1 (t) 0 0 0 0 0 x1 (t)
 ẋ2 (t)   1 0 0 0 0   x2 (t) 
    
 ẋ3 (t)  =  0 0 0 0 0   x3 (t) 
 ẋ (t)   0 0 1 0 0   x4 (t) 
4
ẋ5 (t) 0 0 0 1 0 x5 (t)
19

 
x1 (t)
     x (t) 
y1 (t) 0 1 0 0 0  2 
=  x (t) 
y2 (t) 0 0 0 0 1  3 
x4 (t)
x5 (t)

The observability matrix is constructed as follows

 
0 1 0 0 0
0
 0 0 0 1 

  1 
C  0 0 0 0 
 CA

 0 0 0 1 0 

   
  0 0 0 0 0 
Q= 2 = 
 CA  0 0 1 0 0 
   
 CA3   
0 0 0 0 0 
CA4  
0 0 0 0 0 
 
0 0 0 0 0 
0 0 0 0 0

Q has full-column rank 5 because of the pattern of ones and zeros. There-
fore, the state equation is observable. Furthermore, rows 1, 2, 3, 4, and 6
form a linearly independent set of 1 × 5 row vectors. Since the remaining
rows are each 1 × 5 zero vectors, there is only one way to select five
linearly independent rows from Q. 

Example 4.5 Consider now the five-dimensional, two-output homoge-


neous state equation

    
ẋ1 (t) 0 1 0 0 0 x1 (t)
 ẋ2 (t)   1 0 0 0 −1   x2 (t) 
    
 ẋ3 (t)  =  0 −1 0 0 0   x3 (t) 
 ẋ (t)   0 0 1 0 −1   x4 (t) 
4
ẋ5 (t) 0 1 0 1 0 x5 (t)
 
x1 (t)
     x (t) 
y1 (t) 0 1 0 0 1  2 
=  x (t) 
y2 (t) 0 −1 0 0 1  3 
x4 (t)
x5 (t)
20

differing from the preceding example only in the second and fifth columns
of both A and C. These modifications lead to the observability matrix
 
0 1 0 0 1
 0 −1
 0 0 1
   1 
C  1 0 1 −1 
 CA  
 −1 1 0 1 1
   
   1 0 1 −1 −2 
Q=  
 CA  =  1
2 
   0 1 1 −2 
 CA3   
 0 −2 −1 −2 1
4  
CA  0 −2 1 −2 −1 
 
 −2 2 −2 1 4
−2 −2 −2 −1 4
This observability matrix also has rank equal to 5, indicating that the state
equation is observable. In contrast to the preceding example, however,
there are many ways to select five linearly independent rows from this
observability matrix, as the reader may verify with the aid of MATLAB.
Lecture 4

State Feedback
Designing controllers in state space.
8.1 State Feedback
We assume that the process to be controlled is given in state space form, i.e.
ẋ = Ax + Bu
(8.1)
y = Cx
For simplicity we assume that the process lacks a direct term, i.e. that the ma-
trix D = 0. This is a realistic assumption since processes with a direct term are
uncommon.
The transfer function of the process is given by

Y (s) = C (s I − A)−1 BU (s)


where the denominator polynomial
det(s I − A)
is the characteristic polynomial of the process.
We further assume that we can measure all process states. This is obviously
an unrealistic assumption in most cases. In the next chapter we will, however, see
that this assumption can be relaxed and that the states can be computed from the
only signals we normally posses over, i.e. the control and output signals. The
controller structure is shown in Figure 8.1. The controller equation becomes

r P u y
kr Process

x
−K

Figure 8.1 State feedback.

The controller equation becomes

u = kr r − k1 x1 − k2 x2 − · · · − kn xn = kr r − K x (8.2)
where the vectors K and x are given by
 
 x1 

 

  

 x2 


K =  k1 k2 kn  
x= 

··· 
 .
. 


 
 . 
  
xn

68
8.1 State Feedback

In state feedback the control signal is hence a linear combination of the state
variables and the setpoint.

The Closed-Loop System If we combine the control law (8.2) with the process
model (8.1) the state-space description of the closed-loop system is obtained.

ẋ = ( A − B K ) x + Bkr r
(8.3)
y = Cx

Here, the setpoint r is our new input. The corresponding transfer function is given
by
Y (s) = C (s I − ( A − B K ))−1 Bkr R(s)
where the characteristic polynomial has become

det(s I − ( A − B K ))

The state feedback has changed the matrix A of the open-loop system (8.1) into
A − B K, which is the corresponding matrix for the closed-loop system (8.3). Since K
can be chosen arbitrarily, we have a certain freedom in determining the eigenvalues
of this matrix.
The controller parameter kr is apparently unaffected by the pole placement of
the closed-loop system. To begin with we shall choose kr such that the static gain of
the closed loop system becomes unity in order to achieve y = r in stationarity.
However, we may introduce introduce integral action to achieve this goal.
We will now illustrate the synthesis method by an example.

Example 8.1—State feedback control of an electric motor


The transfer function of an electric motor is given by

100
G P (s) =
s(s + 10)

where the current is the input and the shaft angle is the output. The transfer
function can be divided into two parts according to Figure 8.2, where we have also
marked the angular speed. If we introduce the angle and the angular speed as states,
the process can be described in state-space form as

ẋ1 = −10x1 + 100u


ẋ2 = x1
y = x2

which can be written in the matrix form


   
 −10 0 
  100 
ẋ = 
 x+
 u

1 0 0
 
y = 0 1 x

u 100 x1 1 y = x2
s + 10 s

Figure 8.2 Block diagram of the motor in Example 8.1. The state x1 corresponds to the angular
speed, while the state x2 represents the angle.

69
Lecture 8. State Feedback

r P u 100 x1 1 y = x2
kr
s + 10 s

− k1

− k2

Figure 8.3 The state feedback motor control in Example 8.1.

Now we establish feedback connections from the angle and the angular speed
according to Figure 8.3. The control law

u = kr r − k1 x1 − k2 x2 = kr r − K x

yields the closed-loop system


   
 −10 − 100k1 −100k2   100 
ẋ = 
 x+
   kr r

1 0 0
 
y = 0 1 x

The characteristic polynomial becomes

s + 10 + 100k1 100k2
det(s I − ( A − B K )) =
−1 s

= s2 + (10 + 100k1 )s + 100k2

Since we can achieve an arbitrary second-order characteristic polynomial by choosing


the parameters k1 and k2 , the poles of the closed-loop system can be places freely.
Assume that the characteristic polynomial of the closed-loop system is given by

s2 + 2ζωs + ω 2

This yields the following controller parameters

2ζω − 10 ω2
k1 = k2 =
100 100

Now it remains to determine the parameter kr such that y = r in stationarity.


One could always determine kr by calculating the closed-loop transfer function and
thereafter assure that G(0) = 1, i.e. that the static gain becomes unity. It is, however,
often more efficient to investigate the stationary relation as ẋ = 0 directly in the
state-space description. For our example it holds that
   
 −10 − 100k1 −100k2   100 
ẋ = 0 = 
 x+
   kr r

1 0 0
 
y = 0 1 x

70
8.2 Controllability

The measurement signal y for different ω The measurement signal y for different ζ
1.2 1.2

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

The control signal u for different ω The control signal u for different ζ
10 5

8 4

6 3

4 2

2 1

0 0

−2 −1
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Figure 8.4 State feedback control of the motor in Example 8.1. The figures to the left show
the controller corresponding to ζ = 0.7 and ω = 10, 20, 30, where the fastest control action
is achieved by the highest frequency. The figures to the right show the control performance for
ω = 20 and ζ = 0.5, 0.7, 0.9, where the most damped control action corresponds to the largest
relative damping.

The second state-space equation implies that x1 = 0 in stationarity. That this must
be the case is obvious, since x1 represents the angular speed. By inserting x1 = 0
into the first equation and exploiting that y = x2 we observe that y = r if

kr = k2

We have now completed our synthesis. Figure 8.4 shows step responses corre-
sponding to different choices of the design parameters ζ and ω . The figure shows
that the design parameter ω is an efficient way to specify the speed of a system and
that the relative damping ζ is a good measure of the damping of the system.

8.2 Controllability

In the previous example we could place the poles of the closed-loop system arbitrarily
by means of the state feedback. An interesting question is whether this is always
possible. We begin the investigation by an example.

Example 8.2—Controllability
Assume that the process we are about to control is described by the following
equations
   
 −1 0  1
ẋ = Ax + Bu = 
 x+
   u
0 −2 0

y = C x + Du

71
Lecture 8. State Feedback

Since the A matrix is diagonal, one can directly observe its eigenvalues and thereby
the poles of the process, which lie in −1 and −2. State feedback yields the matrix
 
 −1 − k1 − k2 
A − BK = 
 

0 −2

This, in terms, gives us the characteristic polynomial of the closed-loop system

det(s I − ( A − B K )) = (s + 1 + k1 )(s + 2)

Here we see that we cannot place the poles arbitrarily. By means of k1 we can
move the pole which initially lays in −1. However, there is no way for us to move the
pole in −2. We also see that the parameter k2 does not appear in the characteristic
polynomial. Hence we do not benefit from measuring x2 and involve the measurement
in the feedback.
The cause of our problem is clearly visible in the state-space description. Its
second equation is given by
ẋ2 = −2x2
and is unaffected by the control signal u. Consequently, this state is not controllable.

This reasoning leads us to the concept of controllability, which is defined in the


following way:
A state vector x0 is said to be controllable if there exists a control sig-
nal which brings x from the origin to x0 in a finite time. A system is
controllable if all its states are controllable.
If a system is controllable it is possible to place its poles arbitrarily using state
feedback. As seen from the definition, controllability is unrelated to the output y.
The definition concerns the state vector and the control signal. In the state-space
description (8.1) we thus see that the controllability is determined by the A and B
matrices.
Whether a system is controllable can be determined by studying the controlla-
bility matrix, which is defined as
 
Ws =  B AB A2 B ··· An−1 B 

where n is the degree of the system. One can show that a system is controllable if
and only if the controllability matrix Ws consists of n linearly independent columns.
In the non-controllable case the columns of Ws tell which, if any, of the states are
controllable. This is illustrated in the examples below.
We investigate the controllability of the two systems studied in examples previ-
ously in this lecture.

Example 8.3—Controllability
In Example 8.1 an electric motor described by the equations
   
 −10 0   100 
ẋ = 
 x+
  u

1 0 0
 
y = 0 1 x

was studied. The controllability matrix of this process is given by


   100 −1000 
 
Ws =  B AB  = 
 

0 100

72
8.2 Controllability

The columns of this matrix are linearly independent. One way to determine whether
the columns of a quadratic matrix are linearly independent is to investigate if the
determinant is non-zero. Since the columns of Ws are linearly independent, the
process is controllable. We observed this already in Example 8.1, since the poles
could be arbitrarily placed.
Let us now investigate the controllability of the system in Example 8.2. There
the A and B matrices were given by
   
 −1 0  1
A=
 
 B=
  
0 −2 0

This yields the controllability matrix


   1 −1 
 
Ws =  B AB  = 
 

0 0

The columns of this matrix are linearly dependent and det Ws = 0. The system in
Example 8.2 is thus not controllable. The columns of Ws also show that x1 is a
controllable state, while x2 is not controllable.

A Controllability Example
Controllability is an abstract concept. Processes are often constructed so that con-
trollability is achieved for the states which one wishes to control. Despite this, the
concept of controllability is still important and it is good to develop an intuitive
understanding of it. We will return to this later in the course.
In order to increase the understanding of the controllability concept, we will now
investigate a few physical processes regarding their controllability. The processes to
be investigated are shown in Figure 8.5 and are made up by interconnected water
tanks and the corresponding flows.
By writing down the mass balance equations for the individual tanks, we obtain
a dynamical model. If the level in a tank is denoted x it holds that

ẋ = qin − qout

where qin is the inflow and qout the outflow from the tank. All tanks are equipped
with a hole in their bottom which makes the inflow approximately proportional to
the level in the corresponding tank. The inflow of a tank is made up by either of the
outflow from a higher tank or an auxiliary flow u, which is our control signal.

Process A First consider process A in Figure 8.5. In this case the controlled flow
enters the upper tank. The following mass balances are obtained

ẋ1 = u − ax1
ẋ2 = ax1 − ax2

where x1 and x2 denote the levels in the upper and lower tank, respectively. The
first equation tells us that the inflow of the upper tank is given by u whereas the
outflow is proportional to the level in the tank. From the second equation we obtain
that the inflow of the lower tank equals the outflow of the upper tank and that the
outflow of the lower tank is proportional to the level in the tank. The dynamics of
process A are thus described by
   
 −a 0  1
ẋ = Ax + Bu = 
 x+
   u
a −a 0

As a consequence, the controllability matrix is given by


   1 −a 
 
Ws =  B AB  = 
 

0 a

73
Lecture 8. State Feedback

1 1

u u

2 2 1 2

A B C

Figure 8.5 Several connections of water tanks leading to different cases regarding controlla-
bility.

The columns of this matrix are linearly independent, and the determinant is
det Ws = a ,= 0. Hence, the system is controllable. The columns of the controllability
matrix are shown graphically in Figur 8.6, where it’s obvious that they are linearly
independent. This means that we can control both levels to take on arbitrary values,
in accordance with the definition of controllability. Observe that the definition does
not require that we should be able to maintain arbitrary constant levels. From
studying the equations one realizes that this is only possible for levels which satisfy

u = ax1 = ax2

In stationarity we thus have the same level in both tanks. Finally we must consider
the validity of the equations. In reality it holds that the levels cannot take on
negative values, neither can the control signal be negative given that there is no
pump which could suck water from the tank. As a consequence we have x1 ≥ 0,
x2 ≥ 0 and u ≥ 0.

Process B Now consider process B, shown in Figure 8.5. In this case the controlled
flow enters the lower tank. This infers that we can no longer control the level in the
upper tank. Let us now confirm this intuitively drawn conclusion by applying the
above introduced analysis method.
The following balance equations constitute a model of process B

ẋ1 = − ax1
ẋ2 = u + ax1 − ax2

x2 x2 x2

x1 x1 x1

Process A Process B Process C

Figure 8.6 The columns of the controllability matrix in the three examples.

74
8.2 Controllability

The process dynamics can therefore be written


   
 −a 0  0
ẋ = Ax + Bu = 
 x+
   u
a −a 1

The controllability matrix is thus


  0 0 


Ws =  B AB  = 
 

1 −a

The columns of the matrix are linearly dependent, the determinant is given by
det Ws = 0 and the system is hence not controllable. The columns of Ws further
show that the level in the lower tank is controllable, as opposed to the upper tank
level. This is also obvious from Figur 8.6.

Process C In process C, shown in Figure 8.5, the flow enters both tanks, such
that half of the flow enters tank 1 whereas the other half enters tank 2.
The following balance equations hold for tank C

ẋ1 = 0.5u − ax1


ẋ2 = 0.5u − ax2

The dynamics of process C can therefore be written


   
 −a 0   0.5 
ẋ = Ax + Bu = 
 x+
  u

0 −a 0.5

The controllability matrix becomes


   
 0.5 −0.5a 
Ws =  B AB  = 
 

0.5 −0.5a

The columns of the matrix are linearly dependent, the determinant is det Ws = 0 and
the system is hence not controllable. The construction is such that if the levels are
initially equal, they will remain equal. This can also be observed from the columns
of Ws , and the illustration in Figur 8.6.

75
The roots of the equation 1.57 are the open loop poles or eigenvalues. For the closed loop 14
system described above, the characteristic equation is

|(sI − A + BK)| = 0 (1.58)

The roots of equation 1.58 are the closed loop poles or eigenvalues

3.2.1 Regulator design by pole placement


The pole placement control problem is to determine a value of K that will produce a desired
set of closed-loop poles. With a regulator, r(t) = 0 and therefore equation 1.54 becomes

u = −Kx (1.59)
Thus the control u(t) will drive the system from a set of initial conditions x(0) to a set of
zero states at time tl , i.e. x(tl ) = 0.

There are several methods that can be used for pole placement.
(a) Direct comparison method: If the desired locations of the closed-loop poles (eigen-
values) are

s = µ1 s = µ2 . . . s = µn (1.60)
then from equation 1.58

14

(b) Using Arckerman’s method:


Example: 15
Given the system described by the state and output equation below, Evaluate the coefficints
of the state feedback gain matrix.

Solution:

Using direct comparison method


Using Arckerman’s method
16
9.2 Observability

So far we have assumed that all process states have been directly measurable.
Normally this is not the case. However, if one can estimate the state vector by
studying the control and measurement signals, one could close the loop over the
estimates rather than the real states.
In Example 8.1 we utilized state feedback from two states in order to control an
electric motor. The measurement signal was the angle of the motor shaft and the
other state was given by the angular speed of the shaft. If the angular speed cannot
be directly measured, it can be estimated by e.g. deriving the angle measurement.
Before describing how to estimate the state vector, we will ask ourselves the
principal question whether it is generally possible to estimate the state vector merely
by studying u and y. The answer is that it is possible if the system is observable.
Observability is defined in the following way:

77
Lecture 9. Kalman Filtering

A state vector x0 ,= 0 is not observable if the output is y( t) = 0 when


the initial state vector is x(0) = x0 and the input is given by u( t) = 0. A
system is observable if it lacks non-observable states.
As seen from the definition, observability has nothing to do with the control
signal u. Rather, it concerns the state vector and the measurement signal. In the
state-space description (9.1) we see that this implies that observability is determined
solely by A and C.
Whether a system is observable can be determined by studying the observability
matrix which is defined in as
 
 C 

 


 CA  

 


 2 

Wo = 
 C A 

 


 .
..




 


 

n−1
CA

Here n is the degree of the system. One can show that the system is observable if
and only if the observability matrix Wo has n linearly independent rows. If x0 is a
non-observable state, it fulfills the equation

Wo x0 = 0

As seen, observability and the observability matrix have strong resemblance


to controllability and the controllability matrix, which we studied in the previous
lecture. We will nu study an observability example.

Example 9.1—Observability
A process is described by the following equations:
 
 −1 0 
ẋ = 
 x

0 −1
 
y = 1 −1  x

We can e.g. imagine that the equations describe the process in Figure 9.2, where the
states are given by the tank levels. The measurement signal is the difference between
the tank levels. The system lacks input and we study only the state evolution after
various initial states.
We shall now investigate the observability of the process. The observability
matrix is    
 C     1 −1  
Wo =   = 
CA −1 1
The rows of this matrix are not linearly independent, the determinant is det Wo = 0
and the system is not observable. From the equation

Wo x0 = 0

x1 x2

1 − 2
+ P

Figure 9.2 A physical interpretation of the process dynamics in Example 9.1.

78
9.2 Observability

x2

b d x1

Figure 9.3 Different initial states in Example 9.1.

a b
2 2

1.5 1.5

1 1

0.5 0.5

0 0
0 1 2 3 4 5 0 1 2 3 4 5

c d
2 2

1.5 1.5

1 1

0.5 0.5

0 0
0 1 2 3 4 5 0 1 2 3 4 5

Figure 9.4 The measurement y corresponding to the initial states in Figure 9.3.

we see that the non-observable states can be written


 
 a
x0 = 
 
a

If we return to the definition of observability we can justify that this is correct.


If the initial levels in the two tanks are equal they will empty in exactly the same
way and the measurement remains y = 0.
Figure 9.3 shows four initial states for the levels in the two tanks. The corre-
sponding outputs y are shown in Figure 9.4
The initial state a is a non-observable state. The measurement is consequently
y = 0. The initial state b yields a response in the measurement. However, this is
the same response obtained for the initial state c. All initial states which have the
same difference between the levels x1 and x2 will yield the same response in the
measurement signal y and it is hence not possible to tell them apart. The initial
state d yields a response which differs from the others since the difference between
x1 and x2 is not the same as for the other three cases.

79
90

Fig. 4.30 Figure 1.3: A simple full-order state observer

In Figure 8.8, since the observer dynamics will never exactly equal the system dynamics,
this open-loop arrangement means that x and x̂ will gradually diverge. If however, an output
vector ŷ is estimated and subtracted from the actual output vector y, the difference can be used,
in a closed-loop sense, to modify the dynamics of the observer so that the output error (y − ŷ)
is minimized. This arrangement, some- times called a Luenberger observer (1964), is shown in
Figure 8.9.

State observer
The state observer estimates the state vector in the following way:

x̂˙ = Ax̂ + Bu + Ke (y − ŷ)


ŷ = Cx̂ (9.3)

Compared to the simple simulation method, we have now introduced a correction term. Here
the difference between the real measurement signal y and the estimated measurement signal ŷ
affects the estimation.
91

Fig. 4.31 Figure 1.4: The Luenberger full-order state obsever

By merging the two equations in Equation (9.3) the state observer can be written in the form

ẋ = (A − KeC)x̂ + Bu + Ke y(9.4)

From this form it is obvious how the state observer is driven by the two signals u and y.
The estimation error x̃ decreases according to

ẋ = ẋ − ẋ = Ax + Bu − Ax̂ − Bu − KeC(x − x̂)


= (A − KeC)x̃
The properties of the state observer are no longer determined merely by the A matrix, but rather
by the A − KeC matrix, in which the vector Ke is a free parameter. We can consequently choose a
desired decrease rate of the estimation error x̃. The choice is a compromise between speed and
sensitivity towards disturbances and modelling errors. A method used to determine Ke is illustrated
by the following example.
92

EXAMPLE 9.2-STATE ESTIMATION OF PENDULUM DYNAMICS

In this example we shall balance an inverted pendulum. We leave the control part to the next
lecture and put our focus on process state estimation. The pendulum to
be controlled is shown in Figure 9.5.
We assume that we can measure the angle ϕ which consequently becomes our measurement
signal, i.e. y = ϕ. The control objective is to balance the pendulum so that y = ϕ = 0. The setpoint
will thus be r = 0. The control signal u is proportional to the force affecting the cart. For simplicity
we assume that u = z̈.
There are two forces affecting the pendulum. The gravitational force strives to increase the
angle ϕ and thus causing the pendulum to fall down. The control signal can give the pendulum an
acceleration in the opposite direction and thus re-erect the pendulum.
A momentum equation with suitable choice of states, linearization and parameters yields the
following process model of the pendulum:

ϕ̈ = ϕ − u

A second-order system requires at least two states for its state-space representation. We choose
the angle ϕ and its derivative ϕ̇ to represent these states.

x1 = ϕ
x2 = ϕ̇
This yields the following state-space description of the process:
! !
0 1 0
ẋ = x+ u = Ax + Bu
1 0 −1
 
y= 1 0 x = Cx

A state observer estimating the state vector is given by Equation (9.4) and our task is to
determine the vector L so that the matrix A − LC obtains the desired eigenvalues. We have
! ! !
0 1 ke1   −ke1 1
A − KeC = − 1 0 =
1 0 ke2 1 − ke2 0
The eigenvalues are given by the characteristic polynomial

s + ke1 −1
det(sI − (A − KeC)) = = s2 + ke1 s − 1 + ke2
−1 + ke2 s
We can now place the two poles arbitrarily by choosing ke1 and ke2 adequately. E.g. assume
that we desire to place the poles in s = −4 ± 4i. This yields the desired characteristic polynomial

(s + 4 − 4i)(s + 4 + 4i) = s2 + 8s + 32
93

Fig. 4.32 Figure 9.5 The pendulum in Example 9.2.

By comparing the two polynomials, we arrive at the following state observer parameters:

ke1 = 8 ke2 = 33

Figure 9.6 shows the result of a simulation experiment with the state observer The initial state
is given by ϕ(0) = −0.6 and ϕ̇(0) = 0.4 Since the process is unstable, it has to be controlled. This
control is treated in the next lecture.
The solid curves in Figure 9.6 show the actual states. The angle is initially ϕ(0) = −0.6. After
an initial overshoot to ϕ ≈ 0.4 it stabilizes at ϕ = 0 after approximately 3 s.
The dashed curves show the estimates and are thus the ones of highest interest for us. They
both start out in zero. After approximately 1.5 s they have converged well to the actual states.
The example shows that the state observer can be used to estimate the state vector. One thus
realizes that it should be possible to use the estimated state for the

Fig. 4.33 Figure 9.6 Actual and estimated angle and angular speed of the pendulum in Example
9.2.
94

feedback, rather than the actual ones. It is, however, important that the Kalman filter is
sufficiently fast, since the transient behavior shown in Figure 9.6 will be repeated every time the
process is disturbed, which would happen e.g. if someone pokes at the pendulum. The relationship
between the Kalman filter and the state feedback will be further investigated in the next lecture.
Finally it is worthwhile noting that the Kalman filter is not merely interesting in the context
of state feedback control. The method to estimate variables which are not directly measurable
by exploiting the process dynamics and the availability of certain measurement signals is used in
a large number of other applications. Examples include technical systems as well as biological,
medical and economical ones.

10.1 Output Feedback


We shall now merge the state feedback and the Kalman filter and close the loop from the estimated
states, rather than the actual ones. We call this output feedback in order to emphasize that we do
no longer close the loop from the states of the process. Rather, we use the setpoint r, measurement
y and control signal u. The controller structure is shown in Figure 10.1, where the controller is
confined within the dashed rectangle.
We choose to describe the process in state space and assume that it lacks a direct

Fig. 4.34 Figure 10.1 Output feedback. The controller is confined within the dashed rectangle.
term.

ẋ = Ax + Bu
y = Cx
The controller now becomes a combination of the Kalman filter and the state feedback

x̂˙ = Ax̂ + Bu + Ke (y − ŷ)


ŷ = Cx̂
u = kr r − K x̂
95

We shall investigate the closed-loop system and begin by doing so in state space. A natural
way to choose the state vector would have been to append the Kalman filter estimates x̂ to the
process states x. For reasons, which will become evident later in the lecture, we will rather append
the process states x with the estimation error x̃ = x − x̂ in order to form the state vector.
!
x
xe =

The state-space equations can now be written

ẋ = Ax + Bu = Ax + Bkr r − BK x̂ = Ax + Bkr r − BK(x − x̃)


= (A − BK)x + BK x̃ + Bkr r
x̃˙ = ẋ − x̂˙ = Ax + Bu − Ax̂ − Bu − KeC(x − x̂) = (A − KeC)x̃
On block matrix form this becomes

! ! ! ! !
ẋ A − BK BK x Bkr x
= + r = Ae + Be r
x̃˙ 0 A − LC x̃ 0 x̃
! !
  x x
y= C 0 = Ce (10.1)
x̃ x̃

Thanks to the introduction of x and x̃ as system states, the Ae , Be and Ce matrices now contain
a number of zeros. We will benefit from this substantially now that we will investigate the
closed-loop system.
Due to the block triangularity of Ae , its characteristic polynomial is given by

det (sI − Ae ) = det(sI − (A − BK)) · det(sI − (A − KeC))

This is an appealing result. It shows that the characteristic polynomial is a product of the
characteristic polynomial from the nominal state feedback and that of the Kalman filter. Conse-
quently it is possible to separate the control problem into two parts, as we have already attempted
in the previous lectures. The state feedback can first be determined and its poles placed as if all
states were actually measurable. When later a Kalman filter is introduced in order to obtain state
estimates and realize output feedback, this will not affect the pole placement of the feedback.
Let us compute the transfer function from r to y. It is given by
!
  E
Ge (s) = Ce (sI − Ae )−1 Be = C 0
F
where
! !−1 !
E sI − (A − BK) −BK Bkr
=
F 0 sI − (A − KeC) 0
96

By multiplying both sides with (sI − Ae ) we get


! ! !
Bkr sI − (A − BK) −BK E
=
0 0 sI − (A − KeC) F
!
(sI − (A − BK))E − BKF
=
(sI − (A − KeC))F
The latter part of the equation yields F = 0. Consequently, we arrive at

E = (sI − (A − BK))−1 Bkr

which gives us the transfer function

Ge (s) = C(sI − (A − BK))−1 Bkr

This is a remarkable result. The transfer function is identical to the one which we obtained
when we closed the loop from the real states. The dynamics of the Kalman filter are hidden in
the transfer function. The characteristic polynomial of the Ae matrix is of order 2n, whereas the
transfer function is merely of order n.
In the pendulum example from the previous lecture we saw how the estimated state converged
to the actual state after a few seconds long transient. Following this transient there is inherently no
difference between feedback from the estimated and actual states, as long as no disturbances forces
the state estimate away from the actual state. This explains why the dynamics of the Kalman filter
are not visible in the transfer function.
A transfer function of order n can always be described in state space using n states, which
are all both controllable and observable. One could also introduce more than n states, but the
additional states cannot be both observable and controllable. It is e.g. possible to introduce a state
which has no significant relation to the process. This state will obviously be neither controllable
nor observable.
Correspondingly a transfer function of degree n obtained from a state space model of degree
> n shows us that there are states which lack controllability or observability. In our case the
Kalman filter state is not controllable. This can be seen by constructing the controllability matrix
of the system in Equation (10.1).
!
  Bkr (A − BK)Bkr · · ·
Ws = Be Ae Be · · · An−1
e Be =
0 0 ···
Since the last n elements in the columns are zero, there are n linearly independent columns and
the non-controllable states correspond to those of the Kalman filter. The fact that the states of the
Kalman filter are not controllable can be seen directly from the state space Equation (10.1).

x̃˙ = (A − KeC)x̃
97

We can apparently not affect the estimates by means of the closed-loop system input, i.e. by
the setpoint r.

Summary
We have now introduced a method to determine a controller-state feedback combined with Kalman
filtering. The analysis has shown that the design process can be separated into two parts.
The first part was to estimate the state vector by means of the Kalman filter. The properties
of the Kalman filter are determined by the gain vector L. The choice of this vector, i.e. the pole
placement of the Kalman filter, is as always a balance between performance and robustness against
model errors, process variations and disturbances.
Let the system in Figure 1.4 be defined by

ẋ = Ax + Bu (8.125)
y = Cx (8.126)

Assume that the estimate x̂ of the state vector is

x̂˙ = Ax̂ + Bu + Ke (y − Cx̂) (8.127)

where Ke is the observer gain matrix.


If equation (8.127) is subtracted from (8.125), and ( x − x̂) is the error vector e, then

ė = (A − Ke C) e (8.128)

and, from equation (8.127), the equation for the full-order state observer is

x̂˙ = (A − Ke C) x̂ + Bu + Ke y (8.129)

Thus from equation (8.128) the dynamic behaviour of the error vector depends upon the
eigenvalues of (A − Kc C). As with any measurement system, these eigenvalues
should allow the observer transient response to be more rapid than the system itself (typically
a factor of 5), unless a filtering effect is required.
The problem of observer design is essentially the same as the regulator pole placement problem,
and similar techniques may be used.
(a) Direct comparison method: If the desired locations of the closed-loop poles (eigenvalues)
of the observer are

s = µ1 , s = µ2 , . . . , s = µn

then
98

|sI − A + Ke C| = (s − µ1 ) (s − µ2 ) . . . (s − µn )
= sn + αn−1 sn−1 + · · · + α1 s + α0 (8.130)

Ackermann’s formula: As with regulator design, this is only applicable to systems where u(t)
and y(t) are scalar quantities. It may be used to calculate the observer gain matrix as follows
h iT
Ke = φ (A)N−1 0 0 ... 0 1

or alternatively
 −1  
C 0
   
 CA   0 
Ke = φ (A) 
 .. 


 .. 
 (8.134)
 .   . 
CAn−1 1
where φ (A) is defined in equation (8.104).

Example:
A system is described by
" # " #" # " #
ẋ1 0 1 x1 0
= + u
ẋ2 −2 −3 x2 1
" #
h i x
1
y= 1 0
x2
Design a full-order observer that has an undamped natural frequency of 10 rad/sand a damping
ratio of 0.5 .

Solution:
The obsevability matrix is
" #
  1 0
N = CT : AT CT = (8.135)
0 1
N is of rank 2 and therefore non-singular, hence the system is completely observable and the
calculation of an appropriate observer gain matrix Ke realizable.
Open-loop eigenvalues:

|sI − A| = s2 + 3s + 2 = s2 + a1 s + a0 (8.136)
99

Hence

a0 = 2, a1 = 3

And the open-loop eigenvalues are

s2 + 3s + 2 = 0
(s + 1)(s + 2) = 0
s = −1, s = −2 (8.137)

Desired closed-loop eigenvalues:

s2 + 2ζ ωn s + ωn2 = 0
s2 + 10s + 100 = s2 + α1 s + α0 = 0 (8.138)

Hence

α0 = 100, α1 = 10

and the desired closed-loop eigenvalues are the roots of equation (8.138)

µ1 = −5 + j8.66, µ2 = −5 − j8.66 (8.139)

Using Direct comparison method:

|I − A + Ke C| = s2 + α1 s + α0 (22)
" # " # " #
s 0 0 1 kel h i
− + 1 0 = s2 + 10s + 100
0 s −2 −3 ke2
" # " #
s −1 kel 0
+ = s2 + 10s + 100
2 s+3 ke2 0
s + kel −1
= s2 + 10s + 100
2 + ke2 s + 3
s2 + (3 + ke1 ) s + (3kel + 2 + ke2 ) = s2 + 10s + 100

From equation (8.140)


100

(3 + kel ) = 10, ke1 = 7 (8.141)


(3kel + 2 + ke2 ) = 100
ke2 = 100 − 2 − 21 = 77 (8.142)

Using Arckermans formula:


" #−1 " #
C 0
Ke = φ (A)
CA 1
Using the definition of φ (A) in equation (8.104)

" #−1 " #


 1 0 0
Kc = A2 + α1 A + α0 I (8.149)
0 1 1
"" # " # " ## " #" #
−2 −3 0 10 100 0 1 0 0
Ke = + +
6 7 −20 −30 0 100 0 1 1
" #" #" #
98 7 1 0 0
=
14 77 0 1 1
" #" # " #
98 7 0 7
= = (8.150)
14 77 1 77
Chapter 5

Optimal Control (DP, LQR and Kalman


Filters)

5.1 Dynamic Programming


• From IEEE History Center: Richard Bellman:

• "His invention of dynamic programming in 1953 was a major breakthrough in the theory of
multistage decision processes..."

• "A breakthrough which set the stage for the application of functional equation techniques in
a wide spectrum of fields..."

• “...extending far beyond the problem-areas which provided the initial mo-tivation for his
ideas."

Fig. 5.1 Aircraft Fuelling Problem


102 Optimal Control (DP, LQR and Kalman Filters)

- Example: travel from A to B with least cost (robot navigation or aircraft path)

- 20 possible options, trying all would be so tedious

- Strategy: start from B, and go backwards, invoking Proof of Optimum (PoO)

2. Discrete-Time Systems
The plant is described by| the general nonlinear discrete-time dynamical equation

xk+1 = f (k, xk , uk ) , k = i, . . . , N − 1

with initial condition xi given. The vector xk has n components and the vector uk has m components.
Suppose we associate with this plant the performance index

N−1
Ji (xi ) = φ (N, xN ) + ∑ L (k, xk , uk )
k=i

where [i, N] is the time interval of interest. We want to use Bellman’s principle of optimality to
find the sequence uk that minimizes the performance index.
Suppose we have computed the optimal cost


Jk+1 (xk+1 )

for time k + 1 to the terminal time N for all possible states xk+1 , and that we have also found
the optimal control sequences from time k + 1 to N for all possible states xk+1 . The optimal cost
results when the optimal control sequences u∗ k+1 , u∗ k+2 , . . . , u∗ N−1 is applied to the plant with a
state of xk+1 . Note that the optimal control sequence depends on xk+1 . If we apply any arbitrary
control uk at time k and then use the known optimal control sequence from k + 1 on, the resulting
cost will be

Jk (xk ) = L (k, xk , uk ) + Jk+1 (xk+1 )

where xk is the state at time k, and xk+1 is given by the state equation. According to Bellman, the
optimal cost from time k is equal to

Jk∗ (xk ) = min L (k, xk , uk ) + Jk+1

(xk+1 )
uk

and the optimal control u∗k at time k is the uk that achieves this minimum.
3. Discrete LQR via Dynamic Programming
The plant is described by the linear discrete-time dynamical equation

xk+1 = Ak xk + Bk uk ,
5.1 Dynamic Programming 103

with initial condition xi given and final state xN free. We want to find the sequence uk on the
interval [i, N] that minimizes the performance index:

1 1 N−1 
Ji = xNT SN xN + ∑ xkT Qk xk + uTk Rk uk , SN ≥ 0, Q ≥ 0, R > 0
2 2 k=i

Let k = N and write


1
JN = xNT SN xN = JN∗
2
Now let k = N − 1 and write

1 T 1 1
JN−1 = xN−1 QN−1 xN−1 + uTN−1 RN−1 uN−1 + xNT SN xN
2 2 2

According to Bellman’s principle of optimality,



Jk∗ (xk ) = min L (k, xk , uk ) + Jk+1

(xk+1 )
uk

we find uN−1 by minimizing JN−1 , which can be rewritten as

1 T 1 1
JN−1 = xN−1 QN−1 xN−1 + uTN−1 RN−1 uN−1 + (AN−1 xN−1 + BN−1 uN−1 )T SN (AN−1 xN−1 + BN−1 uN−1 )
2 2 2

Since there is no input constraint, the minimum is found by setting

∂ JN−1
0= = RN−1 uN−1 + BTN−1 SN (AN−1 xN−1 + BN−1 uN−1 )
∂ uN−1

which gives
−1
uN−1 = − BTN−1 SN BN−1 + RN−1 BTN−1 SN AN−1 xN−1

Defining
−1
KN−1 = BTN−1 SN BN−1 + RN−1 BTN−1 SN AN−1

we can rewrite
u∗N−1 = −KN−1 xN−1

The optimal cost to go from k = N −1 is found substituting the optimal control in the expression
for JN−1 ,

∗ 1 T h T T
i
JN−1 = xN−1 (AN−1 − BN−1 KN−1 ) SN (AN−1 − BN−1 KN−1 ) + KN−1 RN−1 KN−1 + QN−1 xN−1
2

if we define

SN−1 = (AN−1 − BN−1 KN−1 )T SN (AN−1 − BN−1 KN−1 ) + KN−1


T
RN−1 KN−1 + QN−1

this can be written as


∗ 1 T
JN−1 = xN−1 SN−1 xN−1
2
104 Optimal Control (DP, LQR and Kalman Filters)

For k = N
1
JN = xNT SN xN
2
For k = N − 1

1 T 1 1
JN−1 = xN−1 QN−1 xN−1 + uTN−1 RN−1 uN−1 + xNT SN xN
2 2 2

For k = N − 2

1 T 1 1 T
JN−2 = xN−2 QN−2 xN−2 + uTN−2 RN−2 uN−2 + xN−1 SN−1 xN−1
2 2 2

The structure of the problem is the same. To obtain u∗N−2 We just need to replace N − 1 by
N − 2. If we continued to decrement k and apply the optimality principle, the result for each
k = N − 1, . . . , 1, 0 is
−1
uk = −Kk xk , Kk = BTk Sk+1 Bk + Rk BTk Sk+1 Ak
Sk = (Ak − Bk Kk )T Sk+1 (Ak − Bk Kk ) + KkT Rk Kk + Qk
1
Jk∗ = xkT Sk xk
2

Kalman Gain Sequence, Joseph Stabilized Ricatti Difference Equation (RDE),


and Optimal Cost respectively

The DP Algorithm:

Plant

xk+1 = Axk + Buk

with the performance index

1 1 N−1  
Ji = xN⊤ SN xN + ∑ xk⊤ Qxk + u⊤
k Ruk
2 2 k=i+1
with SN ≥ 0, Q ≥ 0, R > 0.
Use principle of optimality (start at the end and go backwards)

1
JN∗ = xN⊤ SN xN
2
1 ⊤ ⊤

JN−1 = x QxN−1 + uN−1 RuN−1
2 N−1
1
+ (AxN−1 + BuN−1 )⊤ SN (AxN−1 + BuN−1 )
2
No constraint on u, hence find minimum of JN−1 by
5.1 Dynamic Programming 105

∂ JN−1
0= = RuN−1 + B⊤ SN (AxN−1 + BuN−1 )
∂ uN−1
 −1
u∗N−1 = − B⊤ SN B + R B⊤ SN A xN−1
| {z }
KN−1

∗ 1 ⊤ h ⊤ ⊤
i
JN−1 = xN−1 (A − BKN−1 ) SN (A − BKN−1 ) + KN−1 RKN−1 + Q xN−1
2 | {z }
SN−1

Decrement to k = N − 2.
... the rest of the story is known to you...
Given a discrete time dynamic system

x(k + 1) = 4x(k) − 6u(k), x(0) = 8

and a performance index


J0 = F(x(k), u(k)) + Jk+1 (x(k + 1))
1 1=N−1 
= (x(2) − 20)2 + ∑ 2x2 (k) + 4u2 (k)
2 k=0
Using principle of optimality, find control sequence u(0) and u(1), Assume no constraints on
u(k)

5.1.1 Algorithm:
Plant

xk+1 = Axk + Buk

with the performance index

1 ⊤ 1 N−1  ⊤ ⊤

Ji = xN SN xN + ∑ xk Qxk + uk Ruk
2 2 k=i+1
with SN ≥ 0, Q ≥ 0, R > 0.
Use principle of optimality (start at the end and go backwards)

1
JN∗ = xN⊤ SN xN
2
1 ⊤ 
JN−1 = xN−1 QxN−1 + u⊤
N−1 RuN−1
2
1
+ (AxN−1 + BuN−1 )⊤ SN (AxN−1 + BuN−1 )
2
No constraint on u, hence find minimum of JN−1 by
106 Optimal Control (DP, LQR and Kalman Filters)

∂ JN−1
0= = RuN−1 + B⊤ SN (AxN−1 + BuN−1 )
∂ uN−1
 −1
u∗N−1 = − B⊤ SN B + R B⊤ SN A xN−1
| {z }
KN−1

∗ 1 ⊤ h ⊤ ⊤
i
JN−1 = xN−1 (A − BKN−1 ) SN (A − BKN−1 ) + KN−1 RKN−1 + Q xN−1
2 | {z }
SN−1

Decrement to k = N − 2 . . .
... the rest of the story is known to you...

Example:
Given a discrete time dynamic system

x(k + 1) = 4x(k) − 6u(k), x(0) = 8

and a performance index


J0 = F(x(k), u(k)) + Jk+1 (x(k + 1))
1 1=N−1 
= (x(2) − 20)2 + ∑ 2x2 (k) + 4u2 (k)
2 k=0
Using principle of optimality, find control sequence u(0) and u(1), Assume no constraints on
u(k)
Solution:
Backward pass

J2∗ (x(2)) = (x(2) − 20)2 . . .

(1) Terminal cort.


Decrement k to 1 .

Decrement k to 1.
J1∗ (x(1)) = min {F(1, x(1), u(1)) + J2∗ (x(2))} − (2)
u(1)
 
= min 1/2 2x2 (1) + 4u2 (1) + (x(2) − 20)2
u(1)
 
1 2 2
 2
= min 2x (1) + 4u (1) + (4x(1) − 6u(1) − 20)
u(1) 2

Find qptimal u∗ (1)


5.1 Dynamic Programming 107

∂  2 
x (1) + 2u2 (1) + (4x(1) − 6u(1) − 20)2 = 0
∂ u(1)
4u(1) + 2(4x(1) − 6u(1) − 20)(−6) = 0
12x(1) − 60
u∗ (1) = − (3)
19

Substituting u∗ (1) in J1∗ (x(1)).

1h i
J1∗ (x(1)) = 2x(1) + 4u(1) + J2∗ (x(2))
2 2
2 (
  )
2 12x(1) − 60 2
= 1/2 2x(1) + 4 + (4x(1) − 6u(1) − 20)2
19
   
2 12x(1) − 60 2 4x(1) − 20 2
= x(1) + 2 +
19 19
but x(1) depend on x(0) and u(0)
x(0) is known
U(0) is unknown.

Decrement k to0 ie J1 → J0

h   i
J0∗ (x(0)) = min 1/2 2x(0)2 + 4u2 (0) + J1∗ (x(1))
u(0)
(  
1 2
 2 12(4x(0) − 640) − 60 2
= min 2x(0) + 4u (0) + 4(x(0) − 6u(0)) + 2
u(0) 2 19
  )
4(4x(0) − 6u(0) − 20 2
+ =0
19

Finding optimal u(0).


(  
∂ 1 2
 2 12(4x(0) − 6u(0)) − 60 2
2x(0) + 4u (0) + (4x(0) − 6u(0)) + 2
∂u 2 19
  )
4(4 × (0) − 6u(0)) − 20 2
+ =0
19

but x(0) = 8
Thus u(0) = 4.81.

Forward pass

x(1) = 4x(0) − 6u(0) = 4 × 8 − 6 × 4.81 = 3.14


108 Optimal Control (DP, LQR and Kalman Filters)

From Eq. (3)

12x(1) − 60 12 × 3.14 − 60
u(1) = = = −1.175
19 19

x(2) = 4(x(1)) − 6u(1)


= 4 × 3.14 − 6 × 1.17 = 19.61.
Which completes the forward pass.
5.2 Optimal Control: LQR (Linear Quadratic Regulator) 109

5.2 Optimal Control: LQR (Linear Quadratic Regulator)

5.2.1 LQR: Introduction


The intuition in eigenvalue placement method in state feedback control may some time become
less intuitive particularly for systems with many variables or many poles. LQR proves to be
more intuitive method of implicitly placing the poles in away that corresponds to desired system
requirement and performance.
The so-called linear-quadratic regulator or LQR formulation of the controller problem for
linear systems uses an integral-square (i.e. quadratic) cost criterion to pose a compromise between
the desire to bring the state to zero and the desire to limit control effort. The optimal control turns
out to be precisely a state feedback which enables computation of the optimal feedback gain matrix
K

5.2.2 Optimal Control:LQR


An optimal control system seeks to maximize the return from a system for the minimum cost. In
general terms, the optimal control problem is to find a control u(t) which causes the system

ẋ = g(x(t), u(t),t) (9.1)

to follow an optimal trajectory x(t) that minimizes the performance criterion, or cost function
Z t1
J= h(x(t), u(t),t)dt (9.2)
t0

The problem is one of constrained functional minimization, and has several approaches namely:

1. Variational calculus - Euler Lagrange equations

2. The maximum principle of Pontryagin- Hamiltonian function

3. Dynamic programming method of Bellman - principle of optimality (Hamilton-Jacobi-


Bellman partial differential equation)

4. The Hamilton-Jacobi equation solved for special case of the linear time- invariant plant with
quadratic performance criterion (called the performance index), which takes the form of the
matrix Riccati (1724) equation.

5.2.3 Quadratic performance index


If, in the racing yacht example, the following state and control variables are defined

x1 = ye (t), x2 = ψe (t), x3 = ue (t), u = δa (t)

then the performance index could be expressed


110 Optimal Control (DP, LQR and Kalman Filters)

Z t1
J= {(q11 x1 + q22 x2 + q33 x3 ) + (r1 u)} dt (9.4)
t0
or
Z t1
J= (Qx + Ru)dt (9.5)
t0

If the state and control variables in equations (9.4) and (9.5) are squared, then the performance
index become quadratic. The advantage of a quadratic performance index is that for a linear system
it has a mathematical solution that yields a linear control law of the form

u(t) = −Kx(t) (9.6)

A quadratic performance index for this example is therefore

Z t1   
J= q11 x12 + q22 x22 + q33 x32 + r1 u2 dt (9.7)
t0
   
Z t1 h i q11 0 0 x1
   
J= x1 x2 x3  0 q22 0   x2  + [u] [r1 ] [u] dt
t0
0 0 q33 x3

or, in general
Z t1 
J= xT Qx + uT Ru dt (9.8)
t0

Q and R are the state and control weighting matrices and are always square and symmetric. J
is always a scalar quantity.

5.2.4 The Linear quadratic problem (LQR)


The Linear Quadratic Regulator (LQR) provides an optimal control law for a linear system with a
quadratic performance index.

5.2.5 Dynamic Programming and Full-State Feedback


We consider here the regulation problem, that is, of keeping xdesired = 0. The closed-loop system
thus is intended to reject disturbances and recover from initial conditions, but not necessarily
follow y-trajectories. There are several necessary definitions. First we define an instantaneous
penalty function l(x(t), u(t)), which is to be greater than zero for all nonzero x and u. The cost
associated with this penalty, along an optimal trajectory, is
Z ∞
J= l(x(t), u(t))dt
0
5.2 Optimal Control: LQR (Linear Quadratic Regulator) 111

i.e., the integral over time of the instantaneous penalty. Finally, the optimal return is the cost of the
optimal trajectory remaining after time t :
Z ∞
V (x(t), u(t)) = l(x(τ), u(τ))dτ.
t

We have directly from the dynamic programming principle

V (x(t), u(t)) = min{l(x(t), u(t))δt +V (x(t + δt), u(t + δt))}. (5.1)


u

The minimization of V (x(t), u(t)) is made by considering all the possible control inputs u
in the time interval (t,t + δt). As suggested by dynamic programming, the return at time t is
constructed from the return at t + δt, and the differential component due to l(x, u).
Through multivariate Taylor series expansion

p1 (x, y) = f (a, b) + D f (a, b)((x, y) − (a, b))


∂f ∂f (5.2)
= f (a, b) + (a, b)(x − a) + (a, b)(y − b)
∂x ∂y
Since the input u(t) is independent variable, the V (x(t), u(t)) function may be treated as a univariate
function in x(t)

f ′′ (x) f ′′′ (x) f (4) (ξ1 )


f (x + ∆x) = f (x) + ∆x f ′ (x) + ∆x2 + ∆x3 + ∆x4 ··· (5.3)
2! 3! 4!

And if V is smooth and has no explicit dependence on t, as written, then

∂V dx
V (x(t + δt), u(t + δt)) = V (x(t), u(t)) + + h.o.t. −→
∂ x dt
∂V
= V (x(t), u(t)) + (Ax(t) + Bu(t)).
∂x
Substituting the V (x(t + δt), u(t + δt)) in the bellman equation gives

∂V
V (x(t), u(t)) = min{l(x(t), u(t))δt +V (x(t), u(t)) + (Ax(t) + Bu(t))}. (5.4)
u ∂x

Now control input u in the interval (t,t + δt) cannot affect V (x(t), u(t)), so isolating it from
the minimization component and making a cancellation gives
 
∂V
0 = min l(x(t), u(t)) + (Ax(t) + Bu(t)) . (5.5)
u ∂x
We next make the assumption that V (x, u) has the following form:

1 1
V (x, u) = xT Px + uT Zu
2 2
112 Optimal Control (DP, LQR and Kalman Filters)

where P is a symmetric matrix, and positive definite. It follows that

∂V
= xT P −→
∂x

0 = min l(x, u) + xT P(Ax + Bu) .
u

We finally specify the instantaneous penalty function. The LQR employs the special quadratic
form
1 1
l(x, u) = xT Qx + uT Ru,
2 2
where Q and R are both symmetric and positive definite. The matrices Q and R are to be set by the
user, and represent the main "tuning knobs" for the LQR. Substitution of this form into the above
equation gives
 
1 T 1 T T
0 = min x Qx + u Ru + x P(Ax + Bu) . (5.6)
u 2 2
and setting the derivative with respect to u to zero gives

0 = uT R + xT PB
uT = −xT PBR−1
u = −R−1 BT Px.

The gain matrix for the feedback control is thus K = R−1 BT P. Inserting this solution back into
equation 5.6, and eliminating u in favor of x, we have

1 1
0 = xT Qx − xT PBR−1 BT Px + xT PAx.
2 2

All the matrices here are symmetric except for PA;


since
xT PAx = xT AT Px

, we can make its effect symmetric.


or since


2xT PAx = xT AT P + PA x

then
1 1
xT PAx = xT PAx + xT AT Px,
2 2
leading to the final matrix-only result

0 = Q + PA + AT P − PBR−1 BT P.

This equation (9.25)is referred to as (ARE) Agebraic Riccatti Equation


5.2 Optimal Control: LQR (Linear Quadratic Regulator) 113

Example:
The regulator shown in Figure 9.1 contains a plant that is described by
" # " #" # " #
ẋ1 0 1 x1 0
= + u
ẋ2 −1 −2 x2 1
h i
y= 1 0 x

and has a performance index

Z ∞
" " # #
2 0
J= xT x + u2 dt
0 0 1
Determine
(a) the Riccati matrix P
(b) the state feedback matrix K
(c) the closed-loop eigenvalues

Fig. 5.2 LQR controller

(a)
" # " #
0 1 0
A= B=
−1 −2 1
" #
2 0
Q= R = scalar = 1
0 1
From equation (9.25) the reduced Riccati equation is
114 Optimal Control (DP, LQR and Kalman Filters)

Q + PA + AT P − PBR−1 BT P = 0
" #" # " #
p11 p12 0 1 −p12 p11 − 2p12
PA = =
p21 p22 −1 −2 −p22 p21 − 2p22
" #" # " #
T 0 −1 p11 p12 −p21 −p22
A P= =
1 −2 p21 p22 p11 − 2p21 p12 − 2p22
" #" # " #
p11 p12 0 h i p
−1 T 11 p12
PBR B P = 1 0 1
p21 p22 1 p21 p22
" #
p12 h i
= p21 p22
p22
" #
p12 p21 p12 p22
=
p22 p21 p222
Combining equations (9.34), (9.35) and (9.36) gives

" # " # " #


2 0 −p12 p11 − 2p12 −p21 −p22
+ + (9.37)
0 1 −p22 p21 − 2p22 p11 − 2p21 p12 − 2p22
" #
p12 p21 p12 p22
− =0
p22 p21 p222

Since P is symmetric, p21 = p12 . Equation (9.37) can be expressed as four simultaneous
equations

2 − p12 − p12 − p212 = 0 (9.38)


p11 − 2p12 − p22 − p12 p22 = 0 (9.39)
−p22 + p11 − 2p12 − p12 p22 = 0 (9.40)
1 + p12 − 2p22 + p12 − 2p22 − p222 = 0 (9.41)

Note that equations (9.39) and (9.40) are the same. From equation (9.38)

−2 + p212 + 2p12 = 0

solving

p12 = p21 = 0.732 and − 2.732

Using positive value

p12 = p21 = 0.732 (9.42)


5.2 Optimal Control: LQR (Linear Quadratic Regulator) 115

From equation (9.41)

1 + 2p12 − 4p22 − p222 = 0


p222 + 4p22 − 2.464 = 0
solving

p22 = 0.542 and − 4.542

Using positive value

p22 = 0.542 (9.43)

From equation (9.39)

p11 − (2 × 0.732) − 0.542 − (0.732 × 0.542) = 0


p11 = 2.403 (9.44)

From equations (9.42), (9.43) and (9.44) the Riccati matrix is


" #
2.403 0.732
P= (9.45)
0.732 0.542
(b) Equation (9.21) gives the state feedback matrix
" #
h i 2.403 0.732
K = R−1 BT P = 1 0 1 (9.46)
0.732 0.542
Hence
h i
K= 0.732 0.542

(c) From equation (8.96), the closed-loop eigenvalues are

|sI − A + BK| = 0
" # " # " #
s 0 0 1 0 h i
− + 0.732 0.542 = 0
0 s −1 −2 1
" # " #
s −1 0 0
+ =0
1 s+2 0.732 0.542
s −1
=0
1.732 s + 2.542
s2 + 2.542s + 1.732 = 0
s1 , s2 = −1.271 ± j0.341
116 Optimal Control (DP, LQR and Kalman Filters)

1.4.1 LQ-Observer
For a system of the form

ẋ(t) = Ax(t) + Bu(t) (1.12)


y(t) = Cx(t)

The state estimator design problem is to choose the observer gain L in the observer equation

x̂˙ = Ax̂ + Bu + L(y −Cx̂)

With the observer error dynamics equation

ė = (A − LC)e

So that the observer error dynamics is stable.


The related state feedback problem (Dual) is to choose K in

ẋ = AT x +CT u with u = −Kx



Which implies ẋ = AT −CT K x

and AT −CT K is stable.
By choosing L = K T for the observer, the observer is ensured to be stable.
Since the K obtained by LQ optimal control design is stabilizing as long as some stabilizability
and detectability conditions are satisfied. L = K T can be used as a stabilizing observer gain as
well.
Solving LQ control problem for the dual problem

ẋ = AT x +CT u with u = −Kx

Transform the algebraic Riccatti equation

B → CT A → AT

Thus Q + PAT + AP − PCT R−1CP = 0


Stabilizing feedback gain K for the dual system is given by

K = LT = R−1CP ⇒ L = PCT R−1

Where L is the observer gain


5.2 Optimal Control: LQR (Linear Quadratic Regulator) 117

11. Optimal Regulator With Noisy Measurement (Stochastic


state feedback regulator problem)
In the design of state observers in previous sections (LQR), it was assumed that the measurements
y = Cx were noise free. In practice, this is not usually the case and therefore the observed state
vector x may also be contaminated with noise.
For a linear stochastic system, process disturbance and measurement noise are accounted for
model uncertainty and the plant/process model is modified to.

ẋ(t) = Ax(t) + Bu(t) + w(t), x(0) = x0 (1.13)


y(t) = Cx(t) + v(t)

where w(t) and v(t) are white noise processes: the n-dimensional vector w(t) represents
process noise, and the l-dimensional vector v(t) measurement noise and x0 is a stochastic variable
 
with Ex0 x0T = P0 . As before, both noise processes are assumed to be wide-sense stationary,
zero-mean, Gaussian distributed and uncorrelated, and to satisfy

     
E w(t)wT (t + τ) = Qe δ (τ), E v(t)vT (t + τ) = Re δ (τ), E w(t)vT (t + τ) = 0

Consider the following cost function:


Z t1 

T T
 T
J(u) = E x Qx + u Ru dt + x (t1 ) Sx (t1 )
t0

where R > 0, Q ≥ 0 and S ≥ 0. The problem of determining for each t the input u(t) as a
function of the past such that the cost function is minimized is called the stochastic state feedback
regulator problem. Note that since all the variables are stochastic, we consider the average of the
usual cost function.
It is possible to prove that the solution of the stochastic state feedback regulator problem is the
same as in the deterministic case. The presence of white noise does not alter the solution, except
to increase the minimal value of the cost function. That is, the optimal control input, u(t), is given
by:

u(t) = −R−1 BT P(t)x(t) (1.1)

where P(t) is the solution of the Continuos Time Derivative Riccatti Equation (CTDRE) for a
finite horizon time problem :
118 Optimal Control (DP, LQR and Kalman Filters)

Ṗ = −AT P − PA + PBR−1 BT P − Q (1.2)


P (t1 ) = S

For infinite horizon time problem P(t) converges to a constant P which is the solution of the
Algebraic Riccati Equation (ARE) :

Ṗ = 0
0 = Q + AT P + PA − PBR−1 BT P (1.2)
P (t1 ) = S

11.0 Kalman filter (Optimal observer)

11.1 The Kalman Filter


The Kalman filter [1] has long been regarded as the optimal solution to many tracking and data
prediction tasks. The standard Kalman filter is constructed as a mean squared error minimiser.
Derivation of the Kalman filter furnishes the reader with further insight into the statistical constructs
within the filter.
The purpose of filtering is to extract the required information from a signal, ignoring everything
else. How well a filter performs this task can be measured using a cost or loss function. Indeed we
may define the goal of the filter to be the minimisation of this loss function.

11.2 Mean squared error

Many signals can be described in the following way;

yk = ck xk + vk (11.1)

where; yk is the time dependent observed signal, ck is a gain term, xk is the information bearing
signal and vk is the additive noise.
The overall objective is to estimate xk . The difference between the estimate of x̂k and xk itself
is termed the error;

f (ek ) = f (xk − x̂k ) (11.2)

The particular shape of f (ek ) is dependent upon the application, however it is clear that the
function should be both positive and increase monotonically [3]. An error function which exhibits
these characteristics is the squared error function;
5.2 Optimal Control: LQR (Linear Quadratic Regulator) 119

f (ek ) = (xk − x̂k )2 (11.3)

Since it is necessary to consider the ability of the filter to predict many data over a period of
time a more meaningful metric is the expected value of the error function;

Loss function = E ( f (ek )) (11.4)

This results in the mean squared error (MSE) function;


ε(t) = E e2k , = E ((xk − x̂k ) (xk − x̂k )) (11.5)

11.6 Covariance Matrices


Let’remember that covariance is the measurement of the joint variability of two random numbers
[6]. In otherwords it describes how correlated two random distributions are. See that, for a finite
set of random numbers xi ∈ R, the variance σ 2 with expected value vx is:

1 i=1
σ2 = (xi − vx )2
n−1 ∑
n

We can actually do the same thing for two variables, let’s say x and y. This is what we call the
covariance, and for a finite set of random numbers xi ∈ R and yi ∈ R we get the equation:

1 i=1
σ (x, y) = (xi − vx ) (yi − vy )
n−1 ∑
n

Notice that we can also find the variance of x if we just find its covariance with itself. Covariance
σ (x, x) would just give us the variance of x.
The Covariance Matrix K is a matrix representation of the calcuated covariances of all variables
in the system. It is a square, symmetric matrix, which contains the possible covariances. If the
matrix is diagonal, then the variables used are independent (uncorrelated). If it is not, then then the
variables outside of the diagonal are not independent.
" #
σ (x, x) σ (x, y)
K=
σ (y, x) σ (y, y)

11.7 State space derivation of Kalman filter


Assume that we want to know the value of a variable within a process of the form;

xk+1 = Axk + wk (11.10)

where; xk is the state vector of the process at time k, (nx1); A is the state transition matrix of the
process from the state at k to the state at k + 1, and is assumed stationary over time, (nxm); wk is
120 Optimal Control (DP, LQR and Kalman Filters)

the associated white noise process with known covariance, (nx1) such that there is zero correlation
between present input at k and past input at l, i.e:

  Q k=l
k
E wk wTl = Qk δkl =
0 k ̸= l

where Qk > 0 is a positive definite matrix.


h An initial randomivector x0 and initial random estimate x̂0 with initial error covariance
E (x0 − x̂0 ) (x0 − x̂0 )T = P0 .
Observations on this variable can be modelled in the form;

yk = Cxk + vk (11.11)

where; yk is the actual measurement of x at time k, (mx1);C is the noiseless connection between
the state vector and the measurement vector, and is assumed stationary over time (mxn); vk is the
associated measurement error. This is again assumed to be a white noise process with known
covariance and has zero cross-correlation with the process noise, (mx1). and

  R k=l
k
E vk vTl = Rk δkl =
0 k ̸= l

where Rk > 0 is a positive definite matrix. It is assumed that x0 , u j , wk are all uncorrelated for
j ≥ 0, k ≥ 0.

For the minimisation of the MSE to yield the optimal filter it must be possible to correctly
model the system errors using Gaussian distributions. The covariances of the two noise models are
assumed stationary over time and can be simply be described by;

 
Q = E wk wTk (11.12)
 
R = E vk vTk (11.13)

The mean squared error is given by 11.5. This is equivalent to;

 
E ek eTk = Pk (11.14)

where; Pk is the error covariance matrix at time k, (nxn).


Equation 11.14 may be expanded to give;

  h i
Pk = E ek eTk = E (xk − x̂k ) (xk − x̂k )T (11.15)
5.2 Optimal Control: LQR (Linear Quadratic Regulator) 121

11.6 State Estimation


Plant:
x(k + 1) = Ax(k) + Bu(k) + w(k)
y(k) = Cx(k) + ν(k)
Deterministic Observer:

x̂(k + 1) = Ax̂(k) + Bu(k) + L[y(k) −Cx̂(k)]

Stochastic Observer:

(i) Prediction:
x̂(k)′ = x̂(k + 1 | k)

x̂(k)′ = Ax̂(k) + Bu(k)

(ii) Filter : Assuming the prior estimate of x̂k is called x̂k′ , and was gained by knowledge
of the system. It posible to write an update equation for the new estimate, combing the old estimate
with measurement data thus;


x̂k = x̂k′ + Kk yk −Cx̂k′ (11.16)

where; Kk is the Kalman gain, which will be derived shortly. The term yk −Cx̂k′ in eqn. 11.16
is known as the innovation or measurement residual;

ik = yk −Cx̂k (11.17)

Substitution of 11.11 into 11.16 gives;


x̂k = x̂k′ + Kk Cxk + vk −Cx̂k′ (11.18)

Substituting 11.18 into 11.15 gives;


  
Pk = E (I − KkC) xk − x̂k′ − Kk vk i
  T (11.19)
(I − KkC) xk − x̂k′ − Kk vk

At this point it is noted that xk − x̂k′ is the error of the prior estimate. It is clear that this is
uncorrelated with the measurement noise and therefore the expectation may be re-written as;

h  T i
Pk = (I − KkC) E xk − x̂k′ xk − x̂k′ (I − KkC)T
 
+ Kk E vk vTk KkT (11.20)

Substituting equations 11.13 and 11.15 into 11.19 gives;


122 Optimal Control (DP, LQR and Kalman Filters)

Pk = (I − KkC) Pk′ (I − KkC)T + Kk RKkT (11.21)

where Pk′ is the prior estimate of Pk .


Expansion of 11.21 gives;


Pk = Pk′ − KkCPk′ − Pk′CT KkT + Kk CPk′CT + R KkT (11.23)

Note that

KkCPk′ = Pk′CT KkT

implies
−KkCPk′ − Pk′CT KkT = −2KkCPk′

therefore 11.23 may written as;



Pk = Pk′ − 2KkCPk′ + Kk CPk′CT + R KkT (11.23b)

Differentiating with respect to Kk gives;

d [Pk ]  
= −2 CPk′ + 2Kk CPk′CT + R (11.25)
dKk
Setting to zero and re-arranging gives;

 
CPk′ = Kk CPk′CT + R (11.26)

or

Pk′CT = Kk CPk′CT + R (11.26)

Now solving for Kk gives;

−1
Kk = Pk′CT CPk′CT + R (11.27)

Equation 11.27 is the Kalman gain equation. The innovation, ik defined in eqn. 11.17 has an
associated measurement prediction covariance. This is defined as;

Sk = CPk′CT + R (11.28)

Finally, substitution of equation 11.27 into 11.23 gives;

−1
Pk = Pk′ − Pk′CT CPk′CT + R CPk′
= Pk′ − KkCPk′
Pk = (I − KkC) Pk′ (11.29)
5.2 Optimal Control: LQR (Linear Quadratic Regulator) 123

Equation 11.29 is the update equation for the error covariance matrix with optimal gain. The
three equations 11.16, 11.27, and 11.29 develop an estimate of the variable xk . State projection is
achieved using;


x̂k+1 = Ax̂k + Buk (11.30)

To complete the recursion it is necessary to find an equation which projects the error covariance
matrix into the next time interval, k + 1. This is achieved by first forming an expressions for the
prior error;

e′k+1 = xk+1 − x̂k+1


= (Axk + wk ) − Ax̂k
= Aek + wk (11.31)

Extending equation 11.15 to time k + 1;

  h i

Pk+1 = E e′k+1 eTk+1

= E (Aek + wk ) (Aek + wk )T (11.32)

Note that ek and wk have zero cross-correlation because the noise wk actually accumulates
between k and k + 1 whereas the error ek is the error up until time k. Therefore;


 
Pk+1 = E e′k+1 eTk+1

h i  
= E Aek (Aek )T + E wk wTk

Pk+1 = APk AT + Q (11.33)


Pk+1 = A[I − KC]Pk′ AT + Q
 −1 ′ T T
= APk′ AT − ACPk′ CPk′CT + R PkC A + Q
 −1
= AT Pk′ A − APk′CT CPk′CT + R CPk′ AT + Q
The projected covariance is related to the LQR continuous time algebraic Riccati equation (CARE):

AT P + PA − PBR−1 BT P + Q = 0

but quite similar to the discrete time algebraic Riccati equation (DARE):
 −1 T 
P = AT PA − AT PB R + BT PB B PA + Q.

This completes the recursive filter. The algorithmic loop is summarised in the diagram of
figure 11.5.
124 Optimal Control (DP, LQR and Kalman Filters)

Fig. 5.3 The Kalman Gain Intuition: For 1D Case

Fig. 5.4 The Kalman Gain Intuition: For 1D Case

1.4.3 Stochastic LQR and Linear Quadratic Estimator (LQE)


Linear-Optimal Estimation for Continuous-Time Systems (Kalman Bucy Filter)

In order to compute the optimal control input u(t). we use output feedback, where we use the
measured variable y(t) to make an estimation x̂(t) of the state is used.
Now, we want to formulate the optimal linear regulator problem when the observation of the
system are noisy. That is, consider the system:

ẋ(t) = Ax(t) + Bu(t) + w(t), x(0) = x0

where x0 is a stochastic vector with zero mean and covariance P0 . The observed variable is
given by:

y(t) = Cx(t) + Dv(t)


5.2 Optimal Control: LQR (Linear Quadratic Regulator) 125

The observer equation is given by

x̂˙ = Ax̂ + Bu + L(y −Cx̂)

with the observer error dynamics

ė = (A − LC)e + Lv(t) + w(t)


 
where E {w} = 0 and E w(t)wT (s) = δ (t − s)I = Qe while E {v} = 0 and E v(t)vT (s) =
Re δ (t − s)I = R.
Then, the stochastic optimal output feedback regulator problem is the problem of finding the
functional u(t) = f [y(τ),t0 ≤ τ ≤ t] such that the cost function:
Z 
t1−τ  T T
 T
J(u) = E x Qx + u Ru dt + x (t1 ) Sx (t1 )
t0

and
 
minL limt→∞ E x̃T (t)x̃(t)

Jt = ε x̃(t)x̃(t)T
Where x̃(t) = x(t) − x̂(t); Estimation error,
are minimized. i.e. that minimizes the noise power in the state estimation error.
One can show that the observer gain L that solves this problem is obtained by solving the dual
version of the optimal pole placement problem (LQR), with the replacements

A, B → AT ,CT

and

Q, R → Qe , Re = Q f , R f

In practice, the noise covariance matrices are often not known explicitly, but are used as tuning
parameters. Common choices are
Qe = BBT

and
Re = diag (r1 , . . . , rl ) = DDT

, where the values of ri can be used to tune the speed of estimation for each output channel.
It is possible to prove that the solution of the stochastic optimal output feedback regulator
problem is the same as the solution of the corresponding optimal state feedback regulator problem,
except that in the control law, the state x(t) is replaced with the Kalman filter estimator x̂(t), that is
the optimal control input is chosen as:

u(t) = −R−1 BT P(t)x̂(t) (1.4)


126 Optimal Control (DP, LQR and Kalman Filters)

where P(t) is the solution of the (CT DRE) for finite horizon time problem:

Ṗ = −AT P − PA + PBR−1 BT P − Q (1.5)


P (t1 ) = S (1.6)

or

For infinite horizon time problem:

Ṗ = 0
0 = Q + AT P + PA − PBR−1 BT P (1.2)
P (t1 ) = S

The estimate x̂(t) is obtained as the solution of

˙ = Ax̂(t) + Bu(t) + L(t)[y(t) −Cx̂(t)]


x̂(t) (1.7)
x̂ (t0 ) = 0 (1.8)

where

L(t) = P(t)CT R−1 (1.9)

and P(t) is the solution of the (RE) :

Ṗ = AP + PAT − PCT R−1CP + Q (1.10)


P (t0 ) = P0 (1.11)

or

For infinite horizon time problem:

Ṗ = 0
0 = Q + AT P + PA − PBR−1 BT P (1.2)
P (t1 ) = S

5.2.6 Examples
5.2 Optimal Control: LQR (Linear Quadratic Regulator) 127

Example 1

Consider the following differential equation describing a simpl ideal (zero resistance) electrical
circuit:

d2i √
+ i = 3v
dt 2
where v is a normal distributed white noise with

E {v(t)} = 0, E {v(t)v(s)} = δ (t − s)

The measurement, s(t), are taken with an ampere meter where

s(t) = i(t) + e(t)

where e(t) is a normal distributed white noise, uncorrelated from the system noise v(t), and
such that:

E {e(t)} = 0, E {e(t)e(s)} = δ (t − s)

Your assignment is to determine a steady-state LQE to estimate the current, i(t), from the noisy
observation, {s(τ), 0 ≤ τ ≤ t}, by using the associated linear system.
Solution: First, let us rewrite the problem in the standard form. Let x1 = i, and x2 = di/dt.
The, we have that:

ẋ1 = x2

ẋ2 = −x1 + 3v
that is:
" # " #
0 1 0
ẋ = x+ √ v = Ax + Bv
−1 0 3
and
h i
y(t) = 0 1 x(t) + e(t) = Cx(t) + De(t)

It is proven that the covariance matrix of the estimation error, P(t), tends to a limit P as t → ∞.
If (A, B) is completely controllable, and (A,C) is completely observable, then P is the unique
positive definite symmetric solution of the (ARE):

−1
AP + PAT − PCT DDT CP + BBT = 0

and consequently the Kalman filter gain tends to

−1
K = PCT DDT
128 Optimal Control (DP, LQR and Kalman Filters)

In our case the system is a minimal realization. In fact, we have that the controllability and
observability matrices are:
" √ # " # " #
h i 0 3 C 0 1
Γ= B AB = √ , Ω= =
3 0 CA −1 0
and they both are full rank.
In order to solve the (ARE) we define the matrix P as
" #
p1 p2
P=
p2 p3
and
" we plug
# "it in the (ARE):
# " #" # " #" #" # " #
0 1 p1 p2 p1 p2 0 −1 p1 p2 1 0 p1 p2 0 0
+ − + =
−1 0 p2 p3 p2 p3 1 0 p2 p3 0 0 p2 p3 0 3
0
So, we get the following system:


 2p2 − p21 =0
p3 − p1 − p1 p2 = 0


−2p2 − p22 + 3 = 0
By solving the system, and taking the solution that gives a positive definite P we get:
" √ #
2 1
P= √
1 2 2
Then, the steady state Kalman gain is:
" √ #
2
K = PC′ R−1 =
1
Note that the estimator system matrix is:
" √ #
− 2 1
A − KC =
−2 0
which has the following eigenvalues:
p ! √
(2) 1 1 3
|A − KC| = λ+ +2− ⇒ λ1,2 = − √ ± i √
2 2 2 2
that is, A − KC is a stability matrix.

Example 2

For a stochastic system with the deterministic part modelled as shown in Equation. (5.7), the state
equation is affected by Gaussian disturbance while the measurement is by Gaussian noise.
5.2 Optimal Control: LQR (Linear Quadratic Regulator) 129

1. Analyse the following:

(a) open-loop stability of the sytem


(b) Controllability of the system
(c) Observability of the system

2. Using the separation principle, determine the following

(a) LQR controller gain K


(b) Kalman filter gain L

3. Asses the closed loop stability of the system

" # " #
1 1 2
ẋ = x+ u
0 −1 0 (5.7)
h √ i
y= 2 0 x

and we want to minimize the cost function:


Z ∞ 
J(u) = 2y2 + u2 dt
0
130 Optimal Control (DP, LQR and Kalman Filters)

Solution:

like before. First, we consider the previous system without measurement noise, e(t), and we
solve the corresponding LQ problem. That is, we have the following linear system:
To solve this LQ problem we have to check if the realization is minimal. The reachability and
observability matrices are:
" # " # " √ #
h i 2 2 C 2 0
Γ= B AB = , Ω= = √ √
0 0 CA 2 2
Unfortunately, the system is not completely reachable. Clearly, we can not control the noise
w(t) ! However, since the system is asymptotically stable, the (ARE) has anyway a positive definite
solution. So, the optimal solution, û, is given by:

û = −Kz with K = BT P

and where the matrix P is the unique positive solution of the (ARE):

AT P + PA − PBBT P + Q = 0

where
" #
2 0
Q=
0 0
By inserting the numerical values, we get:

" #" # " #" # " #" #" # " #


1 0 p1 p2 p1 p2 1 1 p1 p2 4 0 p1 p2 2 0
+ − + =0
1 −1 p2 p3 p2 p3 0 −1 p2 p3 0 0 p2 p3 0 0

So, we get the following system:




 2p21 + 2 − 4p21 = 0
p2 + p1 − p2 − 4p1 p2 = 0


2 (p2 − p3 ) − 4p22 =0
By solving the system, and taking the solution that gives a positive definite P we get:
" #
1 1/4
P= >0
1/4 1/8
and so, the optimal gain, K, is:
" #
h i 1 1/4 h i
K= 2 0 = 2 1/2
1/4 1/8
5.2 Optimal Control: LQR (Linear Quadratic Regulator) 131

Next, we have to determine the optimal observer, that is the KalmanBucy filter. We have the
following system:
" # " #
1 1 0
ż = z+ √ v
0 −1 2
h i √
y = 1 0 z + 2η

Therefore, we have a standard problem with the following numerical data:


" # " #
1 1 0 h i √
A= , B= √ , C= 1 0 , D= 2
0 −1 2
" #
T T 0 0
R = DD = 2, Q = BB =
0 2
Note that in this case the realization is minimal. In fact, we have that the controllability and
observability matrices are:
" √ # " # " #
h i 0 2 C 1 0
Γ= B AB = √ √ , Ω= =
2 − 2 CA 1 1
So, we know we have an unique positive solution to the (ARE). The steadystate Kalman filter
gain L is given by:

L = PCT R−1

and where the matrix P is the unique positive solution of the (ARE):

AP + PAT − PCT R−1CP + Q = 0

By inserting the numerical values, we get:

" #" # " #" # " #" #" # " #


1 1 p1 p2 p1 p2 1 0 p1 p2 1/2 0 p1 p2 0 0
+ − + =0
0 −1 p2 p3 p2 p3 1 −1 p2 p3 0 0 p2 p3 0 2
with solution (we obtained it numerically)
" #
4.3947 0.4337
P=
0.4337 0.9530

and the Kalman gain becomes


h i
L = PCT R−1 = 2.1974 0.2168

c: The overall system is stable , However, we can check it by computing the eigenvalues of the
matrix:
132 Optimal Control (DP, LQR and Kalman Filters)

" #
A + BK −BK
0 A − LC
That is, the eigenvalues of A + BK and of A − LC :

|A + BK| = (λ + 3)(λ + 1) ⇒ λ1,2 = −3, −1


|A − LC| = (λ + 1)2 + 1 ⇒ λ1,2 = −1, −1.1974 − 1
and so the overall system is asymptotically stable.

You might also like