You are on page 1of 93

– Control Theory –

C4. Time domain system identification


Jan Swevers
September 2020
Objectives :
• The student has basic knowledge on system identification in general and
on least-squares time-domain identification (LS-TDI) in particular.
• The student is able to apply the LS-TDI procedure to linear time-
invariant systems with one input and one output. This includes selection
of model structure, evaluation of model accuracy, application of simple
measures to improve model accuracy.

0-0
Introduction to least-squares time domain identification 1/92

Outline of this chapter


• What is system identification
• The system identification procedure
• Models and prediction
• Time domain parameter estimation
• Practical issues of least-squares ARX model parameter estimation

Reference material: course notes.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 2/92

What is system identification


• Definition of system identification : Selection of a model for a process
(i.e. studied system or device under test (DUT)), using a limited number of
measurements of the input and outputs, which may be disturbed by noise,
and a priori system knowledge.
• Definition of parameter estimation : The experimental determination
of values of parameters that govern the dynamic behavior, assuming that
the structure of the process model is known
• We limit ourselves to linear time-invariant (LTI) systems and models.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 3/92

The system identification procedure


4 basic steps:
1. Collect input-output measurement data
2. Select a model structure to represent the system
3. Match the selected model structure to the measurements (parameter
estimation)
4. Validate the model

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 4/92

Collect input-output measurement data


• Measure the response (input/output) while the system is in normal
operation. No freedom to select excitation!
• Perform a dedicated experiment that actively excites the system:
– Satisfy the condition of persistency of excitation: the excitation should
be sufficiently rich such that all modes of the system are excited and
observable in the output sequence.
– More desirable: design/select the excitation that results in maximally
informative data, subject to constraints that may be at hand, in order
to minimize the model uncertainty. A priori information on the system
is very important to support this design/selection.
– Sampling frequency: in principle twice the highest frequency of interest,
in practical cases at least 10 times higher than the highest frequency.
Take same sampling frequency as the one used to implement the
controller.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 5/92

Choosing a convenient model set


• Make a choice within all possible mathematical models that can be used to
represent the system: choice is very important but often difficult.
• Continuous- or discrete-time models: depends on the further application of
the model/measurement configuration.
• For digital control applications: discrete-time models (see later).
• A priori available system information can help:
– certain physical laws are known to hold true for the system,
– preliminary data analysis: step or frequency response.
• If no information is available: apply trial-and-error procedure.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 6/92

Match the selected model structure to the measurements (parameter


estimation)
• Determine within the set of models, the model that is the “best”
approximation or provides the “best” explanation of the observed data.
• We need a criterion to measure the model quality : the estimation of the
model parameters corresponds to the minimization of the chosen criterion.
• The choice of the criterion is extremely important because it determines
the stochastic properties of the estimator.
• This criterion or cost function defines a distance between the experimental
data and model: can be chosen on an ad-hoc basis using intuitive insight,
or more systematically based on stochastic arguments.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 7/92

Validation the model


• How do you know if the model is satisfactory: use it and check if it serves
its purpose.
• This is often too dangerous: model validation criteria to get some feeling
on the accuracy, confidence on its value.
– Simulate and compare with different sets of measurements: in practice,
the best model (yielding the smallest errors) is not always preferred:
often simpler models that describe the system within user-specified
error bounds are preferred.
– Compare parameter estimates with expectations or values found using
other measuring techniques (if available).
• During validation, keep the application in mind (determines what
properties are critical), test the model under the same conditions, avoid
extrapolation as much as possible.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 8/92

The identification loop

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 9/92

The measurement configuration


Two most common measurement configurations: continuous-time (A) and
zero-order-hold (zoh) (B) measurement configuration

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 10/92

Continuous-time measurement configuration


• External excitation source generates continuous-time band-limited signal.
• Input and output signals are measured with measurement equipment that
samples these continuous-time signals synchronously.
• Both measured input and output signals can be perturbed by measurement
noise,
• Dynamics of sampling/DAC are not present in measurements.
• Continuous-time models are appropriate.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 11/92

zoh-measurement configuration
• The excitation signal is generated by the measurement device that also
measures the system output. This measurement device is typically a digital
control computer also used to control the system.
• The excitation signal is a discrete-time signal u(kT ), free of noise.
• Through a zero-order-hold interpolation performed by the digital-to-analog
converter (DAC) of the control computer a continuous-time signal u(t) is
generated, a sequence of steps, that is applied to the (continuous-time)
system.
• At the system output, the continuous-time signal y(t) is sampled yielding
the discrete-time signal y(kT ) that is stored in the control computer
memory.
• The sampled output signal may be perturbed by measurement noise,
• Discrete-time models including dynamics of zoh-interpolation are
appropriate.
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 12/92

zoh-measurement configuration (2)


zoh-configuration corresponds to digital control computer implementation.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 13/92

zoh-measurement configuration (3)


Relation discrete-time and continuous-time signals in zoh measurement
configuration.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 14/92

zoh-measurement configuration (4)


Relation between discrete-time signal u(kT ) (grey dots) and continuous-time
signal u(t) (black line).

Relation between continuous-time signal y(t) (black line) and discrete-time


signal y(kT ) (grey dots).

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 15/92

Models and prediction


Overview
• Discrete-time representation of continuous-time systems
• Prediction
• Discrete-time input-output models for linear time-invariant systems

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 16/92

Discrete-time representation of continuous-time systems


Discrete-time models for the zoh measurement configuration
• Output of a linear time-invariant system for a given input u(t) and impulse
response g(t):
Z ∞
y(t) = g(τ )u(t − τ )dτ.
τ =0

• The Laplace transform of the impulse response {g(τ )}∞


τ =0 is called the
transfer function G(s):

d0 snd + d1 s(nd −1) + · · · + dnd


G(s) = n (n −1)
,
c0 s + c1 s
c c + · · · + cnc
with nd ≤ nc (system is proper).
• We consider the output only at discrete times tk = kT , for k = 1, 2, ...:
Z ∞
y(kT ) = g(τ )u(kT − τ )dτ.
τ =0

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 17/92

Discrete-time models for the zoh measurement configuration (2)


• Due to zero-order hold (zoh) conditions, the input u(t) is kept constant
between the sampling instants:

u(t) = uk , kT ≤ t < (k + 1)T.

• This yields:
Z ∞ ∞ Z
X lT
y(kT ) = g(τ )u(kT − τ )dτ = g(τ )u(kT − τ )dτ,
τ =0 l=1 τ =(l−1)T
∞ ∞
!
X Z lT X
= g(τ )dτ uk−l = gT (l)uk−l ,
l=1 τ =(l−1)T l=1

where we define the (discrete) impulse response of that system:


Z lT
gT (l) = g(τ )dτ.
τ =(l−1)T

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 18/92

Discrete-time models for the zoh measurement configuration (3)


• We omit T :

X
y(t) = g(k)u(t − k), for t = 1, 2, ...
k=1

• The z-transform of the discrete impulse response {g(k)}∞


k=1 is called the
discrete-time transfer function G(z):

b0 z nb + b1 z (nb −1) + · · · + bnb


G(z) = ,
a0 z na + a1 z (na −1) + · · · + ana
with nb < na (due to the strict causality condition, i.e. g(k) = 0, for
k = 0).
• The relationship between the parameters of the transfer function of
continuous time system and the parameters of its zoh discrete-time
equivalent can be calculated using published tables or CACSD (MATLAB)
software.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 19/92

Discrete-time models for the zoh measurement configuration (4)


• Previous result (g(k) = 0, for k = 0 and nb < na ) is correct if the
continuous-time system G(s) is strictly proper, that is nd < nc .
• If continuous-time system is bi-proper, that is nd = nc , both continuous
and discrete-time system have a direct feed through between input and
output and hence for the discrete-time system orders of numerator and
denominator are (also) equal nb = na and g(k) 6= 0, for k = 0.
• g(k) = 0, for k = 0 does not mean that the input is delayed by one
sampling time T . The effective delay is approximately T /2.
• To illustrate this consider following continuous-time system:
ωn2
G(s) = 2 2
,
s + 2ζωn s + ωn
with ωn = 2 × π rad/s and ζ = 0.7.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 20/92

Discrete-time models for the zoh measurement configuration (5)


In following figure:
• Blue: the (continuous-time) impulse response of G(s)
• Red: the (continouse-time) response of G(s) to the ZOH realization of a
Dirac impulse which is a block pulse of width T and height 1/T . T = 0.1 s.
• Red dots: Samples of red curve with sampling time T = 0.1 s.
• An effective delay of approximately T /2 = 0.05 s can be observed.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 21/92

Timing diagram
• The derived ZOH relation assumes following timing of the input and
output:
compute u

k k +1 k +2

yk yk+1 yk+2

uk uk+1 uk+2 t/Ts

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 22/92

Timing diagram (2)


• In a feedback control configuration, a delay ∆k is introduced between
measurement of the output and sending out the control command (D/A
conversion). This time delay is due to the processing of the output
measurement to obtain the control command.
• This delay is not known in advance (depends on required processing time)
and can vary.
• In most applications this delay is small compared to sampling period and
hence can be neglected. For the project of this course, this delay can be
significant when implementing an (E)KF.
compute u

k k +1 k +2

yk yk+1 yk+2
∆k ∆k+1 ∆k+2
uk uk+1 uk+2 t/Ts
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 23/92

Timing diagram (3)


• The real-time software MicroOS used on the MECOtrons in your project
stores the control command calculated during discrete-time interval k in a
memory buffer until discrete-time instance k + 1.
compute u
store u before sending it out

k k +1 k +2

yk yk+1 yk+2
∆k ∆k+1 ∆k+2
uk uk+1 t/Ts

• Advantage: the effective delay between measurement and control action is


constant and known.
• Disadvantage: this effective delay is increased to about 3T /2 s.
• Remark: this buffering operation also takes place during the identification
experiment.
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 24/92

Introducing disturbances
• We assume that the disturbances and noise can be lumped into an additive
term v(t) at the output:

X
y(t) = g(k)u(t − k) + v(t).
k=1

• Sources of disturbances:
– Measurement noise: the sensors that measure the signals are subject to
noise and drift.
– Uncontrollable inputs: the system is subject to signals that have the
character of inputs, but are not controllable by the user.
• This model does not consider input disturbances, for example noise on the
measurements of the input data sequence.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 25/92

Introducing disturbances (2)


• Time-domain identification approach to model disturbances:

X
v(t) = h(k)e(t − k),
k=0

where e(t) is a sequence of independent (identically distributed) random


variables with a certain probability density function, and h(0) = 1.
• The mean value of e is equal to zero, yielding:

X
E{v(t)} = h(k)E{e(t − k)} = 0.
k=0

• The covariance of v(t) equals:



X
E{v(t)v(t − τ )} = σ 2 h(k)h(k − τ ).
k=0

σ 2 is the variance of e(t).


[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 26/92

Introducing disturbances (3)


Introducing delay/forward shift operators, e.g.
qu(t) = u(t + 1), q −1 u(t) = u(t − 1).
yield:

X ∞
X
y(t) = g(k)u(t − k) + h(k)e(t − k),
k=1 k=0
X∞ ∞
X
= g(k)q −k u(t) + h(k)q −k e(k),
k=1 k=0
"∞ # " ∞
#
X X
= g(k)q −k u(t) + h(k)q −k e(t),
k=1 k=0
= G(q)u(t) + H(q)e(t),
with:

X ∞
X
G(q) = g(k)q −k , H(q) = h(k)q −k .
k=1 k=0
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 27/92

Prediction
The prediction of future outputs of a system is most essential for the
development of the time-domain identification methods discussed here.
One-step-ahead prediction of v(t)

X
v(t) = H(q)e(t) = h(k)e(t − k).
k=0

We assume that H(q) is stable:



X
|h(k)| < ∞,
k=0

and define the z-transform of its impulse response :



X
H(z) = h(k)z −k .
k=0

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 28/92

One-step-ahead prediction of v(t) (2)


We assume that H(q) is invertible: that is, if v(s), s ≤ t, are known, then we
shall be able to compute e(t) as:

X
e(t) = H̃(q)v(t) = h̃(k)v(t − k),
k=0
P∞
with H̃(q) stable, i.e. k=0 |h̃(k)| < ∞.
Then the function 1/H(z) is analytic in |z| ≥ 1 (which means that 1/H(z) is
stable or that H(z) does not have zeros on or outside the unit circle) :

1 X
= h̃(k)z −k .
H(z)
k=0

Hence we can define:



X
H̃(q) = H −1 (q) = h̃(k)q −k .
k=0

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 29/92

One-step-ahead prediction of v(t) (3)


Assume now that we have observed v(s) for s ≤ t − 1 and that we want to
predict the value of v(t) (one-step ahead).

X
v(t) = e(t) + h(k)e(t − k).
k=1

The knowledge of v(s) for s ≤ t − 1 implies the knowledge of e(s) for s ≤ t − 1.


So replace e(t) by the value for which its p.d.f. has its maximum, yielding the
most probable value of v(t) (called the maximum a posterior prediction
(MAP)):

X
v̂(t|t − 1) = 0+ h(k)e(t − k),
k=1
= [H(q) − 1] e(t),
H(q) − 1  −1

= v(t) = 1 − H (q) v(t).
H(q)

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 30/92

One-step-ahead prediction of y(t)


Assume that y(s) and u(s) are known for s ≤ t − 1. Hence, v(s) are known for
s ≤ t − 1:
v(s) = y(s) − G(q)u(s).
Since G(q)u(t) does not include u(t), the one-step-ahead prediction of y(t)
equals:

ŷ(t|t − 1) = G(q)u(t) + v̂(t|t − 1),


−1
 
= G(q)u(t) + 1 − H (q) v(t),
−1
 
= G(q)u(t) + 1 − H (q) [y(t) − G(q)u(t)] ,
−1 −1
 
= H (q)G(q)u(t) + 1 − H (q) y(t).

The prediction error equals:

ε(t|t − 1) = y(t) − ŷ(t|t − 1) = −H −1 (q)G(q)u(t) + H −1 (q)y(t) = e(t).

The variable e(t) represents that part of y(t) that cannot be predicted from
past data: the innovation at time t.
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 31/92

Discrete-time input-output models for linear time-invariant systems


• Rather than using h(k) and g(k) with an infinite number of parameters.
• Use parameterizations of G(q) and H(q) that are finite.
• They correspond to those used in the Matlab System Identification
Toolbox.
• Extend the model with a parameter vector θ:

y(t) = G(q, θ)u(t) + H(q, θ)e(t).

• Prediction error:

ε(t|t − 1, θ) = y(t) − ŷ(t|t − 1, θ) = −H −1 (q, θ)G(q, θ)u(t) + H −1 (q, θ)y(t).

• ε(t|t − 1, θ) = e(t) (a sequence of independent random variables) if θ = θ 0


(exact parameter vector).

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 32/92

Equation error model structure


• ARX model structure
• ARMAX model structure
• Other equation-error-type and output-error-type model structures
We discuss only the ARX model structure!

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 33/92

ARX model structure


The most simple input-output model: a linear difference equation

y(t)+a1 y(t−1)+. . .+ana y(t−na ) = b1 u(t−nk )+. . .+bnb u(t−nk −nb +1)+e(t),

The white-noise term e(t) enters as a direct error in the difference equation:
equation error model.
The model parameter vector:
h iT
θ = a1 a2 ... a na b1 . . . b nb .

If we introduce:

A(q, θ) = 1 + a1 q −1 + . . . + ana q −na ,


B(q, θ) = b1 + . . . + bnb q −nb +1 ,

we get:
A(q, θ)y(t) = B(q, θ)u(t − nk ) + e(t).

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 34/92

ARX model structure (2)


This corresponds to:
B(q, θ) 1
G(q, θ) = q −nk , H(q, θ) =
A(q, θ) A(q, θ)
ARX model: AR refers to the autoregressive part A(q, θ)y(t) and X to the
extra input B(q, θ)u(t − nk ) (called the eXogeneous variable in econometrics).

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 35/92

ARX model structure: linear regressor


One-step-ahead prediction:

ŷ(t|θ) = B(q, θ)u(t − nk ) + [1 − A(q, θ)]y(t).

Now introduce the vector:


h iT
ϕ(t) = −y(t − 1) . . . −y(t − na ) u(t − nk ) . . . u(t − nk − nb + 1) .

ŷ(t|θ) = θ T ϕ(t) = ϕT (t)θ.


Model is linear in the parameters: linear regression model.
The vector ϕ(t) is known as the regression vector.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 36/92

Other model structures

Figuur 1: *
ARMAX

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 37/92

Figuur 2: *
ARARMAX

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 38/92

Other model structures (2)

Figuur 3: *
Output error model structure

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 39/92

Figuur 4: *
Box-Jenkings model structure

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 40/92

Time domain parameter estimation


Overview
• Parameter estimation
• Prediction error identification method (PEM)
• Least-squares parameter estimation for ARX models

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 41/92

Parameter estimation
• We have selected a certain set of candidate models :

M∗ = {M(θ)|θ ∈ DM }.

• Each model M(θ) represents a way of predicting future outputs

M(θ) : ŷ(t|t − 1) = [1 − H −1 (q, θ)]y(t) + H −1 (q, θ)G(q, θ)u(t).

• We are also in the situation that we have collected a batch of data from the
system:
T
Z N = [y(1), u(1), y(2), u(2), . . . , y(N ), u(N )] .

• Parameter estimation: mapping the data Z N to the set DM :

Z N → θ̂ N ∈ DM .

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 42/92

Prediction error identification method (PEM)


General PEM formulation
• We consider that the essence of a model is its prediction aspect.
ε(t, θ ∗ ) = y(t) − ŷ(t|θ ∗ ).

• The prediction error sequence for Z N can be seen as a vector in IRN .


• The ”size” of this vector (norm) can be taken as a measure for the
“quality” of a model.
• Let the prediction error sequence be filtered through a stable linear filter
L(q):
εf (t, θ) = L(q)ε(t, θ), 1 ≤ t ≤ N.
Then use the following form:
N
1 X
VN (θ, Z N ) = l(εf (t, θ)).
N t=1

where l(·) is a scalar-valued function.


[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 43/92

General PEM formulation (2)


The estimate θ̂ N corresponds to:

θ̂ N = arg minθ ∈DM VN (θ, Z N ).

This approach is called a prediction error identification method (PEM).


Quadratic norm
Take:
1 2
l(ε) = ε .
2
and omitting the filter yields:
N
N 1 X 1 2
VN (θ, Z ) = ε (t, θ),
N t=1 2

with
ε(t, θ) = H −1 (q, θ) [y(t) − G(q, θ)u(t)] .

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 44/92

Least-squares parameter estimation for ARX models


Formulation of LSE
• Assume that the system that has to be identified behaves according to an
ARX model structure.
• Assume e(t) ∈ N (0, σ 2 ): a sequence of independent zero-mean Gaussian
random variables:

E{e(t)} = 0,
E{e(t)e(t1 )} = δ(t, t1 )σ 2 ,

where E{} denotes the expected value, and δ denotes the Kronecker delta.
• Prediction error:

ε(t|t − 1, θ) = y(t) − ŷ(t|t − 1, θ).

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 45/92

Formulation of LSE (2)


• For ARX model yields:

ŷ(t|t − 1, θ) = ϕT (t)θ,

with
h iT
ϕ(t) = −y(t − 1) . . . −y(t − na ) u(t − nk ) . . . u(t − nk − nb + 1) .

• Combining quadratic criterion (omitting 1/N ) with the prediction error


expression yields the following least squares (LS) criterion:
N
N
X 1 T
2
VN (θ, Z ) = y(t) − ϕ (t)θ .
t=1
2

• The so-called least-squares estimate (LSE) equals:


" N
#−1 N
LS 1 X 1 X
θ̂ N = arg minθ VN (θ, Z N ) = ϕ(t)ϕT (t) ϕ(t)y(t).
N t=1 N t=1

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 46/92

Matrix formulation of the LSE


• Introduce the following N − na -dimensional column vectors:
h iT
ŷ(θ) = ŷ(na + 1|θ) ŷ(na + 2|θ) . . . ŷ(N |θ),
h iT
y = y(na + 1) y(na + 2) . . . y(N ) ,

• and following (N − na ) × d matrix (d is the number of model parameters):


 
ϕT (na + 1)
 
 T
 ϕ (na + 2) 

Φ= .
 .. 

 . 

ϕT (N )

• The vector containing the output predictions then equals:

ŷ(θ) = Φθ

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 47/92

Matrix formulation of the LSE (2)


• The vector containing the prediction errors equals:

ε(θ) = y − ŷ(θ).

• The LS criterion equals:


N1
VN (θ, Z ) = ε(θ)T ε(θ),
2
and the LSE equals:
LS
h i−1
θ̂ N = ΦT Φ ΦT y.
| {z }
Φ+
• The output predictions for the LSE equal:
LS
h i−1
T
ŷ(θ̂ N ) = Φ Φ Φ ΦT y.

• Calculate the pseudo inverse using the singular value decomposition.


[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 48/92

Consistency of the LSE


• A parameter estimate is consistent if the estimate θ(Z N ) converges in
probability to the true parameter vector which is indicated as θ0 .

plimN →∞ θ(Z N ) = plimθ(Z N ) = θ0

which means that the probability P (|θ(Z N ) − θ0 | > ) → 0 for N → ∞, for


every .

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 49/92

Consistency of the LSE (2)


• What if the innovations do not satisfy these conditions? Consider:
y(t) = ϕT (t)θ 0 + ν0 (t).
{ν0 (t)} is called the regression-equation error sequence.
• Introduce the matrices :
N
1 X
R(N ) = ϕ(t)ϕT (t)
N t=1
and
N
1 X
f (N ) = ϕ(t)ν0 (t).
N t=1
• The LSE equals:
N
LS −1 1 X  T 
θ̂ N = R(N ) ϕ(t) ϕ (t)θ 0 + ν0 (t) ,
N t=1
= θ 0 + R(N )−1 f (N ).
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 50/92

Consistency of the LSE (3)


• The probability limits of these matrices (we assume they exist) are:

R∗ = plim R(N ),
h∗ = plim f (N ).

• The LSE has a probability limit:


LS −1

plim θ̂ N = plim θ 0 + R(N ) f (N ) ,
= θ 0 + plim R(N )−1 plim f (N ),
= θ 0 + (R∗ )−1 h∗ .

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 51/92

Consistency of the LSE (4)


• For the LSE to be consistent, we thus have to require that:
(i) R∗ is nonsingular.
(ii) h∗ = 0. This will be the case if either:
(iia) {ν0 (t)} is a sequence of independent random variables with zero
mean values : E{ϕ(t)ν0 (t)} = 0.
(iib) The input sequence {u(t)} is independent of the zero-mean noise
sequence {ν0 (t)} and na = 0.
• In cases (i) and (iia) it can be shown that the random variable
√ LS
N (θ̂ N − θ 0 )

converges in distribution to the normal distribution with zero mean and


covariance matrix λ0 (R∗ )−1 , where λ0 is the variance of ν0 (t). Experiment
design issues therefore deal with the problem of making R∗ “large” subject
to given constraints.
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 52/92

Example
• Consider the following system (not an ARX model structure !!!):
B(q)
y(t) = u(t − 1) + e(t),
A(q)
with

A(q) = 1 + a1 q −1 ,
B(q) = b1 + b2 q −1 .

which can be rewritten as:

y(t) + a1 y(t − 1) = b1 u(t − 1) + b2 u(t − 2) + e(t) − a1 e(t − 1).

• This relation can be written as: y(t) = ϕT (t)θ 0 + ν0 (t), with:


h iT
ϕ(t) = −y(t − 1) u(t − 1) u(t − 2) ,
ν0 (t) = e(t) − a1 e(t − 1).
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 53/92

Example (2)
• The sequence ν0 (t) is not a sequence of independent random variables. For
example, ν0 (t) and ν0 (t − 1) are related since they both depend on e(t − 1).
As a result, ν0 (t) also relates to y(t − 1), i.e. it relates to some of the
elements of φ(t).
• In matrix form:
     
y(3) −y(2) u(2) u(1)
  ν0 (3)
    a  
1
y(4) −y(3) u(3) u(2)   ν0 (4)
     
     
=   b1 + .
.. .. .. ..   ..

     
 .   . . .   . 
   b2  
y(N − 2) −y(N − 1) u(N − 1) u(N ) ν0 (N − 2)

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 54/92

Practical issues of least-squares ARX model


parameter estimation
Overview
• Frequency domain interpretation of the quadratic prediction error criterion
• Low-pass data filtering to improve parameter estimation
• Iterative weighted least squares approach
• Identifying a partially known system

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 55/92

Frequency domain interpretation of the quadratic prediction error


criterion
Frequency domain interpretation: take the DFT of the prediction error and
apply Parseval’s theorem:
N −1
1 1 X
VN (θ, Z N ) = |EN (2πk/N, θ)|2 ,
N2
k=0
N −1
1 X 1 j2πk/N j2πk/N 2 j2πk/N
= 2
| ĜN (e ) − G(e , θ)| QN (e , θ) + RN ,
N 2
k=0

with
|UN (ej2πk/N )|2 C
QN (2πk/N, θ) = , RN ≤ √ .
|H(ej2πk/N , θ)|2 N
ĜN (ej2πk/N ) is called the empirical transfer function estimate (ETFE) and
equals:
j2πk/N YN (ej2πk/N )
ĜN (e )= j2πk/N
.
UN (e )
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 56/92

Frequency domain interpretation for ARX model structure


For the ARX model structure,
1
H(q, θ) = ,
A(q, θ)
yielding:

VN (θ, Z N ) ≈
N −1
1 X 1 j2πk/N j2πk/N 2 j2πk/N 2 j2πk/N 2
|Ĝ N (e ) − G(e , θ)| |UN (e )| |A(e , θ)| .
N2 2
k=0

The difference between the empirical transfer function estimate and the
frequency response function of the model is weighted by two terms:
periodogram of the input and the frequency response of the denominator of the
model.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 57/92

Low-pass data filtering to improve parameter estimation


Example
Consider:
B(q) b1 q −1 + b2 q −2
=
A(q) 1 + a1 q −1 + a2 q −2
0.0484q −1 + 0.0479q −2
= −1 −2
,
1 − 1.8727q + 0.9691q
which is the zero-order-hold discrete-time equivalent for a sampling rate
fs = 100 Hz of a second order continuous-time system with undamped natural
frequency fn = 5 Hz, damping ratio ζ = 0.05 and a DC-gain equal to one.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 58/92

Example (2)
Excited with a step-up and step-down excitation:

1.5

input
0.5

−0.5
0 200 400 600 800 1000
time [samples]

Figuur 5: *
Input excitation for second order system

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 59/92

Example (3)
Gaussian noise is added to the simulated output:
B(q)
y(t) = u(t) + e(t),
A(q)
with e(t) ∈ N (0, σ 2 ) with σ = 0.05.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 60/92

1.5

output
0.5

−0.5

−1
0 200 400 600 800 1000
time [samples]

Figuur 6: *
Noise-free (red) and noisy (blue) response to step-up and down input excitation

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 61/92

Example (4)
LSE:
 
y(2)
 
y(3)
 
 
y = 
 ..
,


 . 

y(999)
 
−y(1) −y(0) u(1) u(0)
 
−y(2) −y(1) u(2) u(1)
 
 
Φ =  ,
 .. .. .. 

 . . . 

−y(998) −y(997) u(998) u(997)

and
h iT
θ̂ LS = â1 â2 b̂1 b̂2 = Φ+ y.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 62/92

Example (5)
Inaccurate results due to high frequency weighting introduced by denominator
|A(ej2πk/N , θ)|2 .

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 63/92

20

Magnitude [dB]
10

−10

−20
0 5 10 15

0
Phase [degrees]

−50

−100

−150

−200

−250
0 5 10 15
Frequency [Hz]

Figuur 7: *
Magnitude and phase of exact model (red) and identified model (blue)

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 64/92

Example (6)
• Next, the identification is repeated after filtering the input and output data
with the same low-pass filter.
• Second-order Butterworth filter with a cut-off frequency of 7 Hz. Both
input and output signals are filtered with the same filter such that the
relation between the filtered signals does not alter.

VN (θ, Z N ) ≈
N −1
1 X 1 2 j2πk/N 2 j2πk/N 2 j2πk/N 2
2
| ĜN − G(θ)| |UN (e )| |A(e , θ)| |H f (e )| ,
N 2
k=0

with Hf (ej2πk/N ) the frequency response of the applied filter at the


considered frequencies.
• In this criterion, the high-pass characteristic of A(ej2πk/N , θ) is
compensated by the low-pass characteristic of Hf (ej2πk/N ).

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 65/92

Example (7)
Improved result:

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 66/92

20

Magnitude [dB]
10

−10

−20
0 5 10 15

0
Phase [degrees]

−50

−100

−150

−200

−250
0 5 10 15
Frequency [Hz]

Figuur 8: *
Magnitude and phase of exact model (red) and identified models obtained
without data filtering (blue) and with data filtering (green) respectively
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 67/92

Example (8)

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 68/92

1.5

output
0.5

−0.5

−1
0 200 400 600 800 1000
time [samples]

Figuur 9: *
Time-domain evaluation: exact output (red) and simulated output with
identified models obtained without data filtering (blue) and with data filtering
(green) respectively
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 69/92

Iterative weighted least-square approach


Sanathanan Koerner procedure
Step 1: Identify the model using the measured data using either no filter or a
simple low-pass filter. The obtained model is denoted as B(q, θ̂ 1 )/A(q, θ̂ 1 ).
Step 2: Repeat the identification using the measured data filtered with the
following low-pass filter obtained from the denominator of the model
obtained in step 1: Hf1 (q) = 1/A(q, θ̂ 1 ). The obtained model is denoted as:
B(q, θ̂ 2 )/A(q, θ̂ 2 ).
Step 3: . . .
Step i: Repeat the identification using the measured data filtered with the
following low-pass filter obtained from the denominator of the model
obtained in step i − 1: Hf(i−1) (q) = 1/A(q, θ̂ (i−1) ). The obtained model is
denoted as: B(q, θ̂ i )/A(q, θ̂ i ).

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 70/92

Sanathanan Koerner procedure (2)


• Iteration step i:

VN (θ, θ̂ (i−1) , Z N ) ≈
N −1
1 X 1 2 j2πk/N 2 |A(ej2πk/N , θ)|2
|ĜN − G(θ)| |UN (e )| .
N2 2 |A(e j2πk/N , θ̂ (i−1) )| 2
k=0

• If the iterative procedure converges, and at a certain point no further


model improvement is obtained, A(ej2πk/N , θ̂ (i−1) ) will come arbitrarily
close to A(ej2πk/N , θ̂ (i ) and the high-frequency emphasis introduced by
A(ej2πk/N , θ̂ (i ) will be completely compensated.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 71/92

Example revisited

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 72/92

20

Magnitude [dB]
10

−10

−20
0 5 10 15

0
Phase [degrees]

−50

−100

−150

−200

−250
0 5 10 15
Frequency [Hz]

Figuur 10: *
Magnitude and phase of exact model (red) and identified models obtained in
the first (blue), second (green) and third (yellow) step of the iterative
procedure, respectively
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 73/92

Example revisited (2)

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 74/92

0.04

0.03

0.02

simulated output error


0.01

−0.01

−0.02

−0.03

−0.04
0 200 400 600 800 1000
time [samples]

Figuur 11: *
Time-domain evaluation: difference between exact model output and simulated
output with model identified with low-pass data filtering (green) and model
from third step of[H00S3A-H00S4A-H04X3A-H04X3B]
the iterative procedure (yellow), respectively
Introduction to least-squares time domain identification 75/92

Identifying a partially known system


Approach: scheme 1
• Assume that some poles and/or zeros of the system are known:
Bu (q, θ) Bk (q)
G(q, θ) = Gu (q, θ)Gk (q) = q −nk ,
Au (q, θ) Ak (q)
with Bk (q)/Ak (q) the known part of the system model, and
q −nk Bu (q, θ)/Au (q, θ) the unknown part of the system model, dependent of
the parameter vector θ.
• Introducing the ARX model structure:

Ak (q)Au (q, θ)y(t) = Bk (q)Bu (q, θ)u(t − nk ) + e(t).

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 76/92

Approach: scheme 1 (2)


• Now filter y(t) and u(t) with Ak (q) and Ak (q) respectively,

y ∗ (t) = Ak (q)y(t),
u∗ (t) = Bk (q)u(t),

we get a similar ARX model structure but now with only the unknown
part of the system model:

Au (q, θ)y ∗ (t) = Bu (q, θ)u∗ (t − nk ) + e(t),

which is equivalent to:

∗ Bu (q, θ) ∗ 1
y (t) = u (t − nk ) + e(t),
Au (q, θ) Au (q, θ)
and referred to as scheme 1.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 77/92

Approach: scheme 2
• Filter only the input with the known part of the model:
Bk (q)
u∗ (t) = u(t),
Ak (q)
yielding:
Bu (q, θ) ∗ 1
y(t) = u (t − nk ) + e(t).
Au (q, θ) Ak (q)Au (q, θ)
The corresponding ”ARX” model structure equals:
1
Au (q, θ)y(t) = Bu (q, θ)u∗ (t − nk ) + e(t).
Ak (q)

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 78/92

Approach: scheme 3
• Filter the output with the inverse of the known part of the model:
Ak (q)
y ∗ (t) = y(t),
Bk (q)
yielding:
Bu (q, θ) 1
y ∗ (t) = u(t − nk ) + e(t).
Au (q, θ) Bk (q)Au (q, θ)
The corresponding ”ARX” model structure equals:
1
Au (q, θ)y ∗ (t) = Bu (q, θ)u(t − nk ) + e(t).
Bk (q)

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 79/92

Frequency domain interpretation

VN (θ, Z N ) ≈
N −1
1 X 1 YN∗ 2 ∗ j2πk/N 2 j2πk/N 2
2
| ∗ − Gu (θ)| |UN (e )| |A u (e , θ)| ,
N 2 UN
k=0

with
• for scheme 1: u∗ = Bk u and y ∗ = Ak y,
• for scheme 2: u∗ = Bk /Ak u and y ∗ = y,
• and for scheme 3: u∗ = u and y ∗ = Ak /Bk y.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 80/92

Remarks
• If data filtering is applied to improve the accuracy of the parameter
estimate, it should compensate for the high frequency weighting
|Au (ej2πk/N , θ)|.

• Depending on the applied scheme, also |UN (ej2πk/N )| can emphasis certain
frequencies and should be checked.
• In some situations, also low frequency distortions are present (DC-offset
and/or drift on measurements), also on inputs! These can be amplified by
the pre-filtering with the known part of the model, e.g. if the system
contains a pure integration or differentiation ... Then, apply band-pass
filters to remove these low frequency distortions.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 81/92

Example
Consider system:
B(q) b1 q −1 + b2 q −2 + b3 q −3
=
A(q) (1 − q −1 )(1 + a1 q −1 + a2 q −2 )
0.1656 e−3 q −1 + 0.6580 e−3 q −2 + 0.1651 e−3 q −3
= −1 −1 −2
,
(1 − q )(1.0000 − 1.8962q + 0.9937q )
which is the zero-order-hold discrete-time equivalent for a sampling rate
fs = 100 Hz of a second order continuous-time system with undamped natural
frequency fn = 5 Hz, damping ratio ζ = 0.01 augmented with an integrator.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 82/92

Example (2)
The discrete-time system is excited with a random excitation:

input
0

−1

−2

−3

−4
0 200 400 600 800 1000
time [samples]

Figuur 12: *
Input excitation for second order system

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 83/92

Example (3)
Gaussian noise is added to the simulated output:
B(q)
y(t) = u(t) + e(t),
A(q)
with e(t) ∈ N (0, σ 2 ) with σ = 0.01

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 84/92

0.5

0.4

0.3

0.2

output
0.1

−0.1

−0.2

−0.3
0 200 400 600 800 1000
time [samples]

Figuur 13: *
Noise-free (red) and noisy (blue) response to step-up and down input excitation

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 85/92

Example (4)
The known and unknown parts of the model equal:
Bk (q) q −1
= −1
,
Ak (q) 1−q
Bu (q, θ) b1 + b2 q −1 + b3 q −2
= −1 −2
.
Au (q, θ) 1 + a1 q + a2 q
We apply scheme 1:

u∗ (t) = q −1 u(t) = u(t − 1),


y ∗ (t) = (1 − q −1 )y(t) = y(t) − y(t − 1).

We apply either low-pass data filtering or the Sanathanan Koerner approach.

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 86/92

Example (5)

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 87/92

40

Magnitude [dB]
20

−20

−40

−60
0 2 4 6 8 10

−50
Phase [degrees]

−100

−150

−200

−250

−300
0 2 4 6 8 10
Frequency [Hz]

Figuur 14: *
Magnitude and phase of exact model (red) and identified model using low-pass
filtered data (blue) and model obtained after one Sanathanan-Koerner step
(green)
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 88/92

Example (6)

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 89/92

40

Magnitude [dB]
20

−20

−40

−60
0 2 4 6 8 10

0
Phase [degrees]

−100

−200

−300
0 2 4 6 8 10
Frequency [Hz]

Figuur 15: *
Magnitude and phase of exact model (red) and identified model using low-pass
filtered data taking a priori knowledge into account (blue) and without
accounting for the a priori knowledge (green)
[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 90/92

Example (7)

[H00S3A-H00S4A-H04X3A-H04X3B]
Introduction to least-squares time domain identification 91/92

1.2

0.8

simulated output
0.6

0.4

0.2

−0.2

−0.4
0 200 400 600 800 1000
time [samples]

Figuur 16: *
Time-domain evaluation: comparison of simulated output of three models using
step-up and down input signal: (1) exact model (red) (2) model with a pole at
z = 1 (a priori system knowledge taken into account) (blue) and (3) model with
a pole at z[H00S3A-H00S4A-H04X3A-H04X3B]
= 0.9987 (full model identified) (green)
Introduction to least-squares time domain identification 92/92

Conclusions: revisit the objectives


• The student has basic knowledge on system identification in general and on
least-squares time-domain identification (LS-TDI) in particular.
• The student is able to apply the LS-TDI procedure to linear time-invariant
systems with one input and one output. This includes selection of model
structure, evaluation of model accuracy, application of simple measures to
improve model accuracy.

[H00S3A-H00S4A-H04X3A-H04X3B]

You might also like