Professional Documents
Culture Documents
Material: Part I: Introduction To System Identification
Material: Part I: Introduction To System Identification
$ ' $
sc4110
X. Bombois
P.M.J. Van den Hof
& % & %
Lecture hours: (see schedule for details)
• Monday (15:45-17:30) in Room D (3Me)
• Thursday (15:45-17:30) in Room D (3Me)
' $ ' $
• Friday (13:45-15:30) in Room A (API)
& % & %
• simulation
• diagnosis / fault detection
?
data-based modeling ???
Determine a model of the dynamical relation existing between
& % & %
the voltage u(t) driving the motor and the angular speed y(t)
of the rotor
'
System Identification: Part I
How to proceed?
$ '
5 System Identification: Part I
• Excite the system by applying the following sequence for the modeling error
voltage u(t) during 20 seconds
6
?
4
2
motor y(t)
applied u(t) u(t)
voltage [V]
dynamics
−2
−4
−6
0 2 4 6 8 10
time (sec)
12 14 16 18 20
y(t)
& % & %
-
150
100
50
speed [rad/s]
−150
filtered ε(t); see later)
0 2 4 6 8 10 12 14 16 18 20
time (sec)
50
Frequency response of the identified model
From: u1 To: y1
Frequency
Magnitude (abs)
−100
response 0
10 −150
0 2 4 6 8 10 12 14 16 18 20
& % & %
−1
10 −1
10
0
10 10
1
10
2 3
10
ε(t) contains not only the model inaccuracy, but also the noise
Frequency (rad/sec)
acting on the system
'
System Identification: Part I
$ '
9 System Identification: Part I
& % & %
However, data-based modeling is often as important as
first-principle modeling Arm is driven by the current i(t) of a motor
θref (t)
i(t) positioning θ (t)
controller
& % & %
mechanism
=⇒ model is needed The controller designed with this physical model could not
achieve the desired bandwidth without thrilling
'
System Identification: Part I
Data-based modeling
$ '
13 System Identification: Part I
$
14
& % & %
(actuator dynamics)
blade
& The blades of a wind turbine are subject to high vibration loads
% & %
' $ ' $
due to wind gust, periodic rotations, ... flap
First-principle modeling
For control design, we need a model of the dynamics between Model based on aerodynamic and mechanical laws
the pitch and flap actuators and the strain in the blade: linear model (order = 28)
many physical parameters to determine =⇒ high uncertainty
wind 102
From Pitch Actuator [deg] From Smart actuator 1&2 [V]
0.025
10−2
10−4
20 20
flap actuator
& % & %
-360
-540
100 4 101 102 100 4 101 102
Frequency [Hz] Frequency [Hz]
2
10
1st flapping mode 2st flapping mode
100
& % & %
Strain sensor Phase [deg]
a parameters of first-principle model have in fact been tuned with the identified
-720
' $ ' $
Frequency [Hz] Frequency [Hz]
clouds
mobile phone receiving
a signal y(t) A model of the so-called channel is required to reconstruct u(t)
from the distorted y(t)
Antenna emitting ground This model can not be determined in advance since the position
a signal u(t)
of the mobile phone is mobile (by definition)
The received signal y(t) is made up of several delayed versions
of the emitted signal u(t) + noise The model is identified at each received call
& % & %
y(t) = g1 u(t − n1 ) + g2 u(t − n2 ) + ... + noise
=⇒ distorted signal
2 0.5
y(t)
1.5 0
1 -0.5
0.5 -1
u(t)
0 -1.5
known sequence signal of interest
distorted by channel distorted by channel
-0.5 -2
0 200 400 600 800 1000 1200 1400
& % & %
tim e
-1
-1.5
known sequence signal of interest
-2
0 200 400 600
tim e
800 1000 1200
Denote by yknown (t) and yinterest (t) the received signals
corresponding to uknown (t) and uinterest (t), respectively
'
System Identification: Part I
$ '
25 System Identification: Part I
This model can be then used to determine an appropriate filter • model generally more complicate than with system
to reconstruct uinterest (t) from yinterest (t) identification
• missing actuator/sensor dynamics and phenomena can be
& % & %
forgotten
• sometimes impossible to determine (as in example 3, but also
in the process industry)
• no disturbance model
& % & %
• a contribution due to u(t) i.e. G0 u(t) between u(t) and y(t)
' $ ' $
a model of the transfer G0 and of the disturbance v(t)
Experiment
design Measure the “distance” between a data set (u, y)t=1,···N and a
particular model.
Identification
Data Model Set
Criterion In this course, we will consider two criteria
& % & %
discrete-time transfer function as model of G0
Validate
NOT OK • Empirical Transfer Function Estimate (ETFE) delivering an
intended
Model
model application estimate of the frequency response of G0
OK
& % & %
methods, ...
'
System Identification: Part I
Experiment Design
$ '
33 System Identification: Part I
Model validation
$
34
• Choice of the type of excitation • Comparing the actual output of the system with the output
predicted by the model
• sum of sinusoids (multisine)
• realization of (filtered) white noise or alike • Determining the uncertainty of the system e.g. in the
frequency domain
• Which frequency content?
2
10
amplitude
0
10
• Which duration? 10
10
-2
-4
& % & %
-2 -1 0
10 10 10
0
phase
-200
-400
-600
-2 -1 0
disturbance
10
-2
-4
• Basic principle (LS) from Gauss (1809)
Data → Model
10
input output 10
-2
10
-1
10
0
process
process
phase
0
• Development based on theories of
-200
-400
- stochastic processes
Identification
Feedback control system -600
10
-2
10
-1
frequency
10
0
- statistics
• Strong growth in sixties and seventies
Feedback
Feedbackcontrol
controlsystem
system Åström en Bohlin (1965), Åström en Eykhoff (1971)
disturbance
• Brought to technological tools in nineties
(Matlab Toolboxes for either time-domain of frequency domain),
& % & %
reference
Model → Controller input + output
as well as to professional industrial control packages
controller
controller process
process
- (Aspen, SMOC-PRO, IPCOS, Tai-Ji Control, AdaptX, ...).
'
System Identification: Part I
$ '
37 System Identification: Part I
$
38
& % & %
c. Variance: cov(θ̂N ) = E(θ̂N − E θ̂N )(θ̂N − E θ̂N )T . middle: biased estimate with small variance
right: unbiased estimate with large variance
=⇒
%
' $ ' $
functions
2. Discrete-time systems
u x[n]
ZOH ucont Continuous ycont Sampling y
Ts system Ts The system is excited via the discrete sequence u(t)
t = 0, 1, 2, ... generated by a PC
This discrete signal is made continuous by the Zero Order Hold
The continuous output ycont (tc ) (tc ∈ R) of the system is (ZOH):
& % & %
sampled with sampled time Ts
ucont (tc ) = u(t) for tTs ≤ tc < (t + 1)Ts
This sampling delivers the discrete measurements y(t) where
t = 0, 1, 2, ... i.e. y(t) = ycont (tTs )
0.8
Sampling time: Ts = 0.04 s. 0.6
0.4
0.2
1
0.8
0 f or 0 ≤ t ≤ 2
& % & %
0.6
0.2
0.5 f or 18 ≤ t ≤ 40
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
time (s)
'
System Identification: part II
$
6
signal is then sampled with a sample period Ts = 0.04s. (upper Discrete-time transfer function
plot, blue circle). This delivers the discrete sequence y(t) of 41
samples (t = 0...40) (bottom plot)
Does it exist a transfer function relation between y(t) and u(t)?
1
0.8
0.6
0.4
0.2
u ZOH ucont Continuous ycont Sampling y
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Ts system Ts
time (s)
& % & %
0.8
0.6
G0(z)
0.4
0.2
0
0 5 10 15 20 25 30 35 40
samples
a
When G0 (s) = s+a and Ts = 0.04s., the discrete-time transfer
function between y(t) and u(t) is
G0(z)
(1 − b)z −1
+∞ G0 (z) = with b = e−aTs
X
−t 1 − bz −1
y(t)z
Y (z)
∆ t=−∞
G0 (z) = = Thus:
U (z) +∞
& % & %
X
u(t)z −t 10 0.33z −1
t=−∞ G0 (s) = ←→ G0 (z) =
s + 10 1 − 0.67z −1
G0 (z) can be computed from G0 (s) with the function c2d.m of
' $ ' $
Matlab (ZOH methodology)
Proof:
Properties of discrete-time transfer function
Suppose u(t) is a discrete step, then ucont (tc ) is a continuous step.
The step response of G0 (s) is, for tc > 0,
∆
y(t) = ycont (tTs ) = 1 − e−atT s
With some abuse, we will write
= 1 − bt
& % & %
1 1
Y (z) 1−z −1
− 1−bz −1 (1 − b)z −1
G0 (z) = = 1 = y(t) = G0 (z)u(t)
U (z) 1−z −1
1 − bz −1
∆
z −1 u(t) = u(t − 1)
10
e−αs
s + 10
Example:
& % & %
bz −1
y(t) = u(t) ⇐⇒ y(t) − ay(t − 1) = bu(t − 1) The corresponding rational discrete transfer function is:
1 − az −1
' $ ' $
sequence u(t) z −β
1 − 0.67z −1
& % & %
∞
• by dividing the numerator of G0 (z) by its denominator
X
G0 (z) = g0 (k)z −k
k=0
a transfer function is stable ⇐⇒ the poles of G0 (z) are all the frequency response of G0 (z) is given by the transfer
located within the unit circle function evaluated at z = ejω i.e. on the unit-circle:
G0 (z = ejω )
Example:
Only the frequency response between [0 π] is relevant.
bz −1
stable ⇐⇒ |a| < 1
& % & %
1 − az −1 Discrete frequency ω ∈ [0 π] =⇒ actual frequency
ωactual = Tωs (ωactual lies within the interval between 0 and the
Indeed, z = a is the unique pole of 1 − az −1 Nyquist pulsation)
'
System Identification: part II
General interpretation:
$ '
17 System Identification: part II
$
18
30
10
u(t) (t = −∞... + ∞)
Magnitude (dB)
0
−10
−20
−30
−40
−60
0
−180
Phase (deg)
u(t) = sin(ω0 t) =⇒ −360
& % & %
−540
−720
−2 −1 0 1
10 10 10 10
Frequency (rad/sec)
& % & %
identified for each of them (and coupled with a scheduling
See end of the course for methodologies to choose Ts function)
stochastic
1.5
0.5
signal
-0.5
-1
Input u(t):
-1.5
-2
-2.5
0 10 20 30 40 50 60 70 80 90 100
1.5
Output y(t):
0.5
multisine 0
-0.5
-1
-1.5
-2
0 10 20 30 40 50 60 70 80 90 100
& % & %
2.5
1.5
(filtered) 1
0.5
-0.5
white noise -1
-1.5
-2
-2.5
0 10 20 30 40 50 60 70 80 90 100
'
System Identification: part II
Observations
$ '
23 System Identification: part II
BUT, each realization has the same power content over ω (i.e.
signal y(t) can be made up of a combination of stochastic and the same Φ(ω))
deterministic signal (e.g. when u(t) is a multisine)
& % & %
Ensemble
three real-
A new theory is necessary to deal with such signals called izations
quasi-stationary signals (see later)
t Time
0
Stationarity also implies that the mean of the signal and the
auto-correlation function is time-invariant A quasi-stationary signal is a finite-power signal which can be
& % & %
In identification, the deterministic signals are the multisines
Analysis very close to the one of stationary stochastic signals
(see WB2310 S&R3)
'
System Identification: part II
$
28
=⇒ New operator Ē ∆
+∞
X
Φu (ω) = Ru (τ ) e−jωτ
τ =−∞
N
∆ 1 X with
Ēu(t) = lim Eu(t)
& % & %
N →∞ N
t=1
∆
Ru (τ ) = Ē (u(t) u(t − τ ))
Φ u (ω)
Example 1: Φu (ω) and Pu when u(t) is a white noise of
2
variance σu ? σ 2
u
1 X
N ω
Ru (τ ) = lim E (u(t)u(t − τ )) −π 0 π
N →∞ N t=1
& % & %
= E (u(t)u(t − τ )) by stationarity
σ 2 when τ = 0 2
and Pu = Ru (0) = σu
∆ u
=
0 when τ 6= 0
'
System Identification: part II
$ '
31 System Identification: part II
=⇒ Ru (τ ) =
$
32
N 2
1 X A A20
lim cos(ω0 τ ) − cos(2ω0 t − ω0 τ + 2φ)
N →∞ N 2 2
Example 2: Φu (ω) and Pu when u(t) = Asin(ω0 t + φ) t=1
Ru (τ ) = Ē (u(t)u(t − τ )) A2
2
=⇒ Ru (τ ) = cos(ω0 τ )
= Ē A sin(ω0 t + φ) sin(ω0 t − ω0 τ + φ) 2
2
A A20
= Ē cos(ω0 τ ) − cos(2ω0 t − ω0 τ + 2φ) and thus, in the fundamental frequency range [−π π],
2 2
& % & %
A2 π
Φu (ω) = (δ(ω − ω0 ) + δ(ω + ω0 ))
2
A2
and Pu = Ru (0) = 2
.
The power spectrum of the sinus is independent of its phase Properties of the power spectrum
shift φ and is = to 0 except in ±ω0 where it is infinite.
y(t) = G(z)u(t) =⇒ Φy (ω) = |G(ejω )|2 Φu (ω)
Φ u(ω)
y(t) = s1 (t) + s2 (t) with s1 (t) independent of s2 (t)
& % & %
−π −ω 0 0 ω0 π
'
System Identification: part II
& % & %
N
• the signals y(t) and u(t) are independent =⇒ R̂u (τ ) = N t=0
Ryu (τ ) = 0 ∀τ 0
f or |τ | > N − 1
NB. Ru (τ ) = Ruu (τ )
& % & %
N −1
1 X
lesser products u(t)u(t − τ ) UN (ω) = √ u(t) e−jω t
N t=0
' $ ' $
deterministic signals
30
u(t)
0
−20
& % & %
lim E Φ̂N
u (ω) = Φu (ω)
N →∞ −30
−40
0 100 200 300 400 500 600 700 800 900 1000
time
2
2.5
1.5
Periodogram (logarithmic scale)
1
1.5
0
0.5
−0.5
0
−1
−0.5
−1.5
& % & %
−1
−2
0 0.5 1 1.5 2 2.5 3 −1.5
ω
−2
0 0.5 1 1.5 2 2.5 3
ω
Φ̂N
u (ω) is an erratic function fluctuating around Φu (ω)
'
System Identification: part II
period = 10 time-samples): 25
2
20
Periodogram
1.5
15
1
0.5 10
u(t)
0
5
−0.5
& % & %
0 0.5 1 1.5 2 2.5 3
ω
−1
−1.5
−2
0 10 20 30 40 50 60 70 80 90 100
It can be proven that the value at ω1 = 0.63, ω2 = 1.26 and
time N A2
ω3 = 1.89 are given by 4 i where Ai (i = 1, 2, 3) is the
amplitude of the sinusoid of frequency ωi .
300
250
200
Periodogram
150
100
& %
50
0
0 0.5 1 1.5 2 2.5 3
ω
v(t)
z }| {
y(t) = G0 (z)u(t) + H0 (z)e(t)
H0(z)
u(t) + y(t)
G0(z)
& % & %
' $ ' $
true system S
& % & %
made on the probability density function)
• the disturbance v(t) = H0 (z)e(t) : independent of the
input signal u(t)
& % & %
Ev(t) = 0 θ1
θ= M = { G(z, θ), H(z, θ) ∀θ ∈ R2 }
Φv (ω) = |H0 (eiω )|2 · σe2 θ2
' $ ' $
Note: H (z, θ) is always chosen as a monic transfer function (like H0 )
∃θ0 such that G(z, θ0 ) = G0 (z) and H(z, θ0 ) = H0 (z) Consider the following true system:
v(t)
i.e. S ∈ M z }| {
y(t) = G0 (z)u(t) + H0 (z)e(t) = G(z, θ0 )u(t) + H(z, θ0 )e(t)
The objective can therefore be restated as follows:
from which N input and output data have been measured:
Find (an estimate of) the unknown parameter vector θ0 using a
set of N input and output data: Z N = { u(t), y(t) | t = 1...N }.
& % & %
Z N = { u(t), y(t) | t = 1...N } Given the parametrization G(z, θ) and H(z, θ), find (an
estimate of) the unknown parameter θ0 .
generated by the true system i.e. y(t) = G0 u(t) + H0 e(t)
& % & %
In other words, θ = θ0 minimizes the power of y(t) − y(t, θ)
'
System Identification
$ '
8 System Identification
& % & %
= H(z, θ) −1
(y(t) − G(z, θ)u(t)) ∀t = 1...N 1
H(z, θ)
ǫ(t, θ)
System Identification 10 System Identification 11
' Properties of the prediction error ǫ(t, θ)
Property 1. Given θ and Z N , ǫ(t, θ) computable ∀t = 1...N
$ ' Property 2. ǫ(t, θ0 ) = e(t) (smth really unpredictable at time t)
$
Example:
y(t)
θ1
z }| {
θ1 z −1 1 ǫ(t, θ) = H(z, θ)−1 G0 (z)u(t) + H0 (z)e(t) −G(z, θ)u(t)
G(z, θ) = 1+θ2 z −1
H(z, θ) = 1+θ2 z −1
θ=
θ2
G0 (z) − G(z, θ) H0
! = u(t) + e(t)
θ1 z −1 H(z, θ) H(z, θ)
ǫ(t, θ) = 1 + θ2 z −1 y(t) − u(t)
1 + θ2 z −1
= y(t) + θ2 y(t − 1) + θ1 u(t − 1) =⇒ ǫ(t, θ0 ) = e(t)
& % & %
Notes: Property 3. ǫ(t, θ) 6= white noise for all θ 6= θ0 (provided an
it is typically assumed that u(t < 0) = y(t < 0) = 0 appropriate signal u(t))
' $ ' $
H −1 (z, θ) is always causal since H (z, θ) is monic !
Since ǫ(t, θ0 ) = e(t), we have thus: u(t) and e(t) uncorrelated and e(t) white noise =⇒
Ēǫ2 (t, θ0 ) = σe2 Ēǫ2 (t, θ) = σe2 + Ēs21 (t, θ) + Ēs22 (t, θ)
& % & %
Ēǫ2 (t, θ) > σe2 ∀θ 6= θ0
θ = θ0 minimizes both Ēs21 (t, θ) and Ēs22 (t, θ) by making them
equal to 0.
(the latter provided u(t) has been chosen appropriately)
=⇒ θ = θ0 minimizes Ēǫ2 (t, θ)
& % & %
T
θ= b0 , b1 , f 1 , f 2 , f 3 , f 4
T
=⇒ θ0 = 0.103 , 0.181 , −1.991 , 2.203 , −1.841 , 0.894
'
System Identification
4
epsilon(t,theta0) white noise as opposed to ǫ(t, θ1 )
2
auto−correlations of epsilon(t,theta0) (red) and epsilon(t,theta1) (blue)
1.5
0
−2 1
−4
0 200 400 600 800 1000 1200 1400 1600 1800 2000
t 0.5
epsilon(t,theta1)
4
0
2
0
−0.5
−2
& % & %
−4 −1
0 200 400 600 800 1000 1200 1400 1600 1800 2000
t
−1.5
0 5 10 15 20 25 30 35 40
T tau
Estimated power of ǫ(t, θ1 ) : 1.4678 • ǫ(t, θ) is a computable quantity comparing the output y(t)
of the true system and the predicted output of a model
Note: the estimated power is R̂ǫN (0) • θ = θ0 minimizes the power of ǫ(t, θ)
& % & %
'
System Identification
$
21
Remark:
3.1. An ideal criterion
There is no difference between θ ∗ and θ0 at this stage of the
Denote by θ ∗ , the solution of the minimization of the power of
course since we suppose S ∈ M and u(t) appropriate.
the prediction error:
& % & %
V̄ (θ) has an unique minimum θ ∗ minima and θ ∗ represents the set of these minima, while θ0
is one single parameter vector
θ ∗ = θ0
& % & %
Indeed, the power of the prediction error can not be exactly Parameter estimation through minimizing VN :
computed with only one experiment and only N measured data.
θ̂N = arg min VN (θ, Z N )
θ
'
System Identification
$ '
24 System Identification
Example:
$
25
θ̂N ∼ AsN (θ ∗ , Pθ ) we have applied 20 times the same sequence u(t) of length
N = 200 and we have measured the corresponding y(t).
& % & %
• θ̂N → θ ∗ with probability 1 when N → ∞ (i.e. Pθ → 0
when N → ∞ ) For these 20 experiments, we have computed the estimate θ̂N
T
of θ0 = 0.3 , 0.7
0.9
N
0.8 1 X
θ̂N = arg min ǫ2 (t, θ)
0.7 θ N t=1
0.6
N
0.5 1 X 2
b
& % & %
0.2
In order to answer this question, the parametrization of G(z, θ)
0.1
and H(z, θ) must be defined more precisely.
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
a
'
System Identification
Model structure: M = {(G(z, θ), H(z, θ)), θ ∈ Rnθ } Model structure G(z, θ) H(z, θ)
General parametrization used in the Matlab Toolbox: z −nk B(z, θ) 1
ARX
A(z, θ) A(z, θ)
z −nk B(z,θ) C(z,θ)
G(z, θ) = F (z,θ)A(z,θ)
H(z, θ) = D(z,θ)A(z,θ)
z −nk B(z, θ) C(z, θ)
ARMAX
A(z, θ) A(z, θ)
T
θ = a1 .. ana b0 .. fnf z −nk B(z, θ)
OE - Output Error 1
B(z, θ) = b0 + b1 z −1 + · · · + bnb −1 z −nb +1 F (z, θ)
A(z, θ) = 1 + a1 z −1 + · · · + ana z −na
& % & %
FIR z −nk B(z, θ) 1
C(z, θ) = 1 + c1 z −1 + · · · + cnc z −nc
D(z, θ) = 1 + d1 z −1 + · · · + dnd z −nd z −nk B(z, θ) C(z, θ)
BJ - Box-Jenkins
F (z, θ) = 1 + f1 z −1
+ · · · + fnf z −nf
F (z, θ) D(z, θ)
& % & %
na , nb are the number of parameters in the A and B
polynomial.
There are no common parameters in G and H.
nk number of time delays
⇒ Advantages for independent identification of G and H.
& % & %
θ
' $ ' $
VN (θ,Z N )
z }| {
N
1 X 2
θ̂N = arg minθ y(t) − φ(t)T θ can be determined
N t=1 As a consequence:
analytically using:
−1
∂VN (θ, Z N )
=0 1 X N N
1 X
∂θ T
= φ(t)φ (t)
θ=θ̂N θ̂N · φ(t)y(t)
N t=1 N
t=1
Indeed: | {z } | {z }
R(N ) f (N )
N
∂VN (θ, Z N ) 1 X
= −2 [φ(t)y(t) − φ(t)φT (t)θ]
∂θ N t=1
Putting derivative to 0 in θ = θ̂N delivers: • Analytical solution through simple matrix operations.
" N
# N
1 X T 1 X
φ(t)φ (t) θ̂N = φ(t)y(t)
N t=1 N t=1
& % & %
System Identification 36 System Identification 37
' $ ' $
5.2 Case of a predictor nonlinear in θ (OE,BJ,ARMAX) T
θ= b0 · · · bnb −1 f1 f2 · · · fnf
Example of the OE model structure:
VN (θ,Z N )
z −nk B(θ) z }| {
G(θ) = ; H(θ) = 1 N
F (θ) 1 X 2
θ̂N = arg minθ y(t) − φ(t, θ)T θ can not be
N t=1
Predictor ŷ(t, θ):
determined analytically using:
ŷ(t, θ) = H(θ)−1 G(θ)u(t) + [1 − H(θ)−1 ]y(t)
∂VN (θ, Z N )
B(θ) =0
= z −nk u(t) ∂θ
F (θ) θ=θ̂N
= φ(t, θ)T θ NONLINEAR in θ !!! since this derivative is a very complicate expression which is
nonlinear in θ and since this derivative is (generally) equal to 0
with for several θ (local minima).
φ(t, θ) = (u(t − nk ), ..., u(t − nk − nb + 1), The solution θ̂N will therefore be computed iteratively. Risk of
T
−ŷ(t − 1, θ), ..., −ŷ(t − nf , θ))
& % & %
finding a local minimum !
' $ ' $
Counterexample
The ideal identification criterion Consider u(t) = 0 ∀t as input signal and a full-order model
structure M for S:
Consequently:
has a unique solution θ ∗ (i.e. θ ∗ = θ0 when S ∈ M) if the =0
input signal u(t) that is chosen to generate the experimental z }| {
G0 (z) − G(z, θ) H0
data is sufficiently rich. ǫ(t, θ) = u(t) + e(t)
H(z, θ) H(z, θ)
& % & %
System Identification 40 System Identification 41
' $ ' $
1 + dz −1
=⇒ ǫ(t, θ) = e(t)
1 + d0 z −1
Notion of signal richness: persistently exciting input signals
We know that Ēǫ2 (t, θ) is minimum for θ making ǫ(t, θ) = e(t)
A quasi-stationary signal u is persistently exciting of order n if
=⇒ the (Toeplitz) matrix R̄n is non-singular
The power Ēǫ2 (t, θ) is minimized for each θ making Ru (0) Ru (1) ··· Ru (n − 1)
H(z, θ) = H0 i.e.
Ru (1) Ru (0) ··· Ru (n − 2)
R̄n := .. .. .. ..
. .
. .
b
Ru (n − 1) Ru (1) Ru (0)
∗
θ = d0 ∀b ∈ R and ∀f ∈ R
···
f
' $ ' $
Examples:
2
• A white noise process (Ru (τ ) = σu δ(τ )) is persistently
2
exciting of infinite order. Indeed, R̄n = σu In .
• a block signal
2
1
1.5
1 3
− 13 −1
1
1 1 1
− 13
R̄4 = 3 3
0.5
−1 1
1 1
0
3 3 3
−0.5
−1 − 13 1
3
1
−1
−1.5
R̄3 is regular, R̄4 is singular. Consequently, u is p.e. of order 3
−2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1
Ru (0) = 1 Ru (1) = 3
Ru (2) = − 13 Ru (3) = −1
Ru (4) = − 13 Ru (5) = 1
3
Ru (6) = 1 etcetera
& % & %
System Identification 44 System Identification 45
' $ ' $
& % & %
System Identification 46 System Identification 47
' $ ' $
Sketch of the proof (case of a FIR model structure):
nb
X
ǫ(t, θ) = y(t) − bk u(t − k) (nk = 1)
k=1
What can we say about the identification of θ̂N ?
θ̂N will be the (consistent) estimate of θ ∗ = θ0 (the unique
θ ∗ is characterized by:
solution of the ideal criterion) if the input signal is sufficiently
θ∗ exciting of order ≥ ng .
z }| {
Ru (0) ··· Ru (nb −1) b∗1 Ryu (1)
Ru (1) Ru (nb −2) b2 Ryu (2)
∗
···
.. .. .. .. = ..
.
. . . .
Remark. In the sequel, we will always assume that the signal
Ru (nb −1) · · · Ru (0) b∗nb Ryu (nb ) u(t) has been chosen such that it is persistently exciting of
sufficient order.
Consequence:
θ ∗ can uniquely be identified if and only if u is persistently
& % & %
exciting of order ≥ nb .
y(t) = u(t)+e(t)
1 − 1.991z −1 + 2.203z −2 − 1.841z −3 + 0.894z −4 1
−2
z −3 b0 + b1 z −1 0 200 400 600 800 1000 1200 1400 1600 1800 2000
M = G(z, θ) = ; H(z, θ) = 1 t
1 + f1 z −1 + f2 z −2 + f3 z −3 + f4 z −4
output signal y(t)
2
T 0
θ= b0 , b1 , f1 , f2 , f3 , f4 =⇒ nG = 6
−1
−2
0 200 400 600 800 1000 1200 1400 1600 1800 2000
t
& % & %
Using the 2000 recorded data, we have identified θ̂N
' $ ' $
G(z, θ̂N ) (blue) is compared with G(z, θ0 ) (red): Second experiment on S
From u1 to y1
we have applied a white noise u(t) (u p.e. of order ∞) to S
2
10
and collected N = 2000 IO data:
0
10
Amplitude
−4
10 0
−2 −1 0 1
10 10 10 10
−1
0
−2
−200 0 200 400 600 800 1000 1200 1400 1600 1800 2000
Phase (degrees)
−600
1
−800
−2 −1 0 1
10 10 10 10 0
Frequency (rad/s)
−1
From u1 to y1
2
10
0
10
Amplitude
−2
10
−4
10
−2 −1 0 1
10 10 10 10
−200
Phase (degrees)
−400
−600
−800
−2 −1 0 1
10 10 10 10
Frequency (rad/s)
Note: the first property is in fact θ̂N ∼ AsN (θ0 , Pθ )
System Identification
System Identification 56
θ̂N ∼ N (θ0 , Pθ )
E θ̂N = θ0 means that
p
1 (i)
lim θ̂N = θ0
p→∞ p
i=1
E (θ̂N − θ0 )(θ̂N − θ0 )T
∆
Pθ =
σe2 −1
= Ēψ(t, θ0 )ψ T (t, θ0 )
N
θ̂N unbiased estimate of θ0
with ψ(t, θ0 ) = ∂ ŷ(t,θ) = − ∂ ε(t,θ)
∂θ ∂θ
θ=θ 0 θ=θ 0
Pθ = E (θ̂N − θ0 )(θ̂N − θ0 )T
∆
System Identification
System Identification
ψ(t, θ0 ) =
ΛG (z, θ0 )
u(t) +
ΛH (z, θ0 )
e(t)
60
H(z, θ0 ) H(z, θ0 )
Property 1. Pθ is a function of the chosen input signal u(t) and
ΛG Λ∗ ΛH Λ∗
of the number N of data used for the identification. Now defining ΓG = HH ∗
G
and ΓH = HH ∗
H
and using Parseval
theorem
Proof: σe2 −1
Pθ = Ēψ(t, θ0 )ψ T (t, θ0 ) =⇒
G0 (z) − G(z, θ) H0 N
(t, θ) = u(t) + e(t) =⇒
H(z, θ) H(z, θ) −1
π
σ2
1 jω jω
Pθ = e ΓG (e , θ0 ) Φu (ω) + ΓH (e , θ0 ) σe2 dω
N 2π −π
−∂(t, θ) ΛG (z, θ0 ) ΛH (z, θ0 )
ψ(t, θ0 ) = = u(t) + e(t)
∂θ
θ=θ0 H(z, θ0 ) H(z, θ0 )
=⇒ Pθ function of u(t) and N
∂G(z,θ) ∂H(z,θ)
with ΛG (z, θ) = ∂θ
and ΛH (z, θ) = ∂θ We can therefore influence the value of Pθ by appropriately
choosing u(t) and N
N
−1 If we could collect N = ∞ data from S, then the identified
σ̂e2 1
P̂θ = ψ(t, θ̂N )ψ T (t, θ̂N ) parameter vector θ̂N →∞ would have the following distribution:
N N t=1
N θ̂N →∞ ∼ N (θ0 , Pθ ) with Pθ = 0
with σ̂e2 = 1
N t=1 (t, θ̂N )2
always equal to θ0
Predictor:
ŷ(t, θ) = φ(t)T θ with φT (t) = u(t) u(t − 1)
=y(t)
⎛ ⎞
N
⎜1
θ̂N = R−1 ⎝ φ(t) φ(t)T θ0 + e(t) ⎠
⎟
N t=1
System Identification
N
1
67
System Identification
−1
θ̂N = θ0 + R φ(t)e(t)
N t=1
Mean:
estimation error
N
1
E θ̂N = θ0 + E R−1 φ(t)e(t)
N
=⇒ t=1
θ̂N is a random variable and is (asymptotically) normally Since φ(t) and R are deterministic (not stochastic):
distributed
⎛ =0 ⎞
N
1
Indeed E θ̂N = θ0 + R−1 ⎝ φ(t) Ee(t)⎠
N t=1
• e(t) is a random process and
= θ0
• central limit theorem
N N
R−1 R−1
σ2
T Note that Pθ = Ne R−1 converges when N → ∞ to the
Pθ = E φ(t)e(t) e(s)φ (s)
N t=1 s=1
N asymptotic expression
N N
R−1 R−1
φ(t) E(e(t)e(s)) φT (s) σe2
= Ēψ(t, θ0 )ψ T (t, θ0 )
−1
N t=1 s=1
N N
−1 N −1
R R
φ(t)φT (t) since
= σe2
N t=1
N
σe2 σe2 ŷ(t, θ) = φT (t)θ =⇒ ψ(t, θ) = φ(t) ∀θ
−1 −1 −1
= R RR = R
N N
System Identification 71
System Identification
(θ0 − θ̂N )T Pθ−1 (θ0 − θ̂N ) ∼ χ2 (k)
The largest Pθ , the largest the ellipsoid and thus the largest the
U = θ ∈ Rk | (θ − θ̂N )T Pθ−1 (θ − θ̂N ) ≤ α uncertainty
Remark: G(z, θ0 ) lies with the same probability in
System Identification 75
System Identification 76
we have applied a sequence u(t) of length N = 1000 to S and
U = θ ∈ Rk | (θ − θ̂N )T Pθ−1 (θ − θ̂N ) ≤ 5.99
we have measured the corresponding y(t).
0.85
0.8 the identified models G(z, θ̂N ) (and H(z, θ̂N )) are also random
0.75 variables:
b
0.7
0.65
• G(z, θ̂N ) is an (asymptotically) unbiased estimate of
0.6
G(z, θ0 )
0.55
• the variance of G(z, θ̂N ) is defined in the frequency domain
0.5
0 0.05 0.1 0.15 0.2 0.25
a
0.3 0.35 0.4 0.45 0.5
as:
cov(G(ejω , θ̂N )) = E |G(ejω , θ̂N ) − G(ejω , θ0 )|2
∆
The in practice unknown θ0 is represented by the red circle and
lies in U as expected
∂G(z,θ)
with ΛT
G (z, θ) = ∂θ direct consequence of the fact that Pθ is a function of these
quantities
(obtained using a first order approximation and the assumption that N
is large enough)
Obtained by assuming that the MacMillan degree n of the Property 3. An estimate of cov(G(ejω , θ̂N )) can nevertheless
model G(z, θ) in M → ∞ be computed using the data and θ̂N
System Identification 83
System Identification 84
System Identification
85 System Identification
86
Validation using cov(G(ejω , θ̂N )
More precizely, since G(z, θ̂N ) is normally distributed, we have
cov(G(ejω , θ̂N )) = E |G(ejω , θ̂N ) − G(ejω , θ0 )|2
∆
at each frequency ω that
if
cov(G(ejω , θ̂N ) is thus a measure of the modeling error and
allows to deduce uncertainty bands around the frequency
the standard deviation cov(G(ejω , θ̂N ) of G(ejω , θ̂N ) is small response of the identified model G(z, θ̂N )
w.r.t. |G(ejω , θ̂N )|
System Identification
What is a small standard deviation cov(G(ejω , θ̂N ) (or a
87
System Identification 88
small modeling error) w.r.t. |G(ejω , θ̂N )|? What to do if the variance appears too large ?
Highly dependent on the expected use for the model !! If the variance cov(G(ejω , θ̂N )) appears too large, then we can
not guarantee that G(z, θ̂N ) is a close estimate of G0 (z)
For example, if we want to use
the model for control, the
modeling error (measured by cov(G(ejω , θ̂N )) has to be A new identification experiment has then to be achieved in
much smaller around the cross-over frequency than at the other order to obtain a better model
frequencies
For this purpose, we have to take care that the variance in this
See the literature on “identification for robust control” to know new identification is smaller
how large cov(G(ejω , θ̂N ) may be
new identification ? Let us consider the same flexible transmission system S (in the
ARX form)
√
jω cov(G(ejω ,θ̂N )
Consequently, cov(G(e , θ̂N )) can be reduced by In this example, we need |G(ejω ,θ̂N )|
< 0.1 ∀ω ∈ [0 1]
• increasing the number of data N ; First identification experiment
• or increasing the power spectrum Φu (ω) of the input signal 2
We apply a white noise input signal u(t) of variance σu = 0.005
at the frequencies where cov(G(ejω , θ̂N )) was too large to S, collect N = 2000 IO data and identify a model G(z, θ̂N )
in M
1
10
We want to reduce the variance of the identified model
0
10
−2
10
2
We apply a white noise input signal u(t) of variance σu = 1 to
S, collect N = 2000 IO data and identify a model G(z, θ̂N )
−3
in M
10
−2 −1 0
10 10 10
omega
cov(G(ejω , θ̂N ) is too large !!!
2
10
identified G (red); standard deviation (blue)
We want to reduce the variance of the identified model further
around the 1st peak
1
10
0
10
Let us for this purpose increase the power of u(t) around this
−1
10
first peak:
−2
10
−3
10
−2 −1 0
10 10 10
omega
for our control purpose!!!
2
10
identified G (red); standard deviation (blue)
Final note:
1
10
−1
10
−2
10
cov(H(ejω , θ̂N )) can be deduced using a similar reasoning as
for cov(G(ejω , θ̂N ))
−3
10
−2 −1 0
10 10 10
omega
cov(G(ejω , θ̂N ) is now OK for our control purpose!!!
& % & %
• θ̂N → θ ∗ w.p. 1 when N → ∞
Consider a model structure M = {G(z, θ) ; H(z, θ)} such
• θ̂N ∼ AsN (θ ∗ , Pθ ) (Pθ having a more complicate
that S 6∈ M and an input signal u(t) sufficiently exciting of
expression than when S ∈ M)
order ≥ ng
'
System Identification
$ '
99 System Identification
$
100
G(z, θ ∗ ) 6= G0 (z) and H(z, θ ∗ ) 6= H0 (z) The model structure M used for identification purpose is
such that
One exception though: ∃θ0 such that G(z, θ0 ) = G0 (z) but H(z, θ0 ) 6= H0 (z)
& % & %
S 6∈ M with G0 ∈ G and M OE, BJ or FIR
To answer this question, we distinguish two classes of model Chosen model structure M = { G(z, θ), H(z, θ) } such that
structures M: ∃θ0 with G(z, θ0 ) = G0 (z) but H(z, θ0 ) 6= H0 (z).
& % & %
• M with common parameters in G(θ) and H(θ) (i.e. ARX, • if M is ARX or ARMAX, then
ARMAX) G0
z }| {
G(z, θ ) 6= G(z, θ0 ) H(z, θ ∗ ) 6= H0
∗
'
System Identification
$ '
103 System Identification
$
104
Example
Using these IO data, we have identified a model in two model
z −3 0.103 + 0.181z −1 structures such that S 6∈ M with G0 ∈ G:
y(t) = u(t)+v(t)
1 − 1.991z −1 + 2.203z −2 − 1.841z −3 + 0.894z −4
Marx = ARX(na = 4, nb = 2, nk = 3)
with v(t) = H0 (z)e(t); H0 (z) very complicate i.e. S is not
ARX, not OE !!!
Moe = OE(nb = 2, nf = 4, nk = 3)
2
We have applied a powerful white noise input signal (σu = 5) to arx
Let us denote G(z, θ̂N oe
) and G(z, θ̂N ), the models identified
& % & %
S and collected a large number of IO data (N = 5000) =⇒ in Marx and Moe , respectively.
small variance =⇒ θ̂N ≈ θ ∗
0
10
Amplitude
−2
10
−4
10
−2 −1 0 1
10 10 10 10
−200
Phase (degrees)
−400
−600
& %
−800
−2 −1 0 1
10 10 10 10
Frequency (rad/s)
oe arx
As expected, we obtain G(z, θ̂N ) ≈ G0 (z) and G(z, θ̂N ) very
different from G0
• S∈M
Model structure validation: based on θ̂N and Z N , determine if
• S 6∈ M with G0 ∈ G
the chosen model structure M is such that:
• S 6∈ M with G0 6∈ G
• S ∈ M or
& % & %
How can we verify these assumptions ? • S 6∈ M with G0 ∈ G
a solution: model structure validation • S 6∈ M with G0 6∈ G
'
System Identification
Situation A
We observe:
$
109
σ2 for τ = 0
e
The identified parameter vector is then θ ∗ Rǫ (τ ) = σe2 δ(τ ) =
0 elsewhere
Model structure validation is performed by considering Rǫ (τ )
Rǫu (τ ) = 0 ∀ τ
and Rǫu (τ ) of ǫ(t, θ ∗ ):
This situation occurs when
ǫ(t, θ ) = H(θ )
∗ ∗ −1
(y(t) − G(θ )u(t))
∗
G0 − G(θ ∗ ) H0
Due to the fact that ǫ(t, θ ∗ ) = u(t) + e(t)
H(θ ∗ ) H(θ ∗ )
& % & %
G0 − G(θ ) ∗
H0
ǫ(t, θ ∗ ) = u(t) + e(t), = 0 × u(t) + e(t)
H(θ ∗ ) H(θ ∗ )
three situations can occur for these quantities Rǫ (τ ) and ⇐⇒ G(θ ∗ ) = G0 and H(θ ∗ ) = H0
Rǫu (τ )
⇐⇒ S ∈ M
& % & %
= 0 × u(t) + e(t) ⇐⇒ G(θ ∗ ) 6= G0
H(θ ∗ )
either S 6∈ M with G ∈ G for M ARX or ARMAX
⇐⇒ G(θ ) = G0 and H(θ ) 6= H0
∗ ∗ 0
⇐⇒
or S 6∈ M with G0 6∈ G
' $ ' $
⇐⇒ S 6∈ M with G0 ∈ G for M OE, BJ or FIR
& % & %
• S 6∈ M with G0 ∈ G (situation B)
No distinction can be made between G0 ∈ G and G0 6∈ G
• or S 6∈ M with G0 6∈ G (situation C)
PN −τ N w.p.
N
(τ ) = 1 R̂ǫu (τ ) lies in its confidence region ∀τ =⇒ Rǫu (τ ) = 0 ∀τ
R̂ǫu N t=1 ǫ(t + τ, θ̂N )u(t)
N −τ
& % & %
1
R̂ǫN (τ ) =
P
N t=1 ǫ(t + τ, θ̂N )ǫ(t, θ̂N )
' $ ' $
and by considering 99%-confidence regions for these estimates
To construct these confidence regions, we use the following 1) when M is OE, FIR, or BJ
result:
both R̂ǫN (τ ) and R̂ǫu
N
(τ ) are in their confidence regions ∀τ
√ R̂N w.p.
ǫ (τ ) =⇒ S ∈ M
if Rǫ (τ ) = σe2 δ(τ ), then N R̂N
∼ AsN (0, 1).
ǫ (0)
√ N
if Rǫu (τ ) = 0 ∀ τ , then N
N R̂ǫu (τ ) ∼ AsN (0, P ) with an R̂ǫu (τ ) is in its confidence regions ∀τ while R̂ǫN (τ ) is not
w.p.
estimable P . completely in its confidence region =⇒ S 6∈ M with G0 ∈ G
%
N
(τ ) are not completely in their confidence
w.p.
regions =⇒ S 6∈ M with G0 6∈ G
& % & %
No distinction can be made between G0 ∈ G and G0 6∈ G
'
System Identification
$
121
60
20
10
Based on the first analysis of S, first choice for M:
0
M = BJ (nb = 2, nc = 2, nd = 2, nf = 2, nk = 3)
& % & %
−10
0 20 40 60 80 100 120 140 160 180 200
t
From this behaviour, we can conclude that G0 has a limited We can identify a parameter vector θ̂N in this M using Z N
order and from a detailed observation, we see that the delay is
nk = 3
0.8
Let us increase the order for G(z, θ) and H(z, θ)
0.6
0.4
0.2
0 M = BJ (nb = 3, nc = 3, nd = 3, nf = 3, nk = 3)
−0.2
0 5 10 15 20 25
lag
0.3
ZN
0.2
& % & %
0.1
−0.1
−0.2
−25 −20 −15 −10 −5 0 5 10 15 20 25
lag
' $ ' $
w.p.
=⇒ S 6∈ M with G0 6∈ G
0.8
0.6
0.4
−0.02
and identify θ̂N in this new model structure using the data Z N
& % & %
−0.04
−25 −20 −15 −10 −5 0 5 10 15 20 25
lag
w.p.
=⇒ S 6∈ M with G0 ∈ G
0.8
0.6
0.4
0.2
0
By a simple iteration, we can find a model set M that has the
−0.2
0 5 10
lag
15 20 25 property S ∈ M
Cross corr. function between input u1 and residuals from output y1
0.04
0.02
−0.02
& % & %
−0.04
−25 −20 −15 −10 −5 0 5 10 15 20 25
lag
w.p.
=⇒ S ∈ M
'
System Identification
& % & %
a successful model structure validation does not necessarily q
imply that G(z, θ̂N ) and H(z, θ̂N ) are close estimates of and cov(G(ejω , θ̂N )) allows one to verify whether G(z, θ̂N ) is
q
G0 (z) = G(z, θ0 ) and H0 (z) = H(z, θ0 ) (variance can be still close to G0 (and eventually cov(H(ejω , θ̂N )) for H(z, θ̂N ))
large !!!)
1. choose the input signal and collect Z N Possible additional tests for item 5:
2. choose a model structure M • simulation of the identified model
3. identification of the models G(z, θ̂N ) and H(z, θ̂N ) • observation of the poles and zeros of the identified models
4. Verify if S ∈ M. If it is the case, go to item 5. If not, go • comparison of the frequency response of the identified
to item 2 and choose another model structure M models with the ETFE (see later) and/or with the physical
equations.
q
5. Verify if cov(G(ejω , θ̂N )) (and eventually
& % & %
q
cov(H(ejω , θ̂N ))) are small. If not, go back to item 1. If
yes, stop
& % & %
n Φv (ω)
cov(G(ejω , θ̂N )) ≈ • perform the identification experiment in such a way that the
N Φu (ω)
identified model is a close estimate of S in the important
with n large and N , Φu (ω) limited frequency range
'
System Identification
$ '
134 System Identification
$
135
& % & %
and M = {G(z, θ) ; H(z, θ) = 1} is an OE model structure
such that 6 ∃θ0 with G(z, θ0 ) = G0 (z)
θ̂N is a random variable due to the stochastic disturbance v(t) the modeling error G0 (z) − G(z, θ̂N ) is decomposed into two
corrupting the data contributions:
& % & %
Pθ can not be determined analytically, but Pθ → 0 when
N → ∞ =⇒ θ̂N → θ ∗ w.p. 1 when N → ∞ Note: when S ∈ M, G0 (z) − G(z, θ ∗ ) = 0
' $ ' $
6 ∃θ0 with G(z, θ0 ) = G0 (z) =⇒ G(z, θ ∗ ) 6= G0 (z)
• G0 − G(θ ∗ ) is called the bias error and is due to the fact • on the bias error
that S 6∈ M with G0 6∈ G; • on the variance error
& % & %
• G(θ ∗ ) − G(θ̂N ) is called the variance error and is due to
the fact that N < ∞
=⇒
θ ∗ = arg min V̄ (θ)
θ
and
π
& % & %
Z
V̄ (θ) = Ēǫ(t, θ)2 ∗ 1
θ = arg min Φǫ (ω, θ)dω
Z π θ 2π
1 −π
= Φǫ (ω, θ)dω 1
Z π
2π −π = arg min |G0 (ejω ) − G(ejω , θ)|2 Φu (ω) + Φv (w)dω
θ 2π −π
' $ ' $
(Parseval; both expressions are equal to Rǫ (0))
Z π
1 Notes:
θ ∗ = arg min |G0 (ejω ) − G(ejω , θ)|2 Φu (ω) + Φv (w)dω
θ 2π −π
& % & %
the bias will be the smallest at those ω’s where Φu (ω) is bias error, but influences the variance error
relatively the largest
uF (t) = L(z)u(t) and yF (t) = L(z)y(t) where ǫ(t, θ) is the prediction error if we would have used u(t)
and y(t)
Result:
If you use the data uF (t) and yF (t) for the identification, the Consequently,
& % & %
weighting function shaping the bias error is:
ΦǫF (ω, θ) = |L(eiω )|2 · Φǫ (ω, θ)
iω 2
W (ω) = Φu (ω)|L(e )|
' $ ' $
and therefore W (ω) = Φu (ω)|L(eiω )|2
13.5 Example
& % & %
• large N =⇒ small variance error
G(z, θ̂N ) for G0 (z) in the frequency range [0 0.7] in the
reduced order model structure:
M = OE(nb = 2, nf = 2, nk = 3)
Amplitude
We want a small bias error in the frequency range [0 0.7] =⇒ 10
−1
−2
10
200
Phase (degrees)
0
frequency 0.7rad/s −100
−200
& % & %
We filter u and y collected from S by this L and we obtain −300
Frequency (rad/s)
0
10
1
10
' $ ' $
=⇒ G(z, θ̂N ) is OK
G0 (z) (red) and G(z, θ̂N ) (blue) identified with the data in Z N
1
From u1 to y1 13.6 What about a Box Jenkins model structure
10
10
0
The weighting function W (ω) for the bias error
Amplitude
10
−1
G0 (ejω ) − G(ejω , θ ∗ ) is then
−2
10
10
−3
10
−2 −1
10
0
10
1
10
Φu (ω)|L(eiω )|2
W (ω) =
200
|H(ejω , θ ∗ )|2
100
Phase (degrees)
& % & %
−100
−200
G-model !!
−300
−400
−2 −1 0 1
10 10 10 10
Frequency (rad/s)
=⇒ G(z, θ̂N ) is KO
& % & %
Based on these N time-domain data, we want to estimate the
frequency response G0 (ejω ) (amplitude and phase) of the true
plant transfer function
'
Sc4110: part IV
$ '
1 Sc4110: part IV
$
2
& % & %
N −1
1 X
{ y(t) | t = 0...(N − 1) } ←→ YN (ω) = √ y(t) e−jωt
N t=0
jω YN (ω)
Ĝ(ejω ) = |Ĝ(ejω )|ej∠Ĝ(e )
= Ĝ(ejω ) is therefore only computed at those frequencies ωk
UN (ω)
& % & %
for which UN (ω) 6= 0
The Fourier transform UN (ω) of such a signal is indeed only
significant at the (active) harmonics of ω0 . Ĝ(ejω ) will
' $ ' $
therefore only be computed at those harmonics.
Illustration
2π
Input signal 1: a multisine of fundamental frequency ω0 = 100
≈ 0.06
G0 (z)
(power=0.5)
z }| {
z −3 0.103 + 0.181z −1
y(t) = u(t) + H0 e(t)
1 − 1.991z −1 + 2.203z −2 − 1.841z −3 + 0.894z −4
30
1 X
u(t) = √ sin(kω0 t)
30 k=1
with H0 = 1/den(G0 ) and e(t) a white noise disturbance of
variance σe2 = 0.1
& % & %
The ETFE is computed at the 30 harmonics of ω0 present in
We collect N = 10000 data on this true system subsequently u(t)
with two different input signals having the same
Pu = 0.5 = 5σe2
0
10
−1
10
−2 −1 0
10 10 10
ω
2
10
1
10
0
10
N
10
−1 The ETFE is computed at all the 2
= 5000 frequencies ωk
−2 −1 0
10 10 10
ω
& % & %
Above plot: the ETFE at the 30 harmonics of ω0 ; Bottom plot:
the same with the frequency response of G0 (ejω ) (blue)
We see that the ETFE is a good estimate of G0 (ejω ) at the
' $ ' $
harmonics of ω0
2
10
1
10
ETFE Modulus
0
10
−1
10
−2
10
−2 −1 0
10 10 10
ω
2
10
How can we explain this?
1
10
ETEF Modulus
0
10
−1
10
For this purpose, we need to understand the statistical
−2
10
−2 −1 0
properties of the ETFE
10 10 10
ω
& % & %
Above plot: the ETFE at ωk ; Bottom plot: the same with the
frequency response of G0 (ejω ) (blue)
We see that the ETFE is an erratic and poor estimate of
G0 (ejω )
=⇒
& % & %
the ETFE is different at each experiment
' $ ' $
Ĝ(ejωk ) are small for all ωk
N A2k 10000
with VN (ω) defined as YN (ω) and UN (ω) |UN (ejωk )|2 =
& % & %
=
4 120
Φv (ω)
cov(Ĝ(ejω )) tends, for increasing values of N , to Φu (ω)
√
since the amplitude Ak of each sine is 1/ 30 and N = 10000
& % & %
Since |UN |2 is proportional to N and A2k , the variance is
1 Unlike for a multisine u(t), the variance is not proportional to
proportional to N and A12 1
; variance only proportional to σ12
' $ ' $
k
N u
For equal power, variance much smaller for multisine u(t) How can we then get a relatively good estimate? How can we
& % & %
reduce the variance ?
Smoothing is motivated by: with Ĝ(ejω ) the unsmoothed ETFE and Wγ (ω) a positive
& % & %
• ETFE estimates are independent for different ωk ’s real-valued frequency-function (window)
'
Sc4110: part IV
40
Hamming frequency window for different resolutions
$ '
21 Sc4110: part IV
35
30
15 Z π
10 Wγ (ξ − ω)Ĝ(eiξ )dξ
Ĝsm (ejω ) =
−π
5
Z π
0
Wγ (ξ − ω)dξ
−π
−5
−1 −0.5 0 0.5 1
frequency (rad/sec)
& % & %
Wγ (ω) of Hamming window for γ = 10 (solid), γ = 20 Ĝsm (ejωk ) at a particular frequency ωk is obtained by averaging
(dash-dotted) and γ = 40 (dashed). Ĝ(ejω ) in the interval [ωk − ∆ω, ωk + ∆ω]
γ is measure for the width of the window.
−1
10
−2 −1 0
10 10 10
• Choice of window dependent on expected smoothness of ω
2
G0 (eiω )
10
0
10
& % & %
−1
10
Window too wide: possible smoothing of dynamics −2
10
−1
10
0
10
ω
' $ ' $
with the frequency response of G0 (ejω ) (blue)
YN (ω)
Ĝ(ejω ) =
UN (ω)
Besides trial-and-error coupled with physical insights on G0 (z), YN (ω)UN
∗
(ω)
=
UN (ω)UN
∗
(ω)
is there another way to select γ? +∞
X
N
R̂yu (τ ) e−jωτ
τ =−∞
= +∞
X
N
R̂u (τ ) e−jωτ
Yes....
& % & %
τ =−∞
where the last step follows from expressions (3.13) and (3.19) in
the lecture note.
& % & %
N −1
1
X The approximations of the spectra are obtained by taking the
y(t)u(t − τ ) f or 0 < τ < N − 1
N N
N
R̂yu (τ ) = N t=0 Fourier transforms of estimates R̂yu (τ ) and R̂u (τ ) of the exact
' $ ' $
correlation functions.
0
elsewhere
Hamming lag-window wγ (τ )
1.2
1
Moreover it can be shown that
0.8
+∞
X
N
wγ (τ )R̂yu (τ ) e−jωτ 0.6
jω τ =−∞
Ĝsm (e )= +∞ 0.4
X
N
wγ (τ )R̂u (τ ) −jωτ
e
0.2
τ =−∞
0
with wγ (τ ) obtained as the inverse Fourier transform of the
& % & %
−0.2
frequency window Wγ (ω) 0 10 20 30 40 50 60 70
N
typical R̂yu (τ ) (solid) together with the Hamming lag-windows
w10 (τ ) (dotted), w30 (τ ) (dashed) and w70 (τ ) (dash-dotted).
N 0.25
Φ̂yu (ω) of Φyu (ω) the elements of R̂yu (τ ) for τ > γ 0.2
0.15
0.1
estimated Ryu(τ)
This is relevant since Ryu (τ ) → 0 for τ → ∞ (G0 (z) stable) 0.05
N
and since the accuracy R̂yu (τ ) is smaller and smaller for 0
−0.05
N
increasing values of τ (R̂yu (τ ) computed with less data points) −0.1
−0.15
& % & %
−0.2
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
N N
R̂yu (τ ) are small w.r.t |R̂yu (0)| and “less reliable”
N
we see the inaccuracy of the estimate: R̂yu (τ ) does not tend to
' $ ' $
0 for τ → ∞
Let us focus on the first 500 τ ’s Obtained smoothed ETFE with γ = 100
0.3 2
10
0.2
0
10
0.15
−1
0.1 10
estimated Ryu(τ)
−2 −1 0
10 10 10
0.05 ω
2
10
0
−0.1 0
10
−0.15
−1
10
& % & %
−0.2 10
−2 −1
10
0
10
0 50 100 150 200 250 300 350 400 450 500
τ ω
N
we see that, after τ = 100, R̂yu (τ ) increases again which is Above plot: the smoothed ETFE at ωk ; Bottom plot: the same
much unlikely for Ryu (τ ) =⇒ we select γ = 100 with the frequency response of G0 (ejω ) (blue)
ETFE gives a discrete estimate of the frequency response of =⇒ parametric identification (prediction error identification)
G0 (ejω ) and not the rational transfer function G0 (z)
& % & %
No information about the noise spectrum Φv (ω) while this
information is important for e.g. disturbance rejection
'
Sc4110: part IV
$
38
2
10
1
10
Modulus
0
10
2
10
1
10
This comparison shows that PEI delivers much better results
Modulus
0
10
even with ten times less data points
& % & %
−1
10
−2 −1 0
10 10 10
ω
& % & %
• delay of the system
'
SC4110: Part V
$
2
& % & %
ωs
The ETFE obtained with these data can be represented up to 2
with ωb as observed in the ETFE Considering now the normalized frequency ω = ωactual Ts
Data with a smaller ωs can be obtained We note that the interval [0 ω2s ] (actual frequencies)
• either by re-collecting data with a smaller ωs corresponds to the main interval [0 π] when considering
normalized frequencies. Indeed
& % & %
• or by decimating the data obtained with high ωs
(+anti-aliasing filter) ωs π
= =⇒ normalized ω = π
2 T
| {z s}
' $ ' $
actual f requency
& % & %
increase the accuracy of the identified model
the phase shifts φk can be optimized in order to reduce the
• the amplitude of the time-domain signal should be maximal amplitude of u(t) without any effect on the power
bounded/limited in order not to damage the actuators and spectrum Φu (ω)
in order not to excite the nonlinearities
& % & %
Shaping Φu (ω) is very easy, but there is no a-priori bound on The RBS has the maximal power Pu = Ēu2 (t) = c2 that can
the amplitude of u(t) !! be attained by a signal u(t) ≤ c ∀t
'
SC4110: Part V
c
$ '
9 SC4110: Part V
Influence of ν on Φu (ω)
$
10
(a)
1.6
t → 1.4
−c
1.2
c 0.8
(b)
0.6
t → 0.4
−c 0.2
& % & %
0
0 0.5 1 1.5 2 2.5 3 3.5
(a) Typical RBS with clock period equal to sampling interval
(ν = 1); 1
Spectrum 2π Φu (ω) of (P)RBS with basic clock period ν = 1
(b) RBS with increased clock period ν = 2.
(black), ν = 3 (green), ν = 5 (red), and ν = 10 (blue).
& % & %
less flexibility, but bounded amplitude !!
'
SC4110: Part V
$ '
13 SC4110: Part V
$
14
4 Data (pre)processing
Unstable systems can not be identified in open loop
• Anti-aliasing filter
Experiments has to be done with a stabilizing controller C in
• outliers/spike closed loop:
• Non-zero mean and drift in disturbances; detrending
G0 C H0
& % & %
y(t) = r(t) + e(t)
1 + G0 C 1 + G0 C
& %
T̂ (z)
Ĝ(z) =
C(z)(1 − T̂ (z))
SC4110: Part V 17