You are on page 1of 6

Nonlinear model validation using novel correlation tests

L.F. Zhang, Q.M. Zhu and A. Longden


Faculty of Computing, Engineering and Mathematical Sciences
University of the West of England, Frenchay Campus
Coldharbour Lane, Bristol, BS16 1QY, UK
lifeng.zhang@uwe.ac.uk, quan.zhu@uwe.ac.uk and ashely.longden@uwe.ac.uk

such as higher order CCF tests [5], [6], [7], [8], [9], a
Abstract - Model validation is the final step of system
combination of five first and second order ACF and CCF
identification, which is to check the goodness of the
tests [10], higher order ACF and CCF between outputs,
estimated models. In the present study, novel first order
inputs and residuals [11], [12], and multi-directional
correlation tests named omni-directional auto-correlation
correlation tests [13]. Nevertheless, these approaches do
functions and omni-directional cross-correlation functions
not perform adequately when some special nonlinear
are proposed to check the validity of a wide class of
effects occur in the residuals since both first and higher
nonlinear dynamic models. Compared to the other first
order correlation functions fail to detect some special
order and higher order correlation tests based approaches,
nonlinear associations. It means that the correlation
the new methodology enhances the power of nonlinear
functions, under certain conditions, may still fall inside the
correlation detection and provides a more comprehensive
95% confidence intervals even if predictable components
and effective solution. The efficiency and effectiveness of
remain in the residuals. In addition, higher order correlation
the new methodology are demonstrated through simulation
functions can sometimes exhibit less power when the
studies and comparison with the other correlation tests
variances of noise and input are small, because of the
based approaches.
fourth and higher moments become small [10].
Keywords: Model validation, nonlinear dynamic model,
nonlinear correlation tests, higher order correlation To overcome these problems, novel correlation tests:
function, NARMAX model omni-directional auto-correlation functions (ODACFs),
omni-directional cross-correlation functions (ODCCFs),
1 Introduction combined ODCCF and combined ODACF are proposed in
the present study to effectively and comprehensively
Nowadays, nonlinear dynamic modelling is applied in validate nonlinear models. Several simulation studies are
many fields to approximate a wide rang of systems. Model employed to illustrate the performance of the new methods.
validation is the final step of the procedure of modelling to
check the validity of the identified model and to determine 2 Correlation functions and problem
if the model is representative of the underlying system.
formulation
If the system under analysis is linear, a number of Consider a generalised single input single output
well-established methodologies have been developed for (SISO) parametric model
validating the identified model. Some of the most powerful
methods are based on the concept that if the model
structure is correct and the parameters estimation is y ( n ) = fˆ ( y n −1 , u n −1 , εˆ n −1 ) + εˆ (2.1)
unbiased, the residuals should form a random sequence
with zero mean and finite variance. Auto-correlation where n(n=1,2,…,N) is a time index and
function (ACF) and cross-correlation function (CCF),
therefore, are widely applied in linear model validation. y n −1 = [ y ( n − 1),..., y (t − r )]⎫
The studies of [1], [2], [3], [4] show that the ACF of ⎪
residuals and the CCF between residuals and inputs should u n −1 = [u ( n − 1),..., u (t − r )] ⎬ (2.2)
lie within the preset confidence intervals when the εˆ n −1 = [εˆ( n − 1),..., εˆ(t − r )] ⎪⎭
identified model is valid and the residual sequence is
completely random.
are output, input and estimated residual vectors
Unfortunately, nonlinear model validation is not as respectively, with delayed elements from 1 to r. fˆ (⋅) is the
straightforward as linear model validation, and Bohlin’s estimated linear or nolinear function.
ACF and CCF tests are obviously inadequate for validating
nonlinear models [5]. To validate nonlinear models, several If the identified model is valid, the residuals ε(n)
correlation tests based approaches have been developed should be reduced to an uncorrelated sequence denoted by
e(n) with zero mean and finite variance. If the identified where p i (n ) denotes nonlinear terms such as
model is inadequate, the residuals should correlate with the
delayed outputs, inputs and noises to some extent [1]. p1 ( n ) = y ( n − 1)u ( n − 1) , p 2 ( n ) = u 2 ( n )εˆ( n − 1) , and etc.
Therefore, the residuals of an invalid model can be
characterized as the general form derived as equation (2.3), For nonlinear models, the validity tests are not as
straightforward as in the linear case since g (⋅) may be a
εˆ( n ) = g ( y n −1 , u n −1 , e n −1 ) + e( n ) (2.3) nonlinear function and nonlinear terms possibly exist in the
residuals. The simple ACF and CCF are now no longer
sufficient [5]. Several higher order correlation functions
where g (⋅) is a linear or nonlinear function.
based approaches have been developed to validate
nonlinear model. Full details of these tests can be found in
Linear case: If f (⋅) is linear function, g (⋅) should be the studies of Billings and Voon [5], [10], Billings and Zhu
also a linear function in terms of delayed noise, inputs and [8], [11], [12], Mao and Billings [13] and only a brief
outputs when the identified model is inadequate. description will be given here.
Correlation tests, ACF and CCF, can be used to check if
ε ( n ) = e( n ) , since they are the measures of linear Higher order ACF ( r(εˆ (τ ) ) and higher order CCF
2
)'( εˆ 2 )'
associations between data sequences. Bohlin’s studies
shows that all the correlation functions are within the preset ( r( u 2
)' ( εˆ 2 )'
(τ ) ) are expressed as following
confidence intervals when ε ( n ) = e( n ) [1]. The correlation

∑ (((εˆ )( )
N
tests, estimated normalized residual ACF ( rεˆεˆ (τ ) ) and CCF 2
( n ))' (εˆ 2 ( n − τ ))'
( ruεˆ (τ ) ) between the inputs and residuals, are formulated r(εˆ 2 )'(εˆ 2 )' (τ ) = n τ
= +1
(2.7)
∑( )
N
as (εˆ 2 ( n ))'
2

n =1

∑ (εˆ(n) − εˆ )(ε (n − τ ) − εˆ )
N

∑ (((εˆ )( )
N
rεˆεˆ (τ ) = n =τ +1
(2.4)
2
( n ))' (u 2 ( n − τ ))'
∑ (εˆ(n) − εˆ )
N 2
r( u 2
)'( εˆ 2
)'
(τ ) = n =τ +1
1/ 2
(2.8)
⎡⎛ N ⎞⎤
n =1
(
⎢⎜⎜ ∑ (εˆ ( n ))'
2
) 2 ⎞⎛ N
(
⎟⎟⎜⎜ ∑ (u 2 ( n ))' )
2
⎟⎟⎥
⎣⎝ n =1 ⎠⎝ n =1 ⎠⎦
⎧1,τ = 0
rεˆεˆ (τ ) = ⎨
⎩0, otherwise where the dash ’ in equation (2.7) and (2.8) denotes
that the mean level has been removed from the

∑ (εˆ(n) − εˆ )(u(n − τ ) − u )
N corresponding data sequence.

ruεˆ (τ ) = n =τ +1
(2.5) ⎫ 1 N

∑ εˆ
1/ 2
(εˆ 2 ( n ))' = εˆ 2 ( n ) −
( ) ( )
2
⎡⎛ N
⎞⎛ N ⎞⎤ (n ) ⎪
⎢⎜⎜ ∑ εˆ( n ) − εˆ ⎟⎟⎜⎜ ∑ u ( n ) − u
2 2
⎟⎟⎥ 1 ⎪ N
⎣⎝ n =1 ⎠⎝ n =1 ⎠⎦ ⎬ (2.9)
1 N 2 ⎪
(u ( n ))' = u ( n ) − ∑ u ( n )
2 2

N 1 ⎪⎭
ruε (τ ) = 0, ∀τ
Since higher order correlation functions can be
where overbar denotes the time average operation. regarded as the measures of the correlations between
For large N, the correlation function estimates given in (2.4) squared data sequences, they can detect the nonlinear
and (2.5) are asymptotically normal with zero mean and dependences between the amplitudes of the variables. In
finite variance from the centre limit theorem [14]. nonlinear dynamic models, however, variables are always
correlated with others in very complicated relationships
Nonlinear case: When f (⋅) is a nonlinear function, a which possibly relate to not only the amplitudes but also
typical parametric expression is the polynomial NARMAX the signs (positive or negative) of each variable. The higher
(Nonlinear AutoRegressive Moving Average with order correlation functions based model validity tests,
eXogenous input) model [15]. It is formulated as following. hence, can’t satisfy all the complex nonlinear models and
fail to detect some special nonlinear terms in residuals.
r
y ( n ) = fˆ ( y n −1 , u n −1 , εˆ n −1 ) + εˆ = ∑ α i p i ( n ) + εˆ( n ) (2.6)
i =1
Two examples were selected to illustrate the
capabilities and the inadequacies of first and higher order
ACFs and CCFs. Consider two residual sequences are
expressed as following.
εˆ1 ( n ) = −5e( n − 2)e( n − 4) + 0.3u 2 ( n − 1) + e( n ) (2.10) figure 4 clearly suggests that higher order ACF and CCF
fail to detect the term ( eˆ( n − 2) + 0.8)eˆ( n − 4) at all. Figure
εˆ2 ( n ) = −2( e( n − 2) + 0.8)e( n − 4 ) + e( n ) (2.11) 3 suggests that first order correlation tests can detect the
nonlinear association between εˆ2 ( n ) and eˆ( n − 4) and fail
where {u(n)} was the uniformly distributed random to detect the nonlinear association between εˆ2 ( n ) and
input sequence with zero mean and amplitude from -1 to 1. eˆ( n − 2) .
{e(n)} was the normally distributed random noise sequence
with zero mean and variance of 0.2. All these data In conclusion, ACF, CCF, higher order ACF and
sequences were with length of 1000. higher order CCF are inadequate to detect all the complex
nonlinear terms in residuals. Another disadvantage of the
Figure 1, 2, 3 and 4 show the results obtained from higher order correlation functions is that they can
using the first and higher order correlation functions to test sometimes exhibit less power when the variances of noise
the two residuals respectively. and input are small [10]. To overcome these problems, new
correlation functions and the corresponding model validity
tests are proposed in the following sections.

3 New correlation tests based model


validation
The nonlinear associations are not as simple as the
Figure 1: CCF and ACF tests of εˆ1 ( n ) linear associations which can be simply described as that
one variable increases or decreases as another variable
increases or decreases. To comprehensively and
systematically investigate the nonlinear terms in residuals,
the nonlinear associations are classified into four types in
this study. The four types of nonlinear associations, which
involve all the possible nonlinear effects concerned with
both the amplitude and the sign of each variable, are
Figure 2: Higher order CCF and higher order ACF tests of defined as following.
residual εˆ1 ( n )
Type 1: Amplitude of dependent variable varies with the
amplitude variation of independent variable.

Type 2: Both sign and amplitude of dependent variable


vary with the amplitude variation of independent
variable.

Type 3: Both sign and amplitude of dependent variable


Figure 3: CCF and ACF tests of εˆ2 ( n )
vary with the variations of both sign and
amplitude of independent variable.

Type 4: Amplitude of dependent variable varies with the


variations of both sign and amplitude of
independent variable.

In addition, linear association is a special case of


Figure 4: Higher order CCF and higher order ACF tests of
Type 3 of nonlinear association.
εˆ2 (n )
ACF and CCF tests, as illustrated in figure 1, fail to In order to effectively detect the linear and nonlinear
detect all the nonlinear terms in εˆ1 ( n ) . Figure 2 shows that terms in residuals, new first order correlation functions
r(εˆ )'(εˆ )' (τ ) is outside the confidence interval at τ=2 and τ=4
2 2
named omni-directional auto-correlation functions
(ODACFs) and omni-directional cross-correlation
that higher order correlation functions can be used to detect
functions (ODCCFs) are proposed in the present study.
the nonlinear terms like eˆ( n − 2)eˆ( n − 4) . Both first and
ODACFs and ODCCFs avoid the drawbacks of higher
higher order correlation functions, however, fail to detect order correlation functions and detect the four types of
the nonlinear term u 2 ( n − 1) in εˆ1 ( n ) . For residual εˆ2 ( n ) , nonlinear associations separately. They are formulated as
ODACFs: Either of ODACFs and ODCCFs includes four

∑ (α (n) − α )(α (n − τ ) − α )
N diverse first order correlation functions which correspond
to the four types of nonlinear associations. For whatever
rαα (τ ) = n =τ +1
(3.1) nonlinear term existed in the residuals, hence, there should
∑ (α (n) − α )
N
2
be one or more functions can be used to properly detect it.
n =1

To provide better illustration for detected correlations

∑ εˆ' (n)(α (n − τ ) − α )
N and reduce the number of correlation plots, the results
obtained from ODACFs and ODCCFs are effectively
rαεˆ ' (τ ) = n =τ +1
1/ 2
(3.2) combined to constitute two new correlation quantities
⎡⎛ N 2 ⎞⎛
( ) ⎞⎤
N
named combined ODACF ( ρ εˆεˆ (τ ) ) and combined ODCCF
⎢⎜⎜ ∑ (εˆ' ( n ) ) ⎟⎟⎜⎜ ∑ α ( n ) − α
2
⎟⎟⎥
⎣⎝ n =1 ⎠⎝ n =1 ⎠⎦ ( ρ uεˆ (τ ) ). They are derived as following

N
Combined ODACF:
∑εˆ' (n )εˆ' (n − τ )
rεˆ 'εˆ ' (τ ) = n =τ +1
N
(3.3) max( rαα (τ ), rαεˆ ' (τ ), rεˆ 'εˆ ' (τ ), rεˆ 'α (τ ))
∑ (εˆ' (n)) If, , then
2
> min( rαα (τ ), rαεˆ ' (τ ), rεˆ 'εˆ ' (τ ), rεˆ 'α (τ ))
n =1

ρ εˆεˆ (τ ) = max(rαα (τ ), rαεˆ ' (τ ), rεˆ 'εˆ ' (τ ), rεˆ 'α (τ )) (3.10)
∑τ (α (n) − α )εˆ' (n − τ )
N

rεˆ 'α (τ ) = n = +1
(3.4)
1/ 2 max( rαα (τ ), rαεˆ ' (τ ), rεˆ 'εˆ ' (τ ), rεˆ 'α (τ ))
⎡⎛ N
(
⎢⎜⎜ ∑ α ( n ) − α ) 2 ⎞⎛ N 2 ⎞⎤
⎟⎟⎜⎜ ∑ (εˆ' ( n ) ) ⎟⎟⎥ If,
≤ min( rαα (τ ), rαεˆ ' (τ ), rεˆ 'εˆ ' (τ ), rεˆ 'α (τ ))
, then
⎣⎝ n =1 ⎠⎝ n =1 ⎠⎦

ODCCFs: ρ εˆεˆ (τ ) = min(rαα (τ ), rαεˆ ' (τ ), rεˆ 'εˆ ' (τ ), rεˆ 'α (τ )) (3.11)

∑ (β (n) − β )(α (n − τ ) − α )
N

⎧ ρ εˆεˆ (τ ) = 1,τ = 0
rαβ (τ ) = n =τ +1
1/ 2
(3.5) ⎨
⎩ ρ εˆεˆ (τ ) = 0, otherwise
⎡⎛
( ) ⎞⎟⎟⎛⎜⎜ ∑ (α (n) − α ) ⎞⎟⎟⎤⎥
N N

⎢⎜⎜ ∑ β ( n ) − β
2 2

⎣⎝ n =1 ⎠⎝ n =1 ⎠⎦
Combined ODCCF:

∑ εˆ' (n)(α (n − τ ) − α )
N
max(rαβ (τ ), rαεˆ ' (τ ), ru 'εˆ ' (τ ), ru ' β (τ ))
If, , then
rαεˆ ' (τ ) = n =τ +1
1/ 2
(3.6) > min(rαβ (τ ), rαεˆ ' (τ ), ru 'εˆ ' (τ ), ru ' β (τ ))
⎡⎛ N 2 ⎞⎛
( ) ⎞⎟⎟⎤⎥
N

⎢⎜⎜ ∑ (εˆ' ( n ) ) ⎟⎟⎜⎜ ∑ α ( n ) − α


2

⎣⎝ n =1 ⎠⎝ n =1 ⎠⎦ ρ uεˆ (τ ) = max(rαβ (τ ), rαεˆ ' (τ ), ru 'εˆ ' (τ ), ru ' β (τ )) (3.12)

∑εˆ' (n)u' (n − τ ) max( rαβ (τ ), rαεˆ ' (τ ), ru 'εˆ ' (τ ), ru ' β (τ ))


rεˆ 'u ' (τ ) = n =τ +1
(3.7) If, , then
⎡⎛
1/ 2 ≤ min(rαβ (τ ), rαεˆ ' (τ ), ru 'εˆ ' (τ ), ru ' β (τ ))
2 ⎞⎛ 2 ⎞⎤
N N

⎢⎜⎜ ∑ (εˆ' ( n ) ) ⎟⎟⎜⎜ ∑ (u' ( n ) ) ⎟⎟⎥


⎣⎝ n =1 ⎠⎝ n =1 ⎠⎦
ρ uεˆ (τ ) = min(rαβ (τ ), rαεˆ ' (τ ), ru 'εˆ ' (τ ), ru ' β (τ )) (3.13)

∑ (β (n) − β )u' (n − τ )
N

ρ uεˆ (τ ) = 0, ∀τ
ru ' β (τ ) = n =τ +1
1/ 2
(3.8)
⎡⎛ N
( 2 ⎞⎛ 2 ⎞⎤
)
N

⎢⎜⎜ ∑ β ( n ) − β ⎟⎟⎜⎜ ∑ (u' ( n ) ) ⎟⎟⎥


Consider the residuals given as equations (2.10) and
⎣⎝ n =1 ⎠⎝ n =1 ⎠⎦ (2.11) once again. Figure 5 and figure 6 show the results
obtained from using Combined ODCCF and Combined
ODACF tests for residuals εˆ1 ( n ) and εˆ2 ( n ) respectively
α ( n ) = εˆ' (n ) ⎫⎪
where ⎬ (3.9) that all the nonlinear correlated delayed input and residuals
β ( n ) = u' ( n ) ⎪⎭ are clearly identified.
zˆ1 ( n ) = 0.9593 + 2.6881u( n − 1) + 0.0092u( n − 2) ⎫

+ 0.0398u 2 ( n − 1) − 4.4329u 2 ( n − 2) ⎪

− 2.8771u 3 ( n − 1) − 0.0116u 3 ( n − 2) ⎬ (4.2)

− 0.0527u 4 ( n − 1) + 2.5049u 4 ( n − 2) ⎪
εˆ1 (n ) = y ( n ) − zˆ1 ( n ) ⎪

Figure 5. Combined ODCCF and combined ODACF tests
of εˆ1 ( n ) where zˆ1 (n ) and εˆ1 ( n ) denoted predictive output
sequence and residual sequence respectively. Figure 8
shows the map of of zˆ1 (n ) versus u(n-1) and u(n-2) that
the estimated model (4.2) does not capture the function
underlying the data at all.

Figure 6. Combined ODCCF and combined ODACF tests


of εˆ2 ( n )

4 Simulation studies
The capability of the new correlation tests for
validating the identified model with an incorrect structure
and unbiased parameters estimation was demonstrated
through a simulated example. Consider the discrete Figure 8. Map of zˆ1 (n ) versus u(n-1) and u(n-2)
nonlinear model (model 1) given as (4.1)
Figure 9 shows the results obtain from using
z ( n ) = sin(u( n − 1)π ) + cos(u( n − 2)π )⎫ combined ODCCF and combined ODACF to test the
⎬ (4.1) residual εˆ1 that ρ uεˆ (τ ) is significant outside the
y ( n ) = z ( n ) + e( n ) ⎭
confidence interval at τ=2. Therefore, the residuals are
where {z(n)} and {y(n)} denoted the noise free output correlated to the delayed inputs and the estimated model
and measured output sequences respectively. The input (model 2) is invalid.
sequence {u(n)} was a uniformly distributed random data
sequence with zero mean and amplitude from -1 to 1. {e(n)}
was a normally distributed random noise sequence with
zero mean and variance of 0.02. All these data sequences
were with length of 1000. Figure 7 shows the map of z(n)
versus u(n-1) and u(n-2).

Figure 9. I Combined ODCCF and combined ODACF tests


for model 2

To improve the validity, the degree of nonlinearity of


the polynomial model was increased to 6. Estimation
produced model 3 derived as

zˆ2 ( n ) = 0.9882 + 3.1012u( n − 1) + 0.0082u( n − 2) ⎫



Figure 7. Map of z(n) versus u(n-1) and u(n-2) + 0.0217u 2 ( n − 1) − 4.893u 2 ( n − 2) ⎪
− 4.8035u 3 ( n − 1) − 0.0316u 3 ( n − 2) ⎪
In the present study, nonlinear polynomial models ⎪⎪
with different degree of nonlinearity were employed to fit − 0.0524u 4 ( n − 1) + 3.8209u 4 ( n − 2) ⎬ (4.3)
the data. Initially, the degree of nonlinearity was set to 4, + 1.7155u ( n − 1) + 0.0273u ( n − 2)
5 5 ⎪

and the maximum lag of the input was set to 2. Estimation ⎪
+ 0.0329u 6 ( n − 1) − 0.9277u 6 ( n − 2)
produced model 2 derived as ⎪
εˆ2 ( n ) = y (n ) − zˆ2 (n ) ⎪⎭
Figure 10 shows the map of zˆ2 ( n ) versus u(n-1) and [5] S. A. Billings and W. S. F. Voon, “Structure detection
u(n-2) that the model 3 can be used to adequately and model validity tests in the identification of nonlinear
approximate to model 1 system”, Proceeding of the institution of electronic
engineers, Pt D, Vol 130, pp. 193-199, 1983.

[6] L. A. Aguirre, “A nonlinear correlation function for


selecting the delay time in dynamical reconstructions”,
Physics Letters A, Vol 203, pp. 88-94, 1995.

[7] L. A. Aguirre, “On the structure of nonlinear


polynomial models: higher order correlation functions,
spectra, and term clusters”, Circuits and Systems I:
Fundamental Theory and Applications, Vol 44, pp. 450-
453, 1997.
Figure 10. Map of zˆ 3 ( n ) versus u(n-1) and u(n-2)
[8] Q. N. Zhu and S. A. Billings, “Properties of higher
Figure 11 shows the correlation tests results of εˆ2 order correlation function tests for nonlinear model
that all the correlation values are lie in the 95% confidence validation”, proceedings of the 23rd International
intervals. The estimated model is valid. Conference on Industrial Electronics, Control and
Instrumentation, Vol 1, pp. 306-310, 1997.

[9] H. Hjalmarsson, “Asymptotic correction tests in


model validation”, Proceedings of the 32nd Conference on
Decision and Control, San Antonio, Texas, USA, pp. 2058-
2059, 1993.

[10] S. A. Billings and W. S. F. Voon, “Correlation based


Figure 11.Combined ODCCF and combined ODACF tests model validity tests for nonlinear models”, International
for model 3 Journal of Control, Vol 44, pp. 235-244, 1986.

[11] S. A. Billings and Q. M. Zhu, “Nonlinear model


5 Conclusions validation using correlation tests”, International Journal of
New first order correlation tests and the Control, Vol 60, pp. 1107-1120, 1994.
corresponding nonlinear model validity tests are developed
in the present study. The new correlation tests enhance the [12] S. A. Billings and Q. M. Zhu, “Model validation tests
power of nonlinear association detection compared with the for multivariable nonlinear models including neural
other approaches which are all based on higher order networks”, International Journal of Control, Vol 62, pp.
correlation functions. Simulation studies are employed to 749-766, 1995.
illustrate the new model validity tests that they appear to
provide improved discriminatory performance compared [13] K. Z. Mao and S. A. Billings, “Multi-directional
with the other correlation tests based model validity tests. model validity tests for non-linear system identification”,
International Journal of Control, Vol 73, pp. 132-143,
Reference 2000.

[1] T. Bohlin, “On the problem of ambiguities in [14] A. H. Bowker and G. J. Lieberman, Engineering
maximum likelihood identification”, Automatica, Vol 7, pp. Statistics, Prentice Hall, 1972.
199-200, 1971.
[15] I. J. Leontaritis and S. A. Billings, “Input-output
[2] T. Bohlin, “Maximum power validation of models parametric models for nonlinear systems, Part І and part П”,
without higher order fitting”, Automatica, Vol 14, pp. 173- International Journal of Control, Vol 41, pp. 303-328, pp.
146, 1978. 329-344, 1985.

[3] G. E. P. Box and G. M. Jenkins, Time Series Analysis


Forecasting and Control, San Francisco: Holden-day, 1976.

[4] T. Soderstrom and P. Stoica, “On covariance function


tests used in system identification”, Automatica, Vol 26, pp.
125-133, 1990.

You might also like