You are on page 1of 36

2019-04

Modeling and Identification of Noisy Dynamic Systems

Hugo Hernandez
ForsChem Research, 050030 Medellin, Colombia
hugo.hernandez@forschem.org

doi: 10.13140/RG.2.2.12571.72489

Abstract

Noise is usually present in dynamic data, either as a result of measurement errors or as a


combined effect of many different unknown factors. It is possible to model dynamic systems
using regression errors, where the residual error is minimized, assuming that all the noise is
additive. Other approaches involve filtering the noise (i.e. smoothing the data) in order to
improve the fitness of the model. However, sometimes the dynamic behavior is the result of
pure random processes, as it is the case of Markovian processes (e.g. random walks, Brownian
motion, etc.). These systems present an apparent deterministic behavior but which is actually
caused by randomness. The first challenge dealt in this work is the correct identification of true
from apparent determinism, using only the dynamic data available. A second challenge is
improving the model of the system (i.e. increasing its goodness-of-fit), combining both
deterministic and random effects into a single randomistic model. A general method is
proposed for analyzing any set of dynamic data, identifying the critical derivative
corresponding to the state of the dynamic system where determinism and randomness are
both significant. Higher-order derivatives will present a predominant random behavior,
whereas lower-order derivatives (or integrals) will be predominantly deterministic as long as
the integral of the critical derivative is not a pure Markovian random variable. A test for
Markovian behavior is also proposed for identifying true determinism. Six case studies are
presented in order to exemplify the method proposed. In the first three cases, the data is
obtained from dynamic Monte Carlo simulations. In the last three cases, data is taken from real
life examples: Global temperature change, USD/EUR exchange rate, and body-weight loss.

Keywords

Critical Derivatives, Dynamic Monte Carlo simulation, Dynamic Systems, Markovian behavior,
Minimum Variance, Modeling, Noise, Parameter Identification, Randomistics, Regression.

01/04/2019 ForsChem Research Reports 2019-04 (1 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

1. Introduction

Noise is the result of simultaneous random variations in different factors influencing a certain
measured variable. In principle, noise cannot be avoided although it can be significantly
reduced either by carefully controlling as many external factors as possible (reducing their
variation), by averaging replicated measurements, or by filtering the data using suitable
mathematical models. However, certain factors might be difficult to control or they are just
unknown significant factors; replicating measurements might be difficult, impractical or
expensive; noise models used for filtering might be inadequate. Furthermore, most
mathematical filtering methods result in loss of useful information about the system. Thus, the
mathematical identification of the behavior of a system is usually affected by noise.

Particularly in this report, the modeling and identification of noisy dynamic systems will be
discussed, although the results can be generalized to other types of systems (e.g. by replacing
time with any other independent variable such as position). For this purpose, the most general
case of randomistic dynamic variables will be considered.[1] The term randomistic basically
describes any variable in general whether it is random, deterministic or both.[2] Thus, any noisy
variable ( ) can be represented by the sum of one determinist and one random component:

( ) ( ) ( )̃ ( )
(1.1)

where ( ) represents a deterministic dynamic average value of , ( ) represents a


deterministic dynamic standard deviation of , and ̃ is a type I standard dynamic random
variable (zero mean and variance one) with a dynamic probability density function.[3]

The derivative of the randomistic dynamic variable with respect to time will be also
randomistic:

( ) ( ) ( ( ) ̃ ( )) ( ) ( ) ̃ ( )
̃ ( ) ( )
(1.2)

( ) ( )
In this expression, the terms and are deterministic, whereas the derivative of the
standard random variable can be expressed as:

̃ ( ) √
̃ ( )

(1.3)

01/04/2019 ForsChem Research Reports 2019-04 (2 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org


where is the standard deviation of the derivative,[4] and ̃ is another standard random
variable with probability density function given by:[5]

̃ ( )

̃ (̃ ) ∫ ̃ (̃ ( )) ̃ (̃ ( ) ̃ ) ̃ ( )
̃ ( )

(1.4)

Interestingly, for normal random variables their derivatives are again normal. For any other
distribution, its derivative will result in a different distribution. Such distribution of the
derivative of any standard random variable with respect to time will always be symmetrical.
Therefore, since negative and positive values of a symmetrical distribution are equally
probable, the th derivative of the standard random variable will eventually become a large sum
of random variables from the same distribution (for large values of ), and thus, it will tend to
be a normal random variable according to the central limit theorem.[6] Afterwards, all
following derivatives will remain normal.

On the other hand, the integral over time (starting at time ) of the dynamic variable is:[7]

( ) ∫( ( ) ( ) ̃ ( )) ∫ ( ) ∫ ( )̃ ( )

∫ ( ) 〈̃ 〉 ∫ ( )

(1.5)

where ( ) represents the first integral over time of , and 〈 ̃ 〉 is the average value of the
standard random variable ̃ in the time range from a certain initial time to . Again, the
integral is also randomistic as it combines deterministic and random terms. As long as the
number of realizations during that time interval is large, the average value is expected to be
exactly zero. In practice, the sample of values of ̃ in the interval may result in 〈 ̃ 〉 . For
this case, if ∫ ( ) ∫ ( ) , then the dynamic behavior of the integral will be
randomly determined. This is the case for example of Markovian random variables.[1,4]

Let us consider a typical random walk behavior described by the following expression:

( )

(1.6)

01/04/2019 ForsChem Research Reports 2019-04 (3 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

where ( ) represents the position of an object at time , and is a fixed standard


normal random variable. The term fixed indicates that the shape of the distribution will not
change over time.[1]

The position of the object is obtained by integrating Eq. (1.6). One possible result of the
integration, performed by dynamic Monte Carlos simulation, is presented in Figure 1. 50
different possible integration results are presented in Figure 2. Clearly, the behavior of a single
integration is not representative of the behavior of the system. However, if we have only one
integration result available, how can we discriminate between the true dynamic behavior of the
system and just a mathematical artifact caused by randomness? Figure 3 shows a regression
model describing the behavior of the integration path obtained in Figure 1.

Figure 1. Random walk behavior of the displacement of an object over time with respect to its
initial position. Only one integration result is shown. Numerical integration step: 0.01 a.u.

While not perfect, the polynomial regression model obtained fits very well the data ( ,
standard deviation of model error: ). The observed behavior seems deterministic
although contaminated with some small noise. However, the truth is that it was obtained from
a pure random process.

In this report, a method is proposed for identifying the true nature of a dynamic process
(random or deterministic), and for determining the most adequate model for describing both
random and deterministic effects.

01/04/2019 ForsChem Research Reports 2019-04 (4 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Figure 2. Random walk behavior of the displacement of an object over time with respect to its
initial position. 50 different dynamic Monte Carlo simulation results are shown. Numerical
integration step: 0.01 a.u.

Figure 3. Polynomial regression (red dashed line) of the displacement of an object over time
with respect to its initial position presented in Figure 1.

2. Rationale of the Method

By looking at Eq. (1.3), it is possible to observe that the standard deviation of the derivative
(and therefore the variance) of a random variable always increases (considering that √ ).
Similarly, the standard deviation (or variance) decreases by integration.[7]

01/04/2019 ForsChem Research Reports 2019-04 (5 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

On the other hand, let us consider the behavior of a deterministic variable ( ). For a time
interval between and , the average value of the deterministic variable is:

∫ ( )
〈 ( )〉

(2.1)
and its variance will be given by:
∫ ( ( ) 〈 ( )〉) ∫ ( )
〈 ( )〉

(2.2)

Furthermore, the average and variance of its derivative with respect to time will be (using the
ergodic-stochastic transformation [8]):

( )
( ) ∫ ( ) ( )
〈 〉

(2.3)
( ) ( ) ( )
∫ ( 〈 〉) ∫ ( ) ( ) ( )
( )

(2.4)

Assuming that for the time interval considered the deterministic variable can be approximated
by a truncated polynomial series expansion (as long as the function is continuous in the
interval):

( ) ∑ ( ) ∑

(2.5)
where . Then, Eq. (2.2) and (2.4) become:

∑∑ [ ]
( )( )( )
(2.6)

( )( )
∑∑ [ ]

(2.7)

01/04/2019 ForsChem Research Reports 2019-04 (6 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Thus, for , the variance of the deterministic derivative is expected to decrease compared
to the variance of the original deterministic variable, which is the opposite result obtained for
random variables. Similarly, the integration of a deterministic variable will result in an increase
in the variance.

As it can be inferred, these numerical results depend on the unit of time considered. Therefore,
by changing (normalizing) the unit of time in such a way that unit of time, both
conditions will be fulfilled ( √ ).

Then, by comparing the standard deviation of a certain dynamic variable with the standard
deviation of its derivative with respect to the normalized time, it is possible to determine if the
dominant behavior of such variable is deterministic (decrease in variance) or random (increase
in variance).

For randomistic variables where both deterministic and random effects are present, there is a
critical derivative (or integral) of the variable where the change from dominant deterministic to
dominant random behavior (or vice versa) will be observed. Such critical derivative (or integral)
can be easily identified because it will present a global minimum variance. The first integral of
such critical derivative might be a Markovian random variable if the random effect is larger than
the deterministic effect. In order to test the significance of the deterministic effect, let us
assume that the critical derivative ( ) has no deterministic contribution. Then, the variance
of the integrated random variable is:[7]

( ) (∫ ) ( ) (〈 〉 ) ( ) (〈 〉)

(2.8)

Given that (〈 〉 ) (since the time step is unit of time and assuming ), and

( ) ( ) ( ) (for a uniform distribution of time), then the expected


variance for a pure random variable is:

(2.9)

Thus, for a pure random process the observed behavior of at normalized time should be
within the confidence interval (for a confidence level of ) described by the following
expression:

01/04/2019 ForsChem Research Reports 2019-04 (7 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

( ( ) [ ( ) ( )√ ( ) ( )√ ])

(2.10)

where is the sample variance obtained from the values of the critical derivative, and ( )

is a critical value obtained from the probability distribution of . Assuming that the critical
derivative can be approximately described by a normal random variable, then:

( )
(2.11)

where represents the two-tailed critical Student’s T value obtained for a significance
level and degrees of freedom.

Thus, defining the observed Student’s T value at normalized time as:

( ) ( )
( )

(2.12)

it would be expected that most of the observed T values will have an absolute value less or
equal than for a purely random process. Otherwise the deterministic effect can be
considered significant.

Now, if the ( )th derivative is found to behave randomly, it will be a Markovian random
variable. All other lower order derivatives (or integrals) will be the result of a random process,
even though they show an apparently deterministic behavior.

On the other hand, if the deterministic effect on the ( )th derivative is found significant, it
can be identified using conventional methods of identification of dynamic systems.
Furthermore, since the noise is reduced as the derivative order decreases (or the integration
order increases), it would be desirable to model the system at the lowest derivative (or higher
integration) order possible.

Another alternative to test the significance of the deterministic effect emerging the critical
derivative consists on modeling the behavior of the critical derivative (e.g. using conventional

01/04/2019 ForsChem Research Reports 2019-04 (8 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

regression models or any other modeling approach), and then testing the significance of the
model after subtracting the effect of noise coming from the ( )th derivative. This implies
performing an ANOVA test for the model,[9] and correcting the F-value obtained as follows:

( )
(2.13)

where is the sum of square model errors (or residuals), and is the sum of square
integral error coming from the derivative of the modeled variable, assuming that it behaves as
white noise. If , the behavior of the critical derivative is completely random.

It also implies correcting the determination coefficient ( ) of the model as follows:

(2.14)

where is the total sum of square differences of the modeled variable with respect to its
average value. For the critical derivative, all these terms can be calculated as:

̂
∑( ( ) ( ))

(2.15)
〈 〉
( )( )

(2.16)

∑( ( ) 〈 〉)

(2.17)
̂
where ( ) represents the estimation of the model at normalized time , and 〈 〉 is the
mean normalized time step in the ( ) derivative data.
th

If the corrected F-value is larger than the corresponding critical F-value ( ), then the
deterministic model can be considered significant. Then, if the goodness-of-fit of the model
( ) is satisfactory for the modeler (e.g. ), such model can be used to describe
the deterministic component of the dynamic system.

01/04/2019 ForsChem Research Reports 2019-04 (9 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

3. Description of the Method

The method proposed in this report for the identification of models in noisy dynamic systems is
summarized in Figure 4.

Figure 4. Proposed procedure for modeling and identification of dynamic systems with noise.

A detailed description of the method is presented next:

1. Obtain the discrete set of dynamic data: Information about the response variable(s)
( ) and the time of observation ( ) for each of observations.

2. Calculate the elapsed time ( ) until the next observation:

(3.1)
There will be no value of the elapsed time for the last (final) observation.

01/04/2019 ForsChem Research Reports 2019-04 (10 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

3. Identify the maximum elapsed time between observations in the data:

( )
(3.2)

4. Normalize the time scale such that the initial transformed time is zero and one unit of
transformed time corresponds to :

(3.3)

5. Calculate the sample average and sample variance of the response variable(s), as
estimates of the expected value and variance of the randomistic variable:


( ) 〈 〉
(3.4)
∑ ( 〈 〉)
( )
(3.5)

6. Calculate the forward finite differences for the response variable(s) as an estimate of
its derivative:

( ) ( )

(3.6)

7. Estimate the expected value and variance of the derivative using the sample average
and sample variance of the finite differences:

∑ ( )
( ) 〈 ⁄ 〉
(3.7)

∑ (( ) 〈 ⁄ 〉)
( ) ⁄

(3.8)

01/04/2019 ForsChem Research Reports 2019-04 (11 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

8. If ⁄ , repeat steps 6 and 7 for the next derivative until a global minimum
variance is found, that is, until the variance increases. The derivative with the minimum
variance will be the critical derivative. If no global minimum is found after a certain
predefined maximum number of derivatives ( ), then the response
variable can be considered to be dominantly deterministic. In general, the expressions
for the th derivative will be:

( ) ( )
( ) ( )

(3.9)

∑ ( )
( ) 〈 ⁄ 〉

(3.10)

∑ (( ) 〈 ⁄ 〉)
( ) ⁄

(3.11)

9. If ⁄ , then noise is already dominant. Optionally, it would be possible to find


the critical integral in order to identify any potential emergence of determinism in the
data. In general, the th integral will be approximated using Euler’s integration method:

() ( ) () () ( )
( ) ∫ ( )

(3.12)
()
where and it is non-existent for previous observation times.
The corresponding expected value and variance are estimated using the following
expressions:
()
() () ∑
( ) 〈 〉
(3.13)
()
∑ ( 〈 ( ) 〉)
()
( ) ()

(3.14)

01/04/2019 ForsChem Research Reports 2019-04 (12 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Repeat the integration procedure until ( ) ( ) . The minimum variance integral


will be the critical integral. If no critical integral is found after a predefined maximum
number of integrals ( ), then the response variable can be considered as pure
noise.

10. Each derivative higher than the critical derivative ( ) can be modeled as a white-noise
random variable as follows:

(3.15)

The identification of the standard deviation and probability density function of

can be performed in two ways: i) Assuming a predefined probability density

function (e.g. normal) and testing the suitability of the best model obtained,[10] or ii)
Approximating the probability density function using a polynomial function, and
identifying the coefficients of the polynomial using the data available.[11]

11. Model the deterministic component of the critical derivative using any conventional
modeling method (e.g. linear or non-linear regression, or any other method). Perform
an analysis of variance for the model obtained. Additionally calculate the integral noise
term as:

∑ (( ) 〈 ⁄ 〉)
〈 〉
( )

(3.16)

Correct the F-value and coefficient of the model using Eq. (2.13) and (2.14),
respectively, and analyze significance and goodness-of-fit for the model obtained.

12. If no significant deterministic model for the critical derivative is found, then model all
other lower-order derivatives and/or integrals as random variables, as described in step
10. Otherwise, use conventional modeling approaches (e.g. statistical regression) for
determining both the deterministic and random components of the corresponding
model.

01/04/2019 ForsChem Research Reports 2019-04 (13 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

13. Particularly for the ( )th derivative, it is important to test if the supposed
deterministic behavior is truly deterministic or is caused by random walks. This requires
calculating the observed Student’s T values:

( ) ( )
( )
√( )

(3.17)

and computing the proportion ( ) of data points whose absolute T value is larger than
the critical :

∑ (| ( )| )

(3.18)
where represents Heaviside’s step function.
If , then the observed deterministic behavior is most probably independent of
randomness. Otherwise it is most probably caused by the random walk of a Markovian
random variable. In case of doubt, different confidence levels can be used.

A numerical implementation of this method in R-language (https://cran.r-project.org/) is


included in the Appendix, and is also available as Supporting Information. The algorithm,
denoted as dynoise, calculates different integrals and derivatives of a given data set,
determining the critical derivative, the probability that the emerging deterministic behavior is
significant, and the value of SSI. The user can perform the corresponding modeling
(deterministic or random) of the different derivatives and integrals obtained. Modeling the
random behavior of a variable can be performed either by testing pre-defined probability
density functions (optimodel, optinormal available in HypoTest), by using a polynomial
approximation for the probability density function (invPDF), or by any other suitable method.
Similarly, the deterministic model can be obtained by statistical regression or any other suitable
approach. For the critical derivative, it is important to correct the significance and goodness-of-
fit of the model, by subtracting the integral noise effect. When using the numerical derivatives
and integrals of the data for modeling, it is necessary to use the following transformation of
the data in order to return to the original time scale:

(3.19)

01/04/2019 ForsChem Research Reports 2019-04 (14 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Also, please notice that the in the original time scale is given by:

( )

(3.20)

where is the order of the critical derivative.

4. Examples

4.1. Pure Random Walk

As a first example let us consider the data presented in Figure 1. This data set containing 251
data points was obtained by dynamic Monte Carlo simulation using the following discrete
dynamic model:

( ) ( )
(4.1)

considering that ( ) , , and is a random number obtained from a


standard normal distribution ( , ).[3] The full data set for all examples presented is
available as Supporting Information.

The procedure described in Section 3 was performed to obtain the critical derivative. 4
derivatives and 4 integrals of the original data for the standardized time were considered. The
variances of these variables are summarized in Table 1 and Figure 5. The critical derivative
(minimum variance variable) was found at the first derivative of the original data set. This result
is consistent to the equation used to generate the data (Eq. 4.1), since it can be expressed as:

( ) ( )

(4.2)

The test for Markovian behavior at the original data set (which corresponds to the integral of
the critical derivative) indicates that with a 99% confidence, there is a 73.3% probability that the
behavior is the result of randomness. That is, this method successfully identifies that the
original data set is a sample of a Markovian variable. Figure 6 shows the test of Markovian
behavior for all data points using a 99% confidence interval. It can be observed that most of the
data points lie within the confidence interval for Markovian behavior, indicating that the
deterministic behavior was not significant compared to the effect of randomness.

01/04/2019 ForsChem Research Reports 2019-04 (15 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Table 1. Value of variance obtained for different integrals and derivatives of the data set
considered in Example 4.1, using the normalized time.
Variable Variance
(4)
X Fourth integral 1.73E+13
(3)
X Third integral 4.69E+09
''X Second integral 6.30E+05
'X First integral 2.80E+01
X Original data set 2.60E-03
X' First derivative 9.48E-05
X'' Second derivative 1.95E-04
X(3) Third derivative 6.04E-04
X(4) Fourth derivative 2.06E-03
Critical derivative identification
10
log10(Variance)

5
0

-4 -2 0 2 4

Variable derivative

Figure 5. Behavior of the decimal logarithm of the variance as a function of the order of the
derivative (positive) or integral (negative) for the data set of Example 4.1.
Test for Markovian behavior: 99 % confidence
0.2
0.1
X(c-1)

0.0
-0.1
-0.2

0 50 100 150 200 250

Normalized time
Figure 6. Test for Markovian behavior at 99% confidence for the original data in Example 4.1.

01/04/2019 ForsChem Research Reports 2019-04 (16 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

0.00 0.01 0.02


Critical derivative

-0.02

0 50 100 150 200 250

Normalized time

Figure 7. Dynamic behavior of the critical derivative (first derivative) of the data in Example 4.1.

Now, let us model the behavior of the critical derivative. Figure 7 shows the dynamic behavior
of the first forward finite difference as an approximation to the first derivative of the original
data. A simple linear regression model in time results in the following equation:

(4.3)

with , and ( ) . Although clearly the model is not


significant, it is necessary first to check the integral noise effect coming from the second
derivative before reaching any conclusion. The sum of squares obtained for the integral noise
from the second derivative is , which is larger than , thus
confirming that it is a pure random variable.

Then, the data presented in Figure 7 is used to identify the parameters of a normal distribution.
The optimal model obtained for is a white-noise normal distribution with

( )§. Now, since , is found to be a normal distribution with


, very close to the original standard normal random distribution used in Eq. (4.2).

§
The determination coefficient of the random model is obtained by determining the goodness-of-fit of
the model to the cumulative probability obtained from the data. The residuals of the model are
calculated as the closest distance between the value of the cumulative probability function obtained
from the model for each data point and the cumulative probability interval corresponding to the point,
similar to the Kolmogorov-Smirnov test. If the prediction lies within the interval, the residual is set to
zero. On the other hand, the total sum of squares is determined considering the central value of each
cumulative probability interval.

01/04/2019 ForsChem Research Reports 2019-04 (17 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

4.2. Random Forces acting on a Particle

The second example consists of data obtained from the dynamic Monte Carlo simulation of the
following model describing the motion in one dimension of a certain particle subject to random
environmental forces:

(4.4)

where is the position of the particle in one direction, is the mass of the particle, is the
standard deviation of the random force acting on the particle in that direction, and is a
standard normal random number.

The data set was obtained by numerically solving the model presented in Eq. (4.4) assuming
mass unit, 〈 〉 force unit, and using Euler’s integration method with =0.01 time
units, but with a sampling time of 0.1 time units, that is, the data is recorded every 10
integration steps. The particle starts at rest at a position of 1 distance unit. The dynamic
behavior of the position of the particle for the first 2.5 time units is presented in Figure 8 along
with a polynomial regression model.

Figure 8. Dynamic behavior of the position of a particle subject to random forces. Blue dots:
Data obtained from numerical integration of Eq. (4.4). Red line: Polynomial regression model.

The regression model identified from the data set describes the motion of a particle with an
initial position of distance unit, an initial velocity of distance units per time unit, and a
constant acceleration of distance units per square time unit, corresponding to a

01/04/2019 ForsChem Research Reports 2019-04 (18 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

constant force of force units. Knowing the source of the data, this is clearly incorrect.
However, if the source of the data is not known, this could have been the conclusion of this
analysis, supported by a high value of almost .

Now, using the dynoise algorithm implementing the method proposed in Section 3, the
following results are obtained:

Normalized time = 0.1 original time unit(s)

Variance vector:
Var
X(-4) 1.464066e+07
X(-3) 5.234835e+05
X(-2) 9.649167e+03
X(-1) 6.698097e+01
X(0) 4.927131e-03
X(1) 3.026376e-05
X(2) 8.088210e-06
X(3) 1.046126e-05
X(4) 2.313044e-05

The critical derivative is at the second derivative of the data set.


There is a 100 % probability that the emerging deterministic behavior at
the critical derivative is the result of random processes with a 99 %
confidence.
The sum of squares of the integral noise (SSI) is: 0.00011507380604406 in
normalized time units, and 115.073806044059 in original time units.

Critical derivative identification Test for Markovian behavior: 99 % confidence


0.02
6
4

0.01
log10(Variance)

X(c-1)

0.00
0

-0.02 -0.01
-2
-4

-4 -2 0 2 4 0 5 10 15 20 25

Variable derivative Normalized time

Figure 9. Graphical results obtained for the data of Example 4.2. Left plot: Logarithm of
variance vs. derivative order. Right plot: Velocity vs. normalized time.

The method correctly identifies the main source of noise at the second derivative of the
position (acceleration). Furthermore, it is found that the velocity presents without any doubt a
Markovian behavior, as can be seen in Figure 9. Therefore, it can be concluded that the
behavior of the position is random and not deterministic, indicating that the regression model
presented in Figure 8 does not satisfactorily describe the true nature of the system. A white-
noise normal probability model of the second derivative yields ⁄ distance
units per square original time units ( ). This model underestimates the original

01/04/2019 ForsChem Research Reports 2019-04 (19 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

standard deviation ( ) by a factor of almost 4, as a result of having a sample time larger


than the integration time. In fact, both standard deviations are related by the following
expression: ⁄ . Therefore, the estimation of the

standard deviation is actually less than 20% below its true value.

4.3. Random and Deterministic Forces acting on a Particle

In this example, the dynamic model previously presented in Eq. (4.4) is modified by
incorporating a constant force acting on the particle. The new model is:

(4.5)

where the same parameters as in the previous example are used, and considering
force units. The data obtained from the numerical integration, and the corresponding
polynomial regression model, are summarized in Figure 10.

Figure 10. Dynamic behavior of the position of a particle subject to random and deterministic
forces. Blue dots: Data obtained from numerical integration of Eq. (4.5). Red line: Polynomial
regression model.

01/04/2019 ForsChem Research Reports 2019-04 (20 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

The deterministic behavior predicted by the regression model indicates that the particle has an
initial velocity of distance units per unit of time, and that it is subject to a constant
force of distance units per square time unit. Although the initial velocity can be
neglected, the deterministic force is about larger than the real force applied to the
particle. Please notice that this model has a coefficient of , so there should be no
reason to doubt of the goodness of this model. Let us, however, perform the dynamic analysis
of noise for this case. The graphical results are summarized in Figure 11.

The method output displayed in R is:

Normalized time = 0.1 original time unit(s)

Variance vector:
Var
X(-4) 1.478924e+07
X(-3) 5.337957e+05
X(-2) 1.004452e+04
X(-1) 7.400518e+01
X(0) 2.164395e-02
X(1) 1.452142e-04
X(2) 4.401511e-06
X(3) 7.367433e-06
X(4) 1.965444e-05

The critical derivative is at the second derivative of the data set.


There is a 68 % probability that the emerging deterministic behavior at
the critical derivative is significant with a 99 % confidence.
The sum of squares of the integral noise (SSI) is: 8.10417608863006e-05
in normalized time units, and 81.0417608863002 in original time units.

Critical derivative identification Test for Markovian behavior: 99 % confidence


0.01 0.02 0.03
6
4
log10(Variance)

X(c-1)
0
-2

-0.01
-4

-4 -2 0 2 4 0 5 10 15 20 25

Variable derivative Normalized time

Figure 11. Graphical results obtained for the data of Example 4.3. Left plot: Logarithm of
variance vs. derivative order. Right plot: Velocity vs. normalized time.

The critical derivative was again found to be the second derivative of the position. The velocity
in this case did not show a purely Markovian behavior. The probability of being a true
deterministic effect was found to be 68%. This is a remarkable result as the constant
deterministic force was 10 times smaller than the standard deviation of the random force.

01/04/2019 ForsChem Research Reports 2019-04 (21 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Now, for the critical derivative data, a linear regression model with respect to time results in
the following corrected values: and ( ) . This
means that the second derivative of position is a fixed random variable with a non-zero mean.
Thus, modelling the second derivative with respect to the normalized time as a normal
distribution, the following parameters are found: ⁄ , ⁄
( ). Therefore, the value of the constant deterministic force acting on the particle
would be:

(4.6)

This model obtained at the critical derivative provides an improved estimation on the true
deterministic force acting on the particle ( ), compared to the value obtained from the
regression model of the observed variable. The estimated standard deviation of the force in
this case would be .

4.4. Global Land-Ocean Temperature Index

The next example is taken from data published by NASA's Goddard Institute for Space Studies
(GISS) (https://climate.nasa.gov/vital-signs/global-temperature/), indicating the Global
Temperature Anomaly (°C) observed from 1880 to 2018. This temperature anomaly is
determined as the change in global surface temperature relative to 1951-1980 average
temperatures. The non-smoothed data is graphically presented in Figure 12, along with a
polynomial regression model. The data shows a clear increase in global temperature,
particularly from the 1960’s. Given that the data is noisy, the purpose of this example is finding
a more accurate model of global temperature change.

The results obtained using the analysis of minimum variance (dynoise) are the following:

Normalized time = 1 original time unit(s)

Variance vector:
Var
X(-4) 7.954541e+11
X(-3) 6.575329e+08
X(-2) 2.256168e+05
X(-1) 2.764175e+01
X(0) 1.134398e-01
X(1) 1.273689e-02
X(2) 3.102713e-02
X(3) 9.351267e-02
X(4) 3.100849e-01

01/04/2019 ForsChem Research Reports 2019-04 (22 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

The critical derivative is at the first derivative of the data set.


There is a 100 % probability that the emerging deterministic behavior at
the critical derivative is the result of random processes with a 99 %
confidence.
The sum of squares of the integral noise (SSI) is: 2.10984452554744 in
normalized time units, and 2.10984452554744 in original time units.

Figure 12. Dynamic behavior of the global temperature anomaly between 1880 and 2018. Blue
dots: Data obtained from NASA. Red line: Polynomial regression model.

Critical derivative identification Test for Markovian behavior: 99 % confidence


10 12

1
log10(Variance)

X(c-1)
6

0
4

-1
2
0

-2
-2

-4 -2 0 2 4 0 20 40 60 80 100 120 140

Variable derivative Normalized time

Figure 13. Graphical results obtained for the data of Example 4.4 from 1880 to 2018. Left plot:
Logarithm of variance vs. derivative order. Right plot: Temperature anomaly vs. normalized
time.

The results obtained indicate that random processes are predominant in this dynamic system,
and that any deterministic effect is not larger than the effect of randomness. The right plot in
Figure 13 illustrates this effect. It can be seen that the data lies within the limits of the potential
random walk behavior. Given that the highest temperatures have been registered since 2001, it

01/04/2019 ForsChem Research Reports 2019-04 (23 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

is possible that a deterministic effect cannot be clearly observed in such a long period of time.
Thus, the same analysis is done considering only the data from 2001 to 2018:

Normalized time = 1 original time unit(s)

Variance vector:
Var
X(-4) 1.951128e+05
X(-3) 1.684098e+04
X(-2) 7.214294e+02
X(-1) 1.202650e+01
X(0) 1.804477e-02
X(1) 9.034559e-03
X(2) 2.121958e-02
X(3) 6.736381e-02
X(4) 2.390725e-01

The critical derivative is at the first derivative of the data set.


There is a 100 % probability that the emerging deterministic behavior at
the critical derivative is the result of random processes with a 99 %
confidence.
The sum of squares of the integral noise (SSI) is: 0.159146875 in
normalized time units, and 0.159146875 in original time units.

Critical derivative identification Test for Markovian behavior: 99 % confidence


1.2
4

0.8
log10(Variance)

X(c-1)
2

0.4
0

0.0
-2

-4 -2 0 2 4 0 5 10 15

Variable derivative Normalized time

Figure 14. Graphical results for the data of Example 4.4 from 2001 to 2018. Left plot: Logarithm
of variance vs. derivative order. Right plot: Temperature anomaly vs. normalized time.

Even during the present millennium, a possible deterministic effect on global temperature
cannot be differentiated from the behavior of a pure random walk.

Continuing with the modeling of the system, the behavior of the critical derivative over the
whole time range is presented in Figure 15. A Shapiro-Wilk normality test indicates that the first
derivative data is normal ( , ). Thus, it can be modeled as a white-
noise normal random variable. The optimal value identified for the standard deviation is
⁄ ( ). Thus, the dynamic model can be expressed as:

( ) ( )
(4.7)

01/04/2019 ForsChem Research Reports 2019-04 (24 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

where represents the year and is a random number from a standard normal random
distribution.

0.3
0.2
Critical derivative

0.1
0.0
-0.2 -0.1

0 20 40 60 80 100 120 140

Normalized time

Figure 15. Dynamic behavior of the first derivative of global temperature from the data in
Example 4.4.

Figure 16. Probability density function of the random models obtained for the yearly change in
average global temperature (in °C/yr). Red dashed line: Model obtained using the 1880-2018
data (Eq. 4.7). Blue solid line: Model obtained using the 2001- 2018 data (Eq. 4.8).

A similar model obtained from the data between 2001 and 2018 is ( ):

( ) ( )
(4.8)

01/04/2019 ForsChem Research Reports 2019-04 (25 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Both models are compared in Figure 16. The similitude [10] between the two random models is
. Both random models are very similar, although the standard deviation for the data
between 2001 and 2018 is slightly lower than the standard deviation observed for the whole
range (1880-2018).

4.5. Exchange Rate Dynamics

Historical data for the USD/EUR daily exchange rate during 2017, obtained from investing.com,
is presented in Figure 17. A simple regression analysis indicates that in 2017, the USD/EUR
exchange rate decreased at an average rate of , or equivalently, .

Figure 17. Dynamic behavior of the USD/EUR daily exchange rate for 2017. Day 1 corresponds to
January 1st 2017. Blue dots: Data obtained from investing.com. Red line: Linear regression
model.

The analysis of the dynamic system in the presence of noise yields the following results (see
also Figure 18):

Normalized time = 3 original time unit(s)

Variance vector:
Var
X(-4) 4.575447e+12
X(-3) 5.762850e+09
X(-2) 3.913140e+06
X(-1) 9.627974e+02
X(0) 1.648129e-03

01/04/2019 ForsChem Research Reports 2019-04 (26 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

X(1) 1.276320e-04
X(2) 2.321122e-03
X(3) 5.470130e-02
X(4) 1.380370e+00

The critical derivative is at the first derivative of the data set.


There is a 100 % probability that the emerging deterministic behavior at
the critical derivative is the result of random processes with a 99 %
confidence.
The sum of squares of the integral noise (SSI) is: 0.138575737197854 in
normalized time units, and 0.00171081157034388 in original time units.

Critical derivative identification Test for Markovian behavior: 99 % confidence

1.1
10
log10(Variance)

1.0
X(c-1)
5

0.9
0

0.8

-4 -2 0 2 4 0 20 40 60 80 100 120

Variable derivative Normalized time

Figure 18. Graphical results for the USD/EUR daily exchange rate data of Example 4.5 for 2017.
Left plot: Logarithm of variance vs. derivative order. Right plot: Exchange rate vs. normalized
time.

According to these results, the decrease in exchange rate during 2017 could not be considered
significant compared to the dynamic effect of randomness. Furthermore, the optimal model
obtained for describing the behavior of the exchange rate at the first derivative is (
):

(4.9)

In order to validate both the linear regression model and the model obtained after analyzing
the behavior of noise, the daily exchange rate for 2018 will be included in the data. The two-
year data is presented in Figure 19, along with the prediction of the linear regression model
obtained for 2017. Clearly, the linear regression model previously obtained did not have a good
prediction capability. Let us now analyze again the noise of the two-year data set.

01/04/2019 ForsChem Research Reports 2019-04 (27 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Figure 19. Dynamic behavior of the USD/EUR daily exchange rate for 2017 and 2018. Day 1
corresponds to January 1st 2017. Blue dots: Data obtained from investing.com. Red line: Linear
regression model obtained using only 2017 data.

Normalized time = 3 original time unit(s)

Variance vector:
Var
X(-4) 1.275540e+15
X(-3) 3.841462e+11
X(-2) 6.248026e+07
X(-1) 3.693980e+03
X(0) 1.464498e-03
X(1) 1.177795e-04
X(2) 2.066066e-03
X(3) 4.915564e-02
X(4) 1.269285e+00

The critical derivative is at the first derivative of the data set.


There is a 100 % probability that the emerging deterministic behavior at
the critical derivative is the result of random processes with a 99 %
confidence.
The sum of squares of the integral noise (SSI) is: 0.249774461288951 in
normalized time units, and 0.00308363532455495 in original time units.

At a first glance of the right plot in Figure 20, it looks like the random behavior of the exchange
rate considering both years is similar to the behavior observed only for 2017 (right plot of
Figure 18). Estimating again the standard deviation for a white-noise normal random
distribution, the new model obtained is ( ):

(4.10)

01/04/2019 ForsChem Research Reports 2019-04 (28 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Critical derivative identification Test for Markovian behavior: 99 % confidence


15

1.2
1.1
10
log10(Variance)

1.0
X(c-1)
5

0.9
0.8
0

0.7
-4 -2 0 2 4 0 50 100 150 200 250

Variable derivative Normalized time

Figure 20. Graphical results for the USD/EUR daily exchange rate data of Example 4.5 for 2017
and 2018. Left plot: Logarithm of variance vs. derivative order. Right plot: Exchange rate vs.
normalized time.

Figure 21. Probability density function of the random models obtained for the daily change in
USD/EUR exchange rate. Red dashed line: Model obtained from 2017 data (Eq. 4.9). Blue solid
line: Model obtained from 2017 and 2018 data (Eq. 4.10).

Figure 21 presents a comparison of the probability density distribution obtained in the models
presented in Eq. (4.9) and (4.10). The similitude [10] obtained between those two models is
. These results indicate that there is only a small decrease in the width of the
distribution by including the data from 2018, without a significant change in the behavior of the
random variable. Therefore, it can be concluded that both models (Eq. 4.9 and 4.10) are in
principle identical.

01/04/2019 ForsChem Research Reports 2019-04 (29 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

4.6. Body-weight control

As a last example, let us consider the data previously reported [12] on the practical
implementation of a body-weight control method. The data, along with a linear regression
model, is presented in Figure 22.

Figure 22. Dynamic behavior of a body weight under a weight control method. Blue dots: Data
reported in [12]. Red line: Linear regression model.

The noise analysis (dynoise) yields the following results:

Normalized time = 8 original time unit(s)

Variance vector:
Var
X(-4) 1.723218e+08
X(-3) 2.708154e+07
X(-2) 2.259027e+06
X(-1) 7.029331e+04
X(0) 1.545614e+00
X(1) 5.319603e+00
X(2) 8.829995e+02
X(3) 1.683780e+05
X(4) 3.492943e+07

The critical derivative is at the original data set.


There is a 100 % probability that the emerging deterministic behavior at
the critical derivative is significant with a 99 % confidence.
The sum of squares of the integral noise (SSI) is: 29.5184075967862 in
normalized time units, and 0.461225118699784 in original time units.

01/04/2019 ForsChem Research Reports 2019-04 (30 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Since the critical derivative is at the original dataset, the linear regression model is reliable.
Furthermore, it is confirmed that it is a significant deterministic behavior. It is however needed
to correct the of the regression model using the , and to validate its significance with the
corrected F-value.

Thus, for this example , and the corrected values are:


and ( ) , confirming that the linear regression model has a
good fit and it is statistically significant. For the particular example it can be concluded that the
average decrease in weight in the period of time considered was about 43 g/day.

On the other hand, the residuals of this regression model can be modelled as a white-noise
normal random distribution with ( ). Please notice that this model
already includes the integral noise. The obtained randomistic model is thus:

( ) ( )
(4.11)

where is the weight in kg, is the time in days from the beginning of the weight-control
method implementation, is a standard normal random variable, and the total value
was calculated as:

(4.12)

where is the determination coefficient of the deterministic model before the correction
( ), and is the determination coefficient of the random model based on the fit of
the cumulative probability distribution.

Since the random variable is the result of two effects, additive noise (measurement noise) in
the weight and integral noise from the weight change rate, a more detailed randomistic model
of the system would be:

(4.13)
( ) ( ) ∫
(4.14)

where the standard deviation of the derivative of the weight was obtained from the in

original units as: √( )〈 〉


, and the standard deviation of the

01/04/2019 ForsChem Research Reports 2019-04 (31 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

( ) 〈 〉
additive noise is: √ ( ) . ( ) and 〈 〉 are obtained

from the dataset. Alternatively, the standard error estimated from the regression model ( )
can be used instead of , resulting in . , ∫ and

represent different independent standard normal random variables.

5. Conclusion

Randomness taking place at the time derivatives of a certain observed dynamic variable might
have a significant effect on the outcome of the dynamic variable, and in some cases, it might be
even more important than deterministic effects. When randomness is more relevant than
determinism at the derivative of the observed variable, the latter will behave as a Markovian
random variable. This means that the observed behavior is the result of chance, and therefore,
it is not a repeatable, predictable result. Modeling of noisy dynamic systems should be done
carefully in order to avoid reaching false conclusions about the system behavior. In this report,
a numerical procedure is proposed for identifying the best model (random, deterministic or
randomistic) that should be considered for a noisy dynamic system, using only the measured
data as an input. The procedure involves the identification of the critical derivative, which
defines the limit between randomness and determinism. This limit is found where a global
minimum variance is observed after a normalization of the time scale. Derivatives higher than
the critical derivative can be modelled as white noise (usually normal according to an extension
of a central limit theorem). The standard deviation of such white noise can be determined by
minimizing the residuals of the cumulative probability distribution.[10] Derivatives (or integrals)
lower than the critical derivative can be modelled using conventional methods, as long as the
emerging determinism is not Markovian. A Markovian test is proposed for assessing the
significance of the emerging determinism, compared to the randomness present in the system.
Furthermore, when the random component of a derivative is modeled, the integral noise
coming from the next derivative should be taken into account in order to obtain a better model
of the system. Different examples were presented in order to test the method. The data of
some of these examples was obtained from dynamic Monte Carlo simulation results,
confirming that the proposed approach satisfactorily identified the original model of the
system. Additional real-life examples illustrate how random processes can easily be interpreted
as deterministic, when the noise in the data is neglected (i.e. global temperature change,
exchange rates, etc.). It is also possible to validate using this method whether a certain change
in the system provides significant results or they are just the result of chance (i.e. efficacy of
weight-loss methods).

01/04/2019 ForsChem Research Reports 2019-04 (32 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Acknowledgments

This research did not receive any specific grant from funding agencies in the public,
commercial, or not-for-profit sectors.

References

[1] Hernandez, H. (2018). On the Behavior of Dynamic Random Variables. ForsChem Research
Reports 2018-09. doi: 10.13140/RG.2.2.20135.19366.

[2] Hernandez, H. (2018). The Realm of Randomistic Variables. ForsChem Research Reports
2018-10. doi: 10.13140/RG.2.2.29034.16326.

[3] Hernandez, H. (2018). Multidimensional Randomness, Standard Random Variables and


Variance Algebra. ForsChem Research Reports 2018-02. doi: 10.13140/RG.2.2.11902.48966.

[4] Hernandez, H. (2016). Variance algebra applied to dynamical systems, ForsChem Research
Reports, 2016-2, doi: 10.13140/RG.2.2.36507.26403.

[5] Hernandez, H. (2018). Probability Density Functions of Derivatives of Random Variables.


ForsChem Research Reports 2018-06. doi: 10.13140/RG.2.2.23850.11204.

[6] Hernandez, H. (2019). Sums and Averages of Large Samples Using Standard
Transformations: The Central Limit Theorem and the Law of Large Numbers. ForsChem
Research Reports 2019-01. doi: 10.13140/RG.2.2.32429.33767.

[7] Hernandez, H. (2018). Integrating Functions of Random Variables. ForsChem Research


Reports 2018-07. doi: 10.13140/RG.2.2.23660.87680.

[8] Hernandez, H. (2017). Ergodic-Stochastic Transformations. ForsChem Research Reports


2017-12. doi: 10.13140/RG.2.2.20325.70881.

[9] Kutner, M. H., Nachtsheim, C. J., Neter, J., & Li, W. (2005). Applied linear statistical models.
5th Ed. Boston: McGraw-Hill Irwin.

[10] Hernandez, H. (2018). Parameter Identification using Standard Transformations: An


Alternative Hypothesis Testing Method. ForsChem Research Reports 2018-04. doi:
10.13140/RG.2.2.14895.02728.

[11] Hernandez, H. (2018). Comparison of Methods for the Reconstruction of Probability Density
Functions from Data Samples. ForsChem Research Reports 2018-12. doi:
10.13140/RG.2.2.30177.35686.

[12] Hernandez, H. (2018). Body Weight Control: An Engineering Perspective. ForsChem


Research Reports 2018-01. doi: 10.13140/RG.2.2.11071.20644.

01/04/2019 ForsChem Research Reports 2019-04 (33 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Appendix: Numerical Implementation of the Proposed Algorithm in R


dynoise<-function(dataset=NULL,M=4,L=4,alpha=0.01,disp=TRUE){
#This function determines the critical derivative from a discrete data set, indicating
#where randomness becomes predominant over determinism. It also #indicates if the behavior
#of the integral of the critical derivative is the #result of randomness or if the
#deterministic behavior is significant. The output #of the function includes the
#calculated finite differences and numerical #integrals, which can be used for modeling
#their corresponding noise and/or #deterministic behavior. M is the maximum number of
#derivatives to be evaluated. #L is the maximum number of integrals to be evaluated. alpha
#is the significance #level. When the disp option is TRUE, the results are displayed in
#the console.

#Step 1. Input the data set


if ((is.null(dataset)==TRUE)|((is.data.frame(dataset)==FALSE))){
print("Please input the dataset as a dataframe with two variables: Time and Data")
return(NULL)
}
N=nrow(dataset) #Total number of data points
colnames(dataset)=c("Time","Data") #Update column names
ot=dataset$Time #Read original time
x=dataset$Data #Read variable data

#Step 2. Normalize the time scale


dt=0 #Initialize variable
for (i in 1:(N-1)){
dt[i]=ot[i+1]-ot[i] #Time steps in original scale
}
t=(ot-ot[1])/max(dt) #Time normalization
if (disp==TRUE){
print(paste("Normalized time =",max(dt),"original time unit(s)"))
}

#Step 3. Numerical determination of derivatives and integrals


X=matrix(NA,N,L+M+1) #Initialization of matrix of derivatives and integrals (empty)
colnames(X)=c(1:(L+M+1)) #Initialization of column names
X[,L+1]=x #Original data set
colnames(X)[L+1]="X( 0 )"
for (m in 1:M){
for (i in 1:(N-m)){
X[i,L+m+1]=(X[i+1,L+m]-X[i,L+m])/(t[i+1]-t[i]) #Finite differences
}
colnames(X)[L+m+1]=paste("X(",m,")")
}
for (l in 1:L){
X[l,L-l+1]=0 #Initialize integral
for (i in (l+1):N){
X[i,L-l+1]=X[i-1,L-l+1]+X[i-1,L-l+2]*(t[i]-t[i-1]) #Euler integrals
}
colnames(X)[L-l+1]=paste("X(",-l,")")
}

#Step 4. Identify the critical derivative


Var=0 #Initialize variance vector
for (j in 1:(L+M+1)){
Var[j]=var(X[,j],na.rm=TRUE)
}
jc=which(Var==min(Var)) #Critical derivative position
cr=jc-L-1 #Critical derivative
order=c("first","second","third","fourth")
if (cr==0){
omsg="The critical derivative is at the original data set."
}
if (cr>0){
if (cr>4){
omsg=paste("The critical derivative is at the",cr,"-th derivative of the data set.")

01/04/2019 ForsChem Research Reports 2019-04 (34 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

} else {
omsg=paste("The critical derivative is at the",order[cr],"derivative of the data
set.")
}
if (cr==M){
omsg=paste(omsg,"Please notice that this was the maximum derivative tested. Try
increasing the maximum number of derivatives for finding a global minimum.")
}
}
if (cr<0){
if (cr<(-4)){
omsg=paste("The critical derivative is at the",-cr,"-th integral of the data set.")
} else {
omsg=paste("The critical derivative is at the",order[-cr],"integral of the data set.")
}
if (cr==(-L)){
omsg=paste(omsg,"Please notice that this was the maximum integral tested. Try
increasing the maximum number of integrals for finding a global minimum.")
}
}

if (disp==TRUE){
Vardf=data.frame(Var)
rownames(Vardf)=colnames(X)
print("Variance vector:")
print(Vardf)
print(omsg)
plot((-L:M),log10(Var),xlab="Variable derivative",ylab="log10(Variance)",main="Critical
derivative identification",type="l")
}

#Step 5. Analyze Markovian behavior


if (cr==-L){
print("The critical derivative is at the lowest limit. Please increase L and try
again.")
} else {
tcr=qt(1-alpha/2,N-abs(cr)-1) #Critical t value
St=tcr #Initialize Student's T value
p=0 #Initialize proportion counter
if (cr>0){
for (i in 2:(N-cr+1)){
#Calculation of Student's t value for derivatives
St[i]=(X[i,L+cr]-X[1,L+cr])/sqrt((i-1)*Var[jc]/3)
if (abs(St[i])>tcr){
p=p+1 #Update proportion counter
}
}
p=p/(N-cr+1) #Calculate proportion outside the confidence interval
} else {
for (i in (2-cr):N){
#Calculation of Student's t value for integrals
St[i]=(X[i,L+cr]-X[1,L+cr])/sqrt((i-1)*Var[jc]/3)
if (abs(St[i])>tcr){
p=p+1 #Update proportion counter
}
}
p=p/(N+cr-1) #Calculate proportion outside the confidence interval
}
if (p>0.5){
omsg=paste("There is a",p*100,"% probability that the emerging deterministic behavior
at the critical derivative is significant with a",(1-alpha)*100,"% confidence.")
} else {
omsg=paste("There is a",(1-p)*100,"% probability that the emerging deterministic
behavior at the critical derivative is the result of random processes with a",(1-
alpha)*100,"% confidence.")
}
if (disp==TRUE){

01/04/2019 ForsChem Research Reports 2019-04 (35 / 36)


www.forschem.org
Modeling and Identification of
Noisy Dynamic Systems
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

print(omsg)
plot(t,X[,L+cr],xlab="Normalized time",ylab="X(c-1)",main=paste("Test for Markovian
behavior:",100*(1-alpha),"% confidence"),ylim=c(min(min(X[,L+cr],na.rm=TRUE),X[1,L+cr]-
tcr*sqrt(max(t)*Var[jc]/3)),max(max(X[,L+cr],na.rm=TRUE),X[1,L+cr]+tcr*sqrt(max(t)*Var[jc]/3
))))
lines(t,X[1,L+cr]+tcr*sqrt(t*Var[jc]/3))
lines(t,X[1,L+cr]-tcr*sqrt(t*Var[jc]/3))
}
#Calculation of SSI for the critical derivative
SSI=((N-abs(cr)-2)*mean(dt)/(2*max(dt)))*Var[jc+1]
omsg=paste("The sum of squares of the integral noise (SSI) is:",SSI,"in normalized time
units, and",SSI/((max(dt))^(2*(cr+1))),"in original time units.")
if (disp==TRUE){
print(omsg)
}
}
#Results
output=data.frame(t,X)
return(output)
}

01/04/2019 ForsChem Research Reports 2019-04 (36 / 36)


www.forschem.org