Professional Documents
Culture Documents
Efficient implementation of the Gaussian kernel algorithm in estimating invariants and noise
level from noisy time series data
Dejin Yu,1,* Michael Small,1 Robert G. Harrison,1 and Cees Diks2
1
Department of Physics, Heriot-Watt University, Riccarton, Edinburgh EH14 4AS, United Kingdom
2
CeNDEF, Department of Economics, University of Amsterdam, Roetersstraat 11, NL-1018 WB Amsterdam, The Netherlands
Received 2 June 1999
We describe an efficient algorithm which computes the Gaussian kernel correlation integral from noisy time
series; this is subsequently used to estimate the underlying correlation dimension and noise level in the noisy
data. The algorithm first decomposes the integral core into two separate calculations, reducing computing time
from O(N 2 N b ) to O(N 2 N 2b ). With other further improvements, this algorithm can speed up the calculation
of the Gaussian kernel correlation integral by a factor of (2 10)N b . We use typical examples to demon-
strate the use of the improved Gaussian kernel algorithm.
follows 12. 1 The Gaussian kernel correlation integral where s , c , and n are standard deviations of the input
T m (h) for the noise-free case scales as noisy signal s i , underlying clean component c i , and
Gaussian noise part n i . In the total signal s i c i n i , we
T m h dx m x dy m y e xy
2 /4h 2 have assumed c i and n i to be statistically independent,
i.e., scn and s2 2c 2n , where s, c, and n are the
where D and K are the correlation dimension and entropy, h The numerical implementation of the Gaussian kernel al-
is referred to as the bandwidth, and m (x) is the distribution gorithm requires the transformation of the input time series
function. The scaling behavior T m (h)e mK t h D in Eq. 2 data s i :i1,2, . . . ,N s into a new time series v i :i
was first justified by Ghez et al. 19 and latter by Diks 12 1,2, . . . ,N s according to
with inclusion of the factor (1/m) D , in which the m depen- s i s
dence was originally introduced by Frank et al. 20 to im- v i . 8
prove convergence of K. 2 In the presence of Gaussian s
noise, the distribution function m (y) can be expressed in
terms of a convolution between the underlying noise-free Under the transformation 8, the noise effect is described by
distribution function m (x) and a normalized Gaussian dis- the distribution function 4 and the standard deviation of the
tribution function with standard deviation 12,21, i.e., noise part is in Eq. 7. Accordingly, the delay state vec-
tors are reconstructed by replacing s i with v i in Eq. 1.
m y dx m x m
g
yx
In the case of discrete sampling, we assume vector points
on the attractor to be dynamically independently distributed
according to m (x) and use an average over delay vectors to
replace the integrals over the vector distributions in Eq. 5.
2
1
m dx m x e yx
2 /2 2
, 3 Consequently, T m (h) can be computed by 12
N N
1
2 2
where T m h e x x /4h .
i j 9
N N1 i1
ji
1 2 /2 2
mg yx e yx , 4 In practice, T m (h) is calculated at a series of discrete band-
2 m
width values h k (k0,1,2, . . . ,N b ). The parameters D, K,
and are then extracted through fitting the scaling relation
accounts for noise effects in the m-dimensional space. is
6 to the Gaussian kernel correlation sum T m (h) computed
the Euclidean (L 2 ) norm.
from Eq. 9. One can see that such a direct implementation
Under conditions 2 and 3, the Gaussian kernel corre-
of the GKA has a computational complexity O(N 2 N b ).
lation integral T m (h) in the presence of Gaussian noise and
the corresponding scaling law become 12
C. Nonlinear fitting
T m h dx m x dy m y e xy
2 /4h 2
5
After computing the Gaussian kernel correlation sum, the
parameters D, K, and in Eq. 6 can be extracted using
nonlinear least squares fitting 22. Here we exemplify the
h2
h 2 2 m/2
dx m x dy m y
GKA described above; a robust fitting procedure will be pre-
sented in Sec. IV.
Figure 1 illustrates the fitted D, K, and as a function of
2 /4 h 2 2
e xy embedding dimension. The clean data are generated by the
standard Henon map 23. The noisy data are produced by
h2
h 2
2 m/2
e mK t h 2 2
m D/2
6
adding 5% Gaussian noise to the output of the first variable.
In calculating T m (h) from Eq. 9, we use the following
parameters: N5000, N ref500, N b 100, log2(l)10,
for h 2 2 0, m, and m1,2, . . . ,10. We choose u to be equal to the attrac-
tor diameter. A definition and discussion of these parameters
where is a normalized constant. will be detailed below. As seen in Fig. 1, the fitted correla-
In Eq. 6, D and K are the two invariants to be estimated, tion exponents D and K and noise level converge to their
and the parameter is referred to as the noise level, defined true values when the embedding dimension is increased be-
as yond m3. We note that a convergence in D and is readily
achieved while K exhibits fluctuations, since saturation
n n should only appear in principle as m. In this example,
, 7 the average values of the correlation exponents and noise
s 2c 2n
level, taken over embedding dimensions m3 10, are ob-
3752 YU, SMALL, HARRISON, AND DIKS PRE 61
2 , 1 , 2 1
0
if 1 2
otherwise.
11
N
Noticing that the term i1 Nji 2 ( i j , k , k1 ) in Eq.
10 depends on the index k only, this allows us to decom-
pose Eq. 10 into two separate steps: 1 Calculation of the
binned interpoint distance distribution Cm ( k ), given by
2 xi xj , k , k1 ;
i1 ji
12
1
N b N N
T m h m w d, 15
2 2
T m h 2 i j , k , k1 e k /4h ,
N N1 k0 i1 ji 0 h
10
where m ( ) is given by
where i j xi x j , and k0 and N b correspond to mini-
mum ( l ) and maximum ( u ) distances, respectively. Here dC m
m , 16
we assume that l 0 and u equals the diameter of the at- d
PRE 61 EFFICIENT IMPLEMENTATION OF THE GAUSSIAN . . . 3753
where C m ( ) is the GP correlation integral with the hard In the following, we introduce a three-step fitting method,
kernel function w(r/ ) ( r). Substituting Eq. 16 into which is found to be quite robust. The described method can
Eq. 15, the Gaussian kernel correlation integral T m (h) is extract D and to a high precision and obtain a fair estimate
expressed as of K. The procedure is detailed as follows.
1 Fitting D, , and . We use the relation 6 between
T m h 0
e
2 /4h 2
dC m . 17
T m (h) and h to fit D, , and . The values of D and
obtained will be used in the next step. Letting e mK t
yields a model equation
This is the integral form of Eq. 13.
If one performs a partial integration in Eq. 17, T m (h) y h T m h h m m D/2 h 2 2 Dm /2. 19
can be expressed in the form of
2 Fitting K and deriving . We use y(h)
T m h
1
e
2 /4h 2
C m d . 18
T m1 (h)/T m (h) to eliminate , leading to a model
2h 2 0 D/2
m
y h he K t h 2 2 1/2. 20
m1
In previous work, this formula was used to estimate T m (h)
24,25. We fit K with D and fixed. Then can be derived from
e mK t , where is given in the last step.
IV. COMPUTATIONAL CONSIDERATIONS 3 Refitting D, K, and . We use the original relation 6
to fit D, K, and with the fixed derived in the last
A. Efficient binning
step. D, K, and obtained in the last two steps are used as
The use of the simplified algorithm given by Eqs. 12 trial values.
and 13 can speed up the calculation of T m (h). For instance, Note that the use of Eq. 20 requires computing the
for N10 000 and N b 200, this gives rise to a gain factor Gaussian kernel correlation integrals for two consecutive
200. We note that most of the time is consumed in com- embedding dimensions. In practice, we compute T m (h) for
puting the binned interpoint distance distribution Cm ( k ) in m1,2,...,M and extract D, K, and as functions of m. Thus
Eq. 12, which takes N(N1) operations. Apart from the the fitting procedure described above is performed for con-
remarkable advance by using Eqs. 12 and 13, a further secutive embedding dimensions. Furthermore, the standard
reduction of computing time is achievable. In this algorithm, deviations of the correlation integral T m (h) are used as
we adopt three improvements. weights in the fitting procedure 12. This can greatly reduce
1 A set of N ref representative points randomly chosen on fitting errors by comparison with a calculation using equal
the attractor are used for the sum over i to replace the origi- weights. The latter results in large deviations at higher em-
nal N points in Eq. 12. This reduces O N(N1) to bedding dimensions.
O (N1)N ref . Usually, we use N refN/10.
2 We observe that large distances have a negligible con- C. Choice of cutoff bandwidth h c
tribution to the Gaussian kernel correlation sum T m (h), as in
the GP algorithm 26. Thus we set an upper limit of the There exists an unsolved technical problem in the nonlin-
correlation distance u to be smaller than the attractor diam- ear fitting procedure described above, that is, determination
eter, above which the binning is not done. This can save of the largest bandwidth h c to be used within the scaling
computing time by a factor of 100 101 , depending on the region. This is a difficult task in the presence of noise. The
value of u . reason is simple: noise masks the scaling region at the small
3 A recursive version is used. Since we compute Cm ( k ) scale. Intuitively, h c should be small enough to keep fitting
for a series of consecutive embedding dimensions m in the scaling region presented by the underlying noise-free
dynamics. But h c cannot be so small as to prevent extracting
1,2,...,M and in the L 2 norm 2k (m1) 2k (m)( v im
D. On the other hand, h c should not be so large so as to
v jm ) 2 , the distance in the m dimension is successively
exceed the scaling region. In the general case, we suggest a
used in the (m1) dimension.
choice of h c 3 .
These three binning methods are found to be very effi-
Numerical simulations show that the fitted is insensitive
cient and speed up the calculation by a factor 101 102 at
to the choice of cutoff bandwidth h c . Thus, an iteration
least.
scheme can be adopted. In the first run, a trial h c , for ex-
ample, h c 0.5, is used so as to obtain a fitted . The value
B. Robust fitting procedure
of 3 is in turn adopted as the cutoff bandwidth h c .
Direct fitting based on Eq. 6 does not make sense. We Caution should be used when the noise level is either high
find that though D and can be extracted with a high preci- or low. For example, a 30% noise level will give rise to h c
sion, there is an uncertainty between and K. This is be- 0.9. This value is close to the upper limit of the linear
cause and e mK t are not independent for a fixed m but scaling region for most chaotic attractors tested. On the
their product e mK t is a real independent parameter. other hand, when the noise level is low, say, below 3%, the
In the nonlinear fitting process, as long as a stable is ob- value of 3 is to small to be used for h c . In practice, we
tained, the program will return values of and K. But such recommend fitting D, K, and for a series of cutoff band-
and K are in general arbitrary. width values starting from a small value and from this iden-
3754 YU, SMALL, HARRISON, AND DIKS PRE 61
1 Measures of Complexity and Chaos, edited by N. B. Abraham, 17 J. G. Caputo and P. Atten, Phys. Rev. A 35, 1311 1987.
A. M. Albano, A. Passamante, and P. E. Rapp Plenum, New 18 F. Takens, in Dynamical Systems and Turbulence, Warwick,
York, 1989. 1980, edited by D. A. Rand and L. S. Young Springer-Verlag,
2 P. Grassberger and I. Procaccia, Phys. Rev. Lett. 50, 346 New York, 1981, Vol. 898, pp. 366381.
1983; Physica D 9, 189 1983. 19 J. M. Ghez and S. Vaienti, Nonlinearity 5, 777 1992; J. M.
3 Dimensions and Entropies in Chaotic Systems, edited by G. Ghez, E. Orlandini, M. C. Tesi, and S. Vaienti, Physica D 63,
Mayer-Kress Springer-Verlag, Berlin, 1986. 282 1993.
4 J. Theiler, J. Opt. Soc. Am. A 7, 1055 1990. 20 M. Frank, H.-R. Blank, J. Heindl, M. Kaltenhauser, H. Koch-
5 T. Schreiber and H. Kantz, in Predictability of Complex Dy- ner, W. Kreische, N. Muller, S. Poscher, R. Sporer, and T.
namical Systems, edited by Y. A. Kravtsov and J. B. Kadtke
Wagner, Physica D 65, 359 1993.
Springer, New York, 1996.
21 M. Casdagli, S. Eubank, J. D. Farmer, and J. Gibson, Physica
6 E. Ott, E. D. Yorke, and J. A. Yorke, Physica D 16, 62 1985.
D 51, 52 1991.
7 M. Moller, W. Lange, F. Mitschke, N. B. Abraham, and U.
22 W. H. Press, S. A. Teukolski, W. T. Vetterling, and B. P.
Hubner, Phys. Lett. A 138, 176 1989.
Flannery, Numerical Recipes in FORTRAN, 2nd ed. Cam-
8 R. L. Smith, J. R. Stat. Soc., Ser. B Methodol. 54, 329 1992.
bridge University Press, Cambridge, 1992.
9 G. G. Szpiro, Physica D 65, 289 1993.
10 T. Schreiber, Phys. Rev. E 48, R13 1993. 23 A. Wolf, J. B. Swift, H. L. Swinney, and J. A. Vastano,
11 J. C. Schouten, F. Takens, and C. M. van den Bleek, Phys. Physica D 16, 285 1985.
Rev. E 50, 1851 1994. 24 T. Schreiber, Phys. Rep. 308, 1 1999.
12 C. Diks, Phys. Rev. E 53, R4263 1996. 25 A large error in estimating D, K, and may result. This is
13 H. Oltmans and P. J. T. Verheijen, Phys. Rev. E 56, 1160 because Eq. 17 is derived under the condition that l 0 and
1997. u . This is not the case when T m (h) is numerically com-
14 D. Kugiumtzis, Int. J. Bifurcation Chaos Appl. Sci. Eng. 7, puted.
1283 1997. 26 J. Theiler, Phys. Rev. A 36, 4456 1987.
15 J. Argyris, I. Anderadis, G. Pavlos, and M. Athanasiou, Chaos, 27 Dejin Yu, M. Small, R. G. Harrison, C. Robertson, G. Clegg,
Solitons and Fractals 9, 343 1998. M. Holzer, and F. Sterz, Phys. Lett. A 265, 68 2000.
16 T. Schreiber, Phys. Rev. E 56, 274 1997. 28 J. Theiler, Phys. Rev. A 34, 2427 1986.