Professional Documents
Culture Documents
School of Computing
University of Glamorgan
Pontypridd CF37 1DL
Wales
UK
Email: (sekemp/idwilson/jaware)@glam.ac.uk
Abstract: The strength of Artificial Neural Networks (ANNs) derives from their perceived capability to
infer complex, non-linear, underlying relationships without any a priori knowledge of the model. However,
in reality, the fuller the priori knowledge the better the end result is likely to be.
In this paper we show how to implement the Gamma test. This is a non-linear modelling and analysis tool,
which allows us to examine the input/output relationship in a numerical data-set. Since its conception
in 1997 there has been a wealth of publications on Gamma test theory and its applications.
The principle aim of this paper is to show the reader how to turn the Gamma test theory into a practical
implementation through worked examples and a explicit discussion of all the required algorithms.
Furthermore, we show how to implement additional analytical tools and articulate how to use them in
conjunction with the non-linear modelling technique employed.
that if two points x’ and x are close together (a more detailed discussion is given in Section
in input space then their corresponding outputs 1.4.1). A formal mathematical justification of the
y 0 and y should be close in output space. If the method can be found in [Evans and Jones, 2002a,
outputs are not close together then we consider Evans and Jones, 2002b].
that this difference is because of noise.
Suppose we are given a data-set of the form 1.1.1 The Vratio
M
1 X
1 0
γM (k) = |yN [i,k] − yi |2 (1 ≤ k ≤ p) E (y − y)2 ||x 0 − x | < δ → Var(r ) as δ → 0
2M i=1 2
(4) (8)
In order to compute Γ a least squares fit re- The Delta test is based upon the premise that for
gression line is constructed for the p points any particular value given for δ, the expectation
(δM (k), γM (k)). The intercept on the vertical in (8) can be estimated by the sample mean
(δ = 0) axis is the Γ value, as it can be shown
1 X 1
E(δ) = |yj − yi |2 (9)
|I(δ)| 2
(i,j)∈I(δ)
γM (k) → Var(r ) in probability as
δM (k) → 0
(5) where
Calculating the gradient of the regression line
can also provide helpful information on the
complexity of the system under investigation I(δ) = {(i, j)||x j −x i | < δ, 1 ≤ i 6= j ≤ M } (10)
is the set of index pairs (i, j) for which the asso- In Figure 2 variables are defined as follows.
ciated points (x i , x j ) are located within distance
δ of each other. lowestBound This is an array that keeps a
However, when given finite data points, the record of the smallest distance found for a
number of pairs x i and x j satisfying |x i −x j | < δ data point x i . This is so that when we try
decreases as δ decreases. Choosing a small δ in- to find the k th nearest neighbour we do not
dicates that the sample mean E(δ) has significant consider the k − 1 nearest neighbour.
sampling error. Therefore, δ must be set by the
euclideanDistanceTable This is a two-
user to be significantly large so that the sample
dimensional array that stores the euclidean
mean E(δ) provides a good estimate for the ex-
distance between two data points. The brute
pectation in (8). This restriction reduces the ef-
force procedure runs faster if this array has
fectiveness of E(δ) as an estimate for Var(r).
been calculated beforehand.
If there were a parametric form of the rela-
tionship between δ and the corresponding sample nearNeighbourIndex This records the index
mean E(δ) we could compute the E(δ) for a range of the k th nearest neighbour. The index is
of δ then estimate the limit as δ → 0 using a re- used to calculate the distance between the
gression technique. However, a parametric form yN [i,k] associated with x N [i,k] and yi .
for the relationship between δ and E(δ) is not ap-
parent. yDistanceTable This records the absolute
The Gamma test addresses this issue by defin- value of yN [i,k] − yi for each data point x i .
ing quantities analogous to δ and E(δ), denoted
by δ and γ respectively. These are based on the nearNeighbourTable This records the k th
nearest neighbour structure of the input points nearest euclidean distances up to p for each
x i (1 ≤ i ≤ M ) in such a way that data point x i .
1 2
returned by x i and x j is zero then they are both δ(3) = 4 (5.385164807134504 +
2
(given Euclidean distance is a symmetric func- 3.3166247903554 + 5.3851648071345042 +
2
tion) zero(th) near neighbours. 3.7416573867739413 ) = 20.75
Table 1: Input vectors and the output of y = Plotting δ(k) against γ(k) and drawing a regres-
x1 − x22 + x33 sion line will find Γ. The regression line intercepts
the δ(k) = 0 in Figure 3 at 5.582417582417582
The Euclidean distance table would take the form and Vratio = 0.017140. An explicit description
given in Table 2. of the least squares method can be found in
[Stroud, 2001]. Alternatively, most spreadsheet
1 2 3 4 packages can perform this task.
1 - 3.32 5.40 3.74
2 - - 2.45 1.0
3 - - - 2.24
800
4 - - - -
700
600
1 2 3 δM(k)
1 1
δ(2) = 4 (3.7416573867739413
2
+ A = A(M, k) = Eφ (|∇f (xi )|2 cos2 θi ) (12)
2.449489742783178 + 2.4494897427831782 +
2 2
2.236067977499792 ) = 7.75 where θi denote the angle between the vectors
(xN [i,k] − xi ) and ∇f (xi ).
Under the conditions of a smooth positive sam- number of data points required to train an ANN
pling density φ then the limiting value of which allows the cross-validation set to be used
A(M, k) is actually independent of k (although for training and/or testing. An example can be
for ergodic sampling over a chaotic attractor found in [Wilson et al., 2002] where the Gamma
this is not always the case - see Figure 11 test was applied to property price forecasting.
[Evans and Jones, 2002b]). Furthermore, if the M -test line stabilises
Also if |∇f (xi )|2 is independent of cos2 θi then then there is enough data present to construct
a smooth model. However, if the line does not
1 stabilise more data capture maybe necessary or
A= Eφ (|∇f (xi )|2 )Eφ (cos2 θi ) (13)
2 a further investigation to find the possible causes
If m = 1 then θi takes only the values 0 and π. If of the noise. An example of the M -test is shown
we assume it takes these values with equal prob- in Figure 4 using the ’noisy sine data’ again.
ability then Eφ (cos2 θi ) = 1. If m > 1 then one Here, the line stabilises at approximately the
might assume instead that the θi are uniformly 160th data point therefore we require at least 160
distributed over [−π, π] and then Eφ (cos2 θi ) = data points for the training phase of the ANN.
1/2. Asymptotically we then have
1 2
A= 2 Eφ (|∇f (xi )| ) if m = 1,
(14)
1 2
4 Eφ (|∇f (xi )| ) if m ≥ 2.
can be computed very easily using a brute force feature in a property index. RICS Cutting
approach, although it is computationally expen- Edge Conference, London.
sive running at O(M 2 ). Or, can be computed
using the kd-tree [Friedman and Bentley, 1977] [Končar, 1997] Končar, N. (1997). Optimisa-
which has a run-time of O(M log M ), although tion methodologies for direct inverse neurocon-
this method takes more programming effort. trol. PhD thesis, Department of Computing,
The Gamma test has already been used Imperial College of Science, Technology and
to improve the non-linear models in a variety Medicine, University of London.
of problem domains including: crime predic-
tion [Corcoran, 2003]; river level prediction [Monte, 1999] Monte, R. A. (1999). A random
[Durrant, 2002]; and, house price forecasting walk for dummies. MIT Undergraduate Journal
[James and Connellan, 2000, of Mathematics, 1:143–148.
Wilson et al., 2002].
In summary, the Gamma test is a mathe- [Pi and Peterson, 1994] Pi, H. and Peterson, C.
matically proven smooth non-linear modelling (1994). Finding the embedding dimension and
tool with a wide variety of applications. The variable dependencies in time series. Neural
Gamma test is still in its nascency and therefore Computation, 5:509–520.
many more problem domains remain where the
Gamma test may prove useful. We entertain [Stefánsson et al., 1997] Stefánsson, A., Končar,
the hope that the readers of this paper will N., and Jones, A. J. (1997). A note on the
implement the Gamma test and apply it to these gamma test. Neural Computing Applications,
problem domains. 5:131–133.
Procedure sortNeighbours(euclideanDistanceTable)
for k = 1 to p do
for i = 1 to M do
for j = 1 to M do
if i == j then
do nothing because they are the same data point.
end if
else
distance → euclideanDistanceTable[i][j]
if distance > lowerBound[i] and distance < currentLowest then
currentLowest → distance
nearNeighbourIndex → j
end if
end else
end for
yDistanceTable[i][k] → | getOutput(nearNeighbourIndex) - getOutput(i) |
nearNeighbourTable[i][k] → currentLowest
lowestBound[i] → currentLowest
currentLowest → ∞
end for
end for
Return(yDistanceTable, nearNeighbourTable)