2014 - Stewart Et Al - Grade Estimation From Radial Basis Functions

Grade Estimation from Radial Basis
Functions – How Does it Compare with

Conventional Geostatistical Estimation?
M Stewart1, J de Lacey2, P F Hodkiewicz3,4 and R Lane5
ABSTRACT
Implicit modelling is an approach to spatial modelling in which the distribution of a target variable
is described by a unique mathematical function that is derived directly from the underlying data
and high-level parametric controls specified by the user. This modelling approach may be applied
to discrete variables such as lithology (after converting the discrete codes to numeric values)
or to continuous variables such as geochemical grades. This paper discusses the estimation of
continuous (grade) variables using implicit modelling.
One of the underlying engines of implicit modelling for producing this mathematical function
description is the radial basis function (RBF). In essence, the RBF is a weighted sum of functions
positioned on each data point. A system of linear equations is solved to derive weights and the
coefficients of any underlying drift model coefficients. Once derived, the RBF may be solved for
any unsampled point or averaged over any volume to provide an estimate of grade. It is possible,
for example, to query the RBF on a regular grid to derive an estimate of block grades. Given the
ease of creation of an RBF, and its ability to predict grade, the question arises as to how the grades
derived from the solution of an RBF compare with grade estimates derived from conventional
geostatistical interpolation methods (eg ordinary kriging (OK)).
The purpose of this paper is to describe in lay terms:
•• the basic structure of an RBF
•• the role of parametric choice in the solution of RBFs and how this influences the character of
the solution
•• the fundamental similarities and differences between RBFs and conventional geostatistical
estimators.
Using a high-resolution conditional simulation, we show that in many situations RBF and OK
estimates are very similar.
INTRODUCTION
In recent years, implicit wireframe models have increasingly surface being modelled exists implicitly within the volume
been used to develop coherent 3D shapes for subsequent use function as an iso-potential surface defined by the data rather
in estimation via traditional methods (eg ordinary kriging than by an explicit drawing process. This volume function
(OK)). The term ‘implicit modelling’ was introduced to the may then be gridded, or ‘rendered’, into a wireframe for
task of modelling geological surface geometries by Cowan visualisation or subsequent modelling use.
et al (2003). Implicit modelling describes an approach to The method of implicit modelling is now widely used for
spatial modelling in which a combination of the data and the modelling of surface geometry from categorical logging
parametric controls specified by the user define a unique data and for the modelling of ‘grade iso-surfaces’ based on
mathematical volume function. This approach may be applied continuous grade variables. What many people are unaware
to the modelling of surfaces from categorical variables, such of is that the implicit models used to generate ‘grade iso-
as lithology, or to the modelling of continuous variables, such surfaces’ can also provide point or block estimates of grade. In
as geochemical grades throughout space. The most common many situations, these are very similar to estimates obtained
function currently in use for implicit modelling is the radial from more familiar estimation methods such as OK. There is
basis function (RBF). The term implicit is used because the a reason for this – it can be shown (Carr et al, 2001; Chiles
1. MAusIMM, Senior Principal Consultant, QG Pty Ltd, PO Box 1304, Fremantle WA 6959. Email: ms@qgconsulting.com.au
2. Technical Services Team Leader, ARANZ Geo Ltd, 41 Leslie Hills Drive, Riccarton, Christchurch 8011, New Zealand. Email: jacob.delacey@leapfrog3d.com
3. FAusIMM, Senior Manager Resource Evaluation, BHP Billiton, 125 St Georges Terrace, Perth WA 6000. Email: paul.hodkiewicz@bhpbilliton.com
4. Adjunct Research Fellow, Centre for Exploration Targeting, The University of Western Australia, 35 Stirling Highway, Crawley WA 6009.
5. Research Director, ARANZ Geo Ltd, 41 Leslie Hills Drive, Riccarton, Christchurch 8011, New Zealand. Email: richard.lane@leapfrog3d.com
NINTH INTERNATIONAL MINING GEOLOGY CONFERENCE / ADELAIDE, SA, 18–20 AUGUST 2014 129
M STEWART et al
and Delfiner, 1999; Costa, Pronzato and Thierry, 1999) that the compared to the other in terms of cost or speed. We prefer to
RBF is mathematically equivalent to a particular formulation leave it up to the practitioners and software manufacturers to
of kriging (dual kriging (DK)). In practice, estimates derived make that case.
from RBFs are also often very similar to those produced
by OK. Study workflow
The purpose of this paper is to describe (in simple terms) The generalised workflow for this paper is as follows:
the basic structure of an RBF and to illustrate the similarities •• create a simulation on a 300 × 300 × 10 m grid (1 × 1 × 1 m
this has to kriging. We will also briefly discuss the role of nodes) using a Gaussian random function (sequential
parameter choice in the solution of RBFs and show how this Gaussian simulation)
influences the character of the solution. •• back-transform the Gaussian values to a negatively-
skewed distribution with a mean of 54 per cent
GENERAL METHODOLOGY (approximating an Fe ore)
The proportion of a deposit physically sampled is invariably a •• sample ‘drill holes’ from the simulation on a 10 × 10 m
tiny proportion of the in situ reality. Even the highest density of pattern
sampling typically conducted (eg blast hole sampling for final •• define a variogram based on sampled ‘drill holes’
grade control) will only physically test around 0.1 per cent •• estimate grades on to 1 × 1 × 1 m nodes using OK
(one part in 1000) of the total volume. For resource estimation •• create the RBF and evaluate it onto 1 × 1 × 1 m nodes
drill patterns, this proportion may be less than 0.001 per cent
•• compare the RBF and OK estimates against underlying
(one part in 100 000). A model is required to estimate what is
reality.
in the unsampled volume.
The purpose of estimating on the underlying simulation
The purpose of models is to predict ‘reality’ and reduce the nodes rather than on to blocks is to enable a comparison of the
need for further sampling. In a situation like mining grade estimates to the underlying ‘reality’, and to show the ‘field’
control, we must make predictions of metal grade in order lines of the OK interpolant when performed at fine scale.
to decide what material to mine and process as ore and what
to send to waste. What is mined and processed is of course
reality, not the model.
THE IDEA OF INTERPOLATION
Interpolation is the process of predicting (estimating) the value
Ultimately, the only means for assessing the quality of a
of an attribute at an unsampled location from measurements
model is reconciliation between the predicted tonnage and
of the attribute made at surrounding sites (Figure 1). In linear
grade from the model and the tonnage and back-calculated
interpolation, the grade at the target location is calculated
head grade of the milled material. However, there are many
as a weighted linear average of the sample data. Different
other factors between model and mill that influence this
interpolators use different methods for determining the value
comparison (eg ore-blocking, blast movement, ore-loss and
of the weights.
dilution, tails sampling, conveyor mass calculations etc).
Reconciliation can provide, at best, a loose indication of the When the point to be estimated is within the field of available
quality of a resource model, not an objective measure. data, the process is called interpolation; when the point is
outside the data field, the process is termed extrapolation.
The ultimate method of objectively assessing the quality
This process may be undertaken in one, two, three or four
of any estimation method would be to obtain exhaustive
dimensions. Typically in mineral resource estimation, we
information, extract a subset of this information and use this
are concerned with practical three dimensional problems
to perform an estimate, and then compare the estimate with
the measured observations. This is clearly impractical (except
perhaps at laboratory scale). The better (and in fact only
practical) alternative is to use a synthetic version of reality – a
conditional simulation.
This study compares the results of grade estimation based
on evaluation of an RBF against grade estimates from the
generally accepted method of OK. Relative comparisons are
interesting, but these provide no indication of which method
is ‘best’ as this must be derived from a comparison with an
independent ground truth. Consequently, in this study, we
have adopted an experimental approach of using a simulation
model to represent the underlying ‘reality’, drawn data sets
from this reality and used this data to make grade estimates
by both the RBF and OK.
Results from the RBF and OK are then compared, both
against each other and against the underlying reality. This
allows the study to concentrate on the real matter at hand –
the relative performance of one method of grade interpolation
(OK) with another (the RBF) in predicting ‘reality’.
This study has deliberately not used a case study based
on real data. The only outcome possible from a case study
comparison of estimation methods is to produce two
separate grade estimates, with no idea whether either is ‘fit
for purpose’ or which is the better predictor. Assuming both FIG 1 – An illustration of interpolation with a local
estimates are fit for purpose, the matter of interest for the user neighbourhood. x0 is the target point, z(xi) are the measured
in this scenario is whether one method provides advantages values at location xi , λi is the weight given to sample i.
130 NINTH INTERNATIONAL MINING GEOLOGY CONFERENCE / ADELAIDE, SA, 18–20 AUGUST 2014
GRADE ESTIMATION FROM RADIAL BASIS FUNCTIONS – HOW DOES IT COMPARE WITH CONVENTIONAL GEOSTATISTICAL ESTIMATION?
– predicting the grade of an attribute (eg a metal grade) at is normally based on fitting a function to the available
unsampled locations from values measured in scattered drill experimental data, although there is no explicit link
samples. It is an underlying assumption that the attribute we and this function is often chosen more for mathematical
are attempting to predict is spatially continuous – that it takes convenience than being derived from an analysis of the
a real value at all possible locations. mineralisation process.
There are many different forms of interpolator possible. The •• Adoption of a model that summarises the random process
most basic is the piece-wise constant method, more familiarly allows the error variance (the variance of the ‘on average’
known as nearest neighbour estimation, in which any difference between estimated and true grade) to be
unsampled location simply takes the value of the nearest data expressed in terms of spatial covariances and weighting
point. The resultant continuous estimator takes the form of a factors applied to the samples (kriging weights). Spatial
mosaic pattern, with patches of constant grade separated by covariances are specified by the choice of variogram model
sudden steps. This is not a very realistic representation of the made and the locations of the data. Conventional algebra
way in which real attributes, like metal grades, are observed provides the means to find the set of kriging weights that
to vary in practice, and it gives significantly different weights minimises the error variance.
to the samples at the spatial extremities of the data set. For
These attributes are characteristic of all kriging systems. The
the sake of simplicity in discussion, this paper will
most common variants of kriging are simple kriging (SK), OK
consider that the attribute being predicted is the grade of a
metal, however, the idea can be simply extended to any and universal kriging (UK). What distinguishes these is the
continuous variable. way in which variation in mean grade (drift) is incorporated
into the kriging systems.
When considering the merits of different methods of
interpolation, it is helpful to start by looking at the observed
characteristics of real attributes in nature. The following
Dealing with drift
general observations can be made about most metal Any variable described by a random function can be
distributions: decomposed into a deterministic drift component and a
residual stochastic component, as follows:
•• ‘Average’ grades vary within a deposit. While this is a
scale-dependent phenomenon, typically it is possible
to draw contours of grades and delineate higher-grade Z (x) = m (x) + Y (x)
areas from lower. Consequently, observations are usually
not independent – grades at close distances are more where:
similar than grades further apart. This is modelled using Z(x) is the random function at location x (eg Fe grades)
auto-correlation, which decays as a function of distance m(x) is the mean grade at location x
(correlated behaviour). Y(x) is the residual random function
•• The strength of correlation may differ in different Usually, it is assumed that the drift can be expressed
directions (anisotropy). locally as a linear combination of known basis functions –
•• At some separation, correlation no longer exists (range). usually polynomial terms of low degree (f i) – with unknown
•• There is usually a non-spatial component to variability. coefficients (ai ):
Repeat measurements (eg duplicate samples) are not
identical due to errors in sampling and analysis and/or n
geological variation at scales smaller than a drill sample m (x) = / ai f i (x)
i=1
(nugget effect).
Common drift models used are:
The kriging interpolator (local) •• constant – polynomial of degree 0
Interpolators fall in to two general types – global or local. A •• linear – polynomial of degree 1
global interpolator takes into consideration all known points •• quadratic – polynomial of degree 2
to estimate a value, while local interpolation uses a subset of In theory, it is possible to use higher-degree polynomials
the data, usually defined by a search neighbourhood centred
to model the drift, but in practice it is inadvisable to do so
on the point being estimated.
because of the high degree of variation in the extrapolation
The interpolator that is probably used most commonly at the spatial extremes of the data. It must be remembered
in mining is kriging, or more particularly OK. The general that decomposition of the random variable into drift and
idea is straightforward – the estimation of a point is based stochastic components is simply a mechanism to allow us to
on a weighted linear combination of local data values, with describe a real phenomenon in mathematical terms. But the
the weights being calculated in such a way as to minimise real phenomenon is not actually the result of a mathematical
the error variance based on an assumed model for spatial
process of known character. The more complex the model
covariance. Kriging is underpinned by a number of key
(and the fewer the data) used to describe drift, the less likely
assumptions:
it is to have any resemblance to reality.
•• The underlying assumption is that the sample observations
Determination of the coefficients of the drift polynomials is
are interpreted to be the results of a random process.
The variable under study (for example Fe grades) can often challenging – they may be either specified by user choice
then be described by a mathematical random function. or calculated from the data by minimisation of some criteria.
This conceptualisation of the data is simply a neat trick A one-dimensional example of a drift model fitted to a
that allows us to describe reality as the outcome of a random function is shown in Figure 2. Note that the second
probabilistic model. degree polynomial drift model is calculated from the
•• The key step in geostatistical modelling is the adoption known sample data (10 m spaced), not from the underlying
of a spatial model (the variogram) that describes the (unknown) random function. In practice, there is inevitably
underlying random function. The choice of spatial model uncertainty attached to the choice of drift model.
M STEWART et al
FIG 2 – Illustration of quadratic function fitted to data extracted from a random function.
Different kriging systems hypothesis). What this means in practice is that there should be
no trends present in the local mean grade at less than the scale
As explained previously, what distinguishes the different
of the search. This is a somewhat slippery concept since it is
kriging systems is the way in which variation in mean grade
unusual to have a clear scale at which this separation applies.
(drift) is incorporated into the kriging systems.
In practice, this means that the variations in sampled grades
present within a local neighbourhood should be plausible
Simple kriging random fluctuations around a constant local mean grade,
SK assumes that the expectation of the mean (m) is both without a strong trend being present. This then allows the
constant throughout the domain and of known value OK system to accommodate variation in the local mean such
(intrinsic stationarity). This is equivalent to saying that the that the estimate is always centred on the weighted average of
drift component is constant and known. It is usually estimated the samples present in the local neighbourhood. This means
using the declustered mean of the available sample data. that the specification of the local search neighbourhood has
The SK estimator at any point reduces to a combination of a critical influence on the quality of the kriging estimator –
two components: a weighted average of local data and the in particular, the neighbourhood must be sufficiently large
domain mean. Estimates close to data will give more weight that the data contained adequately represents the local mean
to the local estimate, while estimates further from data will be grade.
dominated by the domain mean. When extrapolating beyond the limits of data, the OK
SK is seldom used in practice as the underlying assumption estimator does not revert towards the global mean, but
(constant, known mean) is too severe for most applications. maintains the local mean specified by the closest samples.
Moreover, the incorporation of the mean grade as a weighted
term in the SK estimate means that, away from the influence Universal kriging
of sample data, the estimator reverts to the mean grade. In The theory of UK was proposed by Matheron in 1969
most mineral deposits, grade diminishes towards the margin (Armstrong, 1984) to provide a general solution to linear
and this is often where drilling data is lowest. Having an estimation in the presence of drift. This theory assumes that
estimator that reverts towards the mean in this region is the local mean is unknown but varies in a systematic fashion
generally not realistic. and can be written as a finite expansion of known basis
functions (f) and fixed (but unknown) coefficients (a). The drift
Ordinary kriging information may then be incorporated into the expression of
The assumption that distinguishes OK from other kriging the estimation variance.
systems is that the expectation of the mean is unknown but It was very quickly realised that there are severe practical
is constant at the scale of the search neighbourhood (an difficulties in the implementation of UK. The development
assumption known as quasi-stationarity within the intrinsic of the UK system assumes that the underlying variogram
(incorporating drift) is known; in this situation, the kriging 1. A deterministic component carried by the sum of the drift
l
system correctly yields both the drift coefficients and the function terms at the target location: /
l c f (x) .
l
weights. In practice, however, the underlying variogram is 2. A probabilistic component calculated as the weighted
invariably unknown. This leaves us with a circular problem; sum of the covariances between the target location and
in order to calculate the residuals we need to know the drift,
but in order to know the drift we need to know the UK system.
all sample locations: a/
ab v (x , x) . Because the value of
a
the covariance (or variogram) function is largest at short
This circularity does not preclude the use of UK, but does distances, it is easy to see that this term will be largest
mean that great caution must be exercised in its application. when the target point is close to data. The values of the b
Assuming a drift model and working only with residuals will coefficient are influenced by clustering, and the distance
result in a biased estimate of the true underlying variogram of the sample value from the local mean grade estimated
(Armstrong, 1984). by the drift model.
In the authors’ experience, UK is seldom used in practice. The nature of the drift function is a choice imposed by the
user, and the covariance function (or variogram in the intrinsic
Dual kriging case) is likewise a choice specified by the user. The values of
The previously discussed kriging estimators are all based on the coefficients b and c are then calculated in the same way
linear combinations of sample data values. It is also possible as for other kriging systems – by imposing constraints on the
to rewrite the UK estimator in terms of the covariances σ(xi,x) system that allow a unique solution to be obtained (see Chiles
and drift functions f l(x) only, omitting any direct reference and Delfiner (1999) for details).
to the data. This is known as DK. The development is not This system is not easy to visualise. Figure 3 shows how
shown here, but a clear exposition is given in Chiles and the DK estimator is made up of drift and probabilistic
Delfiner (1999) and Galli, Murillo and Thomann (1984). The components, and that neither directly references sampled
term ‘dual‘ originates from an alternative derivation of these values. Note that while the drift estimate (and coefficients) and
equations by minimisation in a functional space, similar to probabilistic estimate (and coefficients) are shown separately,
splines (Chiles and Delfiner, 1999). these are actually derived simultaneously. The drift affects the
estimate throughout space, while the probabilistic coefficients
) (x)
Zdk = / cl f l (x) + / ba v (xa , x) described the local influence around each data point.
l a One of the major advantages of the DK system is that the
Expressed in English, the estimate at any point (x0) is the drift and covariance coefficients only need be solved once,
sum of two components: and can then be used to make an estimate at any location.
FIG 3 – One-dimensional illustration of estimation by dual kriging, illustrating the contribution of drift and covariance
terms. The model used to calculate covariances is a spherical model with a sill of 7.8 m and range of 3 m.
M STEWART et al
The main drawback of the method is that the use of a global considered to be composed of a drift term and a term that
neighbourhood results in a very large system of simultaneous is a weighted average of function values dependent on data
equations, with one equation per sample and one for each locations.
drift function.
Expressed in matrix form, the DK system looks like this: K
s (x) = / ~i z^ x - xi h + / ck qk (x)
i k
; Fl 0 E $ ; c E = 8 0 B
R F b z
where:
where: xi are the data locations over which the interpolation is to
∑ is the matrix (n × n) of sample-sample covariances be constructed
F is the matrix (1 × k) of k basic drift functions 𝜔i are RBF coefficients (weights)
z is the matrix (1 × n) of sample data values z k (x) is a spatial distance function (the RBF – from which the
b is the matrix (1 × n) of covariance coefficients method takes its name)
c is the matrix (1 × k) of drift coefficients The term on the right refers to the set of K drift functions
An advantage of DK over OK stems from the fact that it (qk(x)) each having a coefficient (ck) applied globally across all
is computationally more efficient. Because the weights and data.
coefficients are calculated directly from the data, they only
In the same way as for DK, conditions are imposed that make
need to be calculated once. The calculation of the estimate at
any point can then be done very rapidly. the system solvable. In this case, the conditions are that: the
product of RBF coefficients and drift function coefficients at
Radial basis functions each data point should sum to zero across all data points, and
the function should return the value of the data at a data point
The RBF is a family of mathematical techniques that has
(see Cowan et al (2001) for details and Chiles and Delfiner
been applied to many spatial interpolation problems, and it
underlies most of the ‘implicit modelling’ algorithms in use (1999) for the parallel example of DK). These conditions allow
today. It is based on a somewhat different starting premise to the system to be expressed as a set of linear equations. This is
the theory of regionalised variables on which kriging rests – shown in matrix form as follows:
instead of considering the target variable to be a realisation of
; Fl 0 E $ 8 c B = 8 0 B
a random function with a defined structure, the RBF is based U F ~ s
on interpolation of a predefined function from mathematical
criteria such as minimisation of curvature. In practice, this
difference is semantic only, because traditional kriging where:
also uses determined functions, except these are fitted to Φ is the matrix (n × n) of basis function values between
experimental data and have been chosen to suit the modelling all sample locations
of the data. Mathematically, there is an equivalence between
F is the matrix (1 × k) of k basic drift functions
DK and modelling with RBFs, and it is also possible to
choose the RBF function by fitting to the experimental data. s is the matrix (1 × n) of sample data values
In fact, because variograms such as the spherical are positive 𝜔 is the matrix (1 n) of covariance coefficients
definite (Chiles and Delfiner, 1999, p 59), they are suitable to c is the matrix (1 × k) of drift coefficients
use as RBFs. Apart from the use of different notation, the only difference
The interpolant for an RBF has a very similar form to between these two formulations is that DK uses a covariance
the general expression of kriging – the target variable is (or variogram) function to describe the spatial correlations
FIG 4 – Shape of common radial basis functions compared to common variogram functions.
between sample values, while the RBF uses an alternative into the original grid. The extracted data patterns used are
spatial function. illustrated in Figure 5.
In practice, the spatial function of an RBF performs an All estimation was carried out in three dimensions, but
identical role to the variogram in kriging, and the decision for display purposes the input data and output estimates
about the appropriate spatial function to apply to RBF are shown in two dimensions on the mid plane. Figure 6
modelling has a similar impact on the modelled output as the compares the OK and RBF estimates based on regular
choice of variogram model does to kriging. Some knowledge 10 × 10 m data, while Figure 7 is the equivalent comparison
about the nature of the underlying spatial basis functions based on irregular data. The underlying distribution that
available, and how these influence the character of the the pseudo-drill hole data is extracted from is a negatively
estimate, is thus useful. It is also possible to use traditional skewed distribution, which approximates a typical low-grade
variograms, such as the spherical, in the RBF formulation. iron ore deposit. The user-specified parameters of the two
interpolants are compared in Table 2.
Available radial basis functions
There are a wide family of RBFs that may be drawn on DISCUSSION
(Moroney, 2006). However, as with kriging, it is important that Examination of the OK and RBF estimates shows that,
the RBF chosen matches the observed spatial characteristics of within the field of data (ie inside the black boxes shown
the variable being interpolated. The variogram employed most in Figures 6 and 7), the estimates by the two methods are
often in kriging of grade variables is the spherical variogram. essentially indistinguishable. The correlation between the
This is characterised by near-linear behaviour near the origin, two interpolants is very high (0.988 and 0.980) and both
which then rapidly flattens off to a constant sill value at a are similarly different when compared to the ‘truth’. It is
defined range, and takes the value of the sill for all distances impossible to say which interpolation method produces a
greater than the range. It is worth noting that this function is superior result as both are similarly wrong. In this situation, if
derived from auto-correlation of a sphere, and is the simplest the OK estimate is considered to be ‘fit for purpose’, then the
mathematical function that gives a finite range, linear RBF estimate is likewise.
behaviour at the origin and positiveness. The exponential Kriging is normally only ever visualised as block estimates
variogram has a similar shape, although it is steeper near the or as grade shells constructed from block estimates. But of
origin and only approaches the sill asymptotically. course, any point in space may be determined by kriging
There are a number of basis functions that asymptotically so it can be used as an interpolant. In the aforementioned
approach a sill value, but the majority of these also have a plots, kriging was carried out at all the point locations
smooth (ie flat) form near the origin (for example inverse of the underlying simulation mesh. This reveals that the
multiquadric and Gaussian). The continuous behaviour (nearly) continuous OK interpolant is very similar to the
implied near origin is not typical of grade variables, and the RBF interpolant when within a dense data field such as a
standard spherical was observed to produce unrealistic grade grade control pattern. It also shows the discontinuities that
shells due to artefacts when the samples being used to create are present in OK interpolation caused by use of a local
an estimate changed abruptly when at a distance near the neighbourhood. This discontinuity is particularly marked in
range. Consequently, a new basis function (the spheroidal the area of extrapolation, where sudden jumps in interpolated
function) was created by developers at ARANZ Geo (the grade are caused by samples passing in/out of the search.
makers of Leapfrog software) that combines linear behaviour These discontinuities are reflected as unrealistic artefacts
near the origin and asymptotic behaviour away from the data when grade shells are constructed using traditional kriging.
(Figure 4). The formula for the spheroidal function is given in
Appendix 1. Selection of drift model
Linear variograms are suitable for modelling contact The choice of drift model imposed in implicit modelling has a
surfaces but should be avoided for modelling grades. large impact on the interpolant (Figure 8). Four common drift
models are employed (in increasing order of complexity):
none, constant, linear and quadratic. None of these are
COMPARISON OF RADIAL BASIS FUNCTIONS
directly equivalent to OK with a local neighbourhood. Using
AND KRIGING no drift model is roughly equivalent to SK with a zero mean
As shown previously, the RBF used in implicit modelling – this model should never be used for interpolation of grade
is mathematically equivalent to DK. However, DK is rarely variables.
used in practice because the traditional solution of the full DK A constant drift model is roughly equivalent to SK with a
system is intractable for data sets larger than a few hundred constant mean equal to the declustered average of the data.
points. In practice, the most commonly applied kriging Linear and quadratic drift models should be used with caution
approach is OK, which has gained wide acceptance for as they may be dangerous in extrapolation. Note that, within
resource estimation in the resource industry. Understanding the limits of data, the estimates are very similar with constant,
the difference between estimation by OK and estimation by linear and quadratic drifts. Constant drift is the safest option
RBF is equivalent to understanding the difference between to use for interpolation of grade variables.
OK and DK. The similarities and differences between OK, DK
and RBFs are summarised in Table 1. Point versus block estimation
The similarities and differences are best illustrated visually. In mining, estimates are usually made of blocks rather than
A conditional simulation on a 300 × 300 × 10 m grid was points. The usual reason given for this is that block estimates
created, and vertical columns of nodes were extracted to take account of the change in support between the available
emulate artificial drill hole patterns. Two different drill information (grades measured on drill sample-sized volumes)
patterns were chosen: a regular grid on a 10 × 10 m pattern and what we are trying to predict (the grade of truck or
(emulating a grade control pattern) and a grid on regular 20 m larger-sized volumes). The volume-variance effect dictates
northings with random selection on the easting axis. These that the variability of block estimates should be lower than
pseudo-drill patterns were then used to estimate grades back the variability of data.
M STEWART et al
TABLE 1
Comparison of ordinary kriging, dual kriging and radial basis function interpolators.
Factor Ordinary kriging Dual kriging Radial basis function

General description Grade at target point is calculated as the Based on a chosen drift function and variogram The process is fundamentally the same as for dual
linear weighted average of samples in a local model, a set of global drift coefficients and sample kriging (DK), except that spatial correlations are
neighbourhood. Adoption of a variogram model weighting coefficients are calculated across all provided by a radial basis function (RBF) rather
for the domain allows the ‘estimation error’ to be samples. A unique solution is found that minimises than a variogram. Additionally, mathematical
expressed in terms of the weights (unknown) and the estimation error. Once the drift and weighting simplifications have been developed, which allow
covariances (specified by the model). By a process coefficients have been solved, grade can be rapid solution of the RBF for very large datasets.
of minimisation under constraints, a unique set of interpolated at any location.
weights are derived that minimise the estimation
error. This process is repeated for every target point.
Assumption of Quasi-stationarity should be satisfied. This is The method accounts for non-stationarity via No assumption of stationarity is required. But as
stationarity a weakening of the intrinsic hypothesis that simple drift functions. The simpler the observed for kriging, the higher the degree of stationarity
allows some variation in local mean (drift) to trend, the more robustly the drift model will present (and the simpler the drift) the more
be accommodated. In practice, drift should describe variation in the local mean. The residual reliable interpolation will be.
be insignificant at a scale less than the search then left after subtraction of a simple drift function
neighbourhood, and the variogram should apply should satisfy the intrinsic hypothesis (ie as
across the entire domain. outlined for ordinary kriging).
Spatial model Variogram (intrinsic case). Variogram (intrinsic case) that incorporates drift. Symmetric distance function. Must be real valued
on [0,∞].
User decisions Definition of domain. Definition of domain. Definition of domain.
Choice of variogram model. Choice of variogram model. Form of drift function(s).
Definition of search neighbourhood (which is Form of drift function(s). Choice of radial basis function (and anisotropy).
intimately tied to choice of variogram). Choice of variogram model.
Output Grade estimate of a point (or volume). Grade estimate of a point (or volume). Grade estimate of a point (or volume)
Kriging variance (and other derivative measures of (no metric of estimation confidence).
estimation confidence).
Behaviour in The assumption is that data in the search Behaviour in extrapolation depends on the nature Same as for DK. Common drift functions are:
extrapolation neighbourhood is centred on the local mean. In of the drift models adopted and the variogram • none: away from data interpolant reverts to zero
practice, this means that samples on the margins range (range of influence of samples). Grade • constant: away from data interpolant reverts to
are given the highest weight when estimation estimation is a combination of a drift term and a mean (equivalent to simple kriging)
grades outside the data field and the marginal stochastic term. Far from data, the estimate will be • linear: interpolant reverts to drift function that is
grades are projected outwards. dominated by the drift term – this may simple linear function of coordinates
become negative. • quadratic: interpolant reverts to quadratic drift
function (use with caution).
FIG 5 – Regular and semi-regular drill patterns.
Because the estimates are linear functions of the data, a a larger mesh, or, more commonly, by averaging the spatial
block estimate can be created from the average of a point function between data points and the block and using these
estimator within a volume. Strictly speaking, it is the integral averaged values in the kriging equations. Mathematically,
of the function, but is calculated in practice by evaluating the these processes are equivalent, but the latter is more common
average over a discrete number of points within the block due to its greater computational speed.
(discretisation). In the case of an RBF, it is usually performed by calculating
In OK, block estimation is usually achieved in one of two the integral of the function over the volume as the
ways: by the averaging of point estimations on a fine mesh into computational efficiency of the RBF renders this feasible, and
FIG 6 – Comparison of ordinary kriging and radial basis function estimates based on regular 10 × 10 m data extracted from a dense simulated truth. The scatter
plots on the right only compare points within the area informed by drilling (black square). Note the difference of behaviour of the interpolant in extrapolation.
FIG 7 – Comparison of ordinary kriging and radial basis function estimates based on irregular data (20 m N × random E ) extracted from a dense simulated truth.
it is more convenient to incorporate partial blocks that occur
TABLE 2 when domains subdivide blocks. In practice, like in OK, the
Comparison of ordinary kriging and radial basis function integration is done empirically on a ‘discretisation’ mesh; that
interpolation parameters (note that although ranges are quite is, the function is evaluated at n regularly spaced nodes and
different, the resultant functions are very similar in shape). the n values averaged.
An important observation can be made here. If the gradient
Oridnary kriging Radial basis function of the interpolant function is smooth across a block, then
estimation parameters interpolant evaluation of the function at the centre point is equivalent to
Neighbourhood Search 250 × 250 × 10 m No clipping applied integrating over the volume. This is most easily illustrated in
(anisotropy 10 × 10 × 1 m) one dimension (Figure 9), but the idea can easily be extended
Maximum samples 12 to three dimensions.
No clipping applied In many instances, evaluation of an interpolant at the centre
Variogram C0: 0.9 Nugget: 0.72 point of a block is practically equivalent to averaging across
C1: spherical sill 1.0 Spheroidal sill: 3.0, no rotation the full block, particularly with dense data like in a grade
Ranges 65, 45, 6.5, no rotation Range 20 m control patter. Put another way, point kriging of a block
C2: spherical sill 1.9 Anisotropy: 1, 0.69, 0.1 centroid may produce a very similar result to block kriging,
Ranges 120, 120, 25, no rotation and evaluation of an RBF at a block centroid may likewise
be very similar to evaluating on a finer mesh and averaging.
Drift model Constant drift Simple testing can confirm whether this is the case.
M STEWART et al
FIG 8 – Comparison of drift models applied to the irregular sampling case.
FIG 9 – Illustration of the effect that the gradient of interpolant has on averaging versus centrepoint.
One of the criticisms sometimes levelled at RBF grade Isatis software was used to generate the underlying
interpolants is that if the interpolant is rendered into grade simulation and OK estimates. RBF interpolations were created
shells, the wireframe meshes created from the underlying using Leapfrog Mining software.
interpolant produce ‘non-geological’ looking shapes with
unrealistically smoothed curves (like the shapes from early REFERENCES
digital animations). When the interpolants are compared at
Armstrong, M, 1984. Problems of universal kriging, Mathematical
point scale, it should be obvious that the continuous smooth Geology, 16:1.
curves of the RBF are no more or less ‘realistic’ than the more
Carr, J C, Beatson, R K, Cherrie, J B, Mitchell, T J, Fright, W R,
discontinuous shapes produced by neighbourhood effects
McCallum, B C and Evans, T R, 2001. Reconstruction and
in OK or by the flat triangles and angular vertices present in
representation of 3D objects with radial basis functions, ACM
manually digitised wireframes. SIGGRAPH 2001, 12–17 August 2001, Los Angeles, CA, USA.
Chiles, J P and Delfiner, P, 1999. Geostatistics: Modelling Spatial
CONCLUSIONS Uncertainty, Wiley Series in Probability and Statistics, 695 p (Wiley:
Implicit models are now widely used for the modelling of New York).
surface geometry from categorical logging data, and for Costa, J-P, Pronzato, L and Thierry, E, 1999. A comparison between
the modelling of ‘grade iso-surfaces’ based on continuous kriging and radial basis function networks for non-linear
grade variables. One of the underlying engines of implicit prediction (eds: A Enis Çetin, L Akarun, A Ertüzün, M N Gurcan
modelling is the RBF. The mathematics of the RBF is and Y Yardimci), in Proceedings IEEE-EURASIP Workshop
equivalent to DK, in which a unique solution for both drift on Nonlinear Signal and Image Processing (NSIP’99) (Bogaziçi
coefficients and covariance weightings are found directly University Printhouse).
from the data. Once derived, the RBF may be solved for any Cowan, E J, Beatson, R K, Ross, H J, Fright, W R, McLennan, T J,
unsampled point or averaged over any volume to provide an Evans, T R, Carr, J C, Lane, R G, Bright, D V, Gillman, A J,
estimate of grade. Oshurst, P A and Titley, M, 2003. Practical implicit geological
Comparison interpolations developed in this paper show modelling, in Proceedings Fifth International Mining Geology
that in a situation such as grade control where the data spacing Conference (The Australasian Institute of Mining and Metallurgy:
is less than the range of the variogram, the results of estimation Melbourne).
using RBF interpolation are virtually indistinguishable from
OK of grades. Galli, A, Murillo, E and Thomann, J, 1984. Dual kriging – its properties
and its uses in direct contouring, Geostatistics for Natural Resources
Characterisation (eds: G Verly, M David, A G Journel and
ACKNOWLEDGEMENTS
A Marechal) pp 621–634.
This work originated in a project carried out for BHP Billiton. The
authors thank Colin Carey at BHP Billiton and Scott Jackson of Moroney, T, 2006. An investigation of a finite volume method
QG Pty Ltd for thoughtful reviews of the manuscript. Valuable incorporating radial basis functions for simulating nonlinear
comments were received from two anonymous reviewers. transport, PhD thesis, Queensland University of Technology.
APPENDIX 1 In Leapfrog software, the expression is instead presented in

the form of a variogram, rising to a sill. The interface shows a
The expression for the spheroidal basis function is:
variogram function controlled by the nugget (N), sill (S) and
range (R) parameters. This variogram is given by:
1- r m
r1 c
c 1+m m+1
c ^ r h = ^S - N h ^1 - H ^ tmR, m, r hh + N
H (c, m, r) =
r 2 -m / 2 r 2= c
` +`c j j
1 m + 1 -1 - m/ 2 1
`
2 m+2
j m+1
where:
where: ρm is a scaling constant that depends on the order of the
r is range spheroidal function (m)
m is the order of the spheroidal function (odd valued The order is also called the alpha parameter in the products
numbers from three to nine) and alters the shape of the spheroidal function.
c is the distance at which the value of the function is The expression for the spheroidal function has not been
being calculated formally published. Interested readers may contact author
This is the formula used in the RBF calculations. It is a Richard Lane for further details.
decreasing function asymptotically approaching zero.

2014 - Stewart Et Al - Grade Estimation From Radial Basis Functions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2014 - Stewart Et Al - Grade Estimation From Radial Basis Functions

Uploaded by

Copyright:

Available Formats

Grade Estimation from Radial Basis

Functions – How Does it Compare with

Factor Ordinary kriging Dual kriging Radial basis function

FIG 5 – Regular and semi-regular drill patterns.

FIG 8 – Comparison of drift models applied to the irregular sampling case.

APPENDIX 1 In Leapfrog software, the expression is instead presented in

You might also like