You are on page 1of 3

SME Annual Meeting

Feb. 28-Mar. 03, 2010, Phoenix, AZ

Preprint 10-045

DEALING WITH A TREND IN INVERSE DISTANCE ESTIMATION

G. Jalkanen, Mining & Geostatistics Consultant, Houghton, MI


S. Vitton, Michigan Technological Univ., Houghton, MI

ABSTRACT ’
Taking the inverse of X X and pre-multiplying both sides by this
Dealing with a trend in inverse distance estimation, IWD, is inverse, results in the following matrix equations
presented. A measurement is proposed to be consisted of a portion −1
due to a mean and a portion due to a residual. The procedure consists
of de-trending the measurement via a linear or nonlinear regression.
B=( XTX ) X TY (6)
Residuals are computed, and IWD parameter estimation is conducted Define the vector Y(hat)’ as the vector of fitted values Y(hat)’, then
on the residuals. Estimation at un-sampled points is then computed by
estimating the mean and adding on the estimate of the IWD on the in matrix form, the value of this vector is
residual. A possible BOXCOX transform is also proposed.
Y(hat)=Xb (7)
INTRODUCTION
Define the vector e to be the vector of residuals, then this vector
The object of this section is to present a method to analyze when in matrix form is
a trend is present as far as inverse distance estimation is concerned.
e=Y-Y(Hat) (8)
In this case the model to be considered is one that consists of a trend
and residuals. Up to now, people that use inverse distance don’t use a In this case, the sum of squared errors, SSE is then
model that incorporates the option of a trend in their models. The
proposal is to de-trend the data via linear or nonlinear regression. Here ’
SSE=e e (9)
residuals are computed, and the parameter estimation of the inverse
distance is computed on them. Estimation at un-sampled points is BOXCOX TRANSFORMATION
computed by estimating the mean (this from the regression) and
adding on the estimate coming from the residuals. Note that this In some cases it might be of interest to transform the raw data
argument is an intuitive one since some data sets might be thought of before carrying out the regression analysis. One argument for this is
as being a model with a trend or if physics are involved such as in flow when there are skewed data sets, it might be of interest to use a power
equations in hydrology the objective function might be a nonlinear one transformation to limit the effects of outliers on the regression analysis.
with nonlinear regression being a solution. The following development Once the regression analysis is conducted on the transformed data, a
of the regression equations is from (Neter, Kutner, Nachtsheim, and simple power transformation can be used to transform the results to
Wassermant, 1996). The algorithm begins with the normal error the original space. One transformation of interest is the BOXCOX
regression model transformation. Note that the following review of the BOXCOX
transformation is from (Neter, Kutner, Nachtsheim, and Wasserman,
m 1996). The BOXCOX procedure automatically identifies a
Yi = ∑ β j X ij + ε i i=1,…,n (1) transformation from the family of power transformations on Y. The
j =1 family of power transformations is of the form.
Y’=Yλ (10)
We can write this equation in matrix form as

Y = Xβ + ε (2) Note that when λ = 0, Y =


'
log e Y . The normal error
regression model with the response variable a member of the family of
Where Y is a “n x 1” matrix, X is a “n x m” matrix, β is a “m x 1” power transformations becomes
matrix, ε is a “n x 1” matrix. To derive the normal equations by the m
method of least squares, minimize the quantity Yi λ = ∑ β j X ij + ε i i=1,n (11)
n m j =1
Q= ∑{Y − ∑ β
i = `1
i
j =1
j X ij }2 (3)
For each λ the Yi λ observations are first standardized so that
the magnitude of the error sum of squares does not depend on the
λ
In matrix notation this is
value of
Q=(Y-X β ) T (Y-X β ) (4)
Wi = K1 (Yi λ − 1) λ≠0
Minimizing Q with respect to the vector β and substituting b for
K 2 (log e Yi ) λ =0
β yields the following system of equations
= (12)

XTX b= X TY (5) where K2 = (∏ Y )


n
i =1 i
1/ n
(13)

1 Copyright © 2010 by SME


SME Annual Meeting
Feb. 28-Mar. 03, 2010, Phoenix, AZ

At each data location a search is made for the nearest neighbors within
1
nd K1 = (14) an inputted search ellipse but the data location itself is omitted. An
λK 2 λ −1 estimate is then made for the inverse distance estimator at each
omitted data location. The cross validation error is computed for each
data location. The above steps are to be repeated for each data
Note that K 2 is the geometric mean of the Yi observations. location. The sum of the squares of the cross validation errors is to be
minimized. The decision variables to be determined by the
Once the standardized observations Wi have been obtained for a optimization problem are the inverse distance exponent, and the
given λ value, they are regressed on the predicator variables and the
anisotropic angle and ratio parameters used to estimate the distance
values. Note that Jalkanen and Vitton (2009) looked at the parameter
error sum of squares SSE is obtained. The problem then becomes a estimation of the IWD algorithm, but did not look at the inclusion of a
one dimensional optimization with the objective function being the SSE trend in their work. Related references on the IWD algorithm are also
of the standardized observations and the unknown being the power λ . included in Jalkanen and Vitton (2009). In equation form, the sum of
The one dimensional optimization algorithm being used is the Golden squared cross validation errors (SSCVE) is
Section algorithm that was covered in (Sen, 1986). n
DEVELOPMENT OF THE INVERSE DISTANCE ESTIMATOR SSCVE = ∑( y
i =1
*
( xi ) − y ( xi )) 2 (19)
The following development in the inverse distance estimation
algorithm is from Hartman (1992). The inverse distance estimator
assumes that a domain exists with data locations x1 ,..., xn Where y * ( xi ) is the cross validated estimate at data location
contained in the domain along with residual values e1 ,..., en . The xi , y ( xi ) is the actual value at data location xi and n is the number
of cross validated points.
predicator at a particular data location x is estimated by
OPTIMIZATION ALGORITHMS
n
e( x) = ∑ λi e( xi ) (15) The optimization algorithms used in this study consisted of the
feasible direction main algorithm (Moon, 2002). The revised simplex
i =1
algorithm (Solow, 1984) and the golden section algorithm (Sen, 1986)
The weights are defined by were used as sub-problems. The gradient of the objective function
was approximated via a forward difference algorithm (Gill, Murray, and
1 Wright, 1981). Note the search in these optimization algorithms is a
multidimensional search, not a sequential search.
d im
λi = n (16) RESULTS
1
∑j =1 d j
m
The first data set used was taken from Cressie, (1993). The data
set consists of 85 wells of piezometric-head levels in the Wolfcamp
aquifer in Texas. A first result was for the inverse distance parameter
estimation for the raw data set. A search radius of 600 in both the
Where m is the exponent of the inverse distance estimator and easting and the northing directions was used. The optimal value of the
di the distance from the estimated data location x to the data squared cross validation errors was .30865E+07, with an inverse
distance exponent of 4.36550, an λ value of 5.91948, and an θ
location xi , i=1,…,n value of 143.846. Next a regression was conducted on the Wolfcamp
data set with basic functions being a constant, an easting, a northing,
TWO DIMENSIONAL GEOMETRIC ANISTROPY DISTANCE an easting times a northing, a easting squared and a northing squared.
CALCULATION The regression was conducted with the residuals written to a separate
The procedure used in the IDW algorithm for the reduction of a file. Again a search radius of 600 in both the easting and the northing
three dimensional anisotropy to two dimensions as the three directions was used. The optimal value of the squared cross validation
dimensional anisotropy was developed by Journel and Froidevaux, errors was .25547E+07, with an inverse distance exponent of 1.46323,
(1982). Note that Journel and Froidevaux considered distances in an λ value of 1.11468, and an θ value of 49.4567. Next a
term of distance calculations associated with a variogram. By calculation of a power transform for the BOXCOX transform was
considering the geometric anisotropy in two dimensions, the directional conducted. Again the basic functions were the same as for the above
graph of distance differences is elliptical. Let (dx,dy) be the initial regression. Again a search radius of 600 in both the easting and the
rectangular coordinates of differences, with θ the angle made by the northing directions was used. The optimal value of the squared cross
validation errors was .21308E+04, with an inverse distance exponent
major axis of the ellipse with the coordinate axis Odx, and λ is the
of 1.47935, an λ value of 6.10001, and an θ value of 17.84038.
ratio of anisotropy of the ellipse. The following is the equation to
determine the distance h Note the reduction in the sum squared cross validation errors for the
BOXCOX transform results are to be expected.
1
A second data set used in this case study is from Neter, Kutner,
2
h= ( z1 + z 22 ) 2 (17) Nachtsheim, and Wasserman (1996). This data, called the Dwaine
Studios data set, consisted of sales in a community Y, with the
where
regressors being the number of persons aged 16 or younger x1 , and
z1 = dx cosθ + dy sin θ the per capita disposable personal income in the community x2 .
(18)
z 2 = λ (dy cosθ − dx sin θ There were 21 data locations in this data set. A first result was for the
inverse distance parameter estimation for the raw data set. A search
radius was chosen so all of the data points would be used in the cross
PROBLEM DEFINITION
validation. The optimal value of the squared cross validation errors
The following are the steps in the cross validation of the residuals. was .36448E+04, with an inverse distance exponent of 2.40887,
2 Copyright © 2010 by SME
SME Annual Meeting
Feb. 28-Mar. 03, 2010, Phoenix, AZ

an λ value of 12.59113, and an θ value of 0.00. Next a regression


was conducted on the Dwaine Studios data set with basic functions
being a constant, an x 1 coordinate, and an x 2 coordinate. The
regression was conducted with the residuals written to a separate file.
Again the search radius was chosen so all of the data points would be
used in the cross validation. The optimal value of the squared cross
validation errors was .21093E+04, with an inverse distance exponent
of 1.08056, an λ of 12.0, and an θ value of 17.45167. Next a
calculation of a power transform for the BOXCOX transform was
conducted. Again the basic functions were the same as for the above
regression. Again the search radius was chosen so all of the data
points would be used in the cross validation. The optimal value of the
squared cross validation errors was .39737E-03, with an inverse
distance exponent of 1.0, an λ value of .36776, and an θ value of
0.0. Note the reduction in the sum squared cross validation errors for
the BOXCOX transform results are to be expected.
SUMMARY AND RECOMMENDATIONS
Parameter estimation in the inverse distance algorithm with the
possibility of a trend being present has been presented. The
procedure consists of de-trending the measurements via a linear or
nonlinear regression. Residuals are computed, and IWD parameter
estimation is conducted on the residuals. Estimations at un-sampled
points are them computed by estimating the mean and adding on the
estimate of the IWD on the residuals. A possible BOXCOX transform
has also been proposed. One possible extension to the above is to
use a moving search radius to compute the value of the residuals at
data points. A problem of interest might be to compute the parameters
of the IWD algorithm with a maximum likelihood algorithm. Also a
problem of interest is to estimate the average estimation variance of
interest over a support area of interest. This leads to a possible
sampling problem. An initial sampling campaign is assumed to exist.
It is desired to add an additional number of samples. The objective is
to determine the spatial locations of the additional samples that
minimize the average estimation variance over an area of interest.
REFERENCES
1. Cressie, N. A. C, Statistics for Spatial Data, 1993, pp 212-224.
2. Gill, P.E. and Murray, W., and Wright, M.H. Practical
Optimization, 1981, p 186.
nd
3. Hartman, H. SME Mining Engineering Handbook, 2 Edition, Vol
1, 1992, p 355.
4. Jalkanen, G.J. and Vitton, S., “Parameter Estimation in the
Inverse Distance Estimation Algorithm, Paper given at the AIME
meeting, Feb. 2009.
5. Journel A.G. and Froidevaux, R. “Anisotropic Hole-Effect
Modeling” Mathematical Geology, Vol 14, No. 3, 1982, pp 217-
239.
6. Moon, K.S., Class notes – ME5680- Optimization I, Michigan
Technological University.
7. Neter, N., Kutner, M.H., Nachtsheim, C.J., and Wassermant, W.,
Applied Linear Regression Models, 1996, pp 241.
8. Sen, S. Class notes SIE642, Nonlinear Programing, University of
Arizona, 1986.
9. Solow, D. (1986) Linear Programming – An Introduction to Finite
Improvement Algorithms, pp 165-194.

3 Copyright © 2010 by SME

You might also like