You are on page 1of 5

MIN E 615 - A PPLICATION OF G EOSTATISTICS

Assignment 1

Thiago Alduini Mizuno


January 20, 2022

This report presents the results from the analysis of provided data: dallas.dat, Figure1.1.
The objective was to perform basic Geostatistics with Jupyter notebooks using the pygeostat
package and GSLIB programs [3]. The study consisted of: calculating the variogram map,
experimental variograms, fitting a variogram model, and kriging. The Jupyter notebook is
mainly based on the example provided by Oktay Erten and is attached on eclass "Assignment
1 MIN E 615.ipynb".

1 D ATA

Figure 1.1: Map with the data samples

1
The data set has 180 samples spread over approximately 13km by 13km, a minimum value
of 24, and a maximum of 4230. The mean is 335.24, and the standard deviation is 602.47.
The distribution is strongly positive skew with 50 % of the values under 162. Analyzing the
histogram and the boxplot is possible to observe extreme values, over 1000, with low frequency.
The spatial analysis did not include these outliers, Figure 1.2.

Figure 1.2: Box plot and histogram from the original data (top) and clipped data (bottom).

2 S PATIAL ANALYSIS
The variogram map allowed the definition of the direction of higher continuity 110º. The
parameters were five lags of 1000m for each axis, resulting in Figure 2.1. In this step, the Python
code was based on the Variogram Workflow from GitHub geostatsguy (Michael Pyrcz) [1].
The GSLIB program "VARCALC" generated the experimental variograms. We tested several
search parameters for pairs to the experimental variogram. However, the results did not
present a stable variogram. The direction of the major range was 110º and 20º to the minor.
The search distance limit was 6km. The best result is presented in Figure 2.1 and corresponds
to 15 lags of 400m. For distances smaller than 400 m, the number of pairs is small, less than 40.
Thus, points with greater distances have more reliable results. The direction of 110º presents a

2
gradual increase of the variogram, and the first point reaches the sill at 3600m. Although, from
that point on, the variogram oscillates between 0.74 and 1.27. The perpendicular direction
(20º) shows significant variability even in points with smaller distances, being above sill from
2800m.
The variogram modeling used VARMODEL software Version: 1.1.1 for semi-automatic
variogram fitting. After several attempts, the software did deliver an error "Unsuccessful
sill optimization." So we opted for the manual choice of the variogram model with a single
exponential structure in both directions. The nugget effect is 0 with a sill of 1, major and minor
ranges are 2800m 1300m. The adjustment favored lags with shorter distances.

Figure 2.1: Variogram map and variogram model for both directions

3 K RIGING
The dataset was interpolated using simple kriging with GSLIB software kt3dn in a regular grid,
with 130 cells in both directions and a dimension of 100x100m. Kriging parameters follow the
spatial analysis presented above.
To analyze the impact of outliers capping, we selected three thresholds, 500, 1000, and 2000,
performing four kriging estimations, one of them without capping. The negative values in
the estimates were limited to a minimum of 0. The grid was clipped using a polygon. The
figure 3.1 display the maps with the same color scale. The northeast portion near the center
concentrates the extreme values. This region presents a significant difference between the
estimates, especially in the results using thresholds of 500 and 1000. In these cases, the values
are considerably lower.

3
Figure 3.1: Kriging maps for each threshold

4 D ISCUSSION
There is a limited amount of information about the data, so we can only speculate about its use.
For this study, we considered one example of reserve estimation. In that case, extreme values
directly impact the estimated volume, and may cause an overestimation. Disregarding the
cutoff and fixing the rock volume, the amount of Pb depends on the concentrations estimates.
The figure shows the variation of the sum of concentrations for each scenario of thresholds.
Using the original data for comparison, the thresholds of 1000 and 500 significantly reduce the
sum of Pb by 25% and 37%, respectively. In the first case, approximately 6% of the samples are
affected by outlier capping. There is a change in 14% of the samples in the second.

4
Figure 4.1: Sum of Pb concentration for each threshold

Limiting the maximum values is a widely used way to handle outliers. However, it can be
considered a simplified approach. Some alternatives include using different weights for these
samples [2] and interpolation methods that consider different variograms for value intervals,
for example, indicator kriging [4].
Different thresholds cause significant variations in the estimated Pd values, resulting in
substantial differences in reserves. Therefore, the uncertainty of this parameter must be
taken into account. Additional studies varying variograms and cutoffs would improve the
assessment of uncertainty in resources.

R EFERENCES
[1] Variogram calculation in python for engineers and geoscientists. https://github.com/
GeostatsGuy/PythonNumericalDemos/blob/master/Variogram.ipynb. Accessed:
2022-01-13.

[2] Joao Felipe Costa. Reducing the impact of outliers in ore reserves estimation. Mathematical
geology, 35(3):323–345, 2003.

[3] Clayton V Deutsch, Andre G Journel, et al. Geostatistical software library and user’s guide.
New York, 119(147), 1992.

[4] André G Journel. Nonparametric estimation of spatial distributions. Journal of the Inter-
national Association for Mathematical Geology, 15(3):445–468, 1983.

You might also like