Archaeology and Geostatistics | Stationary Process | Spatial Analysis

Archaeological SCIENCE

Journal of Archaeological Science 31 (2004) 151–165 http://www.elsevier.com/locate/jas

Journal of

Archaeology and geostatistics
C.D. Lloyd a*, P.M. Atkinson b
b

School of Geography, Queen’s University, Belfast BT7 1NN, UK School of Geography, University of Southampton, Highfield, Southampton SO17 1BJ, UK

a

Received 16 December 2002; received in revised form 24 June 2003; accepted 8 July 2003

Abstract Geostatistics is used in many different disciplines to characterise spatial variation and for spatial prediction, spatial simulation and sampling design. Archaeology is an inherently spatial discipline and the models and tools provided by geostatistics should be as valuable in archaeology as they are in other disciplines that are concerned with spatially varying properties. However, there have, so far, been few applications of geostatistics in archaeology. This paper seeks to highlight some of the key tools provided by geostatistics and to show, through two case studies, how they may be employed in archaeological applications. Some relevant literature is summarised and two case studies are presented based on the analysis of (i) Roman pottery and (ii) soil phosphate data.  2003 Elsevier Ltd. All rights reserved.
Keywords: Spatial analysis; Mapping; Sampling design

1. Introduction Geostatistics is a set of tools used for characterising spatial variation, spatial prediction, spatial simulation and spatial optimisation (e.g., sampling design). Applications of geostatistics are found in a wide range of fields including biology, environmental science, geography, geology, meteorology and mining. Geostatistics is based on the principle of spatial dependence (or spatial autocorrelation): observations close in space tend to be more similar than those further apart. Therefore, if the spatial distribution of some variable is structured (as opposed to being random) geostatistics may be useful in some capacity. The characterisation of spatial autocorrelation in archaeological variables has been the concern of several researchers. Hodder and Orton [16], in their classic text on spatial analysis in archaeology, provide a section on the subject of spatial autocorrelation. This work included the definition of Moran’s I and Geary’s c, two
* Corresponding author. Tel.: +44-28-8027-3478; fax: +44-28-9032-1280. E-mail address: c.lloyd@qub.ac.uk (C.D. Lloyd). 0305-4403/04/$ - see front matter  2003 Elsevier Ltd. All rights reserved. doi:10.1016/j.jas.2003.07.004

coefficients which characterise the degree of spatial autocorrelation in a variable. An application was demonstrated based on the distribution of the length/breadth ratio index of Bronze Age spearheads. Specifically, I was estimated for several spatial lags (that is, for pairs of locations separated by several distance and direction vectors), enabling assessment of structure in the spatial distribution of the index. Other studies have applied similar statistical measures of spatial autocorrelation to the terminal distribution of dated monuments at lowland Maya sites [20,40]. There are few published case studies where geostatistics is applied in archaeological contexts. There have, however, been reviews of geostatistics in archaeology: Ebert [13] and Wheatley and Gillings [39] both provide summaries of the basic tools of geostatistics in archaeological contexts. The present paper is intended to take a broader overview and to outline some existing applications of geostatistics in archaeology as well as to present two case studies that are concerned with the analysis of (i) Roman pottery and (ii) soil phosphate data. First, some published applications of geostatistics in archaeology are outlined. Then, geostatistical theory is introduced.

152

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165

2. Published applications of geostatistics in archaeology In this section, a small number of published studies are discussed. These illustrate the wide range of archaeological problems which geostatistics may help to solve. Zubrow and Harbaugh [42] is one of only a few publications that apply the geostatistical spatial prediction method of kriging in an archaeological context. Kriging was utilised to reduce the effort expended in locating archaeological sites. The sites were located in the archaeological zone of Can ˜ ada del Alfaro in Guanajuato, Mexico and the Hay Hollow valley in east-central Arizona, USA. The specific aim of the paper was to predict, from a sample of the sites identified through fieldwork, the expected number of sites in each cell of a regular grid. The paper examined the use of a sample of the total surveyed area from which kriged predictions were made. The subsequent surveying required to locate all sites in the surveyed area was then assessed. It was observed that increasing the initial sample from 12.5% of the surveyed area to 50% made relatively little difference in the number of sites found in cells predicted by kriging. In other words, kriging enabled the location of almost as many of the total sites from 12.5% of the total sample as it did from 50% of the total sample. Thus, in this study, spatial dependence in the density of sites was demonstrated, as was the applicability of methods that utilise this property. Webster and Burgess [36] examined the application of kriging to mapping electrical resistivity for a Saxon or Norman to 17th century site at Bekesbourne in Kent, England. The data set was used to illustrate how large scale trends (that is, a spatially varying mean) may affect the predictions made using kriging, so the objective was only indirectly archaeological in nature. In a specifically archaeological application, Neiman [27] used variograms to explore spatial variation in the terminal dates of Maya settlements (Whitley [40] and Kvamme [20] had a similar focus). Geostatistics has been applied in disciplines allied to archaeology. Oliver et al. [29] estimated variograms of leading principal components and canonical variates of pollen counts in a vertical core made through peat in Fife, Scotland. Their objective was to use a range of tools, including variograms (a means of characterising spatial structure; defined below), to explore the structure of the core. Bocquet-Appel and Demars [4] estimated variograms of 14C dates of remains from or associated with European Neanderthals and early modern humans. Models fitted to the variograms were used to generate maps representing the spatial distribution of remains of different dates. Robinson and Zubrow [34] discuss interpolation in archaeology and they include discussion about kriging, although they caution that the technique should be used with care and that simpler approaches may be suitable in

many contexts. Hageman and Bennett [15] provide a short summary of widely used variants of kriging for generating Digital Elevation Models (DEMs) in archaeological applications. Ebert [13], in a review of geostatistics for the analysis of archaeological fieldwalking data, presents an analysis of the spatial distribution of bulk struck flint. In that application, cross validation (this entails removing a data point, predicting its value, comparing the predicted and observed data points and carrying out the same process for all data) was used to assess the accuracy of kriging predictions. A map was also generated using the variogram model specified in the paper. Wheatley and Gillings [39], in their review of GIS in archaeology, provide a chapter on interpolation which includes a section on geostatistical methods. The examples given are based on elevation data (as is the focus of Hageman and Bennett [15]) and not explicitly archaeological data. 3. Geostatistics The basic principles of geostatistics are outlined below. There are many introductions to the subject and several more detailed texts that could be consulted for more information (for example, [2,14,38]). Burrough and McDonnell [7] provide a short introduction to geostatistics in the context of GIS. There are also introductions for specific audiences including users of GISystems [28]; physical geographers [30,31] and the remote sensing community [9]. 3.1. The theory of regionalised variables In the Earth sciences knowledge about how properties vary in space is usually sparse. Therefore, it is not feasible, in general, to use a deterministic model to describe spatial variation. If, for example, the objective is to make predictions at locations for which there are no data it is necessary to allow for uncertainty in our description as a result of our lack of knowledge. The uncertainty inherent in predictions of any property means that what cannot be described deterministically can be accounted for through the use of probabilistic models. With this approach, the data are considered as the outcome of a random process. Isaaks and Srivastava [17] caution that use of a probabilistic model is an admission of ignorance; it does not mean that any spatially referenced property varies randomly in reality. In geostatistics, spatial variation (at a location, x) is modelled as comprising two distinct parts, a deterministic component (µ(x)) and a stochastic (or ‘random’) component (R(x)): Z͑x͒ϭ ͑x͒ϩR͑x͒ (1)

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165

153

This is termed a random function (RF) model. The upper case Z refers to the RF whereas lower case z refers to the observed data. In geostatistics, a spatially referenced variable, z(x), is treated as an outcome of a RF, Z(x), defined as a spatial set of random variables (RVs). A realisation of a RF is called a regionalised variable (ReV). The Theory of Regionalised Variables [22] is the fundamental framework on which geostatistics is based. Where the properties of the variable of interest are the same, or at least similar in some sense, across the region of interest we can employ what is termed a stationary model. In other words, we can use the same model parameters at all locations. Stationarity may be divided (for geostatistical purposes) into three classes for which different parameters of the RF may exist. In turn these are: (i) strict stationarity, (ii) second-order stationarity and (iii) intrinsic stationarity [19,26]. Only the latter two concern us here. For second-order stationarity, the mean and (spatial) covariance, are required to be constant. Therefore, the expected value should be the same at all locations, x: E͕Z͑x͖͒ϭ for all x (2)

Fig. 1. Observations (+) made along a transect, with lag (h) of 1 and 2 indicated.

and the correlogram (or autocorrelation function, the standardised covariance) exist only if the RF is secondorder stationary, and the variogram must be used when intrinsic stationarity only can be assumed [19]. 3.2. The variogram The core tool in geostatistical analysis is the variogram (defined above). The variogram characterises spatial dependence in the property of interest. The experimental variogram, ˆ ͑h͒, can be estimated from p(h) paired observations, z(xa), z(xa+h), =1, 2, . p(h) using: ˆ (h)ϭ 1 2p(h)
p(h)

In addition, the covariance, C(h), between the locations x and x+h should depend only on the lag, h (the distance and direction by which paired observations are separated), and not on the location, x: C͑h͒ϭE͓͕Z͑x͒Ϫ ͖͕Z͑xϩh͒Ϫ ͖͔ ϭE͓Z͑x͒Z͑xϩh͔͒Ϫ 2 for all x

͚ ͕z(x )Ϫz(x ϩh)͖2
ϭ1

(6)

(3)

In some cases, the requirements for second-order stationarity are not met. For example, the variance (or dispersion) may be unlimited as lag increases. For this reason, Matheron [22] defined the intrinsic hypothesis. For a RF to fulfil the intrinsic hypothesis it is required only that the expected value of the variable should not depend on x: E͕Z͑x͖͒ϭ for all x (4)

for all x and the variance of the increments should be finite [19]. Thus, the variogram, (h), defined as half the expected squared difference between paired RFs, exists and depends only on h: 1 (h)ϭ E[{Z(x)ϪZ(xϩh)}2] 2 (5)

That is, the expected semivariance is the same for all observations separated by a particular lag irrespective of where the paired observations are located. Second-order stationarity implies the intrinsic hypothesis, but the intrinsic hypothesis does not imply second-order stationarity. Thus, the covariance function

In simple terms, the variogram is estimated by calculating the squared differences between all the available paired observations and obtaining half the average for all observations separated by that lag (or within a lag tolerance where the observations are not on a regular grid). Fig. 1 gives a simple example of a transect along which observations have been made at regular intervals. Lags (h) of 1 and 2 are indicated. Thus, half the average squared difference between observations separated by a lag of 1 is calculated and the process is repeated for a lag of 2 and so on. The variogram can be estimated for different directions to enable the identification of directional variation (termed anisotropy). A mathematical model may be fitted to the experimental variogram and the coefficients of this model can be used for a range of geostatistical operations such as spatial prediction (kriging) and conditional simulation (defined below). A model is usually selected from one of a set of authorised models. McBratney and Webster [24] provide a review of some of the most widely used authorised models. Further models can be found in a range of texts (for example, [8]). There are two principal classes of variogram model. Transitive (bounded) models have a sill (finite variance), and indicate a second order stationary process (as defined above). Unbounded models do not reach an upper bound; they are intrinsic only [24]. Fig. 2 shows the parameters of a bounded variogram model. The nugget effect, c0, represents unresolved variation (a

154
range (a) sill (c 0 + c1)

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165

structured component (c1)

practice this is rarely the case. The most widely used variant of kriging, ordinary kriging (OK), allows the mean to vary spatially: the mean is estimated for each prediction neighbourhood. OK predictions are weighted averages of the n available data. The OK weights define the Best Linear Unbiased Predictor (BLUP). The OK prediction, z ˆ OK(x0), is defined as: z ˆ OK(x0)ϭ

nugget (c0)

͚

n

OK

z(x )
OK

(9) , sum to 1 to

ϭ1

Lag(h)

with the constraint that the weights, ensure an unbiased prediction:

Fig. 2. The parameters of a bounded variogram model with a nugget effect.

͚

n

OK

ϭ1

(10)

ϭ1

mixture of spatial variation at a finer scale than the sample spacing and measurement error). The structured component, c1, represents the spatially correlated variation. The sill, c0+c1, is the a priori variance. The range, a, represents the scale (or frequency) of spatial variation. For example, if soil phosphate measured at an archaeological site varies markedly over quite small distances then the soil phosphate can be said to have a high frequency of spatial variation (a short range) while if the soil phosphate is quite similar over much of the site and varies markedly only at the extremes of the site (that is, at large separation distances) then the soil phosphate can be said to have a low frequency of spatial variation (a long range). Variograms used in the case studies presented following this section were fitted with a nugget effect and a spherical component. The nugget variance is given as: (h)ϭ

So, the objective of the kriging system is to find appropriate weights by which the available observations will be multiplied before summing them to obtain the predicted value. These weights are determined using the coefficients of a model fitted to the variogram (or another function such as the covariance function). The kriging prediction error must have an expected value of 0: ˆ OK(x0)ϪZ(x0)}ϭ0 E{Z The kriging (or prediction) variance, as:
2 ˆ ˆ2 OK(x0)ϭE[{ZOK(x0)ϪZ(x0)} ] 2 OK,

(11) is expressed

ϭϪ (0)Ϫ

͚͚

n

n

OK OK

(x Ϫx )ϩ2

H

0 if hϭ0 1 otherwise

ϭ1 ϭ1

͚

n

OK

(x Ϫx0)

(12)

ϭ1

(7)

The spherical model, a bounded model, is defined as: h h c·[1.5 Ϫ0.5 a a (h)ϭ c

5

SD

3

]

if h#a if h>a

(8)

That is, we seek the values of 1, ., n (the weights) that minimise this expression with the constraint that the weights sum to one (equation 10). This minimisation is achieved through Lagrange multipliers. The conditions for the minimisation are given by the OK system comprising n+1 equations and n+1 unknowns:

where c is the structured component. Authorised models may be used in positive linear combination where a single model is insufficient to represent well the form of the variogram. 3.3. Kriging There are many varieties of kriging. Its simplest form is called simple kriging (SK). To use SK it is necessary to know the mean of the property of interest and this must be modelled as constant across the region of interest. In

5

͚
n

n

OK

(x Ϫx )ϩ ϭ1

OKϭ

(x Ϫx0)

ϭ1,...,n (13)

ϭ1 OK

͚

ϭ1 OK,

where OK is a Lagrange muliplier. Knowing prediction variance of OK can be given as: ˆ2 OKϭ
OKϪ (0)ϩ

the

͚

n

OK

(x Ϫx0)

(14)

ϭ1

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165

155

The kriging variance is a measure of confidence in predictions and is a function of the form of the variogram, the sample configuration and the sample support (the area over which an observation is made, which may be approximated as a point or may be an area) [19]. The kriging variance is not conditional on the data values locally and this has led some researchers to use alternative approaches such as conditional simulation (discussed in the next section) to build models of spatial uncertainty [14]. There are two varieties of OK: punctual OK and block OK. With punctual OK the predictions cover the same area (the support, V) as the observations. In block OK, the predictions are made to a larger support than the observations. With punctual OK the data are honoured. That is, they are retained in the output map. Block OK predictions are averages over areas (i.e., the support has increased). Thus, the prediction is not the same as an observation (at x0) and does not need to honour it. A worked example of the OK system is provided by Burrough and McDonnell [7], box 6.2). 3.4. Conditional simulation Kriging predictions are weighted moving averages of the available sample data. Kriging is, therefore, a smoothing interpolator. Conditional simulation (also called stochastic imaging) is not subject to the smoothing associated with kriging (conceptually, the variation lost by kriging due to smoothing is added back) as predictions are drawn from equally probable joint realisations of the RVs which make up a RF model [11]. That is, simulated values are not the expected values (i.e., the mean) but are values drawn randomly from the conditional cumulative distribution function (ccdf): a function of the available observations and the modelled spatial variation [12]. The simulation is considered “conditional” if the simulated values honour the observations at their locations [11]. Simulated realisations represent a possible reality whereas kriging does not. Simulation allows the generation of many different possible realisations that may be used as a guide to potential errors in the construction of a map [18] and multiple realisations encapsulate the uncertainty in spatial prediction. Probably the most widely used form of conditional simulation is sequential Gaussian simulation (SGS). With sequential simulation, simulated values are conditional on the original data and previously simulated values [11]. In SGS the ccdfs are all assumed to be Gaussian. The SGS algorithm follows several steps [10,14] as detailed below: 1. Apply a standard normal transform to the data. 2. Go to the location x1.

3. Use SK (note OK is often used instead; see Deutsch and Journel [11] about this issue), conditional on the original data, z(x ), to make a prediction. The SK prediction and the kriging variance are parameters (the mean and variance) of a Gaussian ccdf: F(x1;zԽ(n)ϭProb{Z(x1)#zԽ(n)} (15)

4. Using Monte Carlo simulation, draw a random residual, zl(x1), from the ccdf. 5. Add the SK prediction and the residual which gives the simulated value; the simulated value is added to the data set. 6. Visit all locations in random order and predict using SK conditional on the n original data and the i 1 values, zl(xi), simulated at the previously visited locations xj, j=1, ., i 1 to model the ccdf: F(xi;zԽ(nϩiϪ1)ϭProb{Z(xi)#zԽ(nϩiϪ1)} (16)

7. Follow the procedure in steps 4 and 5 until all locations have been visited. 8. Back transform the data values and simulated values. By using different random number seeds the order of visiting locations is varied and, therefore, multiple realisations can be obtained. In other words, since the simulated values are added to the data set, the values available for use in simulation are partly dependent on the locations at which simulations have already been made and, because of this, the values simulated at any one location vary as the available data vary. SGS is discussed in detail in several texts (for example, [8,10,11,14]). The use, and benefits, of SGS are explored in this paper. 3.5. Sampling design Kriging predicts with minimum prediction or kriging error, OK (from here on generalised to K), and also predicts this kriging error for every predicted value. The kriging error depends only on the geometry of the domain or support V to be predicted, the distances between V and the n(x0) data points x , the geometry of the n(x0) data, and finally the variogram [19]. The values of the sample observations locally have no influence. Thus, if the variogram is known, the kriging error can be predicted for any proposed sampling strategy prior to the actual survey. Kriging is, therefore, an ideal tool for designing optimal sampling strategies. Burgess et al. [6] chose as their criterion of a good sampling strategy, the minimisation of the maximum Kriging error, Kmax. The quantity Kmax is not constant over the region of interest, but rather tends to increase the further the point (or block) to be predicted is from

156

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165

the observations (at least for monotonic increasing variograms). The Kmax is reached when the point to be predicted is furthest from the sample observations, at a distance dmax. Burgess et al. [6] showed that where spatial variation is isotropic (invariant with orientation) an equilateral triangular grid minimises dmax and hence Kmax. Any alternative sampling schemes, and in particular the random scheme, will have some larger values of dmax and hence larger values of Kmax, although a hexagonal grid may be optimum in restricted circumstances [41]. In practice, a square grid is likely to be preferred for reasons of convenience in indexing, site location and computer handling, and of shorter travelling distances in the field. The optimal sampling density for a given sampling scheme, can be designed by solving the Kriging equations for several sampling densities and plotting ˆ Kmax against sample spacing [23,25]. If the budget for the survey is limited then so too is the maximum precision attainable. If the survey is not limited by funding and the investigators can define a maximum tolerable prediction error, then the optimal sampling strategy is the one that just achieves the desired precision. Greater precision would be wasteful. The optimal strategy is found by reading the required sample spacing from the plot of ˆ Kmax against sample spacing. The above approach for optimal sampling design provides a model-based framework for selecting a sample spacing to achieve a desired precision of prediction. However, this approach has been criticised because it does not provide an adequate measure of local uncertainty (e.g., [14]). True, the quantity ˆ Kmax generally increases with dmax such that densely sampled areas have smaller values of ˆ Kmax than sparsely sampled areas. However, the quantity ˆ Kmax is not affected by the character of spatial variation locally. Thus, in terms of elevation, mountainous regions and floodplain areas would result in the same ˆ Kmax, for a given sampling framework. This inadequacy is most evident in maps of ˆ Kmax for gridded data: the same local pattern in ˆ Kmax is repeated globally. Despite these limitations ˆ Kmax can be useful as a guide to uncertainty in predictions where spatial variation is similar across the region of interest. In cases where the form of spatial variation changes across the region of interest a non-stationary approach (for example, splitting the data into sub-sets which can be regarded as ‘homogeneous’) can be applied [21]. 3.6. Software The wide range of public domain and low cost software now available (see [35], for a review of some public domain software) means that the tools of geostatistics are readily available to the archaeologist. Widely used public domain software packages include

GSLIB (Geostatistical Software Library, [11] and Gstat [32], both used for the case studies presented in this paper. In addition, several commercial GISystems include geostatistical functions and there is a range of commercial geostatistical packages. 4. Case studies In this section, two case studies are presented. The first case study is an analysis of the distribution of Roman pottery in southern Britain and use of the variogram is illustrated. The second case study shows how the variogram, kriging (punctual and block OK) and conditional simulation (SGS) can be applied to the analysis of the distribution of soil phosphates at an archaeological site in Greece. 4.1. Case study 1: Roman pottery in southern Britain The first case study utilises the variogram to characterise spatial dependence in assemblages of Roman pottery from the south of Britain from details collected by Allen and Fulford [1]. Allen and Fulford acquired data on five types of pottery, but of these, only two occur with enough regularity at the sites surveyed to provide a large enough sample for geostatistical analysis. The two types considered here are South-East Dorset Black Burnished Category I (SEDBB I) and Severn Valley Ware (SVW). SVW was not recorded at many of the sites and variograms estimated from few data are often ‘noisy’ and visually unstructured. It should also be noted that the percentages of SEDBB I and SVW at each site were obtained in various different ways including sherd counts, sherd weights, estimated vessel equivalent (EVE) and number of vessels represented (VR). Allen and Fulford [1] discuss this issue in some detail. The omnidirectional variogram for SEDBB I is presented in Fig. 3. The increase in semivariance with lag for the variogram of SEDBB I percentages is indicative of spatial dependence and a model was fitted to this variogram. There is a clear tendency for semivariance to increase up to a lag of about 90 km after which semivariance remains constant. The range (a) of the fitted variogram model was 119.91 km. This may be interpreted as the separation distance above which assemblages of SEDBB I are spatially independent. In archaeological terms, this may represent the redistribution of pottery of this type from production centres to markets. In other words, pottery types that exhibit clearly structured spatial variation may be considered examples of larger scale production, vessels that perhaps dominate in the region of concern and are found consistently in archaeological assemblages. In such a scenario, industries that were only local in scale would, in a regional context, be marked by unstructured spatial variation. A map of SEDBB I%, derived using OK, is

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165

157

900 800
Semivariance (SEDBB I %2)

700

600 500

400 300 200

100 0
0 20 40

Semivariance 256.785 Nug(0) + 527.289 Sph(119.91)

60

80
Lag (km)

100

120

140

Fig. 3. Omnidirectional variogram for SEDBB I. Nug. is nugget, Sph. is spherical.

given in Fig. 4 (the validity of mapping properties such as artefact proportions is discussed below). Predictions are shown only within 250 km of the observations. The largest SEDBBI concentrations are in Dorset, as would be expected. The linear features visible in the map are characteristic of areas located far away from sample data. In this case, the map should not be viewed as a map of predicted pottery amounts (that is, percentages) since pottery amount is not a continuously varying property (unlike, for example, elevation or rainfall). As Wheatley and Gillings [39] note, surfaces derived from observations such as counts of artefacts may be useful where the survey from which the data derive was not exhaustive (to give an indication of counts at areas where no data are available) but such maps should be viewed with caution. The kriged surface provides a way of gaining a clearer sense of regional variation in SEDBB I amount than is possible using shaded point maps and it may be considered to represent the idealised catchment of SEDBB I, but it does not represent pottery amount (that is, percentage) per se. The omnidirectional variogram for SVW in Fig. 5 appears to demonstrate no general increase in semivariance with increase in lag h—in such a case a nugget effect only may describe adequately the form of the variogram and this would be referred to as pure nugget. In other words, semivariance does not increase markedly with respect to the nugget variance as lag increases. Directional variograms were also computed and, in most cases, gave little indication of spatial dependence. However, the variogram for 0( (north-south alignment) given in Fig. 6, to which a model was fitted, demonstrates a fairly clear increase in semivariance as lag h increases.

This indicates that the distribution of SVW is more continuous in the north–south direction than in other directions. Allen and Fulford’s contour map of SVW depicts major contours aligned north–south with more visually erratic changes in the contours in an east-west alignment. This corroborates the form of the variograms. Additionally, there are more data within the north and south extents of the data set than there are within the east–west limits, which means that the variogram would be most stable in form for the north–south direction. However, the variogram is unbounded and this may be indicative of differences between the north and south rather than within-region differences. The variogram provides a means to quantify spatial variation and compare different properties in a manner that is more objective than comparing visually maps. 4.2. Case study 2: Mapping soil phosphates Kriging has been applied widely in soil survey to map soil types (for example, [37]) and is here used to map soil phosphates from an archaeological site. The data examined were published by Buck et al. [5]. The data represent soil phosphate measured at a site (reference LS 165), probably dating to the Roman period, that was studied as part of the Laconia Survey in Greece. The measures are mg P/100 g of soil and were obtained at 10 m intervals on a 16 by 16 point grid (although no data were obtained at nine locations on the grid due to obstacles at those nodes of the grid). Variograms were estimated and models fitted to them using Gstat. The omnidirectional variogram of soil phosphate (Fig. 7) illustrates that the soil phosphate is spatially correlated. The omnidirectional

158

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165

±
SEDBB I %
Value
High : 68.90

Low : 2.78

0

60,000

120,000

240,000 Metres

Fig. 4. Map of SEDBB I%, derived using OK. 1000 m cells.

variogram was fitted with a nugget effect and a spherical model. The large nugget effect is indicative of uncertainty in the measurement of soil phosphates and local (small-scale) variation in soil phosphates. The model fitted to the omndirectional variogram has a range of 80.325 m. This can be interpreted as the maximum scale of spatial variation in soil phosphate in this region. The variogram was also estimated for different directions within a tolerance of 45 degrees. The variogram with the largest range was for 22(:30# (modelled range of 80.976 m) while the variogram at 90( from this had a modelled range of 77.079 m. The variograms, and structured components, for those two directions are given in Fig. 8. The models fitted for the two directions have similar ranges but different nugget variances. Where the sill (recall the total sill is the nugget variance plus the structured components) differs for different directions this is termed zonal anisotropy. The differences are not marked and the most straightforward approach, to use the coefficients from the omni-

directional variogram model as input for kriging, was accepted in this case. The OK functionality of GSLIB was used to Krige soil phosphate to a grid with a 2 m spacing. Both punctual OK and block OK were applied: the choice of one of the two approaches is an important issue. The map of punctual OK predictions is given in Fig. 9 and the corresponding kriging variance in given in Fig. 10. The locations of the observations are obvious in both Fig. 9 and Fig. 10. In Fig. 9, the observations appear as ‘spikes’ in the map. This is a common feature of maps derived using punctual kriging. The OK variance at the observation locations is zero in Fig. 10. This implies that there is no measurement error in the data, but in fact measurement of soil phosphate entails much uncertainty. The analysis was repeated using block OK. The spikes evident in Fig. 9 are not apparent in the map of block OK predictions (Fig. 11). Also, the block OK variances (Fig. 12), unlike in Fig. 10, are not zero at any locations. Note also that the range of values in the block

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165

159

500

400
Semivariance (SVW %2)

300

200

100

0
0
20 40 60
80
Lag (km)

Semivariance

100

120

140

Fig. 5. Omnidirectional variogram for SVW.

600

500
Semivariance (SVW %2)

400

300

200

100
Semivariance 290 Nug(0) + 2 Pow(1)

0
0 20 40
60 80
Lag (km)

100

120

140

Fig. 6. Directional variogram (0() for SVW. Nug. is nugget, Pow. is power.

OK map is smaller than the range of values for the punctual OK map because of the process of averaging over a 2 m by 2 m block. It is, as noted previously, important to consider issues such as the support over which predictions will be made. Buck et al. [5] aimed to delineate areas with high and low concentrations of soil phosphate. Although the aim here has been simply to demonstrate the application of OK for interpolation, other kriging algorithms, in particular, disjunctive kriging [33], may be used to

assess the probability that a predicted value exceeds a particular threshold. Conditional simulation was also applied to the data. Four maps derived using SGS (the algorithm in GSLIB was utilised) are given in Fig. 13. Differences between the four realisations are apparent. Conditional simulation provides a powerful means to explore variation in spatial data and there are extensive potential applications for interpreting and mapping distributions of archaeological variables. Each one of the maps in

160

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165

800
700
Semivariance (mg P/100g)2

600

500

400

300

200
100

0
0
10
20
30

Semivariance 409.959 Nug(0) + 411.119 Sph(80.325)

40
Lag (m)

50

60

70

80

Fig. 7. Omnidirectional variogram of soil phosphate. Nug. is nugget, Sph. is spherical, DD is decimal degrees.

900
800
Semivariance (mg P/100g)

700
600

2

500

400

300

200
100
Semivariance: 22.5 dd 22.5 DD: 443.898 Nug(0) + 410.956 Sph(80.976) Semivariance: 112.5 dd 112.5 DD: 366.037 Nug(0) + 417.243 Sph(77.079)

0
0
10

20

30

40
Lag (m)

50

60

70

80

Fig. 8. Directional variogram of soil phosphate for 22:30( and 112:30(. Nug. is nugget, Sph. is spherical, DD is decimal degrees.

Fig. 13 represents a possible reality, whereas neither Fig. 9 or Fig. 11 is a possible reality because they are smoothed representations. Kriging provides the best prediction on a point-by-point basis, whereas simulation is the best on a global basis that is, reproduces the original spatial structure. Statistics estimated from multiple simulated realisations may be a useful guide to spatial uncertainty. It was noted above that the coefficients of the model fitted to the variogram have been used to ascertain the

maximum punctual kriging variance for different sample spacings (for a prediction neighbourhood of 16 observations). This enables the researcher to ascertain the maximum sample spacing possible to achieve a particular precision [3]. To do this it is necessary to obtain a sample data set for which a representative variogram may be estimated. Measurements are often made along a transect for this purpose. The coefficients of the model fitted to the omnidirectional variogram of soil phosphate were input into the Fortran program OSSFIM

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165
100 m

161

±

137.000
127.000
117.000
107.000
97.000
87.000
77.000
67.000
57.000
47.000
37.000
27.000

0m 0m 100 m

Fig. 9. Map of soil phosphate produced using punctual OK, 2 m cells. Scale is in mg P/100 g of soil.

100 m

±

710.680
639.612
568.544
497.476
426.408
355.340
284.272
213.204
142.136
71.068
0.0

0m 0m 100 m

Fig. 10. Map of punctual OK variances. Scale is in (mg P/100 g)2 of soil.

(Optimal Sampling Schemes for Isarithmic Mapping, [23,25] and the maximum kriging variance, Kmax, for several different sample spacings was obtained (Fig. 14). In Fig. 15, it is shown that if a required Kmaxof 625 mg

P/100 g2 (that is, 25 mg P/100 g) were stated then a sample spacing of about 25 m would be necessary. The kriging variance is directly dependent on the form of the variogram so it is necessary that the variogram is

162
100 m

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165

±

104.000

94.000

84.000

74.000

64.000

54.000

44.000
0m 0m 100 m

Fig. 11. Map of soil phosphate produced using block OK, 2 m cells. Scale is in mg P/100 g of soil.

100 m

±

294.280
272.962
251.644
230.326
209.008
187.690
166.372
145.054
123.736
102.418
81.100

0m 0m 100 m

Fig. 12. Map of block OK variances. Scale is in (mg P/100 g)2 of soil.

representative of the region for which it is estimated. If this is the case, this approach could be a useful tool for the archaeologist as an aid to designing sampling strategies.

5. Summary and conclusions Geostatistics offers many potential benefits to archaeologists who are concerned with the analysis of spatial

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165
100 m

163

100 m

297.000
257.000
217.000
177.000
137.000
97.000
57.000
17.000
0m 0m 100 m

297.000
257.000
217.000
177.000
137.000
97.000
57.000
17.000
0m 0m 100 m

100 m

100 m

297.000
257.000
217.000
177.000
137.000
97.000
57.000
17.000
0m 0m 100 m

297.000
257.000
217.000
177.000
137.000
97.000
57.000
17.000
0m 0m 100 m

Fig. 13. Four maps of soil phosphate produced using SGS, 2 m cells. Scale is in mg P/100 g of soil.

700

2 Max. kriging var. (mg P/100g)

650

600

550

0

5

10

15

25 20 Sample spacing (m)

30

35

40

Fig. 14. Plot of maximum kriging variance against sample spacing.

data. However, like any tool geostatistics must be used appropriately. In many cases, simpler tools may be appropriate. So, it is necessary to consider carefully the pros and cons of geostatistics in any given situation.

Where the spatial variation in an archaeological variable is of interest the tools of geostatistics have much potential value. Tools such as the variogram may be utilised to quantify and interpret observed spatial

164

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165

700

Max. kriging var. (mg P/100g)

2

650

600

550

0

5

10

15

25 20 Sample spacing (m)

30

35

40

Fig. 15. Plot of maximum kriging variance against sample spacing, showing the sample spacing required to achieve a maximum kriging variance of 625 mg P/100 g2.

distributions. Hodder and Orton [16] illustrated how measures of spatial autocorrelation could be used to characterise spatial variation in archaeological variables. In addition to characterisation of spatial dependence, this paper has demonstrated how geostatistics may be used to analyse and map archaeological variables. There are many archaeological variables that could be analysed geostatistically. Some obvious ones are artefact densities and dates of objects [39]. The following applications were outlined (using the tools specified in parentheses): • characterisation of spatial variation (variogram) • spatial prediction (ordinary kriging) • assessment of uncertainty in mapped predictions (kriging variance) • conditional simulation (sequential Gaussian simulation) • design of optimal sampling strategies (kriging variance). The tools of geostatistics represent a powerful addition to the archaeologist’s tool kit but, so far, little of the potential benefits have been realised. This is due, in part, to the perceived complexity of the techniques and the models that underlie them. It is hoped that this paper will serve in some way to expand the understanding of geostatistics and to encourage its use in archaeology. References
[1] J.R.L. Allen, M.G. Fulford, The distribution of South-East Dorset Black Burnished Category 1 pottery in South-West Britain, Britannia 27 (1996) 223–281.

[2] M. Armstrong, Basic Linear Geostatistics, Springer, Berlin, 1998. [3] P.M. Atkinson, Optimal sampling strategies for rasterbased geographical information systems, Global Ecology and Biogeography Letters 5 (1996) 217–280. [4] J.P. Bocquet-Appel, P.Y. Demars, Neanderthal contraction and modern human colonization of Europe, Antiquity 74 (2000) 544–552. [5] C.E. Buck, W.G. Cavanagh, C.D. Litton, The spatial analysis of site phosphate data, in: S.P.Q. Rahtz (Ed.), Computer and Quantitative Methods in Archaeology 1988, BAR International Series 446(i), BAR, Oxford, 1988, pp. 151–160. [6] T.M. Burgess, R. Webster, A.B. McBratney, Optimal interpolation and isarithmic mapping of soil properties. IV. Sampling strategy, Journal of Soil Science 32 (1981) 643–659. [7] P.A. Burrough, R.A. McDonnell, Principles of Geographical Information Systems, Oxford University Press, Oxford, 1998. [8] J.P. Chile ` s, P. Delfiner, Geostatistics: Modeling Uncertainty, John Wiley and Sons, New York, 1999. [9] P.J. Curran, P.M. Atkinson, Geostatistics in remote sensing, Progress in Physical Geography 22 (1998) 61–78. [10] C.V. Deutsch, Geostatistical Reservoir Modelling, Oxford University Press, New York, 2002. [11] C.V. Deutsch, A.G. Journel, GSLIB: Geostatistical Software Library and User’s Guide, second ed, Oxford University Press, New York, 1998. [12] J.L. Dungan, Conditional simulation, in: A. Stein, F. van der Meer, B. Gorte (Eds.), Spatial Statistics for Remote Sensing, Kluwer Academic Publishers, Dordrecht, 1999, pp. 135–152. [13] D. Ebert, The potential of geostatistics in the analysis of fieldwalking data, in: D. Wheatley, G. Earl, S. Poppy (Eds), Contemporary Themes in Archaeological Computing, University of Southampton Department of Archaeology Monograph No. 3, Oxbow Books, Oxford, 2002, pp. 82–89. [14] P. Goovaerts, Geostatistics for Natural Resources Evaluation, Oxford University Press, New York, 1997. [15] J.B. Hageman, D.A. Bennett, Construction of digital elevation models for archaeological applications, in: K.L. Westcott, R.J.

C.D. Lloyd, P.M. Atkinson / Journal of Archaeological Science 31 (2004) 151–165 Brandon (Eds.), Practical Applications of GIS for Archaeologists: A Predictive Modeling Kit, Taylor and Francis, London, 2000, pp. 113–127. I. Hodder, C. Orton, Spatial Analysis in Archaeology, New Studies in Archaeology 1, Cambridge University Press, Cambridge, 1976. E.H. Isaaks, R.M. Srivastava, An Introduction to Applied Geostatistics, Oxford University Press, New York, 1989. A.G. Journel, Modelling uncertainty and spatial dependence: stochastic imaging, International Journal of Geographical Information Systems 10 (1996) 517–522. A.G. Journel, C.J. Huijbregts, Mining Geostatistics, Academic Press, London, 1978. K.L. Kvamme, Spatial autocorrelation and the Classic Maya collapse revisited: refined techniques and new conclusions, Journal of Archaeological Science 17 (1990) 197–207. C.D. Lloyd, P.M. Atkinson, The effect of scale-related issues on the geostatistical analysis of Ordnance Survey digital elevation data at the national scale, in: J. Go ´ mez-Herna ´ ndez, A. Soares, R. Froidevaux (Eds.), GeoENV II: Geostatistics for Environmental Applications, Kluwer Academic Publishers, Dordrecht, 1999, pp. 537–548. G. Matheron, The Theory of Regionalized Variables and its Applications, Les Cahiers du Centre de Morphologie ´ cole Nationale Mathe ´ matique de Fontainebleau No. 5, E Supe ´ rieure des Mines, Fontainebleau, 1971. A.B. McBratney, R. Webster, The design of optimal sampling schemes for local estimation and mapping of regionalised variables. II. Program and examples, Computers and Geosciences 7 (1981) 335–365. A.B. McBratney, R. Webster, Choosing functions for semivariograms of soil properties and fitting them to sampling estimates, Journal of Soil Science 37 (1986) 617–639. A.B. McBratney, R. Webster, T.M. Burgess, The design of optimal sampling schemes for local estimation and mapping of regionalised variables. I. Theory and method, Computers and Geosciences 7 (1981) 331–334. D.E. Myers, To be or not to be . stationary? That is the question, Mathematical Geology 21 (1989) 347–362. F.D. Neiman, Conspicuous consumption as wasteful advertising: a Darwinian perspective on spatial patterns in Classic Maya terminal monuments dates, in: M.C. Barton, G.A. Clark (Eds), Rediscovering Darwin: Evolutionary Theory and Archaeological Explanation, Archaeological Papers of the American Anthropological Association, 1997, pp. 267–290.

165

[16]

[17] [18]

[19] [20]

[21]

[22]

[23]

[24]

[25]

[26] [27]

[28] M.A. Oliver, R. Webster, Kriging: a method of interpolation for geographical information systems, International Journal of Geographical Information Systems 4 (1990) 313–332. [29] M.A. Oliver, R. Webster, K.J. Edwards, G. Whittington, Multivariate, autocorrelation and spectral analyses of a pollen profile from Scotland and evidence of periodicity, Review of Palaeobotany and Palynology 96 (1997) 121–141. [30] M.A. Oliver, R. Webster, J. Gerrard, Geostatistics in physical geography. Part I: theory, Transactions of the Institute of British Geographers 14 (1989a) 259–269. [31] M.A. Oliver, R. Webster, J. Gerrard, Geostatistics in physical geography. Part II: applications, Transactions of the Institute of British Geographers 14 (1989b) 270–286. [32] E.J. Pebesma, C.G. Wesseling, Gstat, a program for geostatistical modelling, prediction and simulation, Computers and Geosciences 24 (1998) 17–31. [33] J. Rivoirard, Introduction to Disjunctive Kriging and Non-linear Geostatistics, Clarendon Press, Oxford, 1994. [34] J.M. Robinson, E. Zubrow, Between spaces: interpolation in archaeology, in: M. Gillings, D. Mattingly, J. van Dalen (Eds.), The Archaeology of Mediterranean Landscapes, Oxbow Books, Oxford, 1999, pp. 65–83. [35] C. Varekamp, A.K. Skidmore, P.A.B. Burrough, Using public domain geostatistical and GIS software for spatial interpolation, Photogrammetric Engineering and Remote Sensing 62 (1996) 845–854. [36] R. Webster, T.M. Burgess, Optimal interpolation and isarithmic mapping of soil properties III. Changing drift and universal kriging, Journal of Soil Science 31 (1980) 505–524. [37] R. Webster, M.A. Oliver, Statistical Methods in Soil and Land Resource Survey, Oxford University Press, Oxford, 1990. [38] R. Webster, M.A. Oliver, Geostatistics for Environmental Scientists, John Wiley and Sons, Chichester, 2000. [39] D. Wheatley, M. Gillings, Spatial Technology and Archaeology: The Archaeological Applications of GIS, Taylor & Francis, London, 2002. [40] D.S. Whitley, Spatial autocorrelation tests and the Classic Maya collapse: methods and inferences, Journal of Archaeological Science 12 (1985) 377–395. [41] E.A. Yfantis, G.T. Flatman, J.V. Behar, Efficiency of kriging estimation for square, triangular, and hexagonal grids, Mathematical Geology 19 (1987) 183–205. [42] E.B.W. Zubrow, J.W. Harbaugh, Archaeological prospecting: kriging and simulation, in: I. Hodder (Ed.), Simulation Studies in Archaeology, Cambridge University Press, Cambridge, 1978, pp. 109–122.

Sign up to vote on this title
UsefulNot useful