1 s2.0 S0022169416304048 Main

Journal of Hydrology 540 (2016) 527–537
Contents lists available at ScienceDirect
Journal of Hydrology
journal homepage: www.elsevier.com/locate/jhydrol
Research papers
Detecting spatial structures in throughfall data: The effect of extent,

sample size, sampling design, and variogram estimation method
Sebastian Voss a, Beate Zimmermann b, Alexander Zimmermann a,⇑
a
University of Potsdam, Institute of Earth and Environmental Science, Potsdam, Germany
b
Research Institute for Post-Mining Landscapes, Finsterwalde, Germany
a r t i c l e i n f o a b s t r a c t
Article history: In the last decades, an increasing number of studies analyzed spatial patterns in throughfall by means of
Received 5 October 2015 variograms. The estimation of the variogram from sample data requires an appropriate sampling scheme:
Received in revised form 10 May 2016 most importantly, a large sample and a layout of sampling locations that often has to serve both vari-
Accepted 19 June 2016
ogram estimation and geostatistical prediction. While some recommendations on these aspects exist,
Available online 21 June 2016
This manuscript was handled by Andras
they focus on Gaussian data and high ratios of the variogram range to the extent of the study area.
Bardossy, Editor-in-Chief, with the However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribu-
assistance of Uwe Haberlandt, Associate tion. In this study, we examined the effect of extent, sample size, sampling design, and calculation
Editor method on variogram estimation of throughfall data. For our investigation, we first generated non-
Gaussian random fields based on throughfall data with large outliers. Subsequently, we sampled the
Keywords: fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling
Throughfall designs (two grid-based layouts, transect and random sampling) and five sample sizes (50, 100, 150,
Geostatistics 200, 400). We then estimated the variogram parameters by method-of-moments (non-robust and robust
Sampling estimators) and residual maximum likelihood. Our key findings are threefold. First, the choice of the
Variogram extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of
Residual maximum likelihood
the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a
minimum sample size of 150, a design that ensures the sampling of small distances and variogram esti-
mation by residual maximum likelihood offers a good compromise between accuracy and efficiency.
Third, studies relying on method-of-moments based variogram estimation may have to employ at least
200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the number
recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous through-
fall studies relied on method-of-moments variogram estimation and sample sizes 200, currently avail-
able data are prone to large uncertainties.
Ó 2016 Elsevier B.V. All rights reserved.
1. Introduction Zimmermann and Zimmermann, 2014; Zimmermann et al.,

2010). In the majority of cases, variograms were used to character-
In the last three decades, an increasing number of studies ana- ize the spatial properties of the throughfall data.
lyzed spatial patterns in throughfall to investigate the conse- The variogram is central to geostatistics (Webster and Oliver,
quences of rainfall redistribution for biogeochemical (Allen et al., 1992, 2007) because it describes spatial variation and provides
2015; Hsueh et al., 2016; Möttönen et al., 1999; Whelan et al., the parameters (nugget, sill and range) that are essential for spatial
1998) and hydrological processes in forests (Fathizadeh et al., prediction and the simulation of random fields. It is widely
2014; Gerrits et al., 2010; Hsueh et al., 2016; Keim et al., 2005; accepted that estimates of the variogram are sensitive to the size
Klos et al., 2014; Loescher et al., 2002; Shachnovich et al., 2008; and spatial arrangement of the sample (e.g. Lark, 2002a; Russo
Staelens et al., 2006; Zimmermann et al., 2009). Other studies ana- and Jury, 1987; Webster and Oliver, 1992). Furthermore, there is
lyzed throughfall spatial patterns to optimize sampling schemes ample evidence that the variogram range and sill depend on the
for estimating mean throughfall (Ziegler et al., 2009; spatial scale of sampling (e.g. Blöschl, 1999; Western and Blöschl,
1999). Several studies investigated the influence of various aspects
of the sampling design on variogram estimation (Bogaert and
⇑ Corresponding author. Russo, 1999; Blöschl, 1999; Corsten and Stein, 1994; Kerry et al.,
E-mail address: alexander.zimmermann.ii@uni-potsdam.de (A. Zimmermann).
http://dx.doi.org/10.1016/j.jhydrol.2016.06.042
0022-1694/Ó 2016 Elsevier B.V. All rights reserved.
528 S. Voss et al. / Journal of Hydrology 540 (2016) 527–537
2008; Lark, 2002a; Morris, 1991; Müller and Zimmerman, 1999; this knowledge gap, we sampled a set of unconditional simula-
Pardo-Igúzquiza and Dowd, 2013; Pettitt and McBratney, 1993; tions, which we obtained using real-world throughfall data, with
Russo and Jury, 1987; Skøien and Blöschl, 2006; Warrick and several extents, common spatial sampling designs, and a variety
Myers, 1987; Webster and Oliver, 1992; Western and Blöschl, of sample sizes. We then evaluated these sampling schemes in
1999). This work, however, has received little attention among for- terms of their ability to provide satisfactory estimates of the vari-
est hydrologists partly because of a missing common language ogram parameters. For our analysis we used both REML and MoM
between environmental statisticians and field hydrologists. variogram estimation.
A closer look at the studies that investigated the role of sam-
pling designs on variogram estimation reveals that they can be
2. Methods
divided into three groups. The first group (Bogaert and Russo,
1999; Lark, 2002a; Morris, 1991; Müller and Zimmerman, 1999;
2.1. Data
Pettitt and McBratney, 1993; Warrick and Myers, 1987) optimized
sampling designs based on various criteria linked to the variogram.
From a large throughfall data set (Zimmermann and
Early studies (Morris, 1991; Warrick and Myers, 1987) focused on
Zimmermann, 2014), we selected six events that showed distinct
the distribution of sampling points among the lags. For instance,
univariate distributions and autocorrelation structures, respec-
Warrick and Myers (1987) presented a criterion that aims on an
tively. While all events included outlying values (i.e. data points
equally distributed number of paired comparisons in each lag class.
which cannot be forced to the center of the distribution even after
Morris (1991) criticized this approach because it neglects the cor-
transformation), events 1 and 5 furthermore showed an underlying
relation of the spatial data and leads to a comparatively low effi-
asymmetry (cf. Kerry and Oliver, 2007). This type of asymmetry is
ciency of the sampling design (van Groenigen, 1999). Subsequent
not caused by outliers but by a skew of the underlying (or primary)
studies chose other, more complex criteria. For instance, Müller
distribution of the data, which can be statistically defined as the
and Zimmerman (1999) maximized the determinant of Fisher’s
region between the first and seventh octile (Zimmermann et al.,
information matrix and Lark (2002a) minimized the kriging vari-
2010). It is important to distinguish between underlying asymme-
ance to find an optimum configuration of sampling points. In his
try and skewness due to outliers because these deviations from the
comprehensive study, Lark (2002a) demonstrated that (i) a random
normal distribution require different treatments of the data.
process with a small spatial dependence is sampled best with scat-
Robust variogram estimators can deal with normally distributed
tered clusters of sampling points, (ii) for long range processes a
data that are contaminated with outliers. Robust estimators, how-
regular array is optimal, and (iii) if there is no prior information
ever, cannot deal with data that show an underlying skew because
about the spatial correlation, sampling in transects is the most
the estimators have a specific consistency correction for contami-
robust approach.
nated normal data (Lark, 2000a). We therefore had to transform
The second group of studies (Corsten and Stein, 1994; Kerry
data of events 1 and 5 before further processing.
et al., 2008; Pardo-Igúzquiza and Dowd, 2013; Webster and
For each event we constructed a Gaussian random field by
Oliver, 1992) sampled simulated random fields to assess the effect
unconditional simulation. The simulated values of fields 1 and 5
of different sampling designs and sample sizes on variogram esti-
were back transformed to ensure that the fields reflected the struc-
mation. This simple approach has the advantage that the experi-
ture of the original data. In a final step we contaminated the fields
mental variogram can be compared directly against the
with outlying values of the respective event (Fig. 1). For an in-
variogram which is based on all data of the simulated field.
depth description of the construction of the fields we refer to
Webster and Oliver (1992) sampled different random fields and
Zimmermann et al. (2010) and Zimmermann and Zimmermann
concluded that a sample size of 150 would be satisfactory for a pre-
(2014). The fields have an edge length of 100 m, a grid unit of
cise estimate of the variogram. Kerry et al. (2008) followed the
0.1 m and hence consist of 106 data points.
approach of Webster and Oliver (1992) and compared residual
A closer look at the simulated fields (Fig. 1, Table 1) reveals that
maximum likelihood (REML) with method of moment (MoM)
they comprise a large span of spatial structures which is reflected
based variogram estimation. They found that REML outperforms
in the variation of the nugget-to-sill ratio and the effective range,
MoM and that a sample size of 100 would be sufficient for vari-
respectively. Relatively strong spatial structures and short autocor-
ogram parameter estimation.
relation distances characterize fields 3, 4 and 5. Accordingly, these
The third group of studies (Blöschl, 1999; Skøien and Blöschl,
fields have nugget-to-sill ratios <25% and effective ranges of
2006; Western and Blöschl, 1999) investigated effects of scale on
around 3 m. In contrast, fields 1 and 2 feature somewhat weaker
variogram estimation. Although these studies did not primarily
spatial structures but a comparatively long autocorrelation dis-
focus on the influence of the sampling design on variogram estima-
tance. Finally, field 6 features a pure nugget structure and hence
tion, their work has important implications for sampling. For
displays no spatial correlation.
instance, Western and Blöschl (1999) showed that estimates of
the correlation length depend on the extent (i.e. on the size of
the sampling plot). 2.2. Sampling methods
Most of the studies that assessed the impact of the sampling
design on variogram estimation worked with normally distributed For our study we tested the influence of the extent, sampling
data. Furthermore, the data of previous studies often showed a design, sample size, and methodology on the estimation of the var-
comparatively strong autocorrelation and a long range. Throughfall iogram. In this section, we describe the selection of the extent,
data, however, usually do not follow a normal distribution; instead sampling design and sample size. In Sections 2.3 and 2.4 we deal
they often show skewed underlying distributions (Zimmermann with the variogram estimation methods.
and Zimmermann, 2014) and heavily outlying values (Lloyd and To assess the influence of the extent, we employed three plot
Marques, 1988; Zimmermann et al., 2009). Moreover, variograms sizes with edge lengths of 25 m, 50 m, and 100 m, respectively.
of throughfall data usually exhibit relatively small ranges com- The largest plots comprised the entire simulation fields. For the
pared to the extent of the research area (Möttönen et al., 1999; smaller plots we arbitrarily chose the lower left corner of the sim-
Zimmermann and Zimmermann, 2014; Zimmermann et al., ulated fields.
2009). Therefore, it is not clear if the results of previous studies For the analysis of the sampling design we tested random sam-
apply for the spatial analysis of data with these properties. To fill pling (R) in addition to three regular designs: a regular grid (G), a
S. Voss et al. / Journal of Hydrology 540 (2016) 527–537 529
field 1 field 2 field 3

100 [mm] [mm] [mm]
200
2.5
40
80
2.0
150
30
60
1.5
Y [m]
100
20
40
1.0
50
10
0.5
20
0.0
0
0
field 4 field 5 field 6
100 [mm] [mm] [mm]
40
80 150
30
60
100
6
Y [m]
20
40
4
50
10
2
20
0
0
0
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
X [m] X [m] X [m]
Fig. 1. The six random fields used for re-sampling.
regular grid with additional observations at short distances (Gs), 2.3. Variogram estimation by the method-of-moments (MoM)
and transects (T). Regular designs have been used frequently for
geostatistical analyses. Previous studies on throughfall spatial pat- The estimation of the variogram by the method-of-moments
terns, however, also used random sampling (e.g. Bellot and Escarre, (MoM) is widely used in geostatistics (de Gruijter et al., 2006, p.
1998; Hsueh et al., 2016; Keim et al., 2005), which is why we 173). With MoM the variogram parameters are estimated in two
tested this design too. steps. These consist of calculating an experimental variogram
We combined the four sampling designs with five different and subsequently fitting a variogram model to it. Because of the
sample sizes (50, 100, 150, 200 and 400) to evaluate the influence outliers in the simulated fields, we calculated experimental vari-
of the sample size on variogram estimation. Because of the regular ograms using non-robust and robust estimators: the estimator of
sampling pattern of grids and transects it was not always Matheron (1962), Cressie and Hawkins (1980) and Dowd (1984).
possible to obtain the envisaged sample size. In these cases we Given normal data, Matheron’s estimator is the most efficient,
used the closest possible sample size (e.g. grid-based sampling: while the others can deal with outliers. Because the samples of
grid of 7 by 7 = 49 points instead of n = 50, Supplementary fields 1 and 5 showed underlying asymmetry (Section 2.1), we
material A). transformed the data to square roots before variogram calculation.
The sampling points were allocated as follows. For the regular Following Journel and Huijbregts (1978, p. 194), we calculated the
designs, we divided the simulated fields into square subplots. Their experimental variograms to a maximum lag distance of half the
size depended on the extent, sample size, and sampling design diagonal of the plots. For all experimental variograms we chose
(Fig. 2). The sampling points of designs G and Gs were randomly 1 m lag distances.
arranged in the subplots, but the grid shape was maintained in In a second step, we fitted three models to the experimental
the whole plot area. For design Gs, the number of sampling points variogram that are most appropriate for throughfall data (expo-
in the grid was reduced, leading to a larger spacing and larger sub- nential, spherical, pure nugget; Zimmermann and Zimmermann,
plots. Every second sampling point in the design Gs got an addi- 2014). For variogram fitting we applied the R-function optim (R
tional sampling point in 1 m distance (or 0.5 m distance for an Core Team, 2013) with the method ‘‘L-BFGS-B”. Subsequently, we
extent of 25 m and a sample size of 400). For design (T), we ran- chose the model with the smallest sum of the weighted least
domly chose the position and orientation (horizontal or vertical) squares (Cressie, 1985). Lags with less than 5 pairs of observations
of the transects. One transect consisted of six sampling points. were not used for model fitting.
The distances of the first five sampling points in a transect were We performed this procedure for the three MoM variogram
fixed (Supplementary material A) and rose exponentially as recom- estimators, which resulted in three different sets of variogram
mended by Pettitt and McBratney (1993). The distance to the last parameters (variogram model, nugget, sill and range). To decide
sampling point depended on the size of the subplots and on the which set of variogram parameters describes the sampled data
sample size. Each of the six random fields (Fig. 1) was sampled best, we predicted the throughfall volume at each location xi by
with all combinations of extents, sampling designs, and sample ordinary kriging using all data except the value at location xi (cross
sizes 1000 times. The obtained samples were used for variogram validation). We then calculated the statistic hðxÞ for every point xi
estimation. (Lark, 2000a):
Table 1
The spatial properties of the simulation parameters used for the construction of the 6 non-Gaussian random fields and the spatial properties based on the exhaustive variograms
of the constructed fields (plot with an edge length of 100 m) and the subfields (plots with 25 m and 50 m edge lengths).
Field Method Edge length Nugget Sill Range [m] Effective range [m] Nugget:sill ratio [%] Variogram model
1 Simulation NA 0.02 0.06 2.84 8.53 24.80 Exponential
MoM 25 0.02 0.06 7.10 21.30 33.10 Exponential
50 0.02 0.06 3.05 9.14 35.06 Exponential
100 0.02 0.05 2.60 7.79 42.65 Exponential
REML 25 0.01 0.05 3.07 9.21 26.67 Exponential
50 0.01 0.05 3.13 9.40 27.03 Exponential
100 0.01 0.04 2.44 7.32 31.06 Exponential
MoM 25 437.97 551.07 1.88 5.65 79.48 Exponential
50 449.00 545.09 2.02 6.07 82.37 Exponential
100 444.77 545.42 2.52 7.57 81.55 Exponential
50 380.64 485.21 2.18 6.53 78.45 Exponential
100 382.29 480.69 2.52 7.55 79.53 Exponential
3 Simulation NA 2.69 11.30 3.39 3.39 23.77 Spherical
MoM 25 3.11 12.87 3.20 3.20 24.18 Spherical
50 2.73 13.09 3.08 3.08 20.82 Spherical
100 2.49 13.43 3.34 3.34 18.52 Spherical
REML 25 2.63 11.59 3.31 3.31 22.72 Spherical
50 2.48 11.30 3.14 3.14 21.93 Spherical
100 2.24 11.76 3.33 3.33 19.00 Spherical
MoM 25 30.90 133.32 0.94 2.82 23.18 Exponential
50 20.05 134.07 1.01 3.02 14.95 Exponential
100 31.51 136.71 1.00 3.00 23.05 Exponential
50 13.18 119.40 0.95 2.86 11.04 Exponential
100 14.01 123.04 0.99 2.97 11.38 Exponential
MoM 25 0.04 0.23 1.46 4.39 18.26 Exponential
50 0.05 0.21 1.14 3.42 22.29 Exponential
100 0.06 0.21 1.05 3.16 29.64 Exponential
50 0.02 0.19 1.07 3.20 10.49 Exponential
100 0.02 0.19 1.00 3.00 11.92 Exponential
6 Simulation NA 5.14 5.14 0.00 0.00 100.00 Nugget
MoM 25 5.87 5.87 0.00 0.00 100.00 Nugget
50 5.83 5.83 0.00 0.00 100.00 Nugget
100 5.72 5.72 0.00 0.00 100.00 Nugget
REML 25 5.13 5.13 0.00 0.00 100.00 Nugget
50 4.98 4.98 0.00 0.00 100.00 Nugget
100 5.00 5.00 0.00 0.00 100.00 Nugget
^ i Þ zðxi Þg
fZðx
2
of the secondary process (outliers originating from drip points). In
hðxi Þ ¼ ð1Þ other cases, outlier removal will not be possible and hence, MoM
r 2
K;i
estimation of the variogram using robust estimators is the only
choice.
^ i Þ and the kriging
with the observed value zðxi Þ, the kriged value Zðx To identify the spatial outliers at locations xi in our throughfall
variance rK;i . A correct set of variogram parameters has a median
2
datasets, we used the standardized error of cross-validation, eS ðxÞ
hðxÞ of 0.455 (Lark, 2000a). Accordingly, we chose the MoM vari- (Bárdossy and Kundzewicz, 1990; Lark, 2002b) and the cross-
ogram estimator with hðxÞ closest to 0.455. validation result of the most accurate MoM estimator:
^ i Þ zðxi Þ
Zðx
2.4. Variogram estimation by residual maximum likelihood (REML) eS ðxi Þ ¼ ð2Þ
rK;i
Maximum likelihood based estimation methods are state of the
^ i Þ, the observed value zðxi Þ and the kriging
with the kriged value Zðx
art in variogram estimation (Kerry et al., 2008; Lark et al., 2006). In
this study, we used the residual maximum likelihood (REML) esti- standard error rK;i . According to Bárdossy and Kundzewicz (1990),
mation method (Patterson and Thompson, 1971) as described in it is plausible to assume that eS ðxÞ follows a standard normal distri-
Diggle and Ribeiro (2007, p. 116). In contrast to MoM, REML esti- bution. Consequently, we can use the z statistic to classify a value as
mates the variogram parameters directly from the sample data an outlier. In the present study, points in the sampled dataset with a
and not as a fit to the experimental variogram. Since REML is sen- standardized error of cross-validation smaller than 2.576

a ¼ 0:005 were removed as spatial outliers before further calcula-
sitive to outliers, we removed all data points identified as spatial 2
outliers before variogram estimation (cf. Zimmermann et al., tions. Hence, with this statistical procedure we removed unusually
2010). This approach is sensible for throughfall data because in large values from our datasets.
most situations we wish to model the spatial variation of the pri- After outlier removal, we applied REML to fit the same theoret-
mary process (throughfall data without outliers) not the variation ical models (exponential, spherical, pure nugget) as we did when
G Gs 2.6. The exhaustive variogram as comparison

100
In order to assess the accuracy of the variogram estimation, we
80
calculated exhaustive variograms using all data of the respective
60 extent and field (Table 1). It was technically not feasible to calcu-
Y [m]
late a variogram model with all data points. Hence, both for
40 MoM and REML, we estimated variogram models as described
20
above using 1000 randomly chosen data points and made sure that
no data point was used more than once. This procedure allowed
0 the estimation of 1000, 250 and 62 variogram models for the
extents with a 100 m, 50 m and 25 m edge length, respectively.
T R
100 By averaging the parameters of the respective variograms, we
received one exhaustive variogram for the three extents of every
80 field. The exhaustive variogram represents an approximation of
60 the real variogram for the respective extent and simulated field.
Y [m]
To assess the accuracy of the variogram estimation with varying

40 sample sizes and sampling designs, we compared the results of the
estimated variogram parameters with the parameter values of the
20
exhaustive variograms. The individual steps of this procedure are
0 summarized in Fig. 3. For comparability of variogram parameters
0 20 40 60 80 100 0 20 40 60 80 100 and estimation methods (MoM and REML), we defined a span for
X [m] X [m]
Fig. 2. The four sampling designs: grid (G), regular grid with additional observa-
tions at short distances (Gs), transects (T) and random (R). Plots show a sample size
of 100. The gray lines divide the plots into subplots, additional observations of
Unconditional simulation of a Gaussian random variable
parameter: mean, sill, nugget, range, variogram model
design Gs are marked in gray.
using MoM estimation, and chose the one with the highest maxi- 6 simulated throughfall fields
mized log-likelihood.
Installation of outliers
2.5. Automatic methods for the estimation of input parameters 6 contaminated throughfall fields
up to here from Zimmermann & Zimmermann (2014)

The two variogram estimation methods described above require
initial estimates of the variogram parameters nugget, sill, and
range, which are based on the experimental variogram. The calcu- Sampling:
3 extents, 4 sampling designs, 5 sample sizes, 1000 iterations
lation of an experimental variogram is indispensable for MoM but
not required for REML. However, for finding initial estimates for
REML, too, we calculated the Matheron’s experimental variogram
Residual Maximum Likelihood (REML)

after outlier removal. Remove spatial outliers
Calculation of the
The initial estimates are usually obtained by visual inspection of with < -2.576
Method-of-Moments (MoM)
experimental variogram
the experimental variogram (Diggle and Ribeiro, 2007, p. 107). For with 3 estimators
this study, however, visual inspection was not feasible because of (Cressie and Hawkins,
1980; Dowd, 1984; Calculation of the
the vast number of experimental variograms that had to be ana-
Matheron, 1962) experimental variogram
lyzed. Consequently, we had to use automatic methods for the (Matheron, 1962)
derivation of input parameters.
Two automatic methods for the estimation of input parameters Automatic estimation of initial parameters
based on the experimental variogram were found in the literature
and used in this study. The first method (Jian et al., 1996) calcu-
lates the initial sill as the mean of the semivariance values of the Estimation of the
Estimation of the
variogram model by
last three lags. The initial nugget is defined as the intercept of a lin- variogram model by
REML (Patterson and
MoM (Cressie, 1985)
ear model with the semivariance values of the first two lags, and Thompson, 1971)
the initial range is defined as half of the mean distance of the
research area. The second method (Hiemstra et al., 2009) calculates
the initial sill as the mean of the maximum and median semivari- Calculate statistic
ance, the initial nugget as the minimum semivariance and the ini- and choose estimator
with median [ ]
tial range as 0.10 times the diagonal of the bounding box of the closest to 0.455
data (this method is implemented in their R-package automap).
We varied the initial parameters of the two methods by ±10%,
which produced six parameter sets for MoM and REML-based var- Comparison of the variogram parameters
iogram estimation, respectively. With these sets of initial parame- with the parameters of the simulated fields
ters we calculated three different variogram models (spherical,
exponential, nugget) using both MoM and REML. For each vari-
ogram estimation method we finally chose the model with the best Fig. 3. Summary of steps involved in estimating the variogram parameters based
fit, as described in Sections 2.3 (MoM) and 2.4 (REML). on sampling of the simulated throughfall fields.
each field and parameter, within which we deemed parameter val- (Fig. 4, Supplementary material B). Particularly for sample sizes
ues to be accurately estimated. The nugget and sill of an exhaustive of 150 and less, semivariance estimates from the 25 m by 25 m plot
variogram ±25% of the sill value defined the tolerable span for the deviate much less from the exhaustive variogram than the esti-
semivariance-axis. The range of the exhaustive variogram ±25% mates from the larger plots. As expected, the choice of the extent
defined the tolerable span for the distance-axis. also influences the estimation of the variogram parameters, albeit
All calculations were processed with the free software environ- to a varying degree (Fig. 5, Supplementary material C). Most esti-
ment R (R Core Team, 2013). Particularly, we used the packages mates of the sill display only minor differences between the three
automap (Hiemstra et al., 2009), geoR (Diggle and Ribeiro, 2007), extents. Estimates of nugget and range, in contrast, clearly improve
and georob (Papritz, 2014). with a decreasing extent which is reflected in an increasing num-
ber of estimates within the tolerable span (Fig. 5; for a definition
of the tolerable span see Section 2.6). Consequently, our results
3. Results and discussion
seem to suggest that a comparatively small ratio of the plot extent
to the range of the data, and hence a relatively small spacing
3.1. The role of the extent
between sampling points, facilitates the estimation of variograms
with a short range (cf. Blöschl, 1999; Skøien and Blöschl, 2006;
Our results indicate that the choice of the extent has a substan-
Western and Blöschl, 1999).
tial influence on the estimation of the experimental variogram
25 50 100
n = 50 n = 50 n = 50
100
semivariance
60
0 20
n = 100 n = 100 n = 100

100
semivariance
60
0 20
n = 150 n = 150 n = 150

100
semivariance
60
0 20
n = 200 n = 200 n = 200

100
semivariance
60
0 20
n = 400 n = 400 n = 400

100
semivariance
field 3
G Gs T R
60
0 20
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
distance [m] distance [m] distance [m]
Fig. 4. Comparison between the 5th and 95th percentiles of the experimental variograms from field 3 sampled with three extents (plot edge lengths of 25 m, 50 m, and
100 m), four sampling designs (G, Gs, T, R), and various sample sizes (50, 100, 150, 200, 400). The exhaustive variograms are plotted as solid lines. For results of all fields we
refer to Supplementary material B.
25 50 100
100
80
60
sill
40
percentage of variogram parameters within the tolerable span [%]
20
100
80
60
nugget
40
20
100
G REML
Gs MoM
T
80
R
60
range
40
20
50 100 150 200 400 50 100 150 200 400 50 100 150 200 400
sample size sample size sample size
Fig. 5. The percentage of variogram parameters within the tolerable span as a function of extent (plot edge lengths of 25 m, 50 m, and 100 m), sample size (50, 100, 150, 200,
400), sampling design (G, Gs, T, R), and estimation method (MoM, REML). The plots show averages of the 6 random fields. For results of the individual fields we refer to
Supplementary material C.
In practice, however, it is not straightforward to choose an opti- typical correlation lengths between three and eight meters
mum extent because the covariance structure of throughfall at a (Table 1), a sampling area of 25 m by 25 m is already too small.
particular site is usually unknown. Furthermore, throughfall auto-
correlation structures vary with rainfall magnitude, i.e. small 3.2. The role of the sample size
events often display a larger range than large events
(Zimmermann and Zimmermann, 2014). Lastly, and probably most The sample size has a considerable influence on the estimation
important, the extent itself influences estimates of the range of both the experimental variogram (Fig. 4, Supplementary mate-
(Blöschl, 1999; Skøien and Blöschl, 2006; Western and Blöschl, rial B) and the variogram parameters (Fig. 5, Supplementary mate-
1999). Previous studies on soil moisture and snow cover patterns rial C). Generally, with rising sample size the spread of the
(Blöschl, 1999; Western and Blöschl, 1999) indicate that the semivariance estimates narrows towards the values of the exhaus-
observed autocorrelation distance decreases with a decreasing tive variogram. Furthermore, the semivariance estimates become
extent. Our simulations do not always show this pattern (Table 1). less erratic (Fig. 4, Supplementary material B). An effect, which
However, the ranges of the exhaustive variograms deviated most was already documented in previous studies (Webster and
from the range of the variogram that was used to build the simu- Oliver, 1992). However, in contrast to previous studies (Gascuel-
lation if the calculations were based on the smallest extent (25 m Odoux and Boivin, 1994; Kerry et al., 2008; Webster and Oliver,
by 25 m plot, Table 1). We therefore suspect that for our data with 1992), our results indicate that a sample size of 200 is not
25 50 100
G G G
number of pairs
600
200
0
Gs Gs Gs
number of pairs
600
200
0
T T T
number of pairs
600
200
0
R R R
number of pairs
600
200
0
0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30
distance [m] distance [m] distance [m]
Fig. 6. The distribution of the number of point pairs per lag class for a sample size of 150. The plot shows results for four sampling designs (G, Gs, T, R) and three extents (plot
edge lengths of 25 m, 50 m, and 100 m).
necessarily the upper end. In fact, if the ratio of the extent to the rial B). This pattern emerges because certain lag classes contain too
range is relatively large, 200 and more sampling points seem to few data. For the same setup, grid sampling (G) and optimized grid
be necessary to avoid erratic semivariance estimates of the exper- sampling (Gs) cannot capture the spatial structure of the simulated
imental variogram (Fig. 4; Supplementary material B). fields either due to missing semivariance estimates at short lags.
A rising sample size also goes along with a higher accuracy of Considering all combinations of extents and sample sizes (Fig. 4,
the variogram parameter estimates (Fig. 5, Supplementary mate- Supplementary material B) provides further evidence that simple
rial C). Particularly for small samples (e.g. n = 50), an increase in grid sampling (G) is inferior to all other designs. Again, this is
sample size is reflected in a large gain in accuracy, whereas because of missing data at short lags (Fig. 6).
increasing the sample size of an already large sample (n = 200) A closer look at the variogram parameter estimates (Fig. 5, Sup-
often results in a comparatively small increase in accuracy. This plementary material B) allows to further assess the sampling
result is in accordance with previous studies (Gascuel-Odoux and designs. Considering a sample size P150, designs G, Gs and R esti-
Boivin, 1994; Kerry et al., 2008), although the point where the ben- mated the sill slightly better than the design T. The nugget and the
efit of an increase in sample size levels off seems to be case- range, however, were estimated best with design T. The superior
specific. For our throughfall data this point is reached only at a performance of design T is related to the sampling of several short
large sample size of P200 (Fig. 5). lag distances, which allowed capturing the small scale variation
typical for throughfall data (cf. Keim et al., 2005; Möttönen et al.,
1999; Zimmermann and Zimmermann, 2014). Somewhat surpris-
3.3. The role of the sampling design
ing, random sampling (R) often estimated the nugget and sill as
good as or even better than design Gs. This is because few data
The performance of the tested sampling designs strongly
pairs in multiple short lag distances (design R, Fig. 6) seem to be
depends on the extent, sample size, and variogram estimation
better than comparatively many pairs in just one short distance
method (Fig. 4, Supplementary material B; Fig. 5, Supplementary
(design Gs, Fig. 6). In practice, however, simple random sampling
material C). Therefore, an assessment of a particular design always
has the disadvantage that the distribution of points among lag
requires to consider the entire sampling setup. For instance,
classes cannot be controlled.
transect-based sampling (T) and random sampling (R) produce
Our results (Fig. 5, Supplementary material C) and data of
particularly erratic semivariance estimates for plots with an edge
previous studies (Corsten and Stein, 1994) indicate that the
length P50 m and sample sizes 6100 (Fig. 4, Supplementary mate-
Table 2
Comparison of sampling designs and throughfall autocorrelation ranges from studies conducted in a variety of forest ecosystems.
References Forest type, location Extenta Nb Supportc Min. Sampling design Effective
[m2] [cm2] distance [m] ranged [m]
Bellot and Escarre (1998) Holm-oak fores, NE Spain 950 50 ? ? Random Pure nugget
Fathizadeh et al. (2014) Persian oak trees, W Iran 60 16 64 0.5 Radial 5–6
Gerrits et al. (2010) 120-year-old beech forest, 596 81 400 2 Grid 6–7
Luxembourg
Gómez et al. (2002) Olive trees, Spain 36 36 113 0.5 Grid 1.5–4.5
Hsueh et al. (2016) 42-year-old mixed-hardwood forest 484 100 145 ? Random 2–6
Keim et al. (2005) 60-year-old conifer forest, W USA 225 94 9 ? Random 5e
Keim et al. (2005) Old conifer forest, W USA 900 94 9 ? Random 5e
Keim et al. (2005) 60-year-old deciduous forest, W USA 304 94 9 ? Random 10e
Loescher et al. (2002) Tropical rain forest, Costa Rica 13,273 56 95 5 Radial and short distances 43
Loustau et al. (1992) 18-year-old pine stand, S France 2500 52 707 ? Stratified random Pure nugget
Möttönen et al. (1999) 150 to 200-year-old Scots pine 10,000 181 50 1.25 Grids with different 9
forest, E Finland distances
Ritter and Regalado (2010) Evergreen tree heath-laurel forest, ? 22 214 1 Transect Pure nugget
Canary islands
Staelens et al. (2006) 85-year-old beech tree, Belgium 180 48 460 0.75 Grid and transect 3–4
Staelens et al. (2006) 85-year-old beech tree, Belgium 180 50 158 0.25 Transect No stable sill
Ziegler et al. (2009) Montane forest, Thailand 500 20 540 2 Stratified random sill not
reached
Zimmermann et al. (2009) Tropical rain forest, Panama 10,000 220 113 1 Stratified random and short Mostly pure
distances nuggetf
Zimmermann and Teak plantation 10,000 350 113 0.1 Stratified random and short 2.6
Zimmermann (2014) distances
Zimmermann and Young secondary forest 10,000 350 113 0.1 Stratified random and short 5.3
Zimmermann and Old secondary forest 10,000 350 113 0.1 Stratified random and short 3.9
a
Size of sampling area.
b
Sample size.
c
Receiving area of collector.
d
The effective range Reff is calculated as exponential model: Reff = range ⁄ 3. Spherical model: Reff = range.
e
Keim et al. (2005) estimated the range visually from the experimental variogram, no variogram model was fitted.
f
Zimmermann et al. (2009) detected a pure nugget structure in 82% of the analyzed events, the remaining events showed weak structures (i.e. high nugget/sill ratios) with
highly variable effective ranges.
performance of sampling designs strongly depends on the ratio of data. The drawback of REML, however, is that it cannot deal with
the range to the extent of the study area. For data with a compar- outlying values. If the removal of outliers is not justifiable or not
atively large range, random-, grid- and transect-based sampling effective, MoM is still an appropriate approach to analyze autocor-
designs differ little in their performance (Corsten and Stein, relation structures in throughfall data.
1994). In contrast, the analysis of short-ranging autocorrelation
structures may greatly benefit from sampling approaches that
3.5. Implications for throughfall studies
include multiple small distances (cf. Fig. 5, Supplementary material
C). Because sampling with transects performs best for a range of
A closer look on previous studies that analyzed throughfall spa-
extents and sample sizes (Fig. 5, Supplementary material C), this
tial patterns reveals that sample sizes, sampling designs and
design is particularly advantageous when there is no prior knowl-
extents of research areas differed widely in the past (Table 2). Most
edge on autocorrelation structures (cf. Lark, 2002a).
studies applied a sample size of less than 100, some studies even
worked with less than 50 data. Furthermore, all but one study
3.4. The role of the estimation method applied MoM estimation. According to our data (Figs. 4 and 5),
the results of most of the studies on throughfall spatial patterns
Of the estimation methods, REML proved superior to MoM are associated with large uncertainties. In fact, it is likely that some
(Fig. 5, Supplementary material C). For instance, MoM-based esti- of the detected autocorrelation patterns rather reflect shortcom-
mates of the range often required 50–100 sampling points more ings of the sampling than true structures in throughfall. Conse-
to reach the same level of accuracy as obtained with REML (cf. quently, the data of these studies have to be interpreted very
Fig. 5, transect-based sampling). The superior performance of max- carefully. Therefore, claims that the range can be linked to crown
imum likelihood-based variogram estimation has been found in structure (Keim et al., 2005; Loescher et al., 2002) or to phenolog-
previous studies too (Kerry et al., 2008; Pardo-Igúzquiza, 1998; ical dynamics in the canopy (Gerrits et al., 2010; Keim et al., 2005)
Pardo-Igúzquiza and Dowd, 2013). Our data further indicate that need to be verified by other case studies and not taken as common
REML seems to be particularly advantageous for spatial fields with knowledge.
a short autocorrelation distance (cf. Supplementary material C, From our point of view, understanding spatio-temporal pat-
transect based sampling, fields 3, 4, 5). This observation corrobo- terns of hydrological processes in forests requires a much higher
rates findings of an earlier simulation study by Lark (2000b) who sampling effort than previously implemented. A combination of a
noted that variograms with weak spatial structures (short ranges sample size of 150–200, a sampling design that ensures the sam-
and comparatively large nugget-to-sill ratios) were significantly pling of small distances and variogram estimation by REML seems
better estimated by REML than with the MoM-based approach. to be a good compromise between accuracy and efficiency. Studies
Consequently, these results hint at the usefulness of REML for that rely on the MoM-based approach have to use at least 200 sam-
the spatial analysis of throughfall- and other weakly structured ples. Furthermore, the choice of the extent offers yet another pos-
sibility to improve variogram estimation. Very large ratios of the Bellot, J., Escarre, A., 1998. Stemflow and throughfall determination in a resprouted
Mediterranean holm-oak forest. Ann. Sci. For. 55, 847–865. http://dx.doi.org/
extent to the autocorrelation length should be avoided. At the
10.1051/forest:19980708.
same time, plots should extend over several times of the ‘‘true” Blöschl, G., 1999. Scaling issues in snow hydrology. Hydrol. Process. 13, 2149–2175.
range to avoid biased estimates of the range (cf. Western and http://dx.doi.org/10.1002/(SICI)1099-1085(199910)13:14/15<2149::AID-
Blöschl, 1999). HYP847>3.0.CO;2-8.
Bogaert, P., Russo, D., 1999. Optimal spatial sampling design for the estimation of
the variogram based on a least squares approach. Water Resour. Res. 35, 1275–
4. Conclusions 1289. http://dx.doi.org/10.1029/1998WR900078.
Corsten, L.C.A., Stein, A., 1994. Nested sampling for estimating spatial
semivariograms compared to other designs. Appl. Stoch. Models Data Anal.
In this study we employed three extents (plots with an edge 10, 103–122. http://dx.doi.org/10.1002/asm.3150100205.
length of 25 m, 50 m, and 100 m), four common sampling designs Cressie, N., 1985. Fitting variogram models by weighted least squares. J. Int. Assoc.
Math. Geol. 17, 563–586. http://dx.doi.org/10.1007/BF01032109.
(grid, grid with additional sampling points, transect, and random Cressie, N., Hawkins, D.M., 1980. Robust estimation of the variogram: I. Math. Geol.
sampling), a wide range of sample sizes (50, 100, 150, 200, 400), 12, 115–125. http://dx.doi.org/10.1007/BF01035243.
and two estimation methods (residual maximum likelihood De Gruijter, J.J., Brus, D.J., Bierkens, M., Knotters, M., 2006. Sampling for
Natural Resource Monitoring, first ed. Springer-Verlag, Berlin Heidelberg
(REML) vs. method-of-moments (MoM)) to investigate which sam-
New York.
pling setup and methodology allows an optimum variogram esti- Diggle, P.J., Ribeiro, P.J., 2007. Model-based Geostatistics. Springer, New York.
mation of non-Gaussian and weakly structured throughfall data. Dowd, P.A., 1984. The variogram and kriging: robust and resistant estimators. In:
Verly, G., David, M., Journel, A.G., Marechal, A. (Eds.), Geostatistics for Natural
Our calculations are based on six simulated throughfall fields that
Resources Characterization. Springer, Netherlands, pp. 91–106.
we repeatedly sampled. Subsequently, we compared the obtained Fathizadeh, O., Attarod, P., Keim, R.F., Stein, A., Amiri, G.Z., Darvishsefat, A.A., 2014.
variogram parameter estimates against the true field parameters. Spatial heterogeneity and temporal stability of throughfall under individual
Our results provide evidence that the choice of the extent Quercus brantii trees. Hydrol. Process. 28, 1124–1136. http://dx.doi.org/
10.1002/hyp.9638.
strongly influences variogram estimation. A comparatively small Gascuel-Odoux, C., Boivin, P., 1994. Variability of variograms and spatial estimates
ratio of the extent to the correlation length is beneficial for vari- due to soil sampling: a case study. Geoderma 62, 165–182. http://dx.doi.org/
ogram estimation. Yet, the extent should be larger than several 10.1016/0016-7061(94)90034-5.
Gerrits, A.M.J., Pfister, L., Savenije, H.H.G., 2010. Spatial and temporal variability of
times the ‘‘true” range to avoid biased estimates of the range canopy and forest floor interception in a beech forest. Hydrol. Process. 24,
parameter. Our results further indicate that transect-based sam- 3011–3025. http://dx.doi.org/10.1002/hyp.7712.
pling is superior to both grid-based designs and to random sam- Gómez, J., Vanderlinden, K., Giráldez, J., Fereres, E., 2002. Rainfall concentration
under olive trees. Agric. Water Manag. 55, 53–70. http://dx.doi.org/10.1016/
pling. Random sampling, in turn, outperformed the grid-based S0378-3774(01)00181-0.
designs due to the inclusion of multiple small distances. Lastly, Hiemstra, P.H., Pebesma, E.J., Twenhöfel, C.J.W., Heuvelink, G.B.M., 2009. Real-time
REML-based variogram estimation is superior to the widely used automatic interpolation of ambient gamma dose rates from the Dutch
radioactivity monitoring network. Comput. Geosci. 35, 1711–1721. http://dx.
MoM approach.
doi.org/10.1016/j.cageo.2008.10.011.
For an optimum spatial analysis of throughfall data we recom- Hsueh, Y., Allen, S.T., Keim, R.F., 2016. Fine-scale spatial variability of throughfall
mend the following setup and methodology: an extent exceeding amount and isotopic composition under a hardwood forest canopy. Hydrol.
Process. 30, 1796–1803. http://dx.doi.org/10.1002/hyp.10772.
several times the ‘‘true” range, a minimum of 150 samples, tran-
Jian, X., Olea, R.A., Yu, Y.-S., 1996. Semivariogram modeling by weighted least
sects covering several small (0.1–1.5 m) distances, and REML- squares. Comput. Geosci. 22, 387–397. http://dx.doi.org/10.1016/0098-3004
based variogram estimation. In situations that do not allow the (95)00095-X.
application of REML, MoM can be used but a minimum of 200 data Journel, A.G., Huijbregts, C.J., 1978. Mining Geostatistics. Academic Press, London,
New York, San Francisco.
is required. The suggested data requirements double previous rec- Keim, R.F., Skaugset, A.E., Weiler, M., 2005. Temporal persistence of spatial patterns
ommendations which were based on normally distributed data in throughfall. J. Hydrol. 314, 263–274. http://dx.doi.org/10.1016/j.
with comparatively strong and long ranging autocorrelation struc- jhydrol.2005.03.021.
Kerry, R., Ingram, B.R., Goovaerts, P., Oliver, M.A., 2008. How many samples are
tures. Our calculations further indicate that most previous studies required to estimate a reliable REML variogram. In: Geostats 2008. Proceedings
on throughfall autocorrelation structures are likely associated with of the Eighth International Geostatistics Congress, pp. 1155–1160.
large uncertainties. Therefore, we suggest that new case studies Kerry, R., Oliver, M.A., 2007. Determining the effect of asymmetric data on the
variogram. I. Underlying asymmetry. Comput. Geosci. 33, 1212–1232. http://dx.
may verify previous results regarding the autocorrelation in doi.org/10.1016/j.cageo.2007.05.008.
throughfall data and its link to forest characteristics. Klos, P.Z., Chain-Guadarrama, A., Link, T.E., Finegan, B., Vierling, L.A., Chazdon, R.,
2014. Throughfall heterogeneity in tropical forested landscapes as a focal
mechanism for deep percolation. J. Hydrol. 519 (Part B), 2180–2188. http://dx.
Acknowledgements doi.org/10.1016/j.jhydrol.2014.10.004.
Lark, R.M., 2000a. A comparison of some robust estimators of the variogram for use
This research was supported by the German Research Founda- in soil survey. Eur. J. Soil Sci. 51, 137–157. http://dx.doi.org/10.1046/j.1365-
2389.2000.00280.x.
tion (ZI 1300/1-1). Field data collection was further supported by
Lark, R.M., 2000b. Estimating variograms of soil properties by the method-of-
the Agua Salud Project, a research initiative sponsored by the HSBC moments and maximum likelihood. Eur. J. Soil Sci. 51, 717–728.
Climate Partnership. We thank two anonymous reviewers and the Lark, R.M., 2002a. Optimized spatial sampling of soil for estimation of the variogram
associate editor for their constructive comments. by maximum likelihood. Geoderma 105, 49–80. http://dx.doi.org/10.1016/
S0016-7061(01)00092-1.
Lark, R.M., 2002b. Modelling complex soil properties as contaminated regionalized
Appendix A. Supplementary material variables. Geoderma 106, 173–190. http://dx.doi.org/10.1016/S0016-7061(01)
00123-9.
Lark, R.M., Cullis, B.R., Welham, S.J., 2006. On spatial prediction of soil properties in
Supplementary data associated with this article can be found, in the presence of a spatial trend: the empirical best linear unbiased predictor (E-
the online version, at http://dx.doi.org/10.1016/j.jhydrol.2016.06. BLUP) with REML. Eur. J. Soil Sci. 57, 787–799. http://dx.doi.org/10.1111/j.1365-
2389.2005.00768.x.
042. Lloyd, C., Marques, A.D.O., 1988. Spatial variability of throughfall and stemflow
measurements in Amazonian rainforest. Agric. For. Meteorol. 42, 63–73. http://
References dx.doi.org/10.1016/0168-1923(88)90067-6.
Loescher, H.W., Powers, J.S., Oberbauer, S.F., 2002. Spatial variation of throughfall
volume in an old-growth tropical wet forest, Costa Rica. J. Trop. Ecol. 18, 397–
Allen, S.T., Keim, R.F., McDonnell, J.J., 2015. Spatial patterns of throughfall isotopic
407. http://dx.doi.org/10.1017/S0266467402002274.
composition at the event and seasonal timescales. J. Hydrol. 522, 58–66. http://
Loustau, D., Berbigier, P., Granier, A., Moussa, F.E.H., 1992. Interception loss,
dx.doi.org/10.1016/j.jhydrol.2014.12.029.
throughfall and stemflow in a maritime pine stand. I. Variability of throughfall
Bárdossy, A., Kundzewicz, Z.W., 1990. Geostatistical methods for detection of
and stemflow beneath the pine canopy. J. Hydrol. 138, 449–467. http://dx.doi.
outliers in groundwater quality spatial fields. J. Hydrol. 115, 343–359. http://dx.
org/10.1016/0022-1694(92)90130-N.
doi.org/10.1016/0022- 1694(90)90213-H.
Matheron, G., 1962. Traité de géostatistique appliquée, tome i: Mémoires du bureau Skøien, J.O., Blöschl, G., 2006. Scale effects in estimating the variogram and
de recherches géologiques et minières. Pairs Ed. Tech. 14. implications for soil hydrology. Vadose Zone J. 5, 153–167. http://dx.doi.org/
Morris, M.D., 1991. On counting the number of data pairs for semivariogram 10.2136/vzj2005.0069.
estimation. Math. Geol. 23, 929–943. http://dx.doi.org/10.1007/BF02066733. Staelens, J., De Schrijver, A., Verheyen, K., Verhoest, N.E.C., 2006. Spatial variability
Möttönen, M., Järvinen, E., Hokkanen, T.J., Kuuluvainen, T., Ohtonen, R., 1999. and temporal stability of throughfall water under a dominant beech (Fagus
Spatial distribution of soil ergosterol in the organic layer of a mature Scots pine sylvatica L.) tree in relationship to canopy cover. J. Hydrol. 330, 651–662. http://
(Pinus sylvestris L.) forest. Soil Biol. Biochem. 31, 503–516. http://dx.doi.org/ dx.doi.org/10.1016/j.jhydrol.2006.04.032.
10.1016/S0038-0717(98)00122-9. Van Groenigen, J.W., 1999. Constrained Optimisation of Spatial Sampling: A
Müller, W.G., Zimmerman, D.L., 1999. Optimal designs for variogram estimation. Geostatistical Approach. Wageningen.
Environmetrics 10, 23–37. http://dx.doi.org/10.1002/(SICI)1099-095X(199901/ Warrick, A.W., Myers, D.E., 1987. Optimization of sampling locations for variogram
02)10:1<23::AID-ENV333>3.0.CO;2-P. calculations. Water Resour. Res. 23, 496–500. http://dx.doi.org/10.1029/
Papritz, A., 2014. Georob: Robust Geostatistical Analysis of Spatial Data. Version WR023i003p00496.
0.1-5. Webster, R., Oliver, M.A., 2007. Geostatistics for Environmental Scientists (Statistics
Pardo-Igúzquiza, E., 1998. Maximum likelihood estimation of spatial covariance in Practice), second ed. Wiley, Chichester.
parameters. Math. Geol. 30, 95–108. http://dx.doi.org/10.1023/ Webster, R., Oliver, M.A., 1992. Sample adequately to estimate variograms of soil
A:1021765405952. properties. J. Soil Sci. 43, 177–192. http://dx.doi.org/10.1111/j.1365-2389.1992.
Pardo-Igúzquiza, E., Dowd, P.A., 2013. Comparison of inference methods for tb00128.x.
estimating semivariogram model parameters and their uncertainty: the case Western, A.W., Blöschl, G., 1999. On the spatial scaling of soil moisture. J. Hydrol.
of small data sets. Comput. Geosci. 50, 154–164. http://dx.doi.org/10.1016/j. 217, 203–224. http://dx.doi.org/10.1016/S0022-1694(98)00232-7.
cageo.2012.06.002. Whelan, M.J., Sanger, L.J., Baker, M., Anderson, J.M., 1998. Spatial patterns of
Patterson, H.D., Thompson, R., 1971. Recovery of inter-block information when throughfall and mineral ion deposition in a lowland Norway spruce (Picea abies)
block sizes are unequal. Biometrika 58, 545–554. http://dx.doi.org/10.1093/ plantation at the plot scale. Atmos. Environ. 32, 3493–3501. http://dx.doi.org/
biomet/58.3.545. 10.1016/S1352-2310(98)00054-5.
Pettitt, A.N., McBratney, A.B., 1993. Sampling designs for estimating spatial variance Ziegler, A.D., Giambelluca, T.W., Nullet, M.A., Sutherland, R.A., Tantasarin, C., Vogler,
components. Appl. Stat. 42, 185. http://dx.doi.org/10.2307/2347420. J.B., Negishi, J.N., 2009. Throughfall in an evergreen-dominated forest stand in
R Core Team, 2013. R: A language and environment for statistical computing. R northern Thailand: comparison of mobile and stationary methods. Agric. For.
Foundation for Statistical Computing, Version 3.0.1, Vienna, Austria. Meteorol. 149, 373–384. http://dx.doi.org/10.1016/j.agrformet.2008.09.002.
Ritter, A., Regalado, C.M., 2010. Investigating the random relocation of gauges below Zimmermann, A., Zimmermann, B., 2014. Requirements for throughfall monitoring:
the canopy by means of numerical experiments. Agric. For. Meteorol. 150, the roles of temporal scale and canopy complexity. Agric. For. Meteorol. 189–
1102–1114. http://dx.doi.org/10.1016/j.agrformet.2010.04.010. 190, 125–139. http://dx.doi.org/10.1016/j.agrformet.2014.01.014.
Russo, D., Jury, W.A., 1987. A theoretical study of the estimation of the correlation Zimmermann, A., Zimmermann, B., Elsenbeer, H., 2009. Rainfall redistribution in a
scale in spatially variable fields. 1. Stationary fields. Water Resour. Res. 23, tropical forest: spatial and temporal patterns. Water Resour. Res. 45, W11413.
1257–1268. http://dx.doi.org/10.1029/WR023i007p01257. http://dx.doi.org/10.1029/2008WR007470.
Shachnovich, Y., Berliner, P.R., Bar, P., 2008. Rainfall interception and spatial Zimmermann, B., Zimmermann, A., Lark, R.M., Elsenbeer, H., 2010. Sampling
distribution of throughfall in a pine forest planted in an arid zone. J. Hydrol. 349, procedures for throughfall monitoring: a simulation study. Water Resour. Res.
168–177. http://dx.doi.org/10.1016/j.jhydrol.2007.10.051. 46, W01503. http://dx.doi.org/10.1029/2009WR007776.

1 s2.0 S0022169416304048 Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S0022169416304048 Main

Uploaded by

Copyright:

Available Formats

Journal of Hydrology 540 (2016) 527–537

Contents lists available at ScienceDirect

Detecting spatial structures in throughfall data: The effect of extent,

1. Introduction Zimmermann and Zimmermann, 2014; Zimmermann et al.,

field 1 field 2 field 3

Fig. 1. The six random fields used for re-sampling.

G Gs 2.6. The exhaustive variogram as comparison

To assess the accuracy of the variogram estimation with varying

up to here from Zimmermann & Zimmermann (2014)

Residual Maximum Likelihood (REML)

n = 100 n = 100 n = 100

n = 150 n = 150 n = 150

n = 200 n = 200 n = 200

n = 400 n = 400 n = 400

distance [m] distance [m] distance [m]

You might also like