Professional Documents
Culture Documents
ABSTRACT: Although considerable theory exists for the probabilistic treatment of soils, the ability to identify
the nature of spatial stochastic soil variation is almost nonexistent. We all know that we could excavate an entire
site and there would be no doubt about the soil properties. However, there would no longer be anything to rest
our structure on, and so we must live with uncertainty and attempt to quantify it rationally. Twenty years ago
the mean and variance was sufficient. Clients are now demanding full reliability studies, requiring more so-
phisticated models, so that engineers are becoming interested in rational soil correlation structures. Knowing
that soil properties are spatially correlated, what is a reasonable correlation model? Are soils best represented
using fractal models or finite-scale models? What is the difference? How can this question be answered? Once
a model has been decided upon, how can its parameters be estimated? These are questions that this paper
Downloaded from ascelibrary.org by Shiv Nadar University on 01/18/23. Copyright ASCE. For personal use only; all rights reserved.
addresses by looking at a number of tools that aid in selecting appropriate stochastic models. These tools include
the sample covariance, spectral density, variance function, variogram, and wavelet variance functions. Common
models, corresponding to finite scale and fractal models, are investigated, and estimation techniques are dis-
cussed.
part of the soil properties at the target site may be grossly chemical and mechanical details of the soil particles them-
in error. selves), which is common to most soil properties and types.
• The use of only the residual process ε(z) at the target site Although it is undoubtedly true that many exceptions to the
will considerably underestimate the soil variability—the idea of an externally created correlation structure exist, the
reported statistics will be unconservative. In fact, the more idea nevertheless gives a reasonable starting point for this type
variability accounted for by m(z), the less variable is ε(z). of investigation. It allows the correlation structure derived
from a random field of a specific soil property to be used
without change as a reasonable a priori correlation structure
From these considerations, it is easily seen that trends that are for other soil properties of interest and for other similar sites
not likely to reappear, that is, trends that are not physically (or (although changes in the mean and possibly variance may, of
empirically) based and predictable, must not be removed prior course, be necessary).
to performing an inferential statistical analysis. The trend itself Fundamental to the following statistical analysis is the as-
is part of the uncertainty to be characterized and removing it sumption that the soil is spatially statistically homogeneous.
leads to unconservative reported statistics. This means that the mean, covariance, correlation structure,
It should be pointed out at this time that one of the reasons and higher-order moments are independent of position (and
for ‘‘detrending’’ the data is precisely to render the residual thus are the same from any reference origin). In the general
process largely spatially independent. This is desirable because case, isotropy is not usually assumed. That is, the vertical and
virtually all classical statistics are based on the idea that sam- horizontal correlation structures are allowed to be quite dif-
ples are composed of independent and identically distributed ferent. However, this is not an issue in this study because only
observations. Alternatively, when observations are dependent, the 1D case is considered.
the distributions of the estimators become very difficult to es- The assumption of spatial homogeneity does not imply that
tablish. This is compounded by the fact that the actual depen- the process is relatively uniform over any finite domain. It
dence structure is unknown. Only a limited number of asymp- allows apparent trends in the mean, variance, and higher-order
totic results are available to provide insight into the spatially moments as long as those trends are just part of a larger-scale
dependent problem (Beran 1994), and simulation techniques random fluctuation; that is, the mean, variance, etc. need only
are proving very useful in this regard (Cressie 1993). be constant over infinite space, not when viewed over a finite
Another issue to be considered is the level of information distance. (Fig. 1 may be viewed as an example of a homo-
available at the target site. Generally, a design does not pro- geneous random field that appears nonstationary when viewed
ceed in the complete absence of site information. The ideal locally.) Thus, this assumption does not preclude large-scale
case involves gathering enough data to allow the main char- variations, such as those often found in natural soils, although
acteristics, for instance, the mean and variance, of the soil the statistics relating to the large-scale fluctuations are gener-
property to be established with reasonable confidence. Then, ally harder to estimate reliably from a finite-sampling domain.
inferred statistics regarding the spatial correlation (where cor- The assumption of spatial homogeneity does, however,
relation means correlation coefficient throughout this paper) seem to imply that the site over which the measurements are
structure can be used to complete the uncertainty picture and taken is fairly uniform in geological makeup (or soil type).
allow a reasonable reliability analysis or internal linear esti- Again, this assumption relates to the level of uncertainty about
mation. Under this reasoning, this paper will concentrate on the site for which the random model is aimed. Even changing
statistics relating to the spatial correlation structure of a soil. geological units may be viewed as simply part of the overall
Although the mean and variance will be obtained along the randomness or uncertainty, which is to be characterized by the
way as part of the estimation process, these results tend to be random model. The more that is known about a site, the less
specifically related to, and affected by, the soil type; this is random the site model should be. However, the initial model
that is used before significant amounts of data are explicitly
true particularly of the mean. The correlation structure is be-
gathered should be consistent with the level of uncertainty at
lieved to be more related to the formation process of the soil;
the target site at the time the model is applied. Bayesian up-
that is, the correlation between soil properties at two disjoint
dating can be used to improve a prior model under additional
points will be related to where the materials making up the site data.
soil at the two points originated and to the common weathering With these thoughts in place, an appropriate inferential anal-
processes experienced at the two points (i.e., geological dep- ysis proceeds as follows:
osition processes, etc.). Thus, the major factors influencing a
soil’s correlation structure can be thought of as being ‘‘exter- 1. An initial regression analysis may be performed to de-
nal’’ (i.e., related to transport and weathering rather than to termine if a statistically significant spatial trend is pres-
JOURNAL OF GEOTECHNICAL AND GEOENVIRONMENTAL ENGINEERING / JUNE 1999 / 471
() = exp 冉 冊
⫺
2兩兩
(2)
冘
the Second-Order Structural Analysis section takes a qualita- n⫺j ⫹1
tive look at a number of tools that reveal something about 1
Ĉ(j) = (x i ⫺
ˆ X)(x i ⫹ j ⫺1 ⫺
ˆ X), j = 1, 2, . . . , n (7)
the second-moment behavior of a 1D random process. The n i =1
intent of that section is to evaluate these tools with respect to
their ability to discern between finite scale and fractal behav- where the lag j = ( j ⫺ 1)⌬z. Notice that Ĉ(0) is the same as
ior. In the Estimation of First- and Second-Order Statistical the estimated variance ˆ 2X. The sample correlation is obtained
Parameters section, various maximum likelihood approaches by normalizing
to the estimation of the parameters for the finite-scale and frac- Ĉ(j)
tal models are given. Finally, the results are summarized in ˆ (j) = (8)
the Conclusions with a view toward their use in developing a Ĉ(0)
priori soil statistics from a large geotechnical database. One of the major difficulties with the sample correlation func-
tion resides in the fact that it is heavily dependent on the
SECOND-ORDER STRUCTURAL ANALYSIS estimated mean ˆ X. When the soil shows significant long-scale
Attention is now turned to the stochastic characterization of dependence, characterized by long-scale fluctuations (see, e.g.,
the soil data itself. Aside from estimating the mean and vari- Fig. 1), then
ˆ X is almost always a poor estimate of the true
ance, the spatial correlation structure must be deduced. In the mean. In fact, it is not too difficult to show that although the
following it is assumed that the data, xi , i = 1, 2, . . . , n, are mean estimator [(5)] is unbiased, its variance is given by
collected at a sequence of equispaced points along a line and
冋 冘冘 册
n n
1
that the best stochastic model along that line is to be found. Var[
ˆ X] = (i⫺j) 2X = ␥n2X ⯝ ␥(D)2X (9)
Note that the xi may be some suitable transformation of the n2 i =1 j =1
actual data derived from the samples. In the following, xi is where D = (n ⫺ 1)⌬z = sampling domain size (interpreted as
an observation of the random process X i = X(zi), where z is an the region defined by n equisized ‘‘cells,’’ each of width ⌬z
index (commonly depth) and zi = (i ⫺ 1)⌬z, i = 1, 2, . . . , n. centered on an observation); ␥(D) = variance function
A variety of tools will be considered in this section and their (VanMarcke 1984) that gives the variance reduction due to
ability to identify the most appropriate stochastic model for Xi averaging over the length D
冕冕 冕
will be discussed. In particular, interest focuses on whether the D D D
process X(z) is finite scaled or fractal in nature. The perfor- 1 2
mance of the various tools in answering this question will be ␥(D) = ( ⫺ s) d ds = (T ⫺ )() d (10)
D2 0 0 D2 0
evaluated via simulation employing 2,000 realizations of fi-
nite-scale (Markov) [(2)] and fractal [(4)] processes. Each sim- The discrete approximation to the variance function, denoted
ulation is 20.48 m in length with ⌬z = 0.02, so that n = 1,024, ␥n in (9), approaches ␥(D) as n becomes large. For highly
and realizations are produced via covariance matrix decom- correlated soil samples (over the sampling domain), ␥(D) re-
position (Fenton 1994)—a method that follows from a Cho- mains close to 1.0, so that
ˆ X remains highly variable, almost
leski decomposition. Unfortunately, large covariance matrices as variable as X(z) itself. Notice that the variance of ˆ X is
are often nearly singular and so are difficult to decompose unknown because it depends on the unknown correlation struc-
correctly. Because the covariance matrix for a 1D equispaced ture of the process.
random field is symmetric and Toeplitz (i.e., the entire matrix In addition, it can be shown that Ĉ(j) is biased according
is known if only the first column is known—all elements to the following (VanMarcke 1984):
冉 冊
along each diagonal are equal), the decomposition is done us-
ˆ j)] ⯝ 2X n⫺j⫹1
ing the numerically more accurate Levinson-Durbin algorithm E[C( [(j) ⫺ ␥(D)] (11)
[see Marple (1987) and Brockwell and Davis (1987)]. n
where, again, the approximation improves as n increases. From
Sample Correlation Function this it can be seen that
冉 冊冉 冊
The classical sample average of x i is computed as ˆ j)] n⫺j⫹1 (j) ⫺ ␥(D)
冘
E[C(
n E[ˆ (j)] ⯝ ⯝ (12)
1 ˆ
E[C(0)] n 1 ⫺ ␥(D)
ˆX = xi (5)
n i =1
using a first-order approximation. For soil samples that show
and the sample variance as considerable serial correlation, ␥(D) may remain close to 1,
JOURNAL OF GEOTECHNICAL AND GEOENVIRONMENTAL ENGINEERING / JUNE 1999 / 473
and generally the term (j) ⫺ ␥(D) will become negative for The major difference between V̂(j) and ˆ (j) is that the
all j* ⱕ j ⱕ n for some j* less than n. What this means is semivariogram does not depend on ˆ X. This is a clear advan-
that the estimator ˆ () will typically dip below zero even when tage because many of the problems of the correlation function
the field is actually highly positively correlated. relate to this dependence. In fact, it is easily shown that the
Another way of looking at this problem is as follows. Con- semivariogram is an unbiased estimator with E[V( ˆ j)] =
sider the sample shown in Fig. 1 and assume that the apparent (1/2)E[(X i⫹j ⫺ X i)2]. Fig. 5 shows how this estimator behaves
trend is in fact just part of a long-scale fluctuation. Clearly, for finite-scale and fractal processes. Notice that the finite-
Fig. 1 is then a process with a large scale of fluctuation, com- scale semivariogram rapidly increases to its limiting value (the
pared with D. The local average ˆ X is estimated and shown variance) and then flattens out whereas the fractal process
by the dashed line in Fig. 3. Now, for any greater than about leads to a semivariogram that continues to increase gradually
half of the sampling domain, the product of the deviations throughout. This behavior can indicate the underlying process
from ˆ X in (7) will be negative. This means that the sample type and allow identification of a suitable correlation model.
correlation function will decrease rapidly and become negative Noteworthy, however, is the very wide range between the ob-
somewhere before = (1/2)D even though the true correlation served minimum and maximum (the maximum going off the
function may remain much closer to 1.0 throughout the sam- plot but having maximum values in the range from 5 to 10 in
ple. both cases). The high variability in the semivariogram may
It should be noted that if the sample does in fact come from hinder its use in discerning between model types unless suf-
a short-scale process, with << D, the variability of (9) and ficient averaging can be performed.
the bias of (12) largely disappear because ␥(D) ⯝ 0. This The semivariogram finds its primary use in mining geosta-
means that the sample correlation function is a good estimator tistics applications [see, e.g., Journel and Huijbregts (1978)].
of short-scale processes as long as << D. However, if the Cressie (1993) discussed some of its distributional character-
process does in fact have long-scale dependence, then the cor- istics along with robust estimation issues, but little is known
relation function cannot identify this and in fact continues to about the distribution of the semivariogram when X(z) is spa-
illustrate short-scale behavior. In essence, the estimator is anal- tially dependent. Without the estimator distribution, the
ogous to a self-fulfilling prophecy: It always appears to justify semivariogram cannot easily be used to test rigorously be-
its own assumptions. tween competing model types (as in fractal versus finite scale),
Fig. 4 illustrates the situation graphically using simulations nor can it be used to fit model parameters using the maximum
from finite-scale and fractal processes. The dashed lines show likelihood method. The latter is one of the goals of this paper.
the maximum and minimum correlations observed at each lag For these reasons, the semivariogram will not be pursued fur-
over the 2,000 realizations. The finite-scale ( = 3) simulation ther in this paper, although indications are that such a pursuit
shows reasonable agreement between ˆ () and the true corre- may be warranted.
lation because << D ⯝ 20. However, for the fractal process
(H = 0.95) there is a very large discrepancy between the es- Sample Variance Function
timated average and true correlation functions. Clearly, the
sample correlation function fails to provide any useful infor- The variance function measures the decrease in the variance
mation about large-scale or fractal processes. of an average as an increasing number of sequential random
variables are included in the average. If the local average of
Sample Semivariogram a random process XD is defined by
冕
The semivariogram [half of the variogram, as defined by D
are related according to then the variance of XD is just ␥(D )2X. In the discrete case,
1 which will be used here, this becomes
冘 冘
V(j) = E[(X i⫹j ⫺ X i)2] = 2X (1 ⫺ ()) (13) n n
2 1 1
X̄ =
ˆX = X(zi) = Xi (16)
The sample semivariogram is defined by n i =1 n i =1
冘
n⫺j
1 where Var[X̄ ] = ␥nX2 ; and ␥n, defined by (9), is the discrete
V̂(j) = (x i⫹j ⫺ x i)2, j = 0, 1, . . . , n ⫺ 1 (14)
2(n ⫺ j) i =1 approximation of ␥(D) (see Sample Correlation Function sub-
474 / JOURNAL OF GEOTECHNICAL AND GEOENVIRONMENTAL ENGINEERING / JUNE 1999
FIG. 4. Sample Correlation Functions for Models Averaged Over 2,000 Realizations: (a) Finite Scale ( = 3); (b) Fractal (H = 0.95)
section). If the X i values are independent and identically dis- It should be noted that ␥ˆ 1 = 1 as the sum of (18) is the same
tributed then ␥n = 1/n, whereas if X1 = X2 = ⭈ ⭈ ⭈ = X n, then the as that used to find ˆ 2X when i = 1. Also, when i = n, the
values of X are completely correlated and ␥n = 1 so that av- sample variance function ␥ˆ n = 0 because X n, j = X n,1 =
ˆ X. Thus,
eraging does not lead to any variance reduction. In general, the sample variance function always connects the points ␥1 =
for correlation functions that remain nonnegative, 1/n ⱕ ␥n 1 and ␥n = 0.
ⱕ 1. Unfortunately, the sample variance function is biased, and
Conceptually, the rate at which the variance of an average its bias depends on the degree of correlation between obser-
decreases with averaging size states the spatial correlation vations. Specifically, it can be shown that
structure. In fact these are equivalent because in the 1D (con-
tinuous case) ␥i ⫺ ␥n
E[␥ˆ i] ⯝ (20)
冕
D 1 ⫺ ␥n
2 1 ⭸ 2
冘
n⫺i⫹1
1 to become unbiased.
␥ˆ i = 2 (X i, j ⫺
ˆ X)2, i = 1, 2, . . . , n (18) Figs. 6(a and b) show sample variance functions averaged
ˆ X(n ⫺ i ⫹ 1) j =1
over 2,000 simulations of finite-scale and fractal random pro-
where X i, j = local average as follows: cesses, respectively. There is very little difference between the
冘
j ⫹i ⫺1 estimated variance function in the two plots, despite the fact
1 that they come from quite different processes. Clearly, the es-
X i, j = Xk, j = 1, 2, . . . n ⫺ i ⫹ 1 (19)
i k=j timate of the variance function in the fractal case is highly
JOURNAL OF GEOTECHNICAL AND GEOENVIRONMENTAL ENGINEERING / JUNE 1999 / 475
FIG. 5. Variograms for Models Averaged Over 2,000 Realizations: (a) Finite Scale ( = 3); (b) Fractal (H = 0.95)
biased. Thus, the variance function plot does not appear to be they are self-similar in nature [i.e., all wavelets look the same
a good identification tool and is really only a useful estimate when viewed at the appropriate scale (which, in the above
of second-moment behavior for the finite-scale process with definition, is some power of 2)]. The random process X(z) is
<< D. In the plots, T = i⌬z for i = 1, 2, . . . , n. then expressed as a linear combination of various scalings,
冘冘
translations, and dilations of a common ‘‘shape.’’ Specifically
Wavelet Coefficient Variance
X(z) = X mj mj (z) (22)
The wavelet basis has attracted much attention in recent m j
years in areas of signal analysis, image compression, and If the wavelets are suitably selected to be orthonormal, then
among other things, fractal process modeling (Wornell 1996). the coefficients can be found through the inversion
It can basically be viewed as an alternative to Fourier decom-
冕
⬁
position except that sinusoids are placed by ‘‘wavelets’’ that m
act only over a limited domain. 1D wavelets are usually de- X =
j X(z)mj (z) dz (23)
⫺⬁
fined as translations along the real axis and dilations (scalings)
of a ‘‘mother wavelet,’’ as in for which highly efficient numerical solution algorithms exist.
The details of the wavelet decomposition will not be discussed
mj (z) = 2m/2(2mz ⫺ j) (21)
in this paper. [The interested reader should see, for example,
where m and j = dilation and translation indices, respectively. Strang and Nguyen (1996).]
The appeal to using wavelets to model fractal processes is that A theorem by Wornell (1996) states that, under reasonably
476 / JOURNAL OF GEOTECHNICAL AND GEOENVIRONMENTAL ENGINEERING / JUNE 1999
FIG. 6. Sample Variance Functions for Models Averaged Over 2,000 Realizations: (a) Finite Scale ( = 1); (b) Fractal (H = 0.98)
general conditions, if the coefficients X mj are mutually uncor- against the scale index m for the finite-scale and fractal sim-
related, zero-mean random variables with variances ulation cases. The fractal simulations yield a straight line, as
expected, whereas the finite-scale simulations show a slight
2m = Var[X mj ] = 22⫺␥m (24) flattening of the variance at lower values of m (larger scales).
The lowest value of m = 1 is not plotted because the variance
then X(z) obtained through (22) will have a spectrum that is of this estimate is very large, and it appears to suffer from the
very nearly fractal. Furthermore, Wornell made theoretical and same bias as estimates of the spectral density function at =
simulation based arguments showing that the converse is also 0 (as discussed later). On the basis of Fig. 7, it appears that
approximately true, namely, that if X(z) is fractal with spectral the wavelet coefficient variance plot may have some potential
density proportional to ⫺␥, then the coefficients X mj will be in identifying an appropriate stochastic model, although the
approximately uncorrelated with variance given by (24). If this difference in the plots is quite small. Confident conclusions
is the case, then a plot of ln(Var[X mj ]) versus the scale index will require a large data set.
m will be a straight line. If it turns out that X(z) is fractal and Gaussian, then the
Using a fifth-order Daubechies wavelet basis, Fig. 7 shows coefficients X mj are also Gaussian and (largely) uncorrelated as
a plot of the estimated wavelet coefficient variances ˆ 2m, where discussed above. This means that a maximum likelihood es-
冘
2m⫺1 timation can be performed to evaluate the spectral exponent ␥
1 by looking at the likelihood of the computed set of coefficients
ˆ 2m = (x mj )2 (25)
2m⫺1 j =1 x mj .
JOURNAL OF GEOTECHNICAL AND GEOENVIRONMENTAL ENGINEERING / JUNE 1999 / 477
FIG. 7. Sample Wavelet Coefficient Variance Functions for Models Averaged Over 2,000 Realizations: (a) Finite Scale ( = 3); (b) Frac-
tal (H = 0.95)
Sample Spectral Density Function exponentially distributed with means equal to the true one-
sided spectral density G(j) [see Beran (1994)]. VanMarcke
The sample spectral density function, referred to here also
(1984) showed that the periodogram itself has a nonzero scale
as the periodogram despite its slightly nonstandard form, is
of fluctuation when D = n⌬z is finite, equal to 2/D. This
obtained by first computing the Fourier transform of the data
suggests the presence of serial correlation between periodo-
冘
n⫺1
1 gram estimators. However, because the periodogram estimates
(j) = X(tk)e⫺ij k (26) at Fourier frequencies are separated by 2/D, they are, there-
n k =0
fore, approximately independent, according to the physical in-
at each Fourier frequency j = 2j/D, j = 0, 1, . . . , (n ⫺ 1)/2. terpretation of the scale of fluctuation distance. The indepen-
This is efficiently achieved using the fast Fourier transform. dence and distribution have also been shown by Yajima (1989)
The periodogram is then given by the squared magnitude of to hold for both fractal and finite-scale processes. Armed with
the complex Fourier coefficients according to this distribution on the periodogram estimates, one can per-
D form maximum likelihood estimation as well as (conceptually)
Ĝ(j) = 兩(j)兩2 (27) hypothesis tests. If the periodogram is smoothed using some
sort of smoothing window, as discussed by Priestley (1981),
where D = n⌬z. For stationary processes with finite variance, the smoothing may lead to loss of independence between es-
the periodogram estimates as defined here are independent and timates at sequential Fourier frequencies so that likelihood ap-
478 / JOURNAL OF GEOTECHNICAL AND GEOENVIRONMENTAL ENGINEERING / JUNE 1999
FIG. 8. Periodograms for Models Averaged Over 2,000 Realizations: (a) Finite Scale ( = 3); and (b) Fractal (H = 0.95)
proaches become complicated. In this sense, it is best to c, so that a log-log plot of the sample spectral density function
smooth the periodogram (which is notoriously rough) by av- of a fractal process will be a straight line with a slope of ⫺␥.
eraging over an ensemble of periodogram estimates taken from Fig. 8 illustrates how the periodogram behaves when averaged
a sequence of realizations of the random process, where avail- over both fractal and finite-scale simulations. The periodogram
able. is a straight line with a negative slope in the fractal case and
It should be noted that the periodogram estimate at = 0 becomes more flattened at the origin in the finite-scale case,
is not a good estimator of G(0), and so it should not be in- as was observed for the wavelet variance plot. Again, the dif-
cluded in the periodogram plot. In fact, the periodogram es- ference is only slight, so that a fairly large data set is required
timate at = 0 is biased with E[Ĝ(0)] = G(0) ⫹ n2X /(2) to decide on a model with any degree of confidence.
(Brockwell and Davis 1987). Recalling that X is unknown,
and its estimate highly variable when strong correlation exists, ESTIMATION OF FIRST- AND SECOND-ORDER
the estimate Ĝ(0) should not be trusted. In addition, its distri- STATISTICAL PARAMETERS
bution is no longer a simple exponential.
The easiest way to determine whether the data are fractal in Upon deciding on whether a finite-scale or fractal model is
nature is to look directly at a plot of the periodogram. Fractal more appropriate in representing the soil data, the next step is
processes have spectral density functions of the form G() ⬀ to estimate the pertinent parameters. In the case of the finite-
⫺␥ for ␥ > 0. Thus, ln G() = c ⫺ ␥ ln , for some constant scale model, the parameter of interest is the scale of fluctuation
JOURNAL OF GEOTECHNICAL AND GEOENVIRONMENTAL ENGINEERING / JUNE 1999 / 479
area up to when the function first becomes negative (the where C = covariance matrix between the observation; Cij =
Downloaded from ascelibrary.org by Shiv Nadar University on 01/18/23. Copyright ASCE. For personal use only; all rights reserved.
scale of fluctuation is not well defined for oscillatory cor- E[(X i ⫺ i)(X j ⫺ j)]; 兩C兩 = determinant of C; and = vector
relation functions—other parameters may be more appro- of means corresponding to each observation location. In the
priate if this is the case). It should also be noted that following, the data are assumed to be modeled by a stationary
correlation estimates lying within the band ⫾2n⫺1/2 are random field, so that the mean is spatially constant and =
commonly deemed to be not significantly different than X 1, where 1 is a vector with all elements equal to 1. Again,
zero [see Priestley (1981, p. 340) and Brockwell and Da- if this assumption is not deemed warranted, a deterministic
vis (1987, Chapter 7)]. trend in the mean and variance can be removed from the data
• Use regression to fit a correlation function to ˆ () or a prior to the following statistical analysis through the transfor-
semivariogram to V̂(). For certain assumed correlation or mation
semivariogram functions, this regression may be nonlin-
ear in . x(z) ⫺ m(z)
x⬘(z) =
• If the sampling domain D is deemed to be much larger s(z)
than the scale of fluctuation, then the scale can be esti- where m(z) and s(z) = deterministic spatial trends in the mean
mated from the variance function using an iterative tech- and standard deviation, respectively, possibly obtained by re-
nique such as that suggested by VanMarcke (1984, p. gression analysis of the data. Recall, however, that this is gen-
337). erally only warranted if the same trends are expected at the
• Assuming a joint distribution for X(tj) with corresponding target site.
correlation function model, estimate unknown parameters Also due to the stationarity assumption, the covariance ma-
(X , 2X, and correlation function parameters) using max- trix can be written in terms of the correlation matrix as
imum likelihood in the space domain.
• Using the established results regarding the joint distribu- C = 2X (29)
tion for periodogram estimates at the set of Fourier fre-
quencies, an assumed spectral density function can be where = function only of the unknown correlation function
‘‘fit’’ to the periodogram using maximum likelihood. parameter . If the correlation function has more than one
parameter, then is treated as a vector of unknown parameters
Because of the reasonably high bias in the sample correlation and the maximum likelihood will generally be found via a
(or covariance) function estimates, even for finite-scale pro- gradient or grid search in these parameters. With (29), the
cesses, using the sample correlation directly to estimate will likelihood function of (28) can be written as
not be pursued here. The variance function techniques have
not been found by the writer to be particularly advantageous
over, for example, the maximum likelihood approaches and
L(x兩) =
1
(2 ) 兩兩1/2
2 n/2
X
exp 再 ⫺
2X2
冎
(x ⫺ )T⫺1(x ⫺ )
(30)
are also prone to error due to their high bias in long-scale or Because the likelihood function is strictly nonnegative, maxi-
fractal cases. Here, the maximum likelihood estimator in the mizing L(x兩) is equivalent to maximizing its logarithm,
space domain will be discussed briefly. The maximum likeli- which, ignoring constants, is given by
hood estimator in the frequency domain will be considered
later. n 1 (x ⫺ )T⫺1(x ⫺ )
ᏸ(x兩) = ⫺ ln 2X ⫺ ln兩兩 ⫺ (31)
It will be assumed that the data are normally distributed, or 2 2 2X2
have been transformed from their raw state into something that
is at least approximately normally distributed. For example, The maximum of (31) can in principle be found by differen-
many soil properties are commonly modeled using the log- tiating with respect to each unknown parameter, X , 2X , and
normal distribution, often primarily because this distribution , in turn and setting the results to zero. This gives three equa-
is strictly nonnegative. To convert lognormally distributed data tions in three unknowns. The partial derivative of ᏸ with re-
to a normal distribution, it is sufficient merely to take the nat- spect to X , when set equal to zero, leads to the following
ural logarithm of the data prior to further statistical analysis. estimator for the mean
It should be noted that the normal model is commonly used
1T⫺1x
for at least two reasons: (1) It is analytically tractable in many
ˆX = (32)
ways; and (2) it is completely defined through knowledge of 1T⫺11
the first two moments, namely the mean and covariance struc- Because this estimator still involves the unknown correlation
ture. Because other distributions commonly require higher mo- matrix, it should be viewed as the value of X that maximizes
ments and because higher moments are generally quite difficult the likelihood function for a given value of the correlation
to estimate accurately, particularly in the case of geotechnical parameter . If the two vectors r and s are solutions of the
samples that are typically limited in size, the use of other dis- two systems of equations
tributions is often difficult to justify. The normal assumption
can be thought of as a minimum knowledge assumption that r = x; s = 1 (33a,b)
冉 冊
anyhow.
The partial derivative of ᏸ with respect to 2X , when set 1
equal to zero, leads to the following estimator for 2X ⫺1 =
1 ⫺ q2
1
ˆ 2X = (x ⫺
ˆ X 1)Tr (35)
Downloaded from ascelibrary.org by Shiv Nadar University on 01/18/23. Copyright ASCE. For personal use only; all rights reserved.
n 1 ⫺q 0 0 ⭈⭈⭈ 0 0
⫺q 1 ⫹ q2 ⫺q 0 ⭈⭈⭈ 0 0
which is also implicitly dependent on the correlation function 0 ⫺q 1 ⫹ q2 ⫺q ⭈⭈⭈ 0 0
parameter through ˆ X and r. ⭈ 0 0 ⫺q 1 ⫹ q2 ⭈⭈⭈ 0 0
Thus, both the mean and variance estimators can be ex- ⭈⭈ ⭈⭈ ⭈⭈ ⭈⭈ ⭈⭈ ⭈⭈⭈ ⭈⭈⭈
pressed in terms of the unknown parameter . Using these ⭈ ⭈ ⭈ ⭈ ⭈
0 0 0 0 ⭈⭈⭈ 1 ⫹ q2 ⫺q
results, the maximization problem simplifies to finding the 0 0 0 0 ⭈⭈⭈ ⫺q 1
maximum of (38)
冘
n
volves differentiating the determinant of the correlation matrix,
and closed form solutions do not always exist. Solution of the b3 = (n ⫺ 1)R⬘;
0 R0 = (x i ⫺
ˆ X)2 (40d,e)
i =1
maximum likelihood estimators may therefore proceed by it-
eration as follows: R⬘0 = R 0 ⫺ (x1 ⫺
ˆ X)2 ⫺ (x n ⫺
ˆ X)2 (40f )
1. Guess at an initial value for .
冘
n⫺1
is both symmetric and Toeplitz (i.e., elements along each For given q, the corresponding maximum likelihood estimates
diagonal are equal). of X and 2X are
3. Solve (33) for vectors r and s.
4. Solve for the determinant of (because this is often van- Q n ⫺ q(Q n ⫹ Q⬘)
n ⫹ q Q⬘
2
n
ishing, it is usually better to compute the log-determinant ˆX = (41a)
n ⫺ 2q(n ⫺ 1) ⫹ q (n ⫺ 2)
2
Qn = x i; Q⬘n = Q n ⫺ x1 ⫺ x n (42a,b)
Guesses for can be arrived at simply by stepping discretely i =1
through a likely range (and the speed of modern computers
make this a reasonable approach), increasing the resolution in According to Anderson (1971), (39) will have one root be-
the region of located maxima. Alternatively, more sophisti- tween 0 and 1 (for positive R1, which is n times the lag 1
cated techniques may be employed that look also at the mag- covariance and should be positive under this model) and two
nitude of the likelihood in previous guesses. One advantage to roots outside the interval (⫺1, 1). The root of interest is the
the brute force approach of stepping along at predefined in- one lying between 0 and 1 and it can be efficiently found using
crements is that it is more likely to find the global maximum Newton-Raphson iterations with starting point q = R1 /R⬘, 0 as
in the event that multiple local maxima are present. With the long as that starting point lies within (0, 1) (if not, use starting
speed of modern computers, this approach has been found to point q = 0.5).
be acceptably fast for a low number of unknown parameters, Because the coefficients of the cubic depend on ˆ X , which
for instance, less than four, and where bounds on the param- in turn depends on q, the procedure actually involves a global
eters are approximately known. iteration outside the Newton-Raphson root finding iterations.
For large samples, the correlation matrix can become However, ˆ X changes only slightly with changing q, so global
nearly singular, so that numerical calculations become unsta- convergence is rapid if it is bothered with at all. Once the root
ble. In the 1D case, the Durbin-Levinson recursion, taking full q of (39) has been determined, the maximum likelihood esti-
advantage of the Toeplitz character of , yields a faster and mate of the scale of fluctuation is determined from
JOURNAL OF GEOTECHNICAL AND GEOENVIRONMENTAL ENGINEERING / JUNE 1999 / 481
⭸
ᏸ⬘ = ᏸ (44) at both ends of the spectrum and both an upper and lower
⬃ ⭸j cutoff are needed. The appropriate cutoff frequencies should
where ᏸ is the log-likelihood function defined by (31), then be selected on the basis of the following:
the matrix Cˆ is given by the inverse of the Fisher information
matrix • A minimum descriptive scale, in the case 0 ⱕ ␥ˆ < 1,
below which details of the process are of no interest. For
C ⫺1
ˆ
= E[[ᏸ⬘][ᏸ⬘]T ] (45) example, if the random process is intended to be soil per-
⬃ ⬃
meability, then the minimum scale of interest might cor-
where the superscript T denotes the transpose and the expec- respond to the laboratory sample scale d at which per-
tation is overall possible values of X using its joint distribution meability tests are carried out. The upper frequency cutoff
[see (28)] with parameters . ˆ The above expectation is gen- might then be selected such that this laboratory scale cor-
erally computed numerically because it is quite complex an- responds to, for instance, one wavelength, u = 2/d.
alytically. For the Gauss-Markov model the vector ᏸ⬘ is given • For the case ␥ˆ > 1, the lower bound cutoff frequency must
⬃
by be selected on the basis of the dimension of the site under
consideration. Because the local mean will be almost cer-
1 T ⫺1 tainly estimated by collecting some observations at the
1 (X ⫺ )
2X site, one can eliminate frequencies with wavelengths that
1 n are large compared with the site dimension. This issue is
ᏸ⬘ = (X ⫺ )T⫺1(X ⫺ ) ⫺ (46) more delicate than that of an upper frequency cutoff dis-
⬃ 24X 22X
cussed above because there is no natural lower frequency
2⌬z(n ⫺ 1)q 2 1 bound corresponding to a certain finite scale (whereas
⫺ (X ⫺ )TR(X ⫺ )
2(1 ⫺ q 2) 2X2 there is an upper bound corresponding to a certain finite-
sampling resolution). If the frequency bound is made to
where R is the partial derivative of ⫺1 with respect to the be too high, then the resulting process may be missing
scale of fluctuation . the apparent long-scale trends seen in the original data
set. As a tentative recommendation, the writer suggests
Fractal Model using a lower cutoff frequency equal to the least nonzero
The use of a fractal model is considerably more delicate Fourier frequency o = 2/D, where D is the site dimen-
than that of the finite-scale model. This is because the fractal sion in the direction of the model.
model, with G() ⬀ ⫺␥, has infinite variance. When 0 ⱕ ␥
< 1, the infinite variance contribution comes from the high The parameter ␥ is perhaps best estimated directly in the fre-
frequencies so that the process is stationary but physically un- quency domain via maximum likelihood. There are at least
realizable. Alternatively, when ␥ > 1, the infinite variance two possible approaches, but here only the wavelet and per-
comes from the low frequencies that yield a nonstationary iodogram maximum likelihood estimators will be discussed.
(fractional Brownian motion) process. In the latter case, the In the case of the wavelet basis representation, the approach
infinite variance basically arises from the gradual meandering is as follows [from Wornell (1996)]. For an observed Gaussian
of the process over increasingly large distances as one looks process, x(ti), i = 1, 2, . . . , n, the wavelet coefficients x mj , m
over increasingly large scales. Though nonstationarity is an = 1, 2, . . . , M, j = 1, 2, . . . , 2m⫺1, where n = 2M, can be
interesting mathematical concept, it is not particularly useful obtained via (23) (preferably using an efficient wavelet decom-
nor practical in soil characterization. It does, however, em- position algorithm). Because the input is assumed to be Gauss-
phasize the dependence of the overall soil variation on the size ian, the wavelet coefficients will also be Gaussian. Wornell
of the region considered. This explicit emphasis on domain has shown that they are mean zero and largely independent if
size is an important feature of the fractal model. X(z) comes from a fractal process so that the likelihood of
To render the fractal model physically useful for the case obtaining the set of coefficients x mj is given by
when 0 ⱕ ␥ < 1, Mandelbrot (1968) introduced a distance ␦
写写 再 冎
M 2m⫺1
1 (x jm)2
over which the fractal process is averaged to smooth out the L(x; ␥) = exp ⫺ (47)
high frequencies and eliminate the high frequency (infinite) m =1 j =1 兹22m 22m
variance contribution. The resulting correlation function is where 2m = 22⫺␥m = model variance for some unknown in-
given by (4). Unfortunately, the rather arbitrary nature of ␦ tensity 2. The log-likelihood function is thus
renders Mandelbrot’s model of questionable practical value,
冘 冘冉 冊冘
2 m⫺1
particularly from an estimation point of view. If ␦ is treated
M M
1 1 1
as known, then one finds that the parameter H can be estimated ᏸ=⫺ (2m⫺1
)ln( ) ⫺
2
(x jm)2 (48)
2m
m
2 2
to be any value desired simply by manipulating the size of ␦.
m =1 m =1 j =1
写冉 冊 再 冎
k various component frequencies.
ˆ ) = j␥ j␥ ˆ It is recognized that these data transforms have been eval-
L(G; exp ⫺ Gj (50)
j =1 Go Go uated by averaging over an ensemble of realizations. In many
real situations only one or a few data sets will be available.
and its logarithm is
Judging from the minimum/maximum curves shown on the
冘 冘
k k
1 plots of Figs. 4–8, it may be difficult to assess the true nature
ᏸ = ⫺k ln Go ⫹ ␥ ln j ⫺ ␥j G
ˆj (51) of the process if little or no averaging can be performed. This
j =1 Go j =1
is, unfortunately, a fundamental issue in all statistical analy-
which is maximized with respect to Go and ␥. The spectral ses—confidence in the results decreases as the number of in-
intensity parameter Go is not necessarily of primary interest dependent observations decreases. All that can really be said
because it may have to be adjusted anyhow to ensure that the about the tools discussed above (i.e., the periodogram) is that
area under G() is equal to ˆ 2x after the cutoff frequency dis- on average it shows a straight line with negative slope for
cussed above is employed (or, alternatively, the cutoff fre- fractal processes and flattens out at the origin for finite-scale
quency adjusted for fixed Go). processes. Because the periodogram ordinates are exponen-
Differentiating (51) with respect to Go and setting the result tially distributed, and the ability to distinguish between process
to zero yields types depends on just the first few (low frequency) ordinates,
冘
k the use of only a single sample set may not lead to a firm
ˆo = 1
G ˆ j ␥j
G (52) conclusion. For this, special large-scale investigations, yield-
k j =1 ing a large number of soil samples, may be necessary.
The sample correlation or covariance functions are accept-
In turn, differentiating (51) with respect to ␥ and setting the able measures of second moment structure when the scale of
result to zero leads to the following root finding problem in ␥ fluctuation of the process is small relative to the sampling
冘
k domain, implying that many of the observations in the sample
冘
Ĝj j␥ ln j k are effectively independent. However, these sample functions
become severely biased when the sample shows strong depen-
冘
⫺ ln j = 0
j =1
k (53)
j =1 dence, preventing them from being useful to discern between
Ĝj ␥j finite-scale and fractal type processes. Because the level of
j =1
dependence is generally not known a priori, inferences based
For almost all common processes, 0 ⱕ ␥ ⱕ 3, so that (53) on the sample covariance and correlation functions are not
can be solved efficiently via bisection. generally reliable. Likewise, the sample variance function is
The Fisher information matrix, and thus the covariance ma- heavily biased in the presence of strong dependence, rendering
trix between the unknown parameters Go and ␥, is especially its use questionable unless the soil property is known to be
simple to compute for the periodogram maximum likelihood finite scale with << D.
estimator because, asymptotically, the estimator Ĝo is indepen- Once a class of stochastic models has been determined using
dent of ␥ˆ . The estimated variances of each estimated parameter the periodogram, the periodogram can again be used to esti-
can be found from mate the parameters of an assumed spectral density function
via maximum likelihood. This method can be applied to either
Ĝ 2o finite-scale or fractal processes, requiring only an assumption
冉冘 冊 冘
2Ĝo ⯝ k 2 k (54)
on the functional form of the spectral density function. The
1⫺␥
i
ˆ
⫺k ⫺ i2(1⫺␥ˆ ) maximum likelihood approach is preferred over other esti-
i =1 i =1
mation techniques such as regression, because of the many
1 available results dealing with the distribution of maximum
冉冘 冊
2␥ˆ ⯝ k 2 (55) likelihood estimates [see, e.g., Beran (1994) and DeGroot and
ln i ⫺ k ⫹k Baecher (1993)].
i =1 If the resulting class of models is deemed to be fractal with
JOURNAL OF GEOTECHNICAL AND GEOENVIRONMENTAL ENGINEERING / JUNE 1999 / 483
tions in a companion paper (Fenton 1999) are that soil prop- The following symbols are used in this paper:
erties are fractal in nature, exhibiting significant correlations
over very large distances. This proposition is reasonable if one a = constant;
thinks about the formation processes leading to soil deposits— bi = coefficients in Gauss-Markov maximum likelihood
the transport of soil particles by water, ice, or air, often takes problem;
place over hundreds if not thousands of kilometers. There may, C = covariance matrix;
however, still be a place for finite-scale models in soil models. Cij = element of covariance matrix;
The major strength of the fractal model lies in its emphasis on Ĉ(j) = estimated covariance function at discrete lag j ;
the relationship between the soil variability and the size of the D = sampling domain length or depth;
domain being considered. However, once a site has been es- d = representative length;
tablished, there may be little difference between a properly f (q) = cubic function in Gauss-Markov maximum likeli-
selected finite-scale model and the real fractal model over the hood problem;
finite domain. The relationship between such an ‘‘effective’’ Go = spectral intensity parameter;
finite-scale model and the true but finite-domain fractal model Ĝo = estimated spectral intensity parameter;
G() = one-sided spectral density function;
can be readily established via simulation and is the subject of
Ĝ() = sample one-sided spectral density function;
ongoing research.
H = Hurst or self-similarity coefficient;
i = index, also 兹⫺1 when appearing in exponential;
ACKNOWLEDGMENTS L( ⭈ ) = likelihood function;
The writer would like to thank the Norwegian Geotechnical Institute ᏸ( ⭈ ) = log-likelihood function;
(NGI) and, in particular, Suzanne Lacasse, Farrokh Nadim, and Odd Gre- m = scale (dilation) index for wavelet basis;
gerson for their comments on the early research as well as for their hos- m(z) = mean trend varying with spatial position;
pitality during the summer of 1997. The writer extends his gratitude to n = number of observations in sample of X(z);
the Research Council of Norway for their generous support and for mak- Q n, Q⬘n = coefficients in Gauss-Markov maximum likelihood
ing the project possible. Over the 10 months following the NGI visit, problem;
further analysis work was carried out at Princeton University, and thanks
are due to Erik VanMarcke and the Department of Civil Engineering and
q = root in Gauss-Markov maximum likelihood prob-
Operations Research at Princeton for the use of their facilities and for lem;
their support in many ways. Finally, the writer would like to thank the R 0, R⬘,
0 R1 = coefficients in Gauss-Markov maximum likelihood
National Sciences and Engineering Research Council of Canada for their problem;
support under operating Grant OPG0105445. r = vector used in space domain maximum likelihood
estimator;
APPENDIX I. REFERENCES s = vector used in space domain maximum likelihood
estimator;
Anderson, T. W. (1971). ‘‘The statistical analysis of time series.’’ Prob- s(z) = standard deviation trend varying with spatial posi-
ability and mathematical statistics, Wiley, New York.
Beran, J. (1994). ‘‘Statistics for long-memory processes.’’ Monographs
tion;
on statistics and applied probability, Chapman & Hall, New York. T = averaging dimension;
Brockwell, P. J., and Davis, R. A. (1987). Time series: Theory and meth- V(j) = variogram;
ods. Springer, New York. V̂(j) = estimated variogram;
Cressie, N. A. C. (1993). Statistics for spatial data, 2nd Ed., Wiley, New X̄ = average of X i across sample of length n;
York. X i = random value of X(z) at zi;
Dahlhaus, R. (1989). ‘‘Efficient parameter estimation for self-similar pro- Xi, j = average of X j, Xj ⫹1, . . . , X j ⫹i;
cesses.’’ Ann. Statist., 17, 1749–1766. X mj = random wavelet coefficient;
DeGroot, D. J., and Baecher, G. B. (1993). ‘‘Estimating autocovariance XT = random local average of X(z) over length T;
of in-situ soil properties.’’ J. Geotech. Engrg., ASCE, 119(1), 147–
166.
X(z) = random 1D process;
Fenton, G. A. (1994). ‘‘Error evaluation of three random field genera- x = vector of observations in sample;
tors,’’ J. Engrg. Mech., ASCE, 120(12), 2478–2497. x i = observed value of X(z) at z i;
Fenton, G. A. (1999). ‘‘Random field modeling of CPT data.’’ J. Geotech. x mj = computed wavelet coefficient from transform;
and Geoenvir. Engrg., ASCE, 125(6), 486–498. x⬘(z) = mean and variance standardized random process;
Fenton, G. A., and VanMarcke, E. H. (1990). ‘‘Simulation of random Y(z) = random 1D process;
fields via local average subdivision,’’ J. Engrg. Mech., ASCE, 116(8), z = coordinate, commonly with depth;
1733–1749. z i = discrete points along 1D random process;
Journel, A. G., and Huijbregts, Ch. J. (1978). Mining geostatistics. Aca- ␥ = spectral exponent in fractal model;
demic, New York.
Marple, S. L., Jr. (1987). Digital spectral analysis. Prentice-Hall, Engle-
␥ˆ = estimated spectral exponent;
wood Cliffs, N.J. ␥(D) = variance function;
Mandelbrot, B. B., and Ness, J. W. (1968). ‘‘Fractional Brownian mo- ␥ˆ (D) = sample variance function;
tions, fractional noises and applications.’’ SIAM Rev., 10(4), 422–437. ⌬z = incremental distance between observations;
Matheron, G. (1962). ‘‘Traite de Geostatistique Appliquee, Tome I,’’ Me- ␦ = averaging dimension in Mandelbrot’s fractal model;