You are on page 1of 35

Twitter: @elgoajiroblanco

Chapter 2
Principles of Statistics

2.1 Introduction This section covers several techniques for analyzing and
The principles of statistics must be understood so the applica- characterizing sample data, starting with a frequency-dis-
tion of geostatistics can be understood. Principles of statistics tribution technique, probably the oldest statistical tech-
form the backbone of geostatistics. In addition, many statistical nique for analyzing samples. Then, we discuss summary sta-
principles can be directly applied to provide a better under- tistical techniques, which try to capture basic features of
standing of the sample data. This chapter focuses on the prin- sample with a limited number of parameters. A separate sec-
ciples necessary to understand and analyze spatially distributed tion covers techniques useful for understanding spatial data
data. Only those principles are covered. Refs. 1 through 5 pro- sets. Finally, we briefly address techniques used to under-
vide more general information on statistical principles. stand bivariate data sets and try to relate two variables to
The chapter is divided into two broad sections. The first each other.
section deals with a branch of statistics called descriptive sta-
tistics. This branch deals with the organization, presentation, 2.2.1 Frequency Distribution. The frequency distribu- -

and summarization of data. These techniques are particularly tion method is one of the simplest ways to analyze sample
useful in developing a better understanding of the type of in- data. It summarizes the data in a more compact form than
formation currently available and the peculiar characteristics original sample observations. To construct a frequency dis-
it possesses. Understanding the characteristics of the in- tribution, the range of the data is divided into intervals called
formation allows us to use it more productively. class intervals. It is a common practice to use class sizes of
The second section deals with a branch of statistics called in- equal width, but this is not necessary. The number of measure-
ferential statistics. This branch deals with the procedures for ments falling within a particular class, i, is called a class fre-
deriving conclusions about a population on the basis of sample quency,
data. Here, sample data are the collected information and popu- The number of classes should be chosen so that the sample
lation is all the information, we would like to have within the characteristics, or "signature," are clearly visible from the
region of interest. Most geostatistical applications use prin- display of class frequencies. Too few or too many class inter-
ciples of inferential statistics in estimating values at unsampled vals do not provide the necessary information. If too few
locations. This chapter, however, concentrates only on the in- classes are used, some details may be lost; if too many
ferential statistical principles used in conventional statistics. classes are used, too few values may fall within each class,
giving very little information about sample data tendencies.
2.2 Descriptive Statistics Typically, the number of class intervals depends on the num-
Descriptive statistics, which has been in use for more than 200 ber of sample data points. One way to estimate is to use the
years, is the oldest branch of statistics. Its origins lie in sur- square root of the total number of sample points to approxi-
veys and census activity, where investigators tried to capture mate the number of class intervals. In most cases, 5 to 20 in-
details about the activity in as few numbers as possible. With tervals are sufficient.
modem computer technology, the field of descriptive statis- On the basis of frequency values within each class, we can
tics has become even more powerful; general trends, anoma- calculate the relative-frequency values for each class. If
lies, and variations in data can be illustrated with contrasting n = the total number of samples, we can calculate the relative
colors or gray-scale maps. In general, descriptive statistics frequency, fki , for Class i as
can be applied to either sample data or population. In practice,
however, the population is rarely known. Therefore, we re-
strict the applications to sample data only. JRi = (2.1)

PRINCIPLES OF STATISTICS 17
Twitter: @elgoajiroblanco

TABLE 2.1-CLASS FREQUENCY VALUES frequencies) within each class, and column 3 lists the relative-
FOR NUMERICAL EXAMPLE 2.1 frequency values for each class calculated with Eq. 2.1. For
Class example, for class values between 0.12 and 0.14,
6 fRi
f
0.10 0.12 1 5 0.1 0.1 fR = = 5-0- = 0 . 18 .
0.12<0 0.14 2 9 0.18 0.28
Column 4 gives the cumulative relative class frequency
0.14 0.16 3 20 0.4 0.68 values calculated with Eq. 2.3. For example, for class values
between 0.14 and 0.16,
0.16 <0 0.18 4 10 0.2 0.88

0.18 <05.. 0.2 5 6 0.12 1.0 F3 fRi = 0.1 + 0.18 + 0.4 = 0.68.
-1
Normalizing the class frequency by the total number of We can plot relative frequencies as well as cumulative rela-
samples ensures that all the relative class frequencies add to tive class frequencies as functions of the variable values (see
one. In other words, Fig. 2.1). The plot of relative frequency vs. the variable is
called a relative-frequency histogram.
These plots are useful in characterizing samples. For exam-
fR, = 1, (2.2)
i= 1
ple, from this plot, we can conclude that 40% of the porosity
values fall between 0.14 and 0.16 (or that 10% of the porosity
where N = total number of classes. values fall between 0.1 and 0.12). The cumulative relative
In addition to relative class frequency, cumulative relative histogram plot shows that 68% of the porosity values are less
class frequency can be calculated by than 0.16. For equal-sized class distributions, the area of the
rectangle for a particular class is proportional to the chance
14"/ = fRi (2.3) that a porosity value will fall within that class (Fig. 2.2). In
i= 1 Fig. 2.2, the chance of porosity falling between 0.14 and 0.16
is proportional to the crosshatched rectangle. Fig. 2.2 also
for j = 1 , , where F1 = cumulative frequency for class j
shows that the chance that porosity is less than 0.16 is 68%.
that results from addition of all the relative frequencies up to
In defining frequency distributions, we assume a discrete
and including class/.
distribution for a variable. That is, once the data are divided
into several classes, we do not distinguish among the values
Numerical Example 2.1. The following porosity samples are within a class; all values are treated the same. The advantage
measured in a wellbore: 0.141, 0.124, 0.152, 0.156, 0.113, of using such an approach is that the sample can be character-
0.167, 0.194, 0.142, 0.133, 0.149, 0.106, 0.137, 0.147, 0.159, ized with fewer parameters. Instead of knowing all the poros-
0.174, 0.129, 0.153, 0.173, 0.189, 0.16, 0.193, 0.156, 0.149, ity values, we may need to know only the individual classes
0.135, 0.145, 0.171, 0.101, 0.151, 0.176, 0.191, 0.121, 0.148, and the distribution of values within all the classes. Such in-
0.153, 0.171, 0.183, 0.108, 0.123, 0.169, 0.185, 0.153, 0.117, formation may be adequate for understanding the sample
0.127, 0.145, 0.141, 0.165, 0.14, 0.143, 0.178, 0.179, 0.157. characteristics.
Analyze these porosity, 0, values using a frequency-distribu- In reality, porosity is not a discrete variable; it is a continu-
tion analysis. ous one. Just because a porosity value of 0.102 is not observed
Solution. For these 50 values, we divide the data into five does not mean that a value of 0.102 does not exist. It may be
classes. Table 2.1 lists the class-frequency values: column obtained in the next sample. Porosity can take any value with-
lists the porosity, column 2 lists the number of values (class in the extreme values.
0.40 0.10
0.35 0.09
0.30 0.08
0.07
0.25 0,06
0,20 0.05
0.15 0.04
0.10 0.03
0.02
0.05 0.01
0.00 0.00
0.12 0.14 0.16 0.18 0.20 0.12 0.14 0.16 0.18 0.20
(a) (b)

Fig. 2.1-(a) Relative frequency and (b) cumulative relative frequency.

0,40
0.35
0.30 0.68
0.25
0.20
0.15
0.10
0_05
0.00
0.1 0.12 0.14 0.16 0.18 0.20 0.1 0.12 0.14 0.16 0.18 0.20

Fig. 2.2-Characterizing the frequency function.

18 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


Twitter: @elgoajiroblanco -••▪

Limiting Case

c1)
-1 0
›. 0
Fig. 2.3—Frequency distribution for a large number of classes.
-<
0
73 co
z cO
o
73 -I
• ©
o0
o
= N
( 0
›. 0:
z
-1
(7)-10c_
-
T.
-n 0
o 7:1
7:1
E •-•
73
00
-n 0
o crn
C o
r f./)
V))
-0 n,
0.14 0.16 016 u
m
-a rn
-
t j
7.-
X
Fig. 2.4—Characte zing continuous distribution. -
xixiXX0
0

tinuous distribution. This section provides a basis for under- • II


Collecting 5,000 samples instead of 50 may allow division m
of the data set into additional classes. Under these circum- standing the conceptual framework of the probability-density u3
c 01
stances, Fig. 2.1 may look like Fig. 2.3. Within the limits, the function (analog of f,) and cumulative-distribution function (1'
Tom.
discrete distribution may look like a continuous distribution (analog of F,), which is discussed in Sec. 2.3. A very clear o ri •
oh o
as shown in Fig. 2.3 for a very large sample set approaching relationship exists between the frequency distributions and c 0 -

the population. Applying the principles used in Fig. 2.2, we the statistical functions for a continuous distribution. XI
can say that the chance that porosity will fall between 0.14 and 0 -o
I1— 0
— >tin)
0.16 is proportional to the area under the relative-frequency
histogram curve between these limits. The cumulative rela- Field Example 2.1. As for all field examples, Appendix A M
0 T. 0
c 0
tive-frequency diagram also shows the chance that the poros- provides the field data. This example uses histogram plots for TT -1 3
mx
ity will be less than 0.16 (Fig. 2.4). porosity and permeability data for Well 34-29 in the Burbank am-
E•
field (Fig. 2.5). The well is in the north-central portion of the 0
Although the analysis presented in Fig. 2.4 is not strictly 0
correct mathematically, it provides an intuitive understanding reservoir. On the basis of initial potential and permeability 3
of how the frequency distribution can eventually lead to con- and porosity data, the well is in a better part of the reservoir.

Porosity, fraction (b) Permeability, and

Fig. 2.5—Histogram for (a) porosity and (b) permeability data for Well 34-29.

PRINCIPLES OF STATISTICS 19
• Twitter: @elgoajiroblanco

0.300

CT

U.

0.100

_so 1. 0
(b)
-3.0 -1.0 3.0
(a) In k. In md
Permeability, md

Fig. 2.6—Effect of log transform on permeability histogram: (a) permeability data and (b) log
k data for Well 34-29.
Fig. 2.5a, the porosity histogram, shows a reasonably sym- togram. Both wells show a significant portion of shale, indi-
E
o metric distribution with an exception at the high end of the cated by a high fraction of very low porosity values (almost
Ec values. Other than in the middle, another peak of frequency 40%). The overall range of porosity values is also smaller
Ts
E '
cr a
appears at a porosity value of 0.27. This may indicate a mix-
ing of porosity distributions from two different geological
compared with that in Well 34-29.
These types of histogram maps for wells located in different
w units. Appendix A indicates that the Burbank sand is divided regions of the reservoir serve several purposes. First, changes
oa
ui
o into several geological units. Although some units are hard to
distinguish from each other, many units are separated by con-
in the sample distribution of a physical property can be visual-
ly observed. This may confirm some geological boundaries

o
:
c7-
o
o >-
tinuous shales that are extensive in an areal direction. Specifi-
cally, the lower unit (Unit 10) is a relatively thick sand with
low porosity values. This unit can significantly influence the
and sand units that are established on the basis of log inter-
pretation. Also, when a particular range of porosity values is
missing at certain wells (as in the case of Well 36-16, where
o
oxo overall distribution of porosity values. no values greater than 18% or between 2 and 10% are ob-
Tv 0 1- The permeability distribution at Well 34-29 (Fig. 2.5b) is served), it may indicate that a particular geological unit may
z 0
2 skewed positively. Although the permeability values range not extend over a region. Additional analysis can confirm
• from as low as 0.01 md to 750 md, the majority of the values this. Another benefit of generating such histogram maps is to

N are at the lower end of the region. This type of histogram is define the region of stationarity. As Chap. 1 discussed, this re-
0
w -
W Ce
rarely useful for characterizing a sample because the values gion represents an area where the proposed model can be ap-
0 are clustered at one end. plied on the basis of the sample data. If the histogram charac-
<
111
1
One way to overcome this problem is to transform the sam- teristics vary over a wide range within an area, the region of
uJ
Q 1— ple data in some way so that some sample characteristics are stationarity may have to be re-examined and redefined.
S
OD
LLI M evident from the histogram plot. The most commonly used
1— 8 u_
0 approach for permeability values is the log transform. Fig. 2.6 2.2.2 Summary Statistics for Univariate Distribution. In
E
shows the original permeability histogram and the log of
13-
,fi o
ct
addition to the frequency distribution, which characterizes
permeability, log k, histogram. The log k distribution is much
Q
1:e
=, 0
F LL
a more symmetric than the permeability distribution. In addi-
the sample visually and numerically, characterizing the sam-
V) ple through summary information is also beneficial. This sec-
< 9 tion, the log k and histograms are remarkably similar. Both
tion addresses the sample of one variable: a univariate dis-
:o ct 0 show similar trends with two peaks in the histogram plot, one
o 2 tribution. Several types of summary statistics for univariate
>- of which is at the higher end of the values. Although this needs
C•1
distributions are available.
o d to be validated, such characteristic similarity may indicate a
t■I
0 0
Mean. The sample mean represents the arithmetic mean of
relationship between log k and 0.
=°z
F the sample data points. Mathematically,
I— 6 Fig. 2.7 shows histogram plots for two additional wells in
the field: Well 31-W23 in the northwest portion of the reser-
T
E co ce voir and Well 36-16 in the northeast portion of the reservoir. = - Ex ...................................................... (2,4)
cr 11

Both wells are thought to be in poor regions of the reservoir. i=1


o On both the east and west sides of the study region, the where n = total number of samples and x, = the value of
amount of shale increases, and the permeability values be- Sample i. The arithmetic mean represents the central tenden-
come smaller. Core porosities reflect this transition in the his- cy of the sample.

0.000 0.100 0.200 0.300 0.000 0.100 0.200 0.3 00

Porosity, fraction (b) Porosity, fraction


(a)
Fig. 2.7—Porosity histograms for Wells (a) 31-23 and (b) 36-16.

20 APPLIED GEOSTATIST1CS FOR RESERVOIR CHARACTERIZATION


Twitter: @elgoajiroblanco
11
Median. Another measure of central tendency is the me- 2
xf - n.)7
dian, which is the sample point that divides the sample into
=
equal halves. If all the samples are arranged in an ascending 5- - (2.7)
n-1
order so that x l < x-, < x„, the sample median, x, is cal-
culated by Although Eqs. 2.6 and 2.7 should give the same result, Eq. 2.7
is preferred because of its numerical precision. The square
x
(n+ 1)/2 (2.5a) root of variance, s, is called the standard deviation. It has the
when n is odd and by same units as the variable being sampled.
Coefficient of Variation. The coefficient of variation, C„,
(x>,/2 xn/2+1)
is defined as
= (2.5b)
2 _ s
Cv - (2.8)
when n is even. An advantage of using median rather than where s = standard deviation and x = sample mean. Be-
mean is that median is not influenced by extreme values. In cause s and x have the same units, C„is a dimensionless quan-
contrast, because the mean is an arithmetic average, it can be tity; therefore, it provides a measure of the relative spread of -1 0
affected by one or two extreme values. a sample. When samples from two different variables are 0
-
Mode. The mode, which is another measure of a central ten- compared, the C„ value provides an indication of which vari- ›. 0 <
dency, is an observation that occurs most frequently in the able has the relatively greater spread. For very tight samples, z cO
sample. The value of the mode obviously depends on the pre- the value of C„ is small. For samples that exhibit several-or- o
-
• I
cision of the data, especially for naturally occurring variables. ders-of-magnitude variations, C„ typically is > 1, sometimes
If the data are very precise, each value is unique and none is
o
o0
o
o h

in the range of 2 to 5. High C,, values are a warning sign that


repeated. One option is to use the frequency-distribution the sample data contain extreme values that may affect es-
classes as a range of precision and define the mode as the class ( 0
ti mation of values at unsampled locations. z 0:
that contains the highest number of values. Range. Range is another quantitative measure of the -1
m
Mean, median, and mode coincide with each other if the No
spread. A simple definition of range, R, is -1

distribution is symmetric. If the distribution is skewed (e.g., -n 0


o
the permeability histogram in Fig. 2.5b), these three tenden- R= x max - x r„, n , (2.9) •-•
00
cies exhibit different values. If the distribution is skew ed -n 0
where x ma , = the maximum value and x,„„ = the minimum 0c rn
positively (to the right), mode < median < mean. If the dis- c
value. Other definitions of range have also been used. For ex- r
-
f./)
tribution is skewed negatively (to the left), mode > me- I So 17
ample, interquartile range represents the difference between o m
dian > mean. n n
o
two successive quartile values. We can define the first quartile n
Extremes. In addition to the mean, median, and mode, we m
range as o
can also define the minimum and maximum of the sample val- rn
25 - 2 X 7
.-
ues, where the minimum represents the smallest value and the R, = x x mm , ( .10)
the maximum represents the highest value. xi X 0
25 - Z
Percentile. Percentile values represent sample values that where x = the 25th percentile value and A min = the mini- 0
mum value. Similar definitions can be used for other quar- m
0
are greater than a certain percentage of the sample values. The 0
1
median is an example of the 50th percentile value because tile ranges. (1'
Tom.
50% of the values are smaller than the median. If the values o
o
are arranged in ascending order, xP represents a value where Numerical Example 2.2. The following data for pay-zone c 0
p percent values are smaller than xP. For example, x'' repre- thickness (in feet) are collected from all available wells in a xi
sents a sample value that is greater than 10% of the total sam- reservoir: 6, 10, 20, 12, 20, 10, 15, 32, 27, 10, 18, 29, 8, 17, 1
r0
x,ra
ple points. 23, 36, 19, 13, 33, 10, 26. Calculate mean, median, mode, m
0 ›.
Certain types of percentiles are commonly used in describ- quartile values, variance, C y , and range. c 0
T3 -1
ing sample data. For example, the first quartile represents x 25 , Solution. The total number of samples is 21. mx w3
Pm
where 25% of the sample values are less than x 25 . x 75 repre- Mean. With Eq. 2.4, E•
0
0
sents a value where 25% of the sample values are greater than 11 73
x 75 . In a similar manner, deciles describe the data in terms of
10ths. The first decile represents the value greater than 10%
of the sample data. The fifth decile represents the median. n
Overall, percentile distribution is simply another way of look-
ing at data distribution. (6 + 10 + 20 + 12 + + 33 + 10 + 26)
Variance. The sample variance represents the spread of the 21
data. It is a quantitative measure of how widely the data are
distributed. Mathematically, variance is calculated as = 18.76 ft.
Median. By arranging the values in ascending order (6, 8,
(x, -30 2 10, 10, 10, 10, 12, 13, 15, 17, 18, 19, 20, 20, 23, 26, 27, 29,
2 1= I
S - .............................................. (2.6) 32, 33, 36), we can calculate the median. For an odd number
n- 1
of samples, with Eq. 2.5a,
where s 2 = sample variance, X = sample mean, and n. = to-
tal number of samples. Variance can also be calculated as x,„ + 0/2 = x i , = 18 ft.

PRINCIPLES OF STATISTICS 21

▪ Twitter: @elgoajiroblanco

TABLE 2.2-SUMMARY STATISTICS TABLE 2.3-POROSITY DATA FOR FIELD EXAMPLE 2.2
FOR FIELD EXAMPLE 2.2
Well
Variable Statistics 31-W23 34-29 36-16
Statistics 0 k In k Mean 0.1043 0.1762 0.0816
Mean 0.1762 123.3 2.82 Median 0.132 0.16 0.113
Median 0.16 24.2 3.19 X2 S 0.000 0.1437 0,000
x25 0.1437 7.4 1.83
Xi s 0.175 0.2203 0.153
x75 0.2203 97.0 4.31
Variance 0.0092 0.0038 0,0059
Variance 0.0038 39,761.2 7.97
Standard Deviation 0.0948 0.0611 0.0754
a 0.0611 197.6 2.82
Cv 0.91 0.347 0,92
Cv 0.347 1.6 1.0
Range 0.268 0.238 0.184
Range 0.238 748.0 11.23

the sample data. Such variations should be treated with care


E Mode. The mode is the sample that occurs the most fre- when estimating values at unsampled locations.
o
Ec quently. In our data set, mode is equal to 10 ft because it oc- As discussed in Sec. 2.2.1, a wide variation in sample val-
Ts curs the most frequently. ues can be overcome by use of some type of nonlinear trans-
4-,E 5'
Quartiles. The 25th quartile represents a value 25% of form, such as a log transform. Table 2.2 also shows the sum-
0 cr a
w the sample values. For 21 samples, this represents the sixth mary statistics for natural log of permeability (ln k) values.
0-
value, or 10 ft. The 75th percentile represents a value 75% Note that the mean and median values are a lot closer to
• o

o
ul of the sample values. This represents thel6th value, or 26 ft. each other, indicating symmetry in the distribution. Also, the
cts Variance. Variance is calculated with Eq. 2.7. interquartile ranges are much more symmetric for the log
8
0 >- n transform than for the raw permeability values. For the raw
c7- n.Y7 2 permeability data, for example, the last quartile range is 651
0 .3 1-
0xo m ra 2 i= I md, which represents more than 80% of the total range of the
Tv 0 S =
n-1 sample data. C v is also smaller for In k data, indicating a
z 0 2 smaller variation than in the sampled data values.
o 2 (6 + 10 2 + + 33 2 + 10 2 + 261 - (18.76) 2
Dealing with samples that are symmetric and that show a
▪ La
Ce 21 - 1
uJ o-
narrow spread is always easier than dealing with highly
W Ce
o = 80.19 ft 2 . skewed samples with a large spread. Some type of nonlinear
< 0_
L Standard Deviation. transform is commonly used to convert raw sample data into
C7 I
uJ transformed data that is "better behaved." Although this
Q 1-
SOD s --= vrs i = v80.19 = 8.95. seems simple, such transformations should be used with cau-
LLI D C.) tion. Discussions in subsequent chapters indicate that these
1- 8 u_
0 Coefficient of Variation. With Eq. 2.8, transformations may result in erroneous estimated values at
E

,fi o
ce unsampled locations within the region of interest.
Q LL Cv = x = = 0 477 Analyzing the field data further, we examine data from the
1:e Ct 18.76
=, 0 V) other two wells investigated in Field Example 2.1: Well
<9 This value represents a relatively small variation within the
.o ct 31-W23 in the northwest part of the reservoir and Well 36-16
Q N 2
o
sample. Typically, a value of C v < 1 indicates a relatively
>- in the northeast part of the reservoir. These two wells repre-
CNI narrow distribution.
0 sent a relatively poor part of the reservoir, where the propor-
c; ui
(NICe Range. With Eq. 2.9,
tion of shale is much higher. Table 2.3 provides the summary
O AF R = x r„x x r„,„ = 36 - 6 = 30 ft. statistics for both these wells and also includes data from Well
I- Ce 6
0 z 34-29 for comparison. The table shows that the mean is much
We can also calculate the interquartile range.
T co ce
E smaller for both Wells 31-W23 and 36-16, indicating a rela-
cr
R, = x 25 x n, = 10 - 6 = 4 ft. tively large fraction of shale. The high C v values at both wells
o indicate a skewed porosity distribution, which was evident
and R4 = X max x
75
= 36 - 26 = 10 ft. from the histograms (Figs. 2.5 through 2.7).
Another visual display that is sometimes useful for compar-
ing the two samples is the Q-Q plot. This plot represents quan-
Field Example 2.2. This example shows application of the tile comparisons of the two data sets. For example, the 10th
same summary-statistics analysis to a field example with some quantile value of one set is plotted vs. the 10th quantile of the
summary statistics obtained for porosity and permeability data other set, the 20th quantile value of one set is plotted vs. the
for Well 34-29. The total number of sample points is 57. Table 20th quantile value of the other set, and so on. If two samples
2.2 provides the summary statistics. have essentially the same distribution, the Q-Q plot shows a
For porosity data, the mean and median are very close to perfect 45 0 straight line. Fig. 2.8 shows the comparison be-
each other, indicating a symmetric distribution. In contrast, tween Wells 34-29 and 31-W23 and between Wells 34-29 and
the median for the permeability data is much smaller than the 36-16. Both plots clearly show a big deviation from the 45 0
mean, indicating a positively skewed distribution. line, indicating significant differences in the distributions of
The coefficient of variation, C v, shows that the variation in the two data sets. All points fall below the 45° line, indicating
porosity is small, while the value of C v > 1 for the perme- that Well 34-29 is always represented by a higher value than
ability data indicates order-of-magnitude variations within the other two wells for the entire percentile range.

22 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


Twitter: @elgoajiroblanco
0.300 0.300 7

0.250 0250.:

0.200_ 0200.:

0.150_

100 e/ .
O. 050 _ 0.050

0.000 0.000
cow 0.050 o.Too 0 . 200 0 2 50 0.300 0.000 0.050 0.100 0.150 0.200 0250

Porosity. Well 34.29 (b) Porostty: Weil 34.29


(a)
Fig. 2.8—Percentile plot (0-0 plot) comparisons of porosities of two wells: (a) Well 34-29 vs.
Well 31-w23 and (b) Well 34-29 vs. Well 36-16.

come available in a particular region. For statistically valid re-


2.2.3 Spatial Data Sets. In this section, we apply some tech- sults, a data set must have a sufficient number of samples. A
niques discussed in the two previous sections to a spatially compromise is to consider local variability as part of the data
distributed data set. The discussion is limited to univariate and account for these variabilities through various proce-
(single variable) descriptions. Spatial data sets collected in dures. Later chapters discuss some of these procedures.
petroleum reservoirs are unique in several ways. Here, we discuss two procedures commonly used to ana-
1. The data sets do not represent random sampling. Ran- lyze and understand spatial data sets. The first procedure can
dom sampling represents a sampling where all samples have be used to remove bias in the sampling. The second procedure
an equal likelihood of being selected. In practice, however, all allows us to understand the local variability in a sample set.
the wells in a reservoir are not drilled at the same time. A few Sample Declustering. Sample declustering is one of the
exploratory wells are drilled first. Then, on the basis of in- simplest ways to remove the undue influence of biased sam-
formation gathered, additional wells are drilled. As more pling. Originally proposed by Journel, 7 the method requires
wells are drilled, the information collected from the previous- that spatial data be divided into several small subareas (Fig.
ly drilled wells is used to drill the next set of wells. That is, all 2.9). These subareas typically are rectangles that cover the en-
wells are not drilled on the basis of the same information tire region. Once the data are divided into small subregions,
available. Therefore, the likelihood that a particular well will the number of data points within in each subarea is calculated.
be drilled differs, depending on the information available at For example, Subarea 1 in Fig. 2.9 has three sample points,
the particular time. and Subarea 2 has six sample points. Depending on the num-
2. The data sets result in biased sampling. The goal in drill- ber of sample points within each subarea, an appropriate
ing wells in petroleum reservoirs i s to drill wells that have the weight is assigned to each sample point within that subarea.
maximum potential of economic success, not to collect a good One logical choice is to assign a weight to each point equal to
sample set. When selecting locations of new wells, the prima- the inverse of the total number of points within each subarea.
For example, if Point x, falls within Subarea! where n 1 sample
ry purpose is always the most effective way to produce addi-
points are located, then
tional oil with minimum investment. Obviously, more wells
are drilled in an area where greater potential for additional re- =1
i it
............................................................. (2.11)
covery exists. Subjective decisions based on potential as a cri-
terion result in biased sampling. That is, more wells are selec- where w t = weight assigned to Sample Point x i . For exam-
tively drilled in an area where either the pay zone is thicker, ple, in Fig. 2.9, each point in Subarea 1 is assigned a weight
the porosity is higher, or the reservoir is more permeable.
Fewer wells are drilled in regions with low permeability, low
pore volume, or both. Treating these sampled wells without
accounting for the sampling bias may paint an overly optimis-
tic picture of the reservoir because more samples selectively
come from regions with a higher potential.
0 0

3. The data sets may show local variability. A typical reser- 0 0


0

voir has several wells drilled, which may intersect several


geological horizons. In addition, even areally, a reservoir may
o
Sample
Space
0

contain several different geological units. In a fluvial reser-


0 0

voir, an area may include a channel sand, a crevasse splay, 0 S


point bars, and a flood plain. Because of the differences in the 0 oa
.

environment, there are local variations in the statistical prop- 0 2 0


0 0;

O
0 0

erties (e.g., some areas show higher porosity than others). In 00


0

considering such a sample, we need to understand how these 0 0


111

local variabilities affect the overall estimation procedure.


One may argue that this problem can be avoided by simply
defining a small enough region of interest so that these types
of local variabilities are not evident. Unfortunately, as a re-
gion is made smaller and smaller, fewer sample points be- Fig. 2.9—Declustering of data.

PRINCIPLES OF STATISTICS 23
▪ Twitter: @elgoajiroblanco

177
1,02

12,9
26,1

21,5
7,178 22.1
262 3
17,2 25_3 21: 2 165 12,0
26,8 25,5
113
286 17,3
24 ,5 23,1
4,140 23,2 10 0 24 164
23,8 317;2 238 23,4 28,0 14,3
22
,8 20 2
23,9 30,2 22 5
- 25, 7 27,8 26,674,722,0 22,0
z 25,3 17,0 21,8
19
3
1,110 24,7 19,71 9,0 18,4 21,8
14.9 15,5 120 14,5
22,3 26,7

217
- 1,920 !

26,4

E 230
o - 4,950
Ec - 4 620 924 6,468 12,012 17,556 23,100
Ts I'
E ' East, ft
0 cr 0
_c w
Fig. 2.10-Areal distribution of porosity values.
o
ca2
a
o of one-third, and each point in Subarea 2 is assigned a weight ward high values, such as high porosity, high pay-zone thick-
of one-sixth. If only a single point falls within an area, a ness, or high permeability. That is, the sample contains a more
o 8
0 >- weight of one is assigned to that point. of high values than it would contain with random sampling.
c71 - The idea of assigning weight is simple. The greater the Eliminating the undue influence of these high-valued sam-
o .3 1-
oxo 9 0: number of data points within an area, the more clustered the ples should make the mean lower than what the sample repre-
7.) 0
sample is within that area. Assigning a smaller weight to those sents. The mean is estimated by examining the declustered
z 0 points reduces any undue influence of those points. mean for various subarea sizes. The correct size is the one that
O
❑N
0
>- Once weights of individual points are calculated, the arith-
metic mean of the samples is calculated as
represents the minimum because that size represents the max-
imum effect that declustering can achieve. The Field Exam-
uJ o- o
W Ce ple illustrates this further.
0_
<
I
L

uJ Field Example 2.3. Data from Flow Unit 3 are used in this ex-
ui .................................................... (2.12)
D ample, Appendix A indicates that the reservoir is divided into
LLI MC.)
1- 8 u_ ten flow units; Flow Unit 3 is one of the units with relatively
0 i 1 =
high porosity and permeability values. At each well, the log
x
E
--
cfi 0
where .7x = declustered sample mean and w, = weight as- porosity values were arithmetically averaged over the pay-
LL
u signed to individual Sample Point i. The variance can be cal-
1:e C
t zone thickness for that unit to estimate the average porosity
0 V)
culated as for the unit at that location. Fig. 2.10 shows the areal distribu-
< 9
:Q ct 0 tion of porosity values over the study area. The values vary
o 2 over a wide range, from as low as 11 % to as high as 30%. As
>-
N 0-
o d c.) 2 the map shows, the wells are clustered in areas that have high
re
S = ........................................ (2.13)
"o o porosity values.
F
I- Q We can apply the declustering technique and, with Eq. 2.11,
o6 z i= 1
calculate the declustered mean for various cell sizes. Fig. 2.11
O 2
T co ce
E where s = declustered variance of the sample. Once the shows a plot of declustered mean as a function of cell size. As
cr mean and variance are calculated, the standard deviation and expected, the declustered mean starts with a high value for a
o coefficient of variation can be estimated. large cell size. It reaches a minimum value at a cell size of
One unknown in these calculations is what size the subarea approximately 3,000 x 3,000 ft. The sample mean is 21.6 vs.
should be. As a guess, one option is to use the declustered mean of 20.7. Although the difference be-
tween the two may not seem significant, remember that the
(2.14) porosity variation is not that significant in this case. For vari-
Asub n
ables that do exhibit an order-of-magnitude variation, the dif-
where A b subarea= susize = size of the individual subareas, ference between sample and declustered means can be signif-
n = total number of sample points, and A, = total icant. More relevant, however, is the fact the sample bias can
area = area of the study region. The shape of the subarea can be removed with a very simple technique.
be the same as that of the total area.
To be precise, different subarea sizes should be tried, and Fig. 2.12, which compares the histogram of the sample data
the declustered mean plotted as a function of the subarea size. with the histogram of the declustered data, also shows the ef-
A size should be chosen so that the declustered mean reaches fect of declustering. Although the maximum and minimum
a minimum or maximum value. When analyzing most reser- values have not changed, note that the quartile values for the
voir properties, the declustered mean reaches a minimum at declustered data are smaller than those for the sample data.
a desired subarea size because sampling is always biased to- Sample data, being biased, tend to exaggerate the percentage

24 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


• •
Twitter: @elgoajiroblanco ▪
-
-

region. If the local means and variances are relatively uni-


form, the region is considered homoscendastic. If the local
means and variances show significant variation, the study re-
gion is considered heteroscendastic.
21.60_ The size and shape of the window is a subjective decision.
A rectangular window is the normal choice because of com-
• • putational efficiency. The size of the window is determined
ea
0.1 21 .20_ • on the basis of the number of samples available within the

study region. The window should be large enough so that each
• •
• • • • window contains sufficient samples to provide representative
60 •
(/)
20.80_
• local statistics and small enough so that local variabilities are
preserved and exhibited through moving-window statistics.
42) • • • •
One possible compromise is to select large windows to ensure
sufficient samples but overlap the windows with a certain lag.
This way, a sufficient number of local variations across the re-
20.40._
gion is obtained without too much reduction in the size of the Ell 0
window. In this approach two adjacent windows partially 0
-
overlap each other, resulting in some sample points being ›. 0 <
20.00 I T 1 1 1 ,/ -7 1 . . . / I T T I
used more than once in estimation of local statistics. z
X 03^'

In examining local properties, it is important to understand C -9


2,000 4,000 6,000 8,000 10,000 o
both local mean and local variance. Several possibilities may E
Cell Size, ft exist with respect to the variation of these properties. Local 0 0

Fig. 2.11-Effect of declustering on sample mean. mean and variance can both vary; local mean can vary while 0
▪ ,„
local variance remains fairly constant; or local mean can re- (.7) 0
of high values, which, in turn, increases quartile values. De- main constant while local variance varies. If both vary, a pos-
clustering the sample minimizes this influence. In the histo- sibility exists that they may be related to each other. In many (7, 0 -
gram plot, the relative-frequency values for high porosities earth science data sets, local means are observed to be propor- -no ni 0
are smaller in the declustered data than in the clustered data, tional to local standard deviations; this is called a proportional
The reverse is true for low porosity values. For the low poros- effect. Observation of such a relationship can be useful in esti-
mating values at unsampled locations. 9 8 =1
ity values, the standard deviation for the declustered data is c ai rn
higher, indicating a more uniform spread over the data range Field Example 2.4 illustrates application of moving-win-
than that indicated by clustering the data. The Q-Q plot in Fig. dow statistics. -D m
1 67, g
i;
2.13, which plots clustered vs. declustered data, further em- -0
phasizes this effect. All points fall below the 45° line, indicat- Field Example 2.4. The data set used in Field Example 2.3 o

ing that, for clustered (sample) data, the value for a given per- -0 n rn,,
is used in this example. The flow-unit porosity data are di- X 7
centile is higher than for the declustered data. vided with windows of approximately 7,000 x 7,000 ft. The -
In general, for data sets that indicate preferential clustering xi
xiXX0
0
windows are defined so that half the window overlaps the pre-
of the samples, it is safe to test them for declustering analysis. ceding window. Fig. 2.14 displays the window statistics. The = m om)
0 E
If any bias is present, it can be removed easily. If there is no c° 01
c
(

mean and variance are shown at the centers of the windows.


bias or if the bias is very small, the declustered properties will There is significant variation in the mean porosity, from a low (
1'
Tom.
be very similar to the sampled data set. of 15% to a high of up to 25%. The standard deviation varies o
2 0
from a low of 1% to a high of 6%. Understanding this type of ,
cr
73 i=n
Moving Window Statistics. Moving-window statistics is a
- areal variability in statistical properties is important for esti-
mating values at unsampled locations. It is also helpful in de-
0
1
,, O
technique by which local variations in statistical properties -
r0
›Ar.,\

• k•ei
within a study region can be investigated. The technique is ciding the type of technique to be used in the estimation pro- m
0 T. 0
relatively simple but very powerful. A small window of a de- cedure. These aspects are discussed further in later chapters. c 0 (-1-
sired size is overlaid on the study region, and all sample points Fig. 2.15 shows a plot of local mean vs. local standard devi- m -1
T3
x gi)3
Pm
E•
falling within that window are used to calculate local summa- ation for a moving window that has at least seven sample •n. 0
points. The choice of seven as the minimum number of data 5-3 0
ry statistics. The two statistical properties most commonly 73
used are the mean and the variance or standard deviation. The is somewhat arbitrary. The figure shows that the relationship
procedure is repeated by moving the window over the entire between the standard deviation and the local mean is rather
0200 Number of data: 68 Number of data: 68
Mean: 21.6226 0.200_ Mean: 20.7280
Standard deviation: 4.8077 Standard deviation: 5.1571
Coefficient of variance: 0.2223 Coefficient of variance: 0.2480
Maximum: 30.2400 Maximum: 302400
0.150 0.160_
Upper quartile: 25,3050 Upper quarlife: 24.7975
Median: 22.4550 Median: 22.1155
Lower quartile: 18.1200 Lower quartile: 162058
Minimum: 11.3900 Minimum: 11.3900
0.100

0.050

10.0 20.0 30.0 10.0 20.0 30.0 40.0

(a) Porosity, % (b) Porosity, %

Fig. 2.12-Comparison of (a) clustered and (b) declustered porosity distributions for Flow Unit 3.

PRINCIPLES OF STATISTICS 25
Twitter: @elgoajiroblanco
2.2.4 Bivariate Statistics. This section analyzes statistics of
two variables. We first discuss the conditional frequency
distribution of one variable with respect to the other vari-
able. We then examine various summary-statistics tools
used to analyze the relationship between two variables and
discuss application of these tools to describe and analyze
spatial data sets. Again, both numerical and field examples
Declustered Data

illustrate the principles.


Conditional Frequency Distribution. Sec. 2.2.1 discussed
the frequency-distribution analysis. Under that analysis, we
divide data sets into several classes and consider the number
of values falling within each class. We can extend this analy-
sis to two variables. The concept of conditional distribution
is used to present frequency distribution of two variables in
a suitable format.
E A conditional distribution is a distribution of one variable
o that is conditional on the distribution of the other variable. For
Ec example, if we are considering two variables, permeability
77,
E '
and porosity, we can generate a conditional distribution of
0,=0 10 14 18 22 26 30 permeability for a given porosity value. Obviously, for a large
.
6 ;:z Clustered Data number of sample values for any given porosity value, there
0 may be multiple observations of permeability values. We can
0- o
ui consider the frequency distribution of only these permeability
Fig. 213—Percentile plot comparisons between clustered and
E declustered data. values. This distribution is conditional to that value at which
0 8
0 >- the porosity is fixed.
c7- 4 In practice, there may not be sufficient samples to generate
0 .3 1- weak. If there is any relationship, it shows that the local mean
oxo mra a conditional distribution for one variable for a given value of
Tv 0 is negatively related to the standard deviation; that is, as the
the other variable. As an alternative, we can consider a range
z 0 local mean increases, the standard deviation decreases. Such of values for one variable and the conditional distribution of
a relationship generally is observed for a negatively skewed the other variable for that range. As this range becomes small-
O❑N CtLa histogram. Note that the porosity histogram for Flow Unit 3 er, the conditional distribution comes closer to a distribution
uJ o-
o
W Ce is slightly negatively skewed. For a positively skewed histo- corresponding to a point value. The following example illus-
0_
< gram, the local standard deviation normally increases as the trates this approach.
I
L

uJ local mean increases. Use of this type of relationship may aid


Q 1-
S
OD the estimation process. Field Example 2.5. The data from Well 34-29 are used for
LLI D
1— 8 u_0 this example. This well has 57 porosity and permeability val-
E
In analysis of spatial data sets, it is important to remember ues, with porosity ranging between 0.052 and 0.29. The po-
13-
,fi LL
ct o rosity values are divided arbitrarily into five classes; Table
u that spatial data sets collected from a reservoir may exhibit
1:e C
t 2.4 shows class ranges and mean of each class.
0 V) some unique features. Before these data sets are used as sam- For convenience, assume that each class is represented by
< 9
: Q ct
ples, it is important to conduct some exploratory analysis. the mean value. The permeability values corresponding to
0
o 2 Exploratory analysis may eliminate some of the bias in the each porosity class also are divided into five classes; Fig.2.16
>-
C•1
o
t■I
wU
0 0
data sets and also may indicate local variations in the ob-
served properties.
shows the histograms of permeability values for each porosity
class. These histograms are conditional distributions of
F
I— Q z
• 0 6 1,020

Mean
Qcr Standard Deviation

7,170 —
o 24
21,5
423,s; 5.812 5 5182 ; 54i,

4,140 —

0
22? 24 ,4 22 1
3 . 836 2
11D 2 45 8 12411 4.07 1 J,
1,110 —

220 22-14 21 1 212 185


1.508 :22:5 ;134
- 1,920

-4,950
- 4,620 924 6,468 12,012 17,556 23,100

East, ft

Fig. 2.14—Areal distribution of mean/standard deviation.

26 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


Twitter: @elgoajiroblanco

TABLE 2.4 — POROSITY CLASSES FOR FIELD EXAMPLE 2.5

• • • Class Mean
0.052 to 0.132 0.098
•• 0.132 to 0.157 0.150
S • 0.157 to 0.175 0.166

Standard Deviation

• 0.175 to 0.261 0.212


• 0.261 to 0.290 0.273


Similar to variance, covariance is defined in units that de-
• pend on the units of x andy, and it can be made dimensionless
by defining a correlation coefficient.
• c(x, y)
r(x, y) — .............................................. (2.16)
• s„
where r(x, y) = correlation coefficient, c(x, y) = covariance
between x andy, s, = standard deviation of the x variable, and
15 17 19 21 23 25 s, = standard deviation of they variable. Sec. 2.2.2 discussed
the calculation of standard deviation, which is the square root
Local Mean of variance.
Fig. 2.15—Local mean vs. standard deviation for Flow Unit 3 po- By making the correlation coefficient dimensionless, we
rosity data. can define its limits. The value of the correlation coefficient
always falls between the limits of + 1 and — 1. If x and y are
permeability values for a given porosity class. Although the positively related, the correlation coefficient falls between 0
number of permeability values in each histogram is rather and + 1. The stronger the relationship, the closer the value
limited, a trend is clearly evident. Low porosity values are will be to + 1. If x andy are negatively related, the correlation
associated with low permeability values. As the average po- coefficient falls between 0 and — I. The stronger the relation-
rosity increases, so do the permeability values. With the as- ship, the closer the value will be to — 1. If x andy are not re-
sumption that the mean porosity value for a class represents lated, the correlation coefficient is zero.
the class, the permeability distribution within each class can In some instances, the square of the correlation coefficient,
be associated with that particular value. For a large number of r2 (x, y), is used instead of the correlation coefficient to de-
porosity values, class sizes can be made smaller. Eventually, scribe the relationship between the two variables. One advan-
in the limit, a conditional distribution of permeability values tage of using this value (sometimes called the r2 statistic) is
can be associated with each porosity value. that it always falls between zero and one, whether x andy are
This type of conditional information is useful in associating positively or negatively related. This is the term most com-
the uncertainty of one variable with respect to the value of monly used in describing the "goodness of fit" in a linear-re-
another variable. For example, in the geological description gression analysis between two variables. The following ex-
of a channel sand, the width of a channel and the thickness amples illustrate the calculation procedure.
(depth) of a channel are related to each other; however, for a
given width, several thickness values are possible. When Numerical Example 2.3. Table 2.5 provides core permeabil-
constructing geological descriptions, once a channel width is ity vs. core porosity data from a well. Calculate the covariance
chosen, we can use the conditional distribution of thickness and the correlation coefficient between log k and data.
to select a particular thickness from that distribution. This al- Solution. In practice, we assume that log k is related to t5
lows us to capture the nonunique relationship between the values. Therefore, we first calculate the log k values;Table
channel width and the thickness. Chap. 7 provides a detailed 2.6 shows the calculated values.
discussion of this particular method. Calculate the covariance with Eq. 2.15.
Summary Statistics for Bivariate Distribution. Similar to
univariate statistics, several tools are used to summarize the 1 xiy i — 1
statistics between two variables. The covariance is defined as r=1 i=1
ft
where n = 9 because we have nine pairs of data. Treating x
c(x, y) = xi (2.15) as log k and y as 0, we can calculate
=1

where x i and y, = samples of the variables x and y, respec- clog k, = { 3.063 x 29.49 + 2.725
tively, and n = total number of sample pairs. Note that
covariance reduces to variance if x = y. x 26.79 + + 2.559 x 25.54]
Covariance is a measure of the relationship between two
variables. If x andy are positively related (i.e., as x increases,
y increases), the covariance has a positive value. If x and y are
— 9 [3.063 + 2.725 + + 2.559]

negatively related (i.e., as x increases, y decreases), covari- x 4{29.49 + 26.79 + + 25.541


ance has a negative value. In the same manner, if x and y are
not related, the covariance has a value close to zero. = 0.8875.

PRINCIPLES OF STATISTICS 27
Twitter: @elgoajiroblanco

Number of data: 12 Number of data: 14


0.700 Mean: 1.2967 0.400 Mean: 1a5543
Standard deviation: 2.5477 Standard deviation: 6.6984
0.500 Coefficient of variance: 1.9648 Coefficient of variance: 0.4942
Maximum: 9.5200 Maximum: 24.2000

I
Upper quartile: 1.2500 0.300 Upper quartile: 20.6000
0.600 y.
Median: 0.3500 Median: 13.1000
Lower quartile: 0.0350 Lower quartile: 7.9300
ry 1403 Minimum; 0.0100 tO Minimum: 4.2000

iu 0.200
ti
0 200
0.101
0 100

00 4.0 9.0 120 40 14.0 24 0 34.0 44.0 540 64.0


Permeability, md Permeability, and

Number of data: 9 Number of data: 12


Mean: 23 2889 Mean: 129.7667
0.200 Standard deviation: 5.3238 Standard deviation: 115.5670
Coefficient of variance: 0.2286 Coefficient of variance: 0.8906
Maximum: 31.6000 Maximum: 401.0000
Upper quartile: 27.6500 Upper quartile: 210.0000
E 0.150 Median: 24.2000 Median: 74.0000
o Q Lower quartile: 18.2750
0,4co Lower quartile: 47.9000
Ec Cr
Minimum: 14.3000 Minimum: 31.6000

Ts I' Lu
W E., 0,100
tL
L: 0.300

E u) 5'
0.280
o
Et
@A I- -1
0.1

C)
CL
.,
C La 2 t4.0 24.0 340 44. 130 230 330 430 530
cts
Permeability. md Permeability, md
H
O
:" o >- O
c71 - Number of data: 10
o .3 I- Mean: 505.8000
mo WI Standard deviation: 128.2886
° Coefficient of variance: 0.2536
L1J Maximum: 748.000

z~0 Upper quartile: 583,9999


Median: 467.0000
0 re Lower quartile: 426.0000
>- Minimum: 333.0000

W
N
EL
Et
UJ
< O
_1 La ce

< 'Pt 13-


-1
La
0 g
w
Q cri
S
LLI
O DC.) 730 930 1130

1-O u_ Permeability, md
O

° M
C4 t
Fig. 2.16-Conditional distribution of permeability for Well 34-29: (a) 4t = 0.052 to 0.132, (b)
0
0 111 LL cb =0.132 to 0.157, (C) 0 =0.157 to 0.175, (d) 0 =0.175 to 0.261, and (e) 0 =0.261 to 0.290.
1:e H <
=, 0
<9 To calculate correlation coefficient, we must calculate stan- Modification of the denominator is necessary to make the def-
a '0
u> 2 dard deviation for both log k and 4:). Use Eq. 2.7 with n instead inition consistent with the covariance definition. For log k, the
>-
N 0_ of (n - 1) as the denominator. mean is 2.6322; therefore,
8
ui

oo
F
n.Y 2
[1063' + 2.725' + + 2.559 2 ] - 9 x 2.6322 2
i- re
o
i 6
z 9
0 E s2 = 1-
I n
T
E co ce = 0.1489,
>- 12 a

07 .
:
ol- TABLE 2.5-PERMEABILITY AND POROSITY DATA
w FOR NUMERICAL EXAMPLE 2.3 TABLE 2.6-log k VALUES FOR NUMERICAL EXAMPLE 2.3
re
k k
Sample ( md) log k (md)
1 29.49 1,156.0 29.49 3.063 1,156.0
2 26.79 531.0 26.79 2.725 531.0
3 28.74 1,059.0 28.74 3.025 1,059.0
4 27.65 822.0 27.65 2.915 822.0
5 27.69 1,014.0 27.69 3.006 1,014.0
6 22.69 109.0 22.69 2.037 109.0
7 23.3 138.0 23.3 2.14 138.0
8 23.81 166.0 23.81 2.22 166.0
9 25.54 362.0 25.54 2.559 362.0

28 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


Twitter: @elgoajiroblanco
-

TABLE 2,7-ORIGINAL POROSITY AND PERMEABILITY TABLE 2.8-MODIFIED POROSITY AND PERMEABILITY
VALUES AND ASSOCIATED RANKINGS VALUES AND ASSOCIATED RANKINGS
FOR NUMERICAL EXAMPLE 2.4 FOR NUMERICAL EXAMPLE 2.4

0 log k F Rlog k 0 log k Rog k


29.49 3.063 9 9 29.49 3.063 9 9
26.79 2.725 5 5 26.79 2.725 5 5
28.74 3.025 8 8 28.74 3.025 8 8
27.65 2.915 6 6 27.65 2.915 6 6
27.69 3.006 7 7 27.69 3.006 7 7
22.69 2.037 1 1 22.69 2.037 1 2

23.3 2.14 2 2 23.3 1.14* 2 1

23.81 2.22 3 3 23.81 2.22 3 3


25.54 2.559 4 4
25.54 2.559 4 4
*log (13.8) = 1.14
Ell 0
which gives s log , = 0.3859. The standard deviation for 0 val- ›. 0
ly related. If the relationship is nonlinear, r(R 1 , R ,) can still in- -<
ues is calculated similarly and results in s o = 2.329. Using o
dicate a high value but r(x, y) may show a low value. z coA-,
Eq. 2.16 gives 5O
o3 -
c(log k, 0) 0.8875 7 I
Numerical Example 2.4. Using the same data as in Numeri-
r(
lo g
k, 0) = S i n!, So =
0.3859 x 2.329 -
0.988.
cal Example 2.3, calculate the rank correlation coefficient be- o
o
o
o0 h
As is evident, a positive correlation-coefficient value indi- tween log k and 0 values.
cates a positive relationship between k and 0. Furthermore, a Solution. To calculate the rank correlation coefficient, first (7) G)
rearrange the porosity and permeability values in ascending ›. o:
value close to one indicates that a strong relationship exists Mo D
order and rank them. Table 2.7 shows original values and ,
(7 0 -
between k and 0 values.
associated rank for each pair. The ranks are identical for both rn ni 0
o
porosity and permeability values. Therefore, the relationship
In addition to correlation coefficient, rank correlation co- between the two variables is perfect.
efficient is another measure that indicates the relationship ?" 8 =1
between two variables. To estimate the rank correlation coef- c(k oo , R o ) = [9 x 9 + 5 x 5 + + 4 x 4]
c ai rn
9
ficient, all data values are first sorted in ascending order.
6,7 m
13
Then, each value is assigned a rank, depending on where it
falls. The smallest value receives the lowest rank, and the
- - [9 + 5 +
9
+4] -0
o
m g
largest value has a rank of n, where n = total number of sam- -a n rn
[]
- [9 + 5 + + 4] = 7.5, mx
ples. Both variables are ordered this way, and ranks are as- 9
xi 0
signed to both. Then, the rank correlation coefficient is calcu- SRI„gk = 2.7386, 0
lated as = m om
fD
and s A,0 = 2.7386. 13 0
6
c(R - Tom.
r(12„ •
= SR I , SR
................................................................... (2.17) Therefore, o
2 0
cr
where r(R .„ R, ) = rank correlation coefficient, c(R „ R,)
. r(12 tog Ro) --- 2.73867.5
x 2.7386
1.0. 73 i=n
,

covariance between the rank values of the two variables, and 0


The value of the rank correlation coefficient is consistent with
1
r0
s R , and s R , = standard deviations for the rank values for the M-1
the perfect relationship. m
two variables. When each variable has the same number of
data values, 5 R = s Now, assume that the seventh sample permeability is 13.8 c u-1 3
T3
n
t (-1.

md instead of 138 md. Using the modified value, we can re- m x gi)
The rank correlation coefficient is a useful statistical tool Pm
E•
for comparing two variables. Unlike the correlation coeffi- write the ranks (Table 2.8). The new value changes the rank- ›. 0
0
cient, which can be influenced by extreme values within the ing slightly. Calculating the correlation coefficient gives 3
data set (extreme values can change the mean and variance), r(log k,0) = 0.868; whereas, r(R iogk , R id = 0.983. While
simply changing one value does not affect the correlation co-
the rank correlation coefficient is not affected significantly.
efficient for rank index significantly, it does significantly af-
Therefore, it is a relatively robust measure and may allow
fect the correlation coefficient for the actual values (from 0.99
detection of any measurement errors, especially if there is a
to 0.87). The discrepancy between the two correlation coeffi-
noticeable difference between the values of the correlation cients should prompt a careful investigation of the sample
coefficient and the rank correlation coefficient. A high value data for any errors in reporting the values.
of r(R x , R,) and a low value of r(x, y) may indicate that some
erratic data pair need to be examined to ensure that there is no
measurement error. Linear Regression. A logical extension for applying the
Alternatively, a high value of r(R„ R i ) and a low value of summary statistics principles is establishing a linear relation-
r(x, y) may indicate that the relationship between the two ship between the two variables. A linear relationship is useful
in predicting a value of one variable when the value of the oth-
variables is monotonic (trends are the same) but not necessar-
er variable is known. The simplest type of this relationship is
ily linear. As noted previously, r(x, y) is a measure of the
goodness of fit between two variables when x and y are linear y = mx + b, (2.18)

PRINCIPLES OF STATISTICS 29
Twitter: @elgoajiroblanco
3.50

3.00

AC 2.50
0,
0

2.00

1.50 11111I MITiiii11111 111111111111111111111111111


20 22 24 26 28 30
11 1111111111111111111111111111111111111111111

Fig. 2.17—Relationship between log k and for Numerical Ex- X


ample 2.5.
Fig. 2.18—Local bias in linear regression.

where y = the variable to be estimated; x = the known vari- and we can calculate the arithmetic means of both variables:
able, m = the slope of the straight line, and b = an intercept log k = 2.6322 and 0 = 26.19. With Eq. 2.19b,
on the y axis. To estimate the values of m and c. we first use
the available sample pair of x and y, and obtain the "best" fit b = log k - miTT) = 2.6322 - 0,1636 x 26.19
between the two variables. We can show that the best fit can = 2.6322 - 0.1636 x 26.19 = - 1.652;
be obtained by defining the values of m and c as
therefore, the overall equation can be written as
m
c(x, y)
= s A2
.............................. (2.19a) log k = 0.16360 - 1.652.

and b = y — rn.X, ...................................................... (2.19b) Fig. 2.17 shows the plot of log k vs. 0 and the best-fit line. As
expected, the relationship between log k and 0 is excellent.
where c(x, y) = covariance between x and y, = variance of A linear relationship established between any two vari-
x, and 57 and :f arithmetic
= means of the y and x variables, ables must be locally, as well as globally, unbiased. Fig. 2.18
respectively. As stated before, the goodness of fit is indicated shows an example of a locally biased relationship. In this
by the correlation coefficient between the two variables. data set, the overall linear relationship between y and x is
quite good and is globally unbiased; i.e., the data are spread
evenly on both sides of the best-fit line. Careful examination
Numerical Example 2.5. Using the same data set used for shows, however, that the relationship is locally biased. The
Numerical Example 2.3, obtain the best-fit line between log value of y is consistently underpredicted with the linear rela-
k and 0 values. tionship at low values of x and consistently overpredicted at
Solution. In this case, assume that they variable is log k and high values of x. This type of local bias may not allow correct
the x variable is 0. Therefore, the relationship is prediction of y values given the value of x. If such a relation-
log k = m0 + b, ship is observed, the sample data must be divided further into
different subsets and different correlations must be estab-
From Numerical Example 2.3, lished for different regions.
c(log k, = 0.8875,
Field Example 2.6. For this example, we use 57 porosity and
s o = 2.329, permeability core values gathered from Well 34-26 to investi-
gate the relationship between log k and 0 for the well. Fig.
and s; = 5.424. 2.19 shows a plot of log k vs. 0 and the best-fit line for Well
With Eq. 2.19a, 34-26. The relationship for the best-fit line is given by
log k = 18.5670 - 2.048, where 0 is in decimal fractions
m = c(log k, ) - 0.8875 and k is in md. The correlation coefficient is 0.934, and the
s2 0.1636.
5.424 rank correlation coefficient is 0.973. The small difference be-
tween the two correlation coefficients indicates that there is
To calculate the intercept, b, we need the arithmetic means of not a significant number of outlier values.
both k and 0. The high values of the correlation coefficient indicate that
With Eq. 2.4, the fit is reasonable; however, local bias does exist in this rela-
tionship because several geological units are combined to-
X = -h- gether in the vertical direction to develop a single relation-
i=I ship. We can remove this type of local bias if we can separate

30 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


Twitter: @elgoajiroblanco

Ell 0
0
10 20 30 40 -<
o

0 10 20 30 40
0, z
5 0
Fig. 2.20—Log k vs. 15 for Flow Unit 3. o3 —
7 I

Fig. 2.19—Local bias in log k/ relationship for Well 34-29. lish a relationship between porosity values at different loca-
o
o
o
oh
0

tions as functions of distance between those values. x 1,3


the data into several distinct, geological units and develop a Solution. Although a lot more data typically are collected (7) G)
›. o:
relationship for each individual unit. from a vertical well, we consider only seven data points for Mo D

this example. The data are collected at uniform intervals. Re- N0-
73
Field Example 2.7. This example investigates the log k vs. 0 call that the covariance relationship (Eq. 2.15) states that rn ni 0
o

relationship for Flow Unit 3. Once the flow units at each well
were identified, all cored data for that unit were collected to- c(x, y) = • .............. (2.15) -n
00
0
0c rn
gether to develop such a relationship. Fig. 2.20 shows a plot c
r CP
of log k vs. 0 and the best-fit line for Flow Unit 3. The correla- So 17
We use the same relationship, except that the x and y variables
tion coefficient is 0.927, and the rank correlation coefficient are the same variable at different locations. For example, if we 6,7 rn
13
o
5.
-0
is 0.900. The equation for the best-fit line is given by denote variables x(u) as a value of x at Location u and a vari- o
m
log k = 0.1580 — 1.457, where porosity is in percent and able x(u L) as a value of x at Location u + L, we can write -a n rn
7
permeability is in millidarcies. 73
Eq. 2.15 as
There is very little local bias in this relationship, indicating xi X 0
good local and global fits. A close examination of the plot re- c[x(u), x(u + = (u i)x(u, + L) = m om
()
veals that one sample data point can be considered as an outli- c° 0 , 7E
c 0 68
er datum. This corresponds to a porosity value of 10% and L'
Tom.
log k of 1.924. Removing this data point and recalculating the + L), _< 0
best-fit line improves the correlation coefficient to 0.938. As o h o
,
cr
expected, the rank correlation coefficient remains the same. 73 i=n
.................................. (2.20) 0
The best-fit equation changes slightly. Note that the decision 1
-i r
0
>fa\
to remove a particular data point from the best-fit correlation where L = distance between the two variables, also called m
is a subjective one. It should be done only after careful ex- the lag distance. Note that the difference between the two cn
0
0 (-1.
amination of the data set to ensure that such a data point is in- variables in Eq. 2.20 is the lag distance. Eq. 2.20 still contains mx-1 w3
T3

deed erroneous and can be removed. Pm


E•
›. 0
TABLE 2.9—POROSITY DATA FOR 73
0
NUMERICAL EXAMPLE 2.6
Bivariate Relationships for Spatial Data. In the previous
sections, we examined the relationship between two variables Depth Porosity
(ft) (%)
and showed that covariance can be used as a statistical tool to
quantify such a relationship. This section briefly introduces 2,040 8.25
the application of bivariate summary statistics to spatial data 2,041 9.00
sets. Chap. 3 discusses such relationships in detail. 2,042 6.25
An important distinction when establishing a bivariate rela- 2,043 5.00
tionship for spatial data is that the same variable is examined 2,044 5.30
but at different spatial locations. It also is possible to develop
2,045 4.75
a relationship between two different variables at different
locations; Chap. 3 also discusses this. 2,046 5.00

Numerical Example 2.6. Table 2.9 shows porosity data col-


lected from a vertical well at uniform intervals of 1 ft. Estab-

PRINCIPLES OF STATISTICS 31
▪ Twitter: @elgoajiroblanco
2,040 8.25 9.00 6.25 where both the correlation coefficient and the covariance are
functions of lag distance. For a lag distance of 1 ft, we can cal-
2,041 9.00 6.25 5.00
culate 40,0 by calculating the variance of all data points used
as a first data point in a given pair. The mean is
2,042 6.25 5.00 5,30
2
x(u) = [8.25 + 9.0 + 6.25 + 5.0 + 5.3 + 4.75],
2,043 5.00 5-30 4.75 6
3
= 6.425,
2,044 5.30 4.75 5.00
4 and the variance is
2,045 4 75
. 5.00
6
2
x2 - 6 x(u)
5.0
2,046 2
S x(tr) -
Fig. 2.21-Pairs for Numerical Example 2.6.
6
[8.25 + 9.0 2 +
2
+4.75 2 - 6 x 6.425 2 ]
E a term n, which represents the number of pairs. In this exam- 6
o
Ec ple, n = the number of pairs located a distance L apart.
= 2.6823.
Ts To apply Eq. 2.20 to our the data set for a lag distance of 1
E Et ft, we first need to find how many pairs are 1 ft apart. Fig. 2.21 Therefore, s x(u) = 1.638. Similarly, for the second data point
0 cr 0
_c w shows that, for a lag distance of 1 ft, we can gather six pairs; in each pair, s x(a .,., ) = 1.474. Using Eq. 2.22 gives
0 for a lag distance of 2 ft, we can gather five pairs; and so forth.
• 0 c(1)

O
ui With Eq. 2.20, we can calculate the covariance for a lag dis-
co 1- tance of 1 ft with n = 6.
r(1) = silos x(u+ , )
8 1.73
0 >- = 0.7165.
c[ip(u), 4)(u + 1)] = [8.25 x 9.0 + 9.0 x 6.25 1.638 x 1.474
0 .3
rmo Similarly, for lag distances of 2 and 3 ft, respectively,
7.) 0 + + 4.75 x 5.01
2
• Ct
z 0
-61 [8.25 + 9.0 + + 4.75] r(2)
c(2)
s(u)s.,-(.+2)

UJ
N La 0.43 - 0.5135,
0
W Ce
0_
x 6 [9.0 + 6.25 + + 5.0] • 1.595 x 0.525
<
C7 LIII c(3)
UJ = 1.73. and r(3) = s
vi
cr) 1- _1 x(u) s x(u +3)
C.) D For covariance, we can simply write the left side of the
UJ C.) 0.195 = 0.631.
1- 8 u_ equation as c(1) because it reflects the covariance for a lag
0 1.586 x 0.195
E distance of 1 ft. Covariance for a lag distance of 2 ft is calcu-
,fi 0
ct
LL lated in the same way. There are five pairs at that lag distance. A special case exists when the covariance and correlation-
kJ
C
1:e t coefficient values are estimated at a lag distance of zero. At
0 V)

< 9 c(2) = - [8.25 x 6.25 + 9.0 x 5.0 + + 5.3 x 5.0] L = 0, the equation for covariance reduces to the corre-
.o ct o 5 sponding equation for variance.
o N 2
>-
CNI -5 [8.25 + 9.0 + + 5.3] 1 2
od c(0) = x(u i )x(u i ) - x(u) sx( . u)
" 0 i= t i=
F x [6.25 + 5.0 + + 5.01
i- 0 6
ce

z
5 In our example, n = 7 for L =--- 0, which gives
j- = 0.43. c(0) = s x200 = 2.548.
cr We can easily show that r(0) = 1 because a perfect rela-
There are four pairs for a lag distance of 3 ft, and the same tionship exists between x(u) and x(u)-they are identical.
o equation is used to calculate c(3) = 0.195. The values of cor- Also mathematically, from Eq. 2.22,
relation coefficient at various lag distances can be calculated
similarly. Recall that Eq. 2.16 states that S x(u)
r(0) = s s = = 1.0.
c(x, y) x(u)- x(u)
r(x, y) = (2.16)
s x s, Although the covariance and therefore the correlation-coeffi-
cient value can be calculated for lag distances of 4, 5, and 6
For spatial data sets, ft with our data, we stopped at a 3-ft lag distance primarily be-
c[x(u), x(u + L)j cause the number of pairs decreases as lag distance increases.
r[x(u), x(u + L)1 = ................. (2.21) This makes estimates of covariance and correlation coeffi-
S r(u)S .r(u + L)
cient less reliable. It is similar to estimating a best-fit line with
As in the case of covariance, the correlation coefficient can be
three data points. We can fit a line but may not trust it as much
written simply as a function of the lag distance. That is, Eq.
because of the limited data. Chap. 3 discusses the importance
2.21 can be written as
of having sufficient pairs for a given lag distance in more de-
c(L) tail. Obviously, however, a sufficient number of pairs is nec-
r(L) = „ ............................................. (2.22) essary for a reliable estimate of spatial relationship.
.v(u + L)'
32 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION

Twitter: @elgoajiroblanco •

1 1 0.006
0.8
0.8 0.6 0.004
0.4
0.6 .e
0.002 2*.
r)
0.2
0.4 1-
0
0
-0.2
0.2
-0.4 -0.002
0 0 5 10 15
1 2 3 Lag Distance, ft

Lag Distance, ft Fig. 2.23—Covariance and correlation coefficient for porosity


data in Well 34-29.
Fig. 2.22—Effect of lag distance on covariance and correlation
coefficient. Ell 0
0
If estimated covariance and correlation-coefficient values 2.3 Inferential Statistics -<
o
are plotted as functions of lag distance (Fig. 2.22), both the X
z
03^'
Inferential statistics is a logical extension of descriptive sta- 5 0
covariance and correlation coefficient decrease as functions tistics. Descriptive statistics most often deals with analyzing o
xi —
I
of lag distance. This trend occurs because of the way in which
most geoscience data are distributed. These types of data be-
sample data sets. However, from the characteristics of the
sample, conclusions (inferences) can be drawn about the
o
o0
o
o h

come significantly more similar as lag distance decreases. population from which the sample was taken. Inferential sta- -0 ro

Recall that the correlation coefficient is a measure of how tistics is the class of statistical techniques that deals with (7) 0
closely two variables are related. For a perfect relationship, ›. o:
these types of problems. Only in use for approximately 80 Mo
,
D
the correlation coefficient is equal to one. As the relationship years, inferential statistics is a relatively new branch of sta- (7 0
gets weaker, the correlation-coefficient value approaches • 73
tistics. It is, however, a much more useful branch for petro- rn ni 0
o
zero. Starting at L = 0, the correlation coefficient equals leum-engineering-related problems than descriptive statis-
one. As lag distance increases, the neighboring values be- tics because most problems that petroleum engineers deal 91
8 =1
0
c c,...
come increasingly dissimilar, which is reflected in progres- with involve inferences and decision-making about un- rn
sively smaller values of covariance as well as correlation co- sampled locations in the reservoir.
efficient. The correlation coefficient for a lag distance of 3 ft We cover only topics related to inferential statistics that are 13 r r
15i; 7, c
is slightly higher than one for a lag distance of 2 ft. This may necessary to understand the geostatistical principles. It is be- ▪ 6). E.
be because of the limited number of pairs used for these cal- yond the scope of this book to cover many of the details about 0mn rn
culations. The covariance, however, shows the expected be- inferential statistics; these are readily available from standard 7
X .--
havior; as the lag distance increases, covariance decreases. statistics books. Instead, we concentrate on covering some of
This type of spatial relationship can be quantified by anoth- xi 0X
the essential details that lay a foundation for understanding II
er method called a variogram. Chap. 3 discusses the vario- geostatistical principles and practice. -I M
gram, its computation, and applications in detail. 13 0, fD
Sec. 2.3.1 covers the definition of a random experiment. 6
-
0 08
7
L:P
Sec. 2.3.2 briefly reviews set-theory principles. Sec. 2.3.3 de- T3 a :
o
Field Example 2.8. This example uses data from Well 34-29 fines probability and the rules related to it. Sec. 2.3.4 dis- 2 0
collected vertically at uniform intervals of 1 ft. Using the pro- cusses probability and cumulative-distribution functions. cr
73
cedure established in Numerical Example 2.6, we can calcu- Sec. 2.3.5 presents the principles related to expected value ~
1
~
r0
0
late the covariance and correlation coefficient for the vertical and its applications. Sec. 2.3.6 covers some of the basic con- - ›reh

data; Fig. 2.23 shows plots of both. As expected, the esti- tinuous-distribution functions that are widely used. Finally, m 0
mated values show a decreasing trend, eventually reaching a Sec. 2.3.7 briefly describes some of the desirable characteris- c ›. (-1.
0

constant value at a lag distance of approximately 15 ft. Chap. tics of inference. We attempt to keep the discussion as simple m -1
T3
x gi)3
Pm
E•
3 explains the significance of a constant value. as possible because of the range of the topics covered. Many n0

Fig. 2.24, which plots the first value of the pair vs. the se- concepts have to be introduced through mathematical equa- 0
3
cond value of the pair for lag distances of 0, 1, 4, and 10 ft, tions, and we try to provide some intuitive feel for the equa-
reiterates what Fig. 2.23 tries to quantify. Recall that the cor- tions used. Appendix B provides details about the equations,
relation coefficient indicates how well two variables relate to and the references can provide further aid in understanding
each other. For a perfect relationship, data pairs should be the concepts.
identical, which is possible only at L = 0. Note that, as lag
distance increases, the deviation from the 45° line increases, 2.3.1 Random Experiment. There is no rigorous mathemati-
which is reflected in smaller correlation-coefficient values. cal definition for a random experiment. However, conceptu-
At L = 10ft, hardly any correlation exists, which is reflected ally, it can be defined as an experiment whose outcome cannot
by a correlation coefficient value close to zero. be predicted with certainty in advance. Obviously, a random
This particular behavior of correlation coefficient or experiment has to result in more than one possible outcome.
covariance is very useful in geostatistical analysis. In this sec- Another characteristic of a random experiment is that it can
tion, we introduced the type of behavior one should expect in be repeated under controlled (unchanged) conditions with the
analyzing spatial data. Chap. 3 discusses the details regarding outcomes appearing in a random manner.
the best ways to capture the spatial relationships and the mod- An often cited example in statistics books of a random ex-
eling of such relationships. periment is the tossing of a coin. Tossing a coin results in more

PRINCIPLES OF STATISTICS 33
Twitter: @elgoajiroblanco

0.4 0.4

0.3

0.2
Co
2
0.1

0
0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.4
(a) Porosity (L = 0) (b) Porosity (L=0)

0.4 0.4

8 •s••
• •
Oa
• ike
• .•

0
0 0.1 0.2 0.3 0,4 0 0.1 0.2 0.3 0.4
(C) Porosity (L 0)= (d) Porosity (L 0)=

Fig. 2.24—Scatter plots for porosity data for Well 34-29: (a) L 0 =
ft, (b) L 1=ft, (c) L 4=ft, and (d) L 10=
ft.

than one outcome: heads or tails. If the coin is true, we cannot this is the case, why should drilling a well be considered to be
predict the outcome with certainty, and we may be able to re- a random experiment? If there is only one outcome, then drill-
peat the experiment under the same conditions so that the out- ing a well should be treated as a deterministic experiment
comes appear (heads or tails) in a random manner. Therefore, where only one outcome is possible.
all characteristics of a random experiment are satisfied, and This argument has been used in the past to justify the use of
we can consider tossing a coin as a random experiment. a deterministic model to describe a reservoir. In a determinis-
Another simple example of a random experiment is rolling tic model, everything is known with certainty. The argument
a pair of dice. This results in multiple outcomes (36 possibili- in favor of this is that each reservoir is unique and already de-
ties to be exact), we cannot predict the outcome with certainty, fined. Although intuitively appealing, some problems are
and we can repeat the experiment under controlled conditions associated with this argument. First, understand that a deter-
so that the outcomes appear in a random order. As in the case ministic model is the most desirable model to describe the res-
of tossing a coin, this satisfies all characteristics of a random ervoir. If we can describe every aspect of the reservoir with
experiment. certainty, we do not need to know geostatistics or any other
An example that is more difficult to justify as a random ex- statistical principle to describe the reservoir. The question is
periment, but closer to what we are interested in, is drilling a whether we can do it.
new well in a reservoir. It can be argued, with some justifica- In principle, with sufficient knowledge about the genesis of
tion, that drilling a well should not be considered a random ex- the reservoir, we can apply a deterministic model to describe
periment because, once a well location is chosen, drilling the the reservoir. This requires knowledge about how the reser-
well results in only one possible outcome, not multiple out- voir was formed, under what circumstances the oil was
comes. The porosity and permeability values are distributed trapped, what type of temperatures and pressures prevailed at
vertically in one particular way, the well is going to produce the time the hydrocarbons were created, what type of geologi-
at a particular rate, and the well-test results (with a particular cal processes took place subsequent to hydrocarbon presence,
model) may provide a unique set of reservoir properties. If what chemical interactions took place after the hydrocarbons

34 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


Twitter: @elgoajiroblanco
were trapped, and possibly some other factors. If we under- with certainty. Therefore, our prediction is a random experi-
stand all these processes in sufficient detail and have the abil- ment with several possible outcomes. Later chapters that cov-
ity to put in a gigantic, complete, mathematical model to solve er the application of geostatistics discuss this in more detail.
the model, we arrive at the deterministic reservoir descrip- To summarize, a random experiment can result in multiple
tion. The model will be able to predict all the necessary reser- outcomes, none of which can be predicted with certainty.
voir characteristics, including its size, shape, volume, physi- When lack of knowledge prevents us from predicting a single
cal properties, and the chemical properties of the outcome with certainty, we also treat that experiment as a ran-
hydrocarbons. This description has no uncertainty. We will be dom experiment.
able to drill the wells in the most optimal way and produce the
reservoir in the most efficient manner. 2.3.2 Sample Space and Events. This section briefly reviews
If this sounds impossible, it is because it is. What is funda- the principles of set theory. For convenience, rolling a die il-
mentally lacking in describing the reservoir in a deterministic lustrates some of these principles. We already saw that rolling
manner is our knowledge. It is true that the reservoir is deter- a die is a random experiment.
ministic and that all the properties are uniquely defined. How- A sample space, S, is a set of all possible outcomes. For a
ever, our understanding of the processes that resulted in these rolling-a-die experiment, we can denote the sample space as
properties is so inadequate that they appear to be random. S = (1, 2, 3, 4, 5, 6) (2.23)
This does not imply that the processes are random, but simply
that we have inadequate knowledge about them. because the experiment has six possible outcomes.
Therefore, should drilling a new well be considered a ran- An event is defined as a set consisting of some of the pos-
dom experiment? Consider the characteristics of drilling a sible outcomes. If Event A consists of all the even-numbered
well with the characteristics of a random experiment. The first outcomes of the rolling-a-die experiment, then Event A is
characteristic of a random experiment is that the outcome A = (2, 4, 6). (2,24)
cannot be predicted with certainty. Because of our lack of
knowledge, we do not know with certainty what the outcome If Event B consists of all the outcomes less than five for the
will be of drilling a new well. The second characteristic is that rolling-a-die experiment, then Event B is
a random experiment can result in several possible outcomes. B = (1, 2, 3, 4). (2.25)
Our lack of knowledge before drilling results in alternative
outcomes; the most obvious is whether the well will be a pro- Using only two events, A and B, of a sample space, we can
ducer or a dry hole. The third characteristic of a random ex- define the union of these two events, A U B, as consisting of
periment is that it can be repeated under controlled conditions all the outcomes present in either A or B. Therefore,
so that the outcomes appear in a random manner. Although A UB = (1, 2, 3, 4,6). (2.26)
this may be a stretch of imagination, drilling a new well is a
new experiment, and the outcome every time will appear in Similarly, using Events A and B, we can define the intersec-
a random manner. Drilling each new well is based on prior ex- tion of these two events, A fl B, as consisting of all outcomes
perience; therefore, we have a little more information about that are present in both A and B. Therefore,
the reservoir than we had before. As a result, we may be able A n B = (2, 4). (2.27)
to predict the possible outcomes with a little more confidence.
However, because the outcome still cannot be predicted with If the intersection of two events results in a null set (contain-
complete certainty, the experiment is still considered random. ing no outcome), these two events are mutually exclusive. For
The concept of random experiment is extremely important. example, if Event C contains all the odd-numbered outcomes
However, drilling a well as a random experiment is different from a rolling-a-die experiment, then
from tossing a coin as a random experiment. Tossing a coin C = (1, 3,5). (2.28)
results in either of two outcomes; therefore, it is a random ex-
periment. Drilling a well results in only one outcome; howev- With the definition of intersection of events,
er, our lack of knowledge does not allow us to predict that out-
come with certainty. Therefore, we treat it as a random
An c = (2.29)
experiment with multiple possibilities of outcomes. where (I) = a null set and A is as defined by Eq. 2.24. Because
This concept can be extended to the use of geostatistical A fl C contains no outcomes, A and C are considered mutual-
principles to estimate values at unsampled locations. Al- ly exclusive.
though the properties at each unsampled location are uniquely The best way to illustrate some of these definitions is with
defined, lack of knowledge does not allow us to predict them Venn diagrams. Fig. 2.25 shows examples of Venn diagrams

Fig. 2.25—Venn diagram.

PRINCIPLES OF STATISTICS 35
▪• Twitter: @elgoajiroblanco
for union and intersection of events within a sample space. where A, = sequence of mutually exclusive events and
The shaded region indicates the resulting event. n = number of mutually exclusive events. For n e = 3,

2.3.3 Probability. Probability normally is associated to a par- p(A,) + p(A 2 ) = p(A U A 2 ), ............................ (2.34)
ticular event of a random experiment. A geologist's statement and for n e = 3,
that "there is 30% probability of finding oil at a location
where a new well is to be drilled" can have two meanings. p(A1) + p(A 2 ) + p(A 3 ) = p(A, U A2 U A3).
Both meanings are correct and are a result of the way we de- .................................. (2.35)
fine the random experiment.
1. The first interpretation is that the geologist believes that, That is, the probability of the union of mutually exclusive
in reservoirs with a similar depositional environment, 30%© of events is equal to the addition of the probabilities of the indi-
the wells will produce oil. That is, if several wells are drilled vidual events.
in very similar depositional environments, 30%© will produce For two events that are not mutually exclusive,
oil and, by inference, 70% will be dry holes. p(A U B) = p(A) + p(B) p(A n B). ............ (2.36)
2. The second interpretation is that the 30% probability is
E a measure of the geologist's subjective belief that the well will
o
produce oil. Numerical Example 2.7. The following three events are de-
77, These two interpretations can be directly related to the de- fined for a rolling-a-die experiment.
E' scription of random experiments. The first interpretation is
0 cr
u)

a
1. Event A. All even-numbered outcomes.
w closely related to the random experiment of rolling a die. That 2. Event B. All outcomes greater than 3.
is, if we repeat the experiment a large number of times under
0-
o 3. Event C. All odd-numbered outcomes less than 4

O
,
controlled conditions, a pattern emerges about the outcomes. Calculate p(A), p(B), p(C), p(A U B), p(A U C) and
(ts For example, for a true die, if we roll the die a large number p( A n B).
8 of times, we observe that each of the six outcomes is equally
0 >- Solution. From the description of events, the events are
c71 - c4 likely. Under this interpretation, probability can be defined as A = (2, 4, 6), B = (4, 5, 6), and C = (1, 3). Knowing that
0 .3 1-
axo nA the probability of individual outcomes for rolling a die is 1/6,
TO p(A) = (2.30)
we can calculate the probability of individual events on the
2
z~0
• where p(A) = probability of Event A, n A = number of times
basis of Eq. 2.35 as
1
outcome has occurred, and n . = number of times the random
❑N p(A) =1 + 6 +1 --
,

uJ o- o experiment is conducted under controlled conditions. n c has


W Ce
0_ to be large to ensure that a correct pattern is captured. For the
< If we consider Event A to consist of three mutually exclu-
IL II rolling-a-die experiment, the probability of each of the six sive events with Outcomes 2, 4, and 6, then the probability of
uJ
0.
Go 1—
_1 outcomes is 1/6 or 0.1667. Going back to the geologist's state- an event is the addition of the probabilities of these three
D ment, if we drill a large number of wells in a similar, deposi-
LLI D C.)
1- 8 u_
events. Similarly,
0 tional environment, we observe thatp(P), where P = produc-
E
er is 30%, or 30% of the wells should produce oil. p(B) = + =
ce
,fi 0
LL 2
u Interpretation 2 is closely related to the random experiment
1:e H t C
=, 0 V) of drilling a well. The geologist is simply using a subjective 1
belief about the chance of success. Uncertainty exists is be-
and p(C) = + 1 _- -5 .
ct
N T,
o 2 cause of a lack of complete knowledge about the reservoir. Because p(A fl C) = €13,
>-
CNI However, the geologist is using his/her partial knowledge to
0
c;) ui p(A U C) = p(A) + p(C) (2.37)
Cs
assign a value to the probability of success.
' re
O0
F Both interpretations are correct, and the mathematics of
re
0 6 probability does not change with the interpretation applied. 1 _ 5
z
o Deterministic events can be treated as random events if we 2 3-6'
T co ce
E
cr lack sufficient knowledge about those events; however, with which can be confirmed because
o partial knowledge about the events, probabilities can be as-
signed to the likely outcomes (events) of that experiment. A C = (1, 2, 3, 4, 6).
Laws of Probability. On the basis of our definition of prob- Therefore,
ability, we can write three basic laws related to probability.
For Event A of a random experiment with Sample Space S, p(A U C) = 1 5
5
6-6
p(A) - 1 (2.31)
and A 11 B = (4, 6),
That is. the probability value can never be less than zero or
which results in
greater than one. Also,
p(S) = 1. (2.32) p(A fl B) =
That is, the probability that the outcome will be part of the Using Eq. 2.36 gives
sample space is equal to one. Secondly,
er r
p(A U B) = p(A) + p(B) - p(A fl B)
p(A i ) = p( U A ,), (2.33) 1 _,_ 1 I 2
-
36 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION
Twitter: @elgoajiroblanco

-

which is confirmed because We know that p(B) = 0.7 and p(A fl B) = 0.2 (because A is
A U B = (2, 4, 5, 6). a subset of B); we have to have source rock to find oil in the
well. Substituting gives
Therefore,
p(AIB) = 0,2 = 0.286.
0.7
p(A U B) =
+6=3 =2
.

The probability of finding oil improves to 0.286 because


Conditional Probability. As the name indicates, condition- source rock is present.
al probability is the probability of an event that is conditional
on some information. This allows calculation of the probabil- We can rewrite Eq. 2.38 as
ity of a given event when partial information regarding the re-
sult of the random experiment is available. p(A 11 B) = p(A1B)p(B). (2.39)
For example, rolling a pair of dice has 36 possible outcom- Eq. 2.39 allows us to derive a few more conclusions. First,
es: (1,1), (1,2) , , (1,6) , , (3,3) , , (6,6). The first num- for mutually exclusive events, (A 11B)is a null set; therefore,
ber in the parentheses is the outcome of the first die, and the p(A fl B) = 0. Because the left side of Eq. 2.39 is zero,
second number in the parentheses is the outcome of the se-
cond die. Because each die can take six possible values, the
p(AIB) = 0, which is consistent with the idea that if B has oc-
curred, A cannot occur. Therefore, the probability of A occur-
nn 0
total number of outcomes is 36. If asked to calculate the prob- ring given that B has occurred is zero. The same definition can -<
o
ability that the addition of the outcomes is two, we immedi- be used to define independent events, which are events that z
0
ately see that the only outcome that results in such addition is 5
are independent of each other. The occurrence of an indepen- o —
(1,1). Therefore, the probability that the addition will be two dent event is not affected by whether any of the events from • I
is I out of 36, or 1/36. If the first die lands on Side 1, what is o0o h
which it is independent has occurred. For example, in rolling o
the probability now that the addition of the two outcomes is a pair of dice, the outcome of one die does not affect the out- ,
1 3
two? We can calculate this probability easily because we come of the other die. The outcomes of the two dice are com- x
(7) G)
know that the outcome of one of the dice is one. The second pletely independent of each other. In other words, ›. o:
die can result in six possible outcomes. We want to calculate Mo
,
D

p(AIB) = p(A) and p(B) = p(B1A), (2.40) (7 0 —


the probability that the outcome will be one because that is the
rn ni 0
only outcome that will result in an outcome of two when rol- where the probability that Event A will occur is not affected o

ling both dice. The probability that the outcome of the second by the fact that B has occurred. The same thing can be said
die will be one is one out of six, or 1/6. Therefore, the condi- about Event B. The probability of Event B occurring is not af-
91
8 =1
tional probability of the addition of the outcomes of a pair of fected by the fact that Event A has occurred.
c ai rn
dice being two if the first die landed on Side 1 is 1/6. This val- Substituting Eq. 2.40 in Eq. 2.39 gives
ue is considerably smaller than 1/36, which is what we calcu- 6,7 rri
13
o
5.
lated without any additional information. p(A n B) = p(A)p(B) (2.41) -0
o m
The most common notation used to describe conditional
for independent events. For more than two events, Eq. 2.41 -a n rn
7
probability is p(AIB). This indicates the conditional probabili- X
can be extended as
ty of Event A occurring given that Event B has occurred. A xi X 0
general equation for calculating conditional probability is p(A, n A 2 , „ • , fl A„) = p(A ,)p(A 2 ) , , p(&) , II
m

p(A n B) .............................. (2.42)


•0 0 fD
1
p(AIB) = (2.38) -1 (1'
p(B) Tom.
where A,, A, ,...,A„ = independent events. _< 0
Intuitively, Eq. 2.38 can be explained as follows. Once o h o
Event B has occurred, we reduced the sample space corre- ,
cr
Numerical Example 2.9. The probability of success is esti- 73 i=n
sponding to Event B only because no outcome outside Event 0
B is possible. For Event A to occur, the outcome has to be com- mated to be 0.2 for an exploration well in Basin 1. For another -I > 0,
exploration well in Basin 2, the probability of success is esti- X -< 1q2
mon to both Events A and B because Event B has occurred. m
That is, it has to come from A fl B. Therefore, the probability mated to be 0.3. If both wells are drilled, what is the probabili- c u-1 3
T3
n
t (-1.

of Event A occurring given that Event B has occurred is equal ty that both will be successful? m x
am
to the ratio of the probability of Events A and B occurring di- Solution. Because these wells are drilled in different ba- - E •=
n0

vided by the probability of Event B. sins, they can be considered as independent events; the out- x0
73
come of one well does not affect the outcome of the other. If
Event A is a successful well in Basinl and Event B is a suc-
Numerical Example 2.8. The probability of finding oil in an
cessful well in Basin 2,
exploration well is estimated to be 0.2. One of the uncertain-
ties in finding oil in this well is the presence of source rock. p(A) = 0.2 and p(B) = 0.3.
The probability of the presence of source rock is 0.7. After
these preliminary calculations were made, a well drilled in a With Eq. 2.41,
nearby area confirmed the presence of source rock in the re- p(A B) = p(A)p(B) = 0.2 x 0.3 = 0.06.
gion. What is the probability of finding oil in the first explora-
tion well given that the presence of source rock is confirmed? The probability that both events will occur is 0.06 or 6%.
Solution. Let A be the event that the oil is found in the ex-
ploration well, and B be the event that the source rock is pres- Another useful extension of Eq. 2.38 can be written. Recall
ent. Using Eq. 2.38 gives that Eq. 2.38 states that
p(A n B) = p(A n B)
p(AIB) p(A1B) (2.38)
p(B) p(B)
PRINCIPLES OF STATISTICS 37
▪ Twitter: @elgoajiroblanco
the regions to contain oil. Geophysicists are certain that one
of these regions should have commercial reserves. If only one
exploration well is drilled in a region, there is a 40% chance
that oil will not be discovered in that region although oil may
indeed be present (overlook probability). If one well is drilled
in Region I and is unsuccessful, what is the probability that
oil is present in Region 1? What is the probability that the oil
is present in the other two regions?
Solution. Let E, where i = 1, 2, 3, be an event that oil is
in Region i. Let F be an event that the search of Region 1 is
unsuccessful. With Bayes' theorem,
p(E, n F)
Fig. 2.26 —Explanation of Bayes' rule. p(E ,IF) — (2.47a)
p(FIE,)p(E ,)
With a sample space consisting of A, mutually exclusive
E events so that
o and p(E, n F) = p(FlE ,)p(E ,) (2.47b)
77, p(A,) = 1, The denominator of Eq. 2.47b can be written as
E i=
0 cr a
_c ui
we can easily write p(FIE i )p(E,) = p(FIE,)p(E,) + p(FIE2 )p(E 2 )
o uia
o p(B) = p(A B) + p(A 2 fl B) + ... + p(A n B).
i= 1

+ p(FIE3 )p(E 3 ). (2.48)


H

0
:
0
o >-
(2.43) Each region has an equal likelihood of success; therefore,
0 .3 1-
In Fig. 2.26, which illustrates this, the sample space is divided
p(E1) = p(E2 ) = p(E3 ) = 0.333.
rmo into four mutually exclusive events, A, through A 4 , and
Tv 0
Event B is located in the sample space. As the figure shows, Also, p(FlE ,) = 0.4, p(FIE2 ) = 1, and p(FIE3 ) = 1 be-
2
z 0
• Ce Event B can be written as cause the exploration well in Region I results in failure if oil
is in either Region 2 or 3. Substituting all these values gives
B = A , ri B + A2 nB +A 3 fl B +A 4 n B.
❑ N La
uJ o-
o p(E,IF) = 0.4 x 0.33 = 0.167.
W Ce .................................. (2.44)
<
0_ 0.4 x 0.33 + 1 x 0.33 + I x 0.33
III
L This can be generalized for tie mutually exclusive events. The probability of finding oil in Region 1 is reduced to 0.167.
uJ
Q 1- Also, the numerator of Eq. 2.38 can be written as Similarly,
SODC.) p(A n B) = p(BIA)p(A) .................................... (2.45)
1— 8 u_0 p(E-IF) = p(E3 1F)
E because we know that 1 x0.33
,fi o
ce
Q
1:e
LL
C
t
= p(A n B) 0.4 x 0.33 + 1 x 0.33 + 1 x 0.33
0 V) p(BIA ) (2.38)
p(A) = 0.417.
I-
8
:
ct
N 2
T, Using Eqs. 2.45 and 2.43 along with Eq. 2 . .38 gives The probability of finding oil in Regions 2 and 3 improved to
Q >-
0
0.417 because the search in Region 1 was unsuccessful.
c;) ui p(Blilip(k)
(NI ce
0 0
p(A i lB) ...................................... (2.46a)
F 2.3.4 Random Variables. A random variable is a variable
re 6
p(A, f1 B)
0 whose values are generated by a random experiment on the
z
o basis of some probabilistic function. For example, the rolling-
T co re
p(B114(A 3 )
E
cr a-die experiment produces any one of the six possible out-
or p(A i lB) = ne
(2.46b)
o comes randomly. If the random variable is letter X for this ran-
p(B IA ,)p(A ,) dom experiment, for a true die,
=I

Eq. 2.46b is Bayes' theorem, which represents a generalized p(X = 1) = = p(X = 2) = p(X = 3) = p(X = 4)
equation for conditional probability. An entire branch of sta-
tistics, Bayesian statistics, is based on the concept of deter- = p(X = 5) = p(X = 6). ................ (2.49)
mining conditional possibilities. The discussion of the condi- That is, the probability that a random variable can take any
tional-simulation technique covers the usefulness of this one of the six values is 1/6.
principle in more detail. Numerical Example 2.10 illustrates It is i mportant to maintain the distinction between a random
Bayes' theorem. variable and an actual outcome of a random variable. To make
this distinction, we use an uppercase letter to denote a random
Numerical Example 2.10. A 3D geophysical survey con- variable (e.g., X) and a lowercase letter to denote the outcome
cludes that a new area has three potential regions where oil (or realization) of a random variable (e.g., x). Conceptually,
can be found. It is equally likely that oil could be found in any the difference between the random variable and its realiza-
of the three regions. On the basis of the surrounding regions tions can be explained with the same example of the rolling-a-
and the presence of source rock, it is possible for only one of die experiment. The random variable can take any of the six

38 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


-▪
Twitter: @elgoajiroblanco
- -

f(x)

f(x

Fig. 2.28 — Probability of value falling within an interval.

Applying Eq. 2.51 gives


6
P(x) = —1 1 — 1. 0
6 0
-
X Therefore, Eq. 2.51 is satisfied. ›. 0 <
X
z 03 ^'
0
5
Fig. 2.27—Probability density function. o —
For a continuous random variable, the probability density xi I
E
function, f(x), describes the behavior of a random variable. 00
outcomes 1, 2, 5, 6, 3, 4, 3, 3, 4, 6, 1, 4, 2 , ..., these are the re- Fig. 2.27 shows a representative probability density function. 0
alizations of the random variable, which we can denote as One requirement of the probability density function is that the ▪=N
(.7) 0
x, = 1, x, = 2, x 3 = 5, and x 4 = 6, where the subscripts area under the curve be equal to one. Mathematically, z nO
denote the number of a particular realization. - —

0—
Random variables are defined as two types: discrete and
f(x)clx = (2,52) ni 0
continuous. Discrete random variables can take a finite num-
ber of values. An example is the rolling-a-die experiment, _x
where a random variable can take only six possible values. 91
8 =1
Continuous random variables can take a very large number of
Recall that addition of relative-frequency distributions also c ai rn
adds to one. (Fig. 2.3 provides a good understanding of the So 17
values (for example, a collection of porosity data from a reser-
definition of the probability density function.) Once the prob- i;
13 rn
7o
voir), and a large number of outcomes are possible. The fol- r n—
ability density function is described, the probability that the -0
lowing two sections describe some functions of both discrete
value of a random variable will fall within a certain interval o
and continuous random variables. m n rn
can be calculated easily, For example, 0
Probability Function. The probability function describes -I 71
the probability that a random variable will take a certain val- xi X 0
ue. The probability function is closely related to the relative- II
5
p(a < X L-. b) = f(x)dx (2.53) =1om
-
frequency-distribution function, which describes the chance I
Oa
a
that a value will fall within a certain class. The probability co

function describes an essentially similar behavior. Schematically, the probability that a value will fall within a o
For discrete random variables, the probability mass func- certain interval is represented by the area under the curve 2 0
,
cr
73 i=n
tion, P(a), of a random variable X is within that interval (Fig. 2.28). Again, it is important to have
an intuitive feel for this equation (see Fig. 2.4 and the related 1
,,
r0
o
P(a) = p(X = a). (2.50) discussion to understand the probability density function and - >Is\

q141
m
its close relationship to the relative-frequency distribution). 0 ›. 0
For a discrete random variable, X can take finite number of c 0 (-1-
T3 -1
values x i , i = 1, n. Therefore, m x

Numerical Example 2.12. Pay-zone thickness in a reservoir Pm


E•
0
is described by the following probability density function. 0
3
1 (x, = 1. (2.51) 3
i=1 f(x) = 0 for x 20 ft
5
= 50 for 20 < x ft
Numerical Example 2.11. Define the probability mass func-
tion for the rolling-a-die experiment. Show that Eq. 2.51 is sa- = 0 for > 70 ft.
tisfied for this function. Show that this density function satisfies Eq. 2.52. Further, cal-
Solution. Knowing the random experiment, we can write, culate the probability that the thickness at a particular location
for example, will fall between 30 and 50 ft thick.
Solution. Applying Eq. 2.52 to the probability density
P(1) = p(X = 1) function gives
20 70
Similarly, we can write

P(2) = P(3) = P(4) = P(5) = P(6) =


OcLv +
SQ dx + Odx = 0 + (47200 + 1
;=
20 70

PRINCIPLES OF STATISTICS 39

▪ Twitter: @elgoajiroblanco
1 "P where f(x) = the probability density function of a random
variable.
0.8
It is important to understand that cumulative-distribution
function is closely related to cumulative-relative-frequency
distribution. Like cumulative-relative-frequency distribu-
0.6 — tion, cumulative-distribution function has a minimum value
of zero and a maximum value of one. It is also a nondecreas-
ing function, which means that it can stay constant over a
0.4 —
certain interval but does not decrease. The following numer-
ical examples illustrate the usefulness of the cumulative-dis-
0.2 tribution function.

0 MIWNWRIIMIN■4www .14.■iiiiiiiiimslitall■sit i
Numerical Example 2.13. Calculate the cumulative-dis-
E 0 1 2 3
tribution function for the rolling-a-die experiment. What is
4 5 6
o the probability that the outcome will fall between two and
Fig. 2.29—Cumulative distribution function for rolling-a-die ex-
Ts I' five, p(2 < X 5)?
periment.
E ' Solution. Using Eq. 2.57 gives
0 0
_c w therefore the density function satisfies Eq. 2.52.
o
ul
a
o Use Eq. 2.43 to calculate the probability that a value will
fall within 30 and 50 ft.
F(1) = P(x, =
COI-

.1
50
•o 8
1
0 >- 1 1
c7 — c4 p(30 ..,-, x 50) = 5 a9- —
0 4
--
F(2) = (x,) =
6—3•
o .3 1- 50 • i =1
rmo
71.) 0 30
We repeat this for the other four outcomes. Fig. 2.29 shows
• Ce2
z 0 Therefore, there is a 40% probability that the pay zone will
fall within the 30- to 50-ft interval.
the plot of the cumulative-distribution function. It starts with
a value of zero and reaches a value of 1 at a value of the vari-
❑N
woo
La
able equal to six. Use Eq. 2.56 to calculate the probability that
W Ce
Cumulative-Distribution Function. The cumulative-dis-
0_ a value will fall between 2 and 5.
<
L
tribution function, F(x), is defined as
C7 I
uJ
F(x) = p(X x) . (2.54) p(a < X c b) = F(b) F(a);
SOD therefore,
D C.) It is the probability that a random variable X will be less than
1— 8 u_
0 a particular value x. Knowing the definition of the cumula-
x
E
tive-distribution function, we can use Eq. 2.53 to calculate the p(2 < X 5) = F(5) — F(2) = —
ce 0fi
Q LL
C probability that a random variable will fall within a certain in-
1:e t
=, 0 V) terval. For example, to compute p(a < X b), we can write
<9 p(X LS. b) as comprising two mutually exclusive events or
.o ct 0
Numerical Example 2.14. In Numerical Example 2.12, we
o 2 defined the probability density function as
>- p(X b) = p(X a) + p(a < X b).
0
c;)
.............................. (2.55) f(x) = 0 for x 20
O 0
re
F
I— Ce z
6 Therefore,
• 0
= 50 for 20 < x 70
T co ce
E
p(a <X b) = p(X b) p(X a)
cr = 0 for x > 70.
= F(b) F(a). (2.56)
o Define the cumulative-distribution function for this function.
Eq. 2.56 calculates the probability that a random variable will
fall within a certain interval. Confirm that the probability that a thickness will fall between
For a discrete random variable, cumulative-distribution 30 and 50 ft is 0.4.
function can be calculated as Solution. With Eq. 2.58, for a value of x between 20 and 70,
20
F(a) = P(x i ), (2.57)
F(a) = f(x)dx = Oclx +
where P(x,) = the probability mass function of a random 20

variable. In general,
For a continuous random variable, the cumulative-distribu-
tion function can be calculated as F(x) = 0 for x 20

x ; for 20 < x 70
F(a) = f(x)dx, (2.58) 0
20

= 1 for x > 70.

40 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


Twitter: @elgoajiroblanco •▪
To calculate p(30 X 50), we can write
Px (x,) = P(x i , y,), (2.61)
p(30 < X7..4 50) = F(50) — F(30) j=1

50 — 201 30 2 0) = 0.4, where all the possible values of Yare summed while the value
50 ) 50 of a random variable X is kept the same. Similarly, we can
which confirms the previous answer. write the marginal distribution of the variable Y, P v (y;), as

Bivariate Functions. Here, we extend our definitions of Py(Y,) = ........................................ (2.62)


probability functions to two variables. For discrete random
variables, the probability mass function is Numerical Example 2.15 illustrates these concepts.

p(x i , y ,) = p(X = xi= Y = (2.59) Numerical Example 2.15. Consider the rolling of two dice.
For example, if we toss two coins simultaneously, the proba- Define Random Variable X as the addition of two outcomes
bility mass function of both coins sharing heads is from the two dice and Random Variable Y as the outcome of 0
the first die. Obviously, X cart take values between 2 and 12 A°
and Y can take values between 1 and 6. Table 2.10 shows the -<
P(H, H) = p(X H, Y = H) = o
03 ^'
probability mass function of all possible outcomes. z

5 O
because out of the four possible outcomes, both heads is one For example, for an outcome of Y = 2, X can take values o —
• I
between 3 and 8. This is because the smallest outcome for the
of the possible outcomes. Similarly, ooo 0o h
=
other die is 1 and the largest outcome is 6. Therefore, addition
P(H,T) = of the outcomes varies between 3 and 8. Note that for true N
dice, all the outcomes are equally likely. The missing values (7) 0
›. 0:
represent zero probabilities. For example, we can write Mo >
P(T, H) = 4, (7 0 —
,

P(X = 5, Y = 2) = rn 0
o
and P(T, T) = 74 .

Adding these four values results in one. Similar to one vari-


or P(X = 12, Y = 6) = 3
6 .
91
8 =1
c ai rn
able, we can write To calculate the marginal probabilities for X, we add all the - So
-D m
probability mass functions over the entire range of Y For ex- 1 i; 67, g
P(x i , yi ) = 1 (2.60) ample, marginal distribution forX = 7 can be calculated as -0
-0 n
co rn
7
7, 1 1 X
where rix = number of possible outcomes of a random vari- P x(7) = P( = 36 36 36
j= xi X 0
able X and n y = number of possible outcomes of a random 4") Z

variable Y 6 __ —I M
0 fD
Considering the distribution of only one variable while ig- 36 i;*
o
noring the effect of other variable is called a marginal dis- We can calculate the marginal distribution for other values of Tom.
o
tribution. For example, the marginal distribution for Random X as well. To calculate the marginal distribution for Y, we add o h o
cr
Variable X, P x (x,), can be written as all the probability mass functions over the entire range of X 73
1
-
-
r0
o
TABLE 2.10—PROBABILITY MASS FUNCTION OF ALL POSSIBLE
y,(EA

• 1•12-1
m
OUTCOMES FOR NUMERICAL EXAMPLE 2.15
c u-1 3
T3
n
t

Outcome of Y m xw
Pm
E•
Marginal 0
Outcome of X 1 2 3 4 5 6 Distribution of X 0
73
2 1/36 1/36
3 1/36 1/36 2/36
4 1/36 1/36 1/36 3/36
5 1/36 1/36 1/36 1/36 4/36
6 1/36 1/36 1/36 1/36 1/36 5/36
7 1/36 1/36 1/36 1/36 1/36 1/36 6/36
8 1/36 1/36 1/36 1/36 1/36 5/36
9 1/36 1/36 1/36 1/36 4/36
10 1/36 1/36 1/36 3/36
11 1/36 1/36 2/36
12 1/36 1/36
Marginal distribution of Y 1/6 1/6 1/6 1/6 1/6 1/6 1

PRINCIPLES OF STATISTICS 41
▪ Twitter: @elgoajiroblanco
for a given value of Y. For example, marginal distribution for 1.
Y = 3 can be calculated as

P r(3) =
it
P(x i , 3) = 0 + 0 + 1 1
3
p[0 < X .'-..-_. 1, 1 < Y < 00 ] =
I
o
e - xe - vd.xdy

i 1

+ +0+0+0=I I e - 'I — e - Y1'dx = e "Aix


6 36
o o
Other values are calculated similarly.
Note that marginal distributions are simply probability = — h
].
mass functions if we consider the distribution of a single vari-
able. For example, if the outcome of a die is considered a ran- Using Eq. 2.65a gives
dom variable, the probability mass function is 1/6 for each
outcome. Similarly, if the addition of a pair of dice is consid-
fx(x) = -
xe - Ydy = e'
ered a random variable, the probability mass function of Out-
E
o come 7 is 1/6.
Ec
Ts I' Similarly, with Eq. 2.65b,
E 5
'
For continuous random variables, we can describe the prob-
0 a
w ability density function, f(x,y), so that f y (y) = e 'e = e - v..
0- .
o

I
0
ui
(ts f(x, y)dxdy = 1, (2.63) The cumulative-distribution function for a bivariate dis-
0 8
0 >- — cc — X tribution can be written like that for a single variable
c7—
0 .3 1- where the volume under the surface equals one. Similar to a F(x, y) = p[X x, Y y] (2.66)
axo
Tv 0 single-variable distribution, the probability that the values of
For discrete variables, we can write
Ct
z 02 random variables will fall within certain intervals can be cal-
culated with the probability density function. For example,
F(a, b) = E y. pk i , y,i (2.67)
❑N La ct-
d x~c a
<O
-< la Ce
0_ p[a < X b, c < Y d] = f(x, y)dxdy. For continuous variables,
C7 LI
UJ a a
ui
D .................................. (2.64) F(a,b) = f(x, y)dxdy. (2.68)
D
1— 8 u_
ce
0 Like discrete distributions, marginal distributions for the two
E
0
O
variables can be calculated as Application of Eqs. 2.66 through 2.68 is straightforward.
W
O LL C
1:e H t Recall independent variables. Remember that, if two events
=, 0 V)
< 9
are independent,
fx(x) = f(x, y)dy (2.65a)
:Q 0
o 2 p(A fl B) = p(A)p(B) (2.40)
E >-
C4 I— a_
u 0
i Similarly,
W
"O 0
F and fy (y) = f(x,y)dx. (2.65b) p[X a, Y b] = p[X < a] p[Y
c
b] . . . . (2.69)
Ct
Oz
6
For discrete variables, for independence to be valid,
T co ce
E

Numerical Example 2.16. A bivariate probability density P[x i , Px[ x dp y (2.70)


{

o function is given by f(x, y) = e - xe - v. Show that Eq. 2.64 is


valid. Calculate p[0 < X 1, 1 < Y < 00]. Calculate the for all x i and y, .
For continuous variables,
marginal distributions for Random Variables X and Y
Solution. f(x, y) fx(x)fy(Y).
= (2.71)
To validate Eq. 2.64,
Use of Eq. 2.71 easily shows that, for the probability densi-
X X X X
ty function provided in Numerical Example 2.16, the vari-
Odxdy + f e xe Yclidy ables x and y are independent.
Conditional distribution, a distribution of one variable that

X — 0 0 is conditional on the other variable taking a certain value (dis-
cussed earlier), can also be defined. For a discrete variable,

= — -
xdx = e = 1. P my(x1Y) = P xiy(X = xIY =y)

0 0 1)(X =
xlY = y) P(x, y)
.......... (2.72)
To calculate p[0 < X 1, 1 < Y < 00 ], P(Y = y) Py(Y)

42 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


Twitter: @elgoajiroblanco
where P,,,(xly) = the conditional distribution of Random outcome of a random experiment if the experiment is con-
Variable X taking a value of x given that Random Variable Y ducted a large number of times. For a discrete random vari-
has taken the value of y. For continuous random variables, the able X, the expected value of X is defined as
conditional distribution is
E[X] = x ,P[X = x , (2.74)
fxiy(xl.Y) = f(x, y)• (2.73) =1
MY)
where E[X] = expected value; x i = outcome of the random
variable, P[X = x i ] = probability mass density function for
Numerical Example 2.17. Numerical Example 2.15 consid- the ith outcome, and o = number of possible outcomes. Nu-
ered two random variables, X and Y, where X = the addition merical Examples 2.18 and 2.19 illustrate the concept.
of the outcomes from a pair of dice and Y = the outcome of
the first die. The probability mass function for these two vari- Numerical Example 2.18. A wheel of fortune in a casino
ables and the marginal-distribution function were calculated contains 80 slots. Each time you bet, you have to place $1. Out
in Numerical Example 2.15. Calculate the probability that of the 80 slots, if you hit one of seven special slots, you win
Y = 3 given that X = 8. $10. If you hit one of the remaining 73 slots, you lose $1.
Solution. Using Eq. 2.72 gives Ell 0
What is the expected value of this experiment? 0
P(x, y) Solution. Betting on a wheel can be considered a random -<
0
P yix(y1x) = experiment with two possible outcomes, success, S, or failure, z
03 ^'
P x(x) F. The random variable can take two values: $10 or $1. S = a 5O
o —
The marginal distribution for Px (8) = 5/36. The value of $10 win and F = a $1 loss. The probability mass functions xi I
@
P(8, 3) = 1/36. Substituting gives for the two outcomes are o0
o
1/36 -I IN)
Pyrx(y1x) = P[X = SI = 0 x
(7) 0
5/36 5*
That is, the conditional probability that Y = 3 is 20% given and P[X = F] = 73 0
-

that X = 8. 73
rn ni 0
The answer can be easily confirmed because the addition of Using Eq. 2.74 gives o

eight can be achieved five different ways. The realizations E[X] ($10)P[X = S] + ( — $1)P[X = F]
have to be (2,6), (3,5), (4,4), (5,3), or (6,2). Of these five pos-
91
8 =1
c c rn
0
sibilities, the first die showing a 3 is possible in only one of = ($ 0)(80) + — W(0 So
these possibilities. This equates to 20% of the total possibili- o rn
51i; o
ties, which is what the answer indicates.
— $0.0375. 6m.)
0 n rn
Before concluding this section, we briefly review the con- For a given bet, we either win $10 or lose $1. The expected m
value represents the average outcome of this random experi-
cept of the random variable and some of its functions. It is not
ment if it is played a large number of times. That is, if we bet xi X 0
necessary to understand all the mathematical details covered 0 II
1,000 times, we lose an average of $0.0375 or 3.750/bet (or,
in this section; however, understanding the concepts is critical m
613 01
0
on an average, approximately $0.0375 x 1, 000 = $37.50).
for understanding future chapters. -I
Obviously, expected value achieves meaning only if a game
A random variable is a numerically valued function whose Tom.
is played a large number of times. o
values can be determined by some probabilistic function. The 2 0
realizations of the random variable are the outcomes of a ran- ,
cr
Numerical Example 2.19. What is the expected value of a 73 I=n
dom experiment. A discrete random variable can take only a
finite number of values, whereas a continuous random vari- rolling-a-die experiment? 1
r 1•2
,
x,ra

0
able can take a large number of values. The probability mass Solution. We know that the random variable can take any m
of the six values, each having a probability of 1/6. Using Eq. 0 ›.
or density function allows calculation of the probability that c 0 (-1-

a random variable takes a certain value or a range of values.


It is closely related to the relative-frequency-distribution
2.74 gives T3 -1 3
m
am
- E•
n0
=
0
function. Similar to relative frequencies, it adds up to one. E[X] = X = 73
The cumulative-distribution function allows determination
of the probability that a random variable will be less than a
certain value. It is closely related to cumulative-relative-fre-
quency distribution and can take a value between 0 and 1.
These functions can be defined to describe the behavior of
two variables to extend the analysis to two variables. Addition
of another variable requires the description of two additional + 5( 1 ) + 6( 1 3.5. = =
6 6
distribution functions. The marginal-distribution function de-
scribes the distribution of one of the variables by ignoring the The expected value is 3.5 for the rolling-a-die experiment.
variations in the other variable. The conditional distribution As in the previous example, an outcome of 3.5 cannot be real-
calculates the probability of one variable that depends on the ized after a single roll of a die; it represents the average of all
other variable taking a certain value. the outcomes if the experiment is repeated a large number of
ti mes. That is, if we roll a die a large number of times and note
2.3.5 Mathematical Expectation. Mathematical expecta- the outcome each time, the arithmetic average of all those re-
tion or expected value is defined as some weighted average alizations is close to 3.5.

PRINCIPLES OF STATISTICS 43
Twitter: @elgoajiroblanco
The definition of expected value can be generalized for Solution. We estimated the arithmetic mean in Numerical
any real valued function u(x) of the variable X. Example 2.19 and observed that
E[X] =
E[u(X)] = u(x i )P[X = x i ] (2.75)
To calculate the variance, we need E[X 2 ]. With Eq. 2.75,

x2 ]
for a discrete variable, and
E = x T [X = x i ] .
i=1
E[u(X)] = u(x)f(x)dx (2.76) For six outcomes, we can write

for a continuous variable.


Eqs. 2.77 through 2.79 give some important characteristics
E X 21 =
12 (6) + 2 ( 6) + 32(16) + 42()
2 1

of expected value. 5 2 (6) 6 2 (I )] 15.17.


6
E[K] = K; (2.77)
Using Eq. 2.82b gives
that is, the expected value of a constant is a constant
a2 = E X 2 1 - ,L 2 = 15.17 - (3.5) 2
E[Ku(X)] = KE[u(X)]; (2.78)
that is, the expected value of a constant times a function is = 15.17 - (3.5) 2 = 2.917
equal to the constant times the expected value of the function. and a = 1.708 .
E[u,(X) + u,(X)] = ,(X)] + E[u2 (X)]; The variance of the random experiment is 2.917, and the stan-
.................................. (2.79) dard deviation is 1.708.

that is, the expected value of a sum of two functions is equal In Sec. 2.2.2, we discussed calculation of the mean and the
to the sum of the expected values of the two functions. These variance of a sample. The equations used (Eqs. 2.4 through
properties are very useful in applying the expected-value 2.10) are closely related to the expected-value formulas pres-
equation. ented in this section.
The two most important expected values for a single vari- The expected-value relationship is easily extended to bi-
able are the arithmetic mean and the variance. The arithmetic variate distributions. For a discrete variable,
mean, /4, is defined as
E[X]. (2.80) E[u(X, Y)] = u(x„ yi )P(x j , (2.84)
is the expected value of the variable itself. The arithmetic
mean is the population mean, not the sample mean; therefore,
we differentiate it from the sample mean (denoted by 1) by where E[u(X, Y)] = expected value of the function u(x !,
using the symbol u to represent it. P(x t , y,) = the probability mass function of the two variables
Variance, a', is defined as X and 1', n x = number of possible outcomes of Random Vari-
able X and n y = is the number of possible outcomes of Ran-
cr2 = E[(X - ,u) 2 ] = VEX] . (2.81) dom Variable Y
For a continuous variable,
The notation a 2 is different from s 2 , which represents sample
variance. The square root of a2 , a is the standard deviation.
With the three characteristics of expected values, we can E[u(X,Y)] = u(x, y)f(x, y)dxdy, (2.85)
show that

a 2 = E[X 2 ] {E[X]} 2
.................................. (2.82a) where f(x, y) = the probability density function of the two
variables X and Y.
or al = E X 2 ] - p 2 . (2.82b) Covariance is one of the most important expected values for
a bivariate distribution; it is defined as
Some important characteristics of the variance are
C[X, Y] = E[(X - ke x )( Y t,)} (2.86)
V[K] = 0, ....................................................... (2.83a)
Using the properties of expected values, we can show that
where K is a constant;
V[KX] = K 2 V[X] , ...... (2.83b)
C[X, Y] = EX YI - E[X]E[Y] (2.87)

where K is a constant; and Covariance is an indicator of the relationship between two


variables. The stronger the relationship is, the higher the
V[KX + b] = K2 [X] , (2.83c) covariance value. The covariance is equal to zero for indepen-
dent variables; that is,
where K and b are constants.
C[X, Y] = 0 (2.88)
Numerical Example 2.20. Calculate the variance for the rol- if X and Yare independent variables. Note that, if X = Y, the
ling-a-die experiment. equation for covariance reduces to one for variance.

44 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


Twitter: @elgoajiroblanco
Some useful properties of covariance are for continuous random variables that play important roles
in applying geostatistical techniques: uniform distribution,
C[X + Z, Y] = C X, Y1 + C[Z, Y] • (2.89) normal or Gaussian distribution, and log-normal distribu-
where X, Y, and Z = random variables. In general, for multi- tion. Again, Appendix B provides detailed derivations
ple variables, of many equations.
Uniform Distribution. Uniform distribution is one of the
simplest distributions of a continuous random variable. To
.......................... (2.90) define the uniform distribution properly, minimum and maxi-
mum values of the population are required. As the name indi-
cates, uniform distribution assumes that any value between
where X, and Y = random variables. Eq. 2.90 indicates that the minimum and the maximum has an equal probability of
the covariance of a summation of random variables equals a being selected. The probability density function for the uni-
summation of the covariances. Eq. 2.90 can be generalized form distribution is
even further to state that
f(x) = 1
(2.93a)
b—a
CIX„ (2.91) for a 5- x b and 0
0
-
f(x) = (2.93b) n0 <
X
Eq. 2.91 is extremely useful in geostatistical analysis. Chap. z 03^'

4 discusses its usefulness. otherwise. 50


o
Using the definition of covariance, we can also define the The cumulative-distribution function is
E
correlation coefficient. o 0
F(x) = 0 (2.94a)
x ro
C[X, Y1 for x < a,
p[X, Y = a ..................................... (2.92) (.7) 0
]
x Y ›. 0:
x a
F(x) = ............................................ (2.94b) Mo >
where p[X, Y] = the correlation coefficient. Although simi- —a (7, 0 —
lar to covariance, p[X, Y1 is a dimensionless quantity and can for a C x c b, and ni 0
only take values between — 1 and + 1. p[X, Y] also quantifies
the relationship between the two variables. F(x) = 1 (2.94c)
Sec. 2.2.4 discussed the procedure for calculating the
91
8 =1
for xb. Fig. 2.30 shows both the probability density function c
c ai rn
covariance and the correlation coefficient for sample data. and the cumulative-distribution function. So
The equations used to calculate the sample values are based With the definitions provided in Sec. 2.3.5, the mean of the 13 rn
1 i; 67, 5.
o
on the definitions of the expected value with different nota- uniform distribution is calculated as -0
tion. The population correlation coefficient is defined by o m

p[X, Y], whereas the sample correlation coefficient is defined a+b -0 n rn


1-1 = 2 (2.95) X 7
.--
by r[X, Y1. C[X, 11 indicates the covariance for the popula- —

tion, and c[X, Y] indicates the covariance for the sample. In and the variance is calculated as xi
xiXX00
0 II
practice, we can calculate only the sample values. Estimating M
0
values for the population requires that some assumptions be (b — 613 0 67
8 ,
= (2.96)
made. Sec. 2.3.7, which describes inference techniques, dis- 12 L'
Tom.
cusses the assumptions. Appendix B provides derivations of The most important application of uniform distribution is o
2 0
many equations presented. creating random-number realizations. A standard random- cr
number generator creates a set of random numbers that are 73
0
2.3.6 Important Distribution Functions. In statistics, there sampled from a uniform distribution. Typically, the random 0
,
■1• 2
I.ra
are several important distribution functions with known char- number is generated with a minimum of zero and a maxi- m
acteristics. This section discusses three distribution functions mum of one. That is, any number between zero and one has 0 ›.
c 0 (-1-

mam-3gi)
T3 -1

- E•
n0

0
3

f(x) F(x)

0
a a
x x

Fig. 2.30 — Uniform distribution function.

PRINCIPLES OF STATISTICS 45

Twitter: @elgoajiroblanco

ffx F(x)

0
X X

Fig. 2.31-Normal distribution function.

an equal likelihood of being selected. If the random-number Eq. 2.101 has no analytical solution and has to be integrated
E generator needs to pick a value within a different range, we numerically. The right side of Eq. 2.101 represents the error
o
Ec can easily convert a standard random-number generator to function, which is a standard mathematical function. Fig.
Ts a new value by 2.31 shows the probability density function and the cumula-
E '
N tive-distribution function for the normal distribution. Fortu-
0 0 R. modified =a+ NR.,id(b a), (2.97)
nately, for all practical purposes, F( - 3) = 0 and
o aui o where N R, std
N
= a standard random number between 0 and 1;
a modified random number that falls in the
F( + 3) 1. Standard tables, such as Table 2.11, are avail-
able in most statistics books. Table 2.11 provides the values
(ts 2 R. modified

o
:
o
o >-
c7 c4
range a to b; and a and b = minimum and maximum values
of the uniform random number, respectively. In many of the
geostatistics techniques, use of a uniform-random-number
of F(z) fora range of z between - 3 and + 3 and can be used
for a normal distribution with any mean and variance, as Nu-
o .3 1-
-
merical Example 2.21 example shows.
rmo generator is quite common. Many algorithms have been pro-
71.) 0
posed in the literature that generate "truly" uniform random
z 0 Numerical Example 2.21.
2
• numbers. Most of these algorithms can also generate an en-
tirely new sequence of random numbers by changing what
1. The porosity in the reservoir is estimated to have a mean
of 0.2 and a variance of 0.0004. If the porosity is believed to

uJ N o- is called the "seed value." The ability of random-number


be normally distributed, what is the probability that the poros-
o generators to generate multiple sequences is useful in condi-
W Ce
0- ity value will be between 0.18 and 0.22?
< tional-simulation techniques to generate multiple reservoir
C7 I
I
descriptions. Chaps. 5 through 7 provide a detailed discus- 2. If rock with a porosity of less than 15% is believed to be
uJ
0_ 1-
sion about conditional-simulation techniques. nonreservoir rock, what is the probability that the rock at a
S
OD given location will have a porosity of less than 15%?
LLI M C.) Normal (Gaussian) Distribution. Normal distribution un-
1- 8 u_0 doubtedly is the most famous distribution in the field of statis- Solution.
E
tics. Its probability density function has a bell-shaped curve, 1. For the distribution, ,14 = 0.2, a 2 = 0.0004 or
,fi LLo
ce a = 0.02. To calculate the pf0.18 0.22], we first
u C which is well known to almost everyone, even those unfamil-
1:e t
need to standardize the two values 0.18 and 0.22. If
=, 0 V) iar with statistical principles. The density function is given by
<9 = 0.18 and 0 2 = 0.22,
:Q 0
o 2 1 1 - ,tt \
CNI >- f(x) = exp for - 00 < x < 00. - /4 = 0.18 - 0.2 = _
0
a ,/211 2k a j z, = a
1
c;

ui 0.02
Cs re
'O
F
° ................................ (2.98) - 14'
=
0.22 - 0.2 = 1,
I- Ce 6 and z2 a
z
- 0.02
This distribution function has a mean of ,u and a variance of
T co re
E
cr a 2 . The maximum value of the density function is 0.4/a, respectively.
which is reached at x = u. It is a symmetric function. Once the values are standardized, we can look up the value
o To use the normal-distribution function, it is much more of F(z ,) and F(z 2 ) in Table 2.11: F(z i ) = 0.15866 and
convenient to define a standardized normal distribution. If we F(z 2 ) = 0.84134.
define a new variable, Recall that
X Ka < X b] = F(b) - F(a). ...................... (2.56)
z= (2.99)
a
Therefore,
the probability density function is

Z2 \
<
f(z) = exp( (2.100) p zi < a = Z2 = 0.84134 - 0.15866 = 0.6827.
,/ 2F1 2 ).

This distribution has a mean of zero and a variance of one. That is, a 68%© probability exists that a porosity value will fall
With Eq. 2.100, the cumulative-distribution function is between 0.18 and 0.22.
2. We first need to calculate KO 0.15]. We can standard-
ize the value with
r2 / 2 dt. (2.101)
x3 - itt 0.15 - 0.2
2 5
z3
= a = 0.02 = ' *
46 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION
Twitter: @elgoajiroblanco

TABLE 2.11-z VALUES FOR NORMAL DISTRIBUTION many experimental procedures. Later chapters show that
(from Ref. 1) many geostatistical techniques also assume that estimation
Standard Standard error at unsampled locations is normally distributed.
Deviation Cumulative Deviation Cumulative A reservoir property that is commonly believed to be nor-
From Mean Probability From Mean Probability mally distributed is reservoir porosity. A quick way to check
- 3.0 0,0014 +0.0 0.5000 whether such an assumption can be justified is to plot the data
-3.0 0.0014 +0.0 0.5000 on a probability plot (a special graph paper available in most
-2.9 0.0019 +0.1 0.5398
engineering supply stores). Plotting the sample values in an
ascending order on one scale and the probability on the other
- 2.8 0.0026 + 0.2 0.5793
scale shows whether the data form a straight line on the graph
- 2.7 0.0035 + 0.3 0.6179 paper, which indicates a normal distribution. Field Example
- 2.6 0.0047 + 0.4 0.6554 2.9 illustrates application of this procedure.
- 2.5 0.0062 + 0.5 0.6915
- 2.4 0.0082 + 0.6 0.7257 Field Example 2.9. This example uses porosity data from
- 2.3 0.0107 + 0,7
Well 34-29 and Flow Unit 3. Fig. 2.32 shows the probability
0.7580
plots for both. The figure shows that the data for Well 34-29
-2.2 0.0139 +0.8 0.7881
were collected from different geological units. As a result,
-2.1 0.0170 +0.9 0.8159 several inflection points are evident on the probability plots,
- 2.0 0.0228 + 1.0 0.8413 indicating the possibility that multiple populations mixed to-
-1.9 0.0287 +1.1 0.8643
gether. The collected data for Flow Unit 3 show a smooth
trend. Although the behavior is approximately linear, it is dif-
-1.8 0.0359 + 1.2 0.8849
ficult to conclude that the distribution is normal.
-1.7 0.0446 + 1.3 0.9032
-1.6 0.0548 + 1.4 0.9192 Log-Normal Distribution. Log-normal distribution is
-1.5 0.0668 + 1.5 0.9332 closely related to normal distribution. If the logarithm of a
-1.4 0.0808 + 1.6 0.9452 variable is normally distributed, then the variable itself is log-
normally distributed. Fig. 2.33 shows this transformation
-1.3 0.0968 + 1.7 0.9554
schematically. In this figure, the log-normal distribution is
-1.2 0.1151 +1.8 0.9641 skewed with a long tail on the right side. After transforming
-1.1 0.1357 +1.9 0.9713 the data by taking the log of the variable, however, the dis-
-1.0 0.1587 + 2.0 0.9773 tribution becomes symmetric and normal.
+2.1
If we consider X to be a log-normally distributed variable,
-0.9 0.1841 0.9821
we can define Y = ln X, where Y = the value of the natural
-0.8 0.2119 +2.2 0.9861
logarithm of the random variable X. If the mean of the vari-
-0.7 0.2420 +2.3 0.9893 able Yis a and the variance is IF , we can write the probability
-0.6 0.2743 +2.4 0.9918 density function for the variable X as
-0.5 0.3085 +2.5 0.9938 )2

-0.4 0.3346 +2.6 0.9953 f(x) = 1


,_exp for x > 0.
-0.3 0.3821 +2.7 0.9965
xi3 ,/ 211
- 0.2 0.4207 + 2.8 0.9974
................................ (2.102)
-0.1 0.5602 +2.9 0.9981
We can show that the mean and the variance of Random
- 0.0 0.5000 + 3.0 0.9987
Variable X is related to the mean and variance of the trans-
formed variable Y
From Table 2.11, F(z 3 ) = 0.00621. From Eq. 2.66,
F(z) = p[2 z]. Therefore,
= exp a + e-] (2.103)
z] 0.00621.
11,
;
2 2
and a2 = ,u [efi - (2.104)
That is, there is a 0.62% probability that the porosity will be 2
where it( = the mean of Variable X and a = the variance of
less than 15%.
Variable Y With Eqs. 2.103 and 2.104, we can still use Table
2.11 to determine the probability values for the log-normal
One reason for the popularity of the normal-distribution distribution.
function is the central-limit theorem, which states that the
"sum of large number of independent random variables tends
to be normally distributed." A good example of the applica- Numerical Example 2.22. Permeability values in a reservoir
tion of this theorem is the measurement error in conducting are expected to be log-normally distributed with a mean of 20
an experiment. Typically, an error is a result of several pos- md and a variance of 2,000 md 2 . What is the probability that
sible independent sources that tend to be additive; therefore, the value of permeability in the reservoir at a given location
the measurement error tends to be normally distributed for will exceed 200 md?

PRINCIPLES OF STATISTICS 47
▪▪ Twitter: @elgoajiroblanco
1990

a.

mulative Probability
Wive Probability
96
90
00
70

20
10

50 10 II 20 20 6.0 10 IS 20

Porosity, % (b) Porosity, %

Fig. 2.32—Probability plots for porosity data: (a) Well 34-29 and (b) Flow Unit 3.

E
o
Ec
Ts I'
4-,E co
'

0 cr a
_c u,
0 o f(x
ui 2
O HO
c7-
0 .3 1-
rmo
O
O H
z 0
o X log x
❑ N ce
Lu
uJ o-
o Fig. 2.33—Transformation for log-normal to normal distribution.
W Ce
0_
<
III
L Solution. In this example, iu = 20 md and a' = 2, 000 scribes reservoir heterogeneity for the well data, assumes a
uJ
CL 1- md 2 . Rearranging Eqs, 2.104 and 2.103 gives, respectively, log-normal distribution of permeability and is defined as
S
OD
W DU k 50 ki5 .9
8 u_0 p2 = in [l + y12-1 (2.105) VDp (2.107)
2 50
E

,fi 0
ce LL where k 50 = the 50th percentile permeability value and
u
Et
=, 0 V) and a = lap - ............................................. (2.106) k, 5.9 = the 15.9th percentile permeability value when the data
are ordered in an ascending fashion.
Q ct 0 Note that Table 2.11 is applicable to the normal distribution.
o 2
C•1 >- Therefore, we need to consider the mean and the variance of
0- Field Example 2.10. With the permeability data from Well
%; d the transformed variable before we can standardize it. Substi-
t\I 34-29 and Flow Unit 3. Fig. 2.34 shows the log probability
tuting into Eqs. 2.105 and 2.106 gives, respectively,
O AF plots for both the data sets. As Field Example 2.10 shows, the
Ce 6
0
z
data collected from Well 34-29 are from several geological
o
T co re
X32 = ln[1 + 2 00 0
' 00
4 = 1.792 units. As a result, the data show several inflection points. The
E
cr flow-unit data show a smoother trend. Although the data devi-
and a ln(20) 1.292
7 = 2.1 ate at permeability values greater than 700 md, they show a
o
fairly linear trend at less than 700 md. The Dykstra-Parsons
Standardizing gives coefficient for the flow-unit data is approximately 0.95,
which indicates a highly heterogeneous reservoir.
In x - a _ ln(200) - 2.1 = 2.39 .
z= —
/3 '1.792 Several advantages are associated with the assumption of
From Table 2.11, F(2.39) = 0.992. That is, the probability one of the distributions describing reservoir properties. The
that permeability will be less than 200 md is 99.2%. Or, in oth- most important is the ability to capture the characteristics of
er words, the probability that the permeability will be greater the distribution with a limited number of parameters. All
than 200 md is (1 - 0,992 = 0.008) 0.8%. three distributions described require only two parameters to
Similar to normal distribution, the assumption of log-nor- describe the distributions completely, Uniform distribution
mal distribution can be validated by plotting the data on log requires the minimum and the maximum, and normal and log-
probability paper. If one observes a straight line, the distribu- normal distributions require the mean and the variance. Once
tion is log-normal. Reservoir permeability is the property those parameters are known, the entire distribution is known.
most commonly assumed to have log-normal distribution. However, as Field Example 2.9 illustrated, if the property
The famous Dykstra-Parsons coefficient, VDP, which de- cannot be described with a particular distribution function,

48 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION


Twitter: @elgoajiroblanco •

. TT1 f 02.00
1 111T1111 1 1 1 1 1110 1 1 1 1 1 1111 1 1 1 1 0111 1 1 III

at as

Cumulative Probability

Cumulative Probability
09
tin go
oo
so so
70
so

30
20
50
50
10
10
5 5
2t 2

ppgg
=

0.01 1. 1 1111111 1 1 1111111 I I 11110 1 11111111 1 1 11111 041 I I 1 1 1 1111 t1 1 0111 1 1 1 1 11111 1 1 1 1 11111 1 1 1111
0010 0.100 140 70.0 100 1000 0.100 1.00 10.0 100 1020 10000

(a) Permeability, md (b) Permeability, md

Fig. 2.34—Log probability for permeability data: (a) Well 34-29 and (b) Flow Unit 3.

the frequency distribution is the only way to capture the de- the distribution may indicate our confidence in the estimate.
tails of the distribution. Instead of requiring only two parame- Because of these characteristics, two desirable characteristics
Ell 0
ters, the frequency distribution requires storage of a lot more in the estimate are unbiasedness and mimimum variance. ›. 0
information to characterize the distribution. Unbiasedness. One desirable characteristic is that the esti- ›. 0 <
-

Another advantage of assuming either a normal or log-nor- mate be close to the true value„of the parameter. Fig. 2.35 X
z 03

mal distribution is the simplicity of the mathematical treat- shows the estimated parameter 0 with its associated distribu- 0
0 01
ment. The normal distribution is especially amenable to many tion. The true parameter 0 ideally "falls" somewhere close to 7:
analytical modifications that allow much more simplified es- the middle of the distribution. This desirable property can be o
00
0
o h
timation procedures in geostatistical techniques. As a result, defined by stating that the expected value of the estimated pa- 0
the assumption of a normal distribution makes the computa- rameter is the same as the true parameter. = N
(.7) 0
tional procedures much more efficient. However, these dis- 0
tribution functions should not be forced on any variable for EP] = 6. (2.108) —i -

the sake of computational speed or mathematical con-


71
ni 0
venience because, unfortunately, these assumptions may In other words, "on average," the value of the estimate is 0

sometimes result in erroneous results. We should take advan- equal to the true parameter.
0c 8c =1
tage of the distribution functions only if we are convinced that Minimum Variance. Fig. 2.36 shows three estimates of an 91
rn
they do represent the reality of variations in the properties. unknown parameter. All three estimates are unbiased; howev-
er, Estimate 1 is more desirable than the other two because it - .9)
13 m
2.3.7 Inference of Parameters. Inference, as could be ex- shows the smallest range of uncertainty. A value estimated 1i; 67, g
pected, is one of the important steps in inferential statistics. with less uncertainty can be predicted with more confidence. -0 m
Inference allows estimation of population parameters from Variance indicates the spread of the estimate; therefore, mini- 0
-a n rn
t j
sample statistics. For example, the sample statistics may in- mizing the variance results in a desirable estimate. X 7

clude determination of the sample mean, I, and sample vari- Combining these two desirable properties, unbiased esti- xi X 0
ance, s 2 . From these statistics, we are interested in inferring mate and minimum variance, we can use the minimum-vari- II
m
the population parameters, such as the population mean, p, ance unbiased-estimate (MVUE) technique. This technique —I 0

and the population variance, a 2 . Because we do not know the used to estimate values at unsampled locations in geostatisti-
true population, we can only estimate the parameters. We dis- cal methods. Appendix B provides a simple application of the - Tom.
o
tinguish the symbols for estimated and true parameters by ad- MVUE technique for a linear-regression problem. 2 0
cr
ding a hat to the symbol for estimated parameters (e.g., esti- MVUE is not the only technique with which to estimate pa- =,
X pi
mated population mean is and true population mean is ft). rameters. Other techniques exist in the literature for estima- 1
-
r0
0
Because we are estimating the parameter, some uncertainty is tion of parameters (e.g, the method of moments; the method -

Ws\

W-1,
associated with it. The estimate can be considered to be a ran- of maximum likelihood; and more sophisticated techniques, m
0 ›. 0
such as penalty-function methods). These techniques are c 0 (-1.
dom variable with some distribution associated with it, and -1 3
T3
mx
am =
E•
0
0
3

A A
(a) e (b) a

Fig. 2.35—(a) Desirable and (b) undesirable unbiased condition.

PRINCIPLES OF STATISTICS 49

▪ Twitter: @elgoajiroblanco
c, d = constant used to define bivariate distribution
c(x, y) = sample covariance between x and y variables
C(X, Y) = population covariance between random
variables X and Y
C v = coefficient of variance.
E = event
E[X] = expected value of random variable X
fr = class frequency of Class i
fR , = relative class frequency of Class i
f(x) = probability density function for continuous
random variable X
f(x, y) = probability density function for a bivariate
distribution of continuous random variables X
A and Y
0
f„(x) = marginal distribution of continuous random
E variable X
o Fig. 2.36—Minimum variance condition.
Ec fy(y) = marginal distribution of continuous random
Ts I' based on heuristic principles in terms of what is considered a variable Y
E '

O
_c w
0 desirable characteristic of the estimate. For example, a biased A ry(xly) = conditional distribution of continuous random
estimate with a smaller variance may be preferable to an un- variable X given a particular value of
0- biased estimate with a larger variance. If the true population continuous random variable Y
• o
ui F = failure
as parameter is not known, the techniques cannot be compared
to examine which is the best one. In practice, however, many F = cumulative relative frequency of Class j,
o 0 >-8 defined in Eq. 2.3
of these techniques have worked well for estimating unknown
c7-
0 .3 1- parameters. For our purposes, we assume that the MVUE F(x) = cumulative-distribution function of random
0xo variable X
Tv 0 technique provides the best estimate.
F(x, y) = cumulative-bivariate-distribution function of
Cere
z 0
• Summary random variables X and Y
k = permeability, L 2 , and
❑N La
uJ o-
This chapter covers a lot of material on understanding statis-
K = constant
o tics principles. Many standard statistics books devote many
W Ce L = lag distance between two samples, L, ft
0_
< chapters to the topics covered in this single chapter. Our goal,
C7 LI however, is not to make the reader expert in these techniques,
= slope of best-fit line
uJ
Q 1— n = total number of samples, n
but to familiarize the reader with the terminology of statistical
S
OD n, = number of times an experiment is conducted
LLI D C.) principles and to expose the reader to the basic concepts of
1- 8 u_ under controlled conditions, n
0 statistics. Appendix B covers the mathematical details of the
E ne number of mutually exclusive events, n
,fi o
ce equations used in the chapter to explain the concepts.
= number of possible outcomes of a random
Q
1:e a F
=, 0 V)
LL
The chapter is divided into two main sections: principles of
variable X, n
descriptive statistics and principles of inferential statistics.
<9 n y = number of possible outcomes of a discrete

•0
.o ct 0 The descriptive-statistics section discusses several useful
o 2 random variable Y, n
>- techniques that can be used to understand the characteristics
CNI N = total number of classes, n
of the sample data sets. Special emphasis is placed on the spa-
(NI Ce NR = random-number realization
00 tial data sets that are collected in analyzing reservoir charac-
F a = number of possible outcomes, n
IX z
6 teristics. Field examples illustrate many techniques.
p(A) = probability of Event A
• 0
The section on inferential statistics introduces the concept
p(AIB) = conditional probability of Event A given that
T co re
E of probability and its properties and applications. The random
cr Event B has occurred
variable and its functions are illustrated, followed by the prin-
o P(x) = probability mass function for a discrete random
ciple of expected value. Three distribution functions that have variable
applications in geostatistical methods are described in addi-
Ps , P y = marginal distributions of discrete random
tional detail, and the desirable characteristics of the estimator variables X and Y, respectively
technique are presented.
P,, y(xly) = conditional distribution of a discrete random
Starting with Chap. 3, how these principles can be applied
variable X given value of a discrete random
to quantify and to analyze spatial characteristics of the reser- variable Y
voir data is illustrated.
Q = quantile
r(x, y) = sample correlation coefficient
Nomenclature
R = sample range
a, b = minimum and maximum of uniform R, = rank of sample for variables x and y,
distribution, respectively respectively, when arranged in descending or
A, = subarea size, L 2 ascending order
= total area, L 2 Ri og k = rank of log k
A, B, C = event of a random experiment R o = rank of
b = intercept s = sample standard deviation
50 APPLIED GEOSTATISTICS FOR RESERVOIR CHARACTERIZATION
Twitter: @elgoajiroblanco
-

Slog k standard
= deviation of log k = porosity
= standard deviation of
52 = sample variance Superscript
S= sample space = estimated
t = dummy variable
U spatial
= location References
U(X) = function of variable x 1. Tukey. J.W.: Exploration Data Analysis, Addison-Wesley Pub-
u(X) = function of random variable X lishing Co., Reading, Massachusetts (1977).
U(X, y) bivariate
= function of variables x and y Mandenhall, W.: Introduction to Probability and Statistics,
VDp = Dysktra-Parsons coefficient seventh edition, PWS Publishers, Boston, Massachusetts (1987).
V(X) = variance of random variable X 3. Ross, S.M.: Introduction to Probability and Statistics for Engi-
Wi = weight assigned to Sample i; Eq. 2.10 neers and Scientists, John Wiley & Sons, New York City (1987).
xi = value of Sample i for Variable x 4. Hines. W.W. and Montgomery, D.C.: Probability and Statistics
x rnax = maximum value in the sample in Engineering and Management Science, third edition, John
x min = minimum value in the sample Wiley & Sons, New York City (1990).
x= sample mean 5. I saaks, E.K. and Srivastava, R.M.: Applied Geostatistics, Oxford
-1 0
= sample median U. Press, New York City (1989). 0
xP p= percentile value of Variable X 6. Journel, A.G.: Fundamentals of Geostatistics in Five Lessons, -<
o
x(u) = variable x at location u American Geophysical Union, Washington, DC (1989) 8.
c
X, Y = random variables 7. Journel, A.G.: "Non-Parametric Estimation of Spatial Distribu-
tions," Math Geology (1983) 15, 445. o—
yl value
= of Sample i for variable y E
xi I
8. Deutsch, C.V. and Journel, A.G.: GSLIB: Geostatistical Soft- o0
z= standard normal variable
ware Library and User's Guide, Oxford U. Press, New York o
a arithmetic
= mean of the log of a log-normally -
0
City (1992). ro
distributed variable 9. Davis, J.C.: Statistics and Data Analysis in Geology, John Wiley (7) 0
ig 2 = variance of the log of a log-normally & Sons, New York City (1986). -1
›. 0:
z

distributed variable No m
-1
0 = population parameter SI Metric Conversion Factors rrI 0
X

14 = population mean
p[X, Y] = population correlation coefficient ft x 3.048* E — 01 =m
0 0
a2 = population standard deviation and x 9.869 233 E — 04 =ktm 2 -
n —I
0 C rn
a = population variance C
*Conversion factor is exact. r —I Cn
-I So 17
rri
o
5
-0 1
m
o
-0 n rn
7
X .--

xi
xiXX0 0
II
-I M
0 fD
(13
OYQ
- 1 L.
Tom.
o
oh o
c cr
xi =1 AT
1 -0
r0
- >alb
• ,m,
m
T. 0
c 0 (-1-
TT -1 3
m x gi)
Pm
E•
›. 0
0
73

PRINCIPLES OF STATISTICS 51

You might also like