You are on page 1of 7

Determination of Lithology From Well

Logs by Statistical Analysis


J.M. Busch, SPE, Arco Alaska Inc.
W.G. Fortney, Boeing Computer Services
L.N. Berry, SPE, Neal Berry & Assocs.

Summary. This paper presents a method of predicting lithology by statistical analysis of wireline log measurements with
calibration to a core lithology standard. Although an example of the technique applied to the Shublik formation of the Prudhoe Bay
area, North Slope, AK, is developed and presented, the method can be applied to any field where some core has been taken. The
Shublik, a complex mixture of seven clastic and carbonate rock types, presents a common problem: how can the rock lithology
best be identified ftom wireline logs? During recent work on the reservoir description of the Shublik, it became necessary to
answer this question.
A lithology standard must be available to calibrate against during development of a log model of lithology. For the Shublik, the
core available provided an excellent sample of the total formation. Using the statistical technique of discriminant analysis, we were .
able to evaluate a number of log models and to choose the most appropriate one. ThG final log model chosen for the Shublik
formation works quite well and correctly predicts lithology 75% of the time.

Introduction Statistical Analysis


In sedimentary basins with primarily sandstone and shale sequences, Discriminant analysis is an established method for-classifying each
mud-log cuttings descriptions are usually adequate for most log in- observation in a data set into one of a set of mutually exclusive,
terpretation. Log interpretation in complex reservoirs with both clas- exhaustive categories (i.e., each observation is classified into only
tic and chemical rocks requires better means of lithology one category) on the basis of numerical data values. In our appli-
identification. Lithology is a necessary first step in any complex cation, the categories are the lithologies existing in the Shublik,
log-analysis program. Conventional porosity-log crossplots scaled and the numerical data are the several available well logs and func-
to mixtures of sandstone, limestone, or dolomite are usually ade- tions of them. Each 1-ft [0.3-m] interval of each well will provide
quate to identify single or dual mineral mixtures. As reservoir li- one observation. It is assumed that a calibration data set is availa-
thology becomes more complex and heterogeneous, log crossplots ble that consists of intervals that have been both cored and logged.
are inadequate in classifying lithology variations. The works of It is important that the match of log and core data be as accurate
Burke et al. 1 and Clavier and Rust 2 broadened the use of the well as possible.
log in lithology identification. Discriminant anaiysis is explained in a number of references 4· 6 ;
With the advent of computers for computational work at the well- however, even the most elementary references assume a working
site and in the office, the use of more complex methods of litholo- knowledge of the application of matrix algebra to the geometry of
gy prediction can be attained. Delfiner et al. 3 have recently shown , multidimensional Euclidean space. In this paper, we emphasize (1)
how statistical analysis can be applied to the prediction of litholo- an intuitive understanding of how discriminant analysis achieves
gy from well log data. Their method uses a library of identified its goal and (2) practical considerations in using discriminant anal-
lithologies to classify well log response into discrete lithologies. ysis in log modeling. Additional discussion of statistical consider-
Our approach is made more specific to a given field by using only ations is included in the Appendix.
the core data available from that field in building a calibrated well It is useful to compare discriminant analysis with the more familiar
log model. technique of multiple regression. In the latter technique, continu-
ous data (e.g., sonic or density logs) might be used to calibrate a
Database Preparation single function from which porosity, say, could be calculated from
The data base used for modeling the Shublik contains matched core the logs. In discriminant analysis, a separate function is estimated
lithology determinations and log data from 32 wells cored com- for each of the several categories. Each function is evaluated through
pletely through the formation. Available logs include sonic, neu- application for every observation, and the observation is assigned
tron, bulk density, and gamma ray. The Shublik formation has been to the category having the largest function value. In summary, mul-
divided into three zones. Starting from the top, they are Zones A, tiple regression uses numerical data to predict values of a numeri-
B, and C. The boundaries have been determined in all wells by cal variable; discriminant analysis uses numerical data to predict
characteristic gamma ray log responses; Zone B is marked by high discrete categories.
radioactivity. Not all of the seven lithologies occur in every zone;
Table 1 shows which lithologies are found in each of the three zones. Simple Example Applications
To be useful, the core data must be depth-matched to the log data The method of discriminant analysis can be understood best through
in each well. This matching process consisted of using the core gam- an example. For this purpose, a simple example will be developed
ma log to shift the core into close approximation of its position to with data from only one zone of the Shublik. For simplicity, only
the gamma ray log. The second step includes fine-tuning the match three lithologies (limestone, shale, and sideriticmudrock) are used.
by comparing the core laboratory porosities with the sonic transit Available logs include bulk density, sonic, neutron, and gamma
time and the core grain density with the density log. The resultant ray. For this example, the data base has 811 observations for which
pairing of core depth with its associated geologic lithology descrip- core lithology and the several log values have been determined.
tion and the various wireline log responses at 1-ft [0.3-m] inter-
vals for all 2,303 ft [702 m] of core are then maintained as a data Using One Variable
base. · The lithoporosity function, M, 1 is a natural candidate to be a li-
thology discriminator. M is a function of sonic transit time and bulk
Copyright 1987 Society of Petroleum Engineers density, defined to be independent of porosity. In theory, M should

412 SPE Formation Evaluation, December 1987


TAI;:ILE 1-LITHOLOGICAL COMPOSITION TABLE 2-MEANS AND STANDARD DEVIATIONS OF M
OF SHUBLIK ZONES
Lithology Mean -Standard Deviation
Zone Sideritic mudrock 0.6010 0.0317
Lithology A B c Shale 0.6717 0.0382
Sandstone X Limestone 0.7394 0.04~5
Shale X X ·x
Limestone X X X
Siltstone X X X
Phosphatic limestone X X
Phosphatic-mudrock X X X
Sideritic mudrock X X

0.9

0.9 0.9

Fig. 2-Distribution functions of M.

tively estimated means. The distribution curves (probability density


functions) are shown in Fig. 2a. Each curve has the saine shape,
a consequence of assuming that each distribution has the same stan-
dard deviation. The three curves divide the Maxis into three regions
according to which lithology has the greatest distribution curve
value. If we use the notation II (M), f2(M), and f3(M) to repre-
sent the respective distribution curve functions for sideritic mudrock
shale, and limestone, then these regions can be determined algebrai~
cally; the boundary points are the values of M where II (M) = h (M)
and where h (M) = f3 (M). [The boundary between the regions for
sideritic mudrock,and limestone would occur whereii (M)=f3(M),
Fig. 1-Histograms of M: (a) sideritic mudrock; (b) shale; and
(C) limestone. ·
but as Fig. 2a shows, this point is dominated by shale.] These points
turn out to be M=0.6363 and 0.7055, so the regions where each
curve is dominant are

vary with lithology. In practice, we hope that each lithology w·ill M < 0.6363 sideritic mudrock
be clearly differentiated from the others by its distribution of M 0.6363 < M < 0. 7055 shale
values. · / 0. 7055 < M limestone.
Figs. la through lc display histograms of the distributions of M ·
within the sideritic mudrock, shale, and limestone lithologies, re- It is thus possible to use this division of the Maxis as a classifi-
spectively. It is immediately apparent that the distributions center cation rule to determine lithology from the iogs. If we apply the
around different points. Table 2 shows the means and standard devi- rule to the same data, we can compare the lithology classifications
ations for the three lithologies. Unless the standard deviations differ derived from the logs with the core classifications. The most con-
greatly, it is customary to use a pooled estimate of an assumed com- venient comparison is a simple cross-classification table that shows
mon standard deviation; for the example data, the estimate is 0.0412. the frequency with which each interval of rock is classified into
The pooled estimate is a weighted root mean square of the respec- every possible combination of core and log lithology. Table 3 shows
tive group's standard deviations.- See Ref. 4, Page 464, for details. these sample data.
Normal (Gaussian) distributions are completely determined· by
their means and standard deviations. It is thus possible to fit three Verification "~:echniques
normal distributions, one for each lithology, to the data by assum- A single measurement of how well the log model matches the data
ing that they have a common standard deviation and the respec- is the percent agreement, the proportion of all the data that are
SPE Formation Evaluation, December 1987 413
TABLE 4-CROSS-CLASSIFICATION USING M
TABLE 3-CROSS-CLASSIFICATION USING M WITH LITHOLOGY PROPORTIONS
Log Lithology Log Lithology
Sideritic Sideritic
Core Lithology Mud rock Shale Limestone Total Core Lithology Mud rock Sh;:~.le Limestone Total
·-·~.-. -- - -
Sideritic mudrock 102 14 0 116 Sideritic mudrock 75 41 0 116
Shale 64 2Q1 79 424 Shale 15 354 55 424
Lithology limestone 9 60 202 271 Lithology limestone 4 78 189 ' 271
Total 175 '355 281 811 Total 94 473 244 811
Percent agreement= 100 x [(102 + 281 + 202)/811] = 72.13%. Percent agreement= 100 x [(75 + 354 + 189)/811] = 76.20%.

matched correctly. In this single case, using only Mas a discrimi- Note that tpe interval of the Maxis for which shale is the log
nator, the percent agreement is 72. ~ 3% . Note that the percent agree- lithology is wider than in the case where lithology proportions were
ment is lower in the shales than in either the sideritic mudrocks disregarded. As a result, the revised log model will designate more
or limestones. Because the shales by themselves constitute more observations as shale, and fewer observations as either sideritic
than half of the data, the effect is to depress the overall percent muqrock or limestone. Table 4, the cross-classification table of core
agreement. c, with log lithologies, shows the re'sults. The percent agreement has
By accounting for the proportion each lithology constitutes of the risen to 76.20%.
total, we can improve the overall match by weighting each distri-
bution curve by the corresponding proportion. Our classification Use of Two Variables
rule is to assign a given observation to the category for which the The ability to discriminate can often be improved by the use of more
weighteq distribution function is largest. 1Fig. 2b shows a graph of than one discriminating variable, just as the use of several explana-
the proportion" weighted· distribution curves·. tory variables can often improve the predictive ability of a regres-
By solving the equations 1f Ift (M) = 1r2h (M) and 1r2f2 (M) = sion model. As the next step, consider the lithoporosity variable,
1r 3f3'(M), where 1r 1 ; 1r 2 , and 1r 3 represent the respective propor- N, 1 a function of neutron and bulk density logs that, again, should
tions of sideritic pmdrock, shale, and limestone (from the core~). vary only with lithology, and not with porosity. Proceeding as be-
we find the points that divide theM axis iota three regions. The fore, but using two variables, we begin with Figs. 3a through 3c,
resultant classification rule is which are crossplots or scatter diagrams showing the joint distri-
butions of M and N for each of the three example lithologies. It
may be observed that sideritic mudrock generally has low values
M<0.6052 sideritic muqrock of both M and N, limestone has high values of both variables, and
0.6052<M<0.7167 shale shale has intermediate values. Spedfically, the mean values of M
0.7167 <M limestone. are 0.6010 for sideritic mudrock, 0.6717 for sbale, and 0. 7394 for

0.90 . - - - - - - - - - - - - - - - - - - - - - - ,
a a

SIDERITIC

......• .·-: ...


M M MUDROCK

,.·
·., :,b~t;
•• •• :•.!!"•': •: •.
·.
0.50 L--------~----,------------1
I

0.25 N 0.65
0.90 r - - - : - - - - - - - - - - c - - - - - - - - - - _ _ _ , 0.50-+"'--:------=--=---<----------------i
b 0.25 N 0.65

0.90,---~-~----------------,----,

..· ...)"!;,... ~.··.


.
b LIMESTONE

M
~ ....,.

. ~ .'::·• ::~.
~ ~-:··
··:· ..
.. ' M
SIDERITIC
MUDROCK

0.50-l-----------------------1
0.25 N 0.65
0.90 , - - - - - - - - - - - - - - - . - - - . , - - - . - - - . - - - - - ,
c

::.~·:· N
0.65

. . ."*-~'•r.•. . .
.. ·'.:~·~
.. ,..,.•"Pt.'
M
-. Fig. 4-Distribution functions of M and N as contour plots.
.
' .
0.50 L - - - - - - - - - - - - - - - - - - - - - i
0.25 N 0.65

Fig. 3-Crossplots of M and N.

414 SPE Formation Evaluation, December 1987


TABLE 5-CROSS-CLASSIFICATION USING M AND N

Log Lithology
Sideritic
Core Lithology Mud rock Shale Limestone Total
Sideritic mudrock 101 13 2 116
Shale 17 320 87 424 0.90
Lithology limestone 7 57 207 271
Total 125 390 296 811
Percent agreement= [(100 x 101 + 320 + 207)/811] = 77.44%.

TABLE &-CROSS-CLASSIFICATION USING M AND N


WI~H LITHOLOGY PROPORTIONS

Log Lithology
Sideritic
Core Lithology Mud rock Shale Limestone Total
-- --
Sideritic mudrock 94 22 0 116 0.90
Shale 7 361 56 424
Lithology limestone 5 76 190 271
Total 106 459 246 811
Percent agreement= 100 x [(94 + 361 + 190)/811] = 79.53%.

limestone. Mean values for N are 0.3589 for sideritic mudrock,


0.4798 for shale, and 0.5195 for limestone. Pooled standard devi-
ations are estimated to be 0.0412 forM and 0.0409 for N. The Fig. 5-Distribution functions of M and N in isometric
pooled correlation coefficient is 0.6880. projection.
With these estimates, it is possible to fit distribution surfaces to
the data, one surface for each lithology. A contour plot of the sur-
face (displaying only the highest of the three at each point) is provid- ogies from log data. It has also suggested that taking lithology
ed in Fig. 4a. Fig. 5a is a three-dimensional (3D) representation proportions into account is generally advantageous and that the use
of the same surface. By determining which of the three distribu- of several variables may be expected to improve the discrimina-
tions has the greatest distribution function value for any pair (M,N), tion over using ,only one.
we can determine that the classification rule divides the M-N plane
into three regions bounded by straight lines, which are shown in; Full Shublik Model
Fig. 4a. If we classify each observation according to the classifica-
When more lithologies are added, increasing the complexity of the
tion rule induced by these lines, we obtain log lithologies deter-
task, we gener-ally expect the percent agreement to decline. To coun-
mined from the two variables M and N. Table 5 is the cross-
ter this model deterioration, it is often useful to increase the num-
classification table of core with log lithology. Percent agreement
ber of discriminating variables in a model. With more than two
is 77.44%. ·
variables, geometric display ofdata is no longer possible, but the
Because shale is the predominant lithology, and because it is un-
analytic geometry does not present any added difficulties. Analo-
derpredicted by this model, we can improve our overall core-to-
gous to multiple regression, multiple discriminant analysis can be
log agreement by weighting the probability surfaces by lithology
carried out with any reasonable number of variables.
proportions. Contours of the weighted surfaces are shown in Fig.
In addition to the raw logs available (sonic, bulk density, neu-
4b; Fig. 5b is a 3D view of the same surfaces.
tron, and gamma ray), certain functions of them were considered
A log lithology classification rule based on the weighted proba-
as potential discriminators. Such functions included M, N, and a
.bility surfaces assigns observations to lithologies by defining a region
variable called proportional heightin zone, defined by
for each lithology in the M-N plane. As before, the regions are
bounded by straight lines. The region for shale has grown larger, ha
however, reflecting the greater proportion that shale constitutes of h=- . .................................... ·, ..... (1)
the zone. The cross-classifications obtained with this classification d
rule are given in Table 6. Percent agreement is 79.53%. Thus defined, h is between 0 and 1. The importance of h lies
This example has shown that a simple-case discriminant analy- in the fact that some lithologies are found predominantly at the top
a
sis can produce reliable model enabling the determination of lithol- of a zone (h near 1), and others lie mostly at the bottom (h near 0).

TABLE ?~CROSS-CLASSIFICATION BEST LOG MODEL FOR CALIBRATION DATA SET

Log Lithology
Phosphatic Phosphatic Sideritic
Core Lithology Sandstone Shale Limestone Siltstone Limestone Mud rock Mud rock Total
-- --
Sandstone 5 1 7 13
Shale 1 1'88 38 13 7 5 252
Limestone 1 27 297 9 15 4 2 355
Siltstone 4 24 16 61 105
Phosphatic limestone 11 1 84 14 110
Phosphatic mudrock 4 10 91 9 114
Sideritic mudrock 2 1 7 80 90
Total 11 242 363 95 109 I
123 96 1,039
Percent agreement= 77.57%

SPE Formation Evaluation, December 1987 415


TABLE 8-CROSS-CLASSIFICATION BEST LOG MODEL APPLIED TO VALIDATION DATA SET

Predicted Log Lithology .


Phosphatic Phosphatic Sideritic
Core Lithology Sandstone Shale Limestone Siltstone Limestone Mud rock Mudrock Total
Sandstone 0
Shale 3 233 45 18 2 4 5 310
Limestone 3 61 374 30 13 1 - 4 486
Siltstone 12 45 64 121
Phosphatic limestone 16 115 6 137
Phosphatic 'mud rock 25 88 5 118
Sideritic mudrock 3 2 13 74 92
Total 6 309 482 112 155 112 88 1,264
Percent agreement= 75.00%

Because we wanted to test many models, we randomly divided


l(XXXXXXX
SIDERITIC MUDROCK :<:XXXXXXX the data into two groups. Group 1, consisting of 16 wells with 1,039
PHOSPHATIC MUDROCK
HJI+.33JE+EIE
1- +:m:::+ +:m:::++: ft [317 m] of data, was used for selecting and calibrating a model.
SIOERITIC MUDROCK ..........- .................................... PHOSPHATIC LIMESTONE ~
~
Because it is easy to err by overfitting a model (choosing an un-
PHOSPHATIC MUDROCK·------------..!'
PllOSPHATIC LIMESToNE :: SILTSTONE
justifiably complicated model), we used data in Group 2 to vali-
SILTSTONE--------- date the model actually selected. Group 2 comprised 16 wells with
;:n~~N_E_______ -------------- ________ ' :: LIMESTONE
1,264 ft [385 mJ of data. The model selected from Group 1 data
SANDSTONE '' SHALE
was applied to the Group 2log data. The lithologies predicted from
the model were compared with the actual lithologies to determine
LOG CORE
how well the model would match lithologies in uncored wells.
PROBABILITY
0 .2 .4 .6 .8 1.0 LITHOLOGY LITHOLOGY Because it is easier to match log to core data on a calibration
:-:·:·:·:·:·:·:·:.·:·:·:·:·:-:.:-:.:-:
~~ -7 data set than it is to make predictions on a validation data set, we
) expect percent agreement to be lower on Group 2 thc:j.n on Group
10040
( 1. Nevertheless, if the model chosen is good, we hope that the per:-
~-------- cent agreement will be nearly as high for the validation as for the
,-._ ------~ calibration data set.
/ ~~~~~~~=?~!
/ ~ We tried 15 models on the calibration data set for each of the
11050 .:'
three zones. The models differed only in the selection of log varia.,.
f\ >
,. r-
I< -
"-.__
[__
·.
bles. In each zone we selected the model that best matched the core
on the calibration data set (Group 1 wells). In Zone A, the varia-
bles used were M, N, bulk density, and gamma ray. In Zones B
and C, the variables were M, N, proportional height, and sonic tran-
sit time. Table 7 shows the resultantcross-classification data.
By summing the diagonal elements, we can see that the model
matched the core description on 806 out of 1,039 ft [246 oUt of
317m] (percent agreement=77.57%). When the calibrated model
was applied to the validation data set, we obtained the results sum-
marized in Table 8. The agreement is 948 out of 1 ,264 ft [289 out
of 385 m], or 75 % . As expected, this is slightly lower than for
the calibration data set, but it is remarkably high, proving that the
log model does an excellent job of determining lithologies.
One feature of discriminant analysis is that for each 1-ft [0.3-m]
interval, aprobability is assigned to each lithology. (This proba-
bility is calculated by use of the Bayes theorem.) These probabili-
ties have the properties that, for each interval, their sum over all
lithologies is equal to one and the lithology that has the highest prob-
ability is always the lithology into which the log model will classi-
fy the interval.
Fig. 6 shows a convenient way of displaying the results of ap-
plying the discriminant analysis. For One well in the validation data
set, the left side shows, as functions of depth, the plots of the prob-
abilities assigned to each of the seven defined lithologies. On the
right side, two tracks show the log-.model-determined lithology and
the geologist-described core lithology, respectively. Note that the
log lithology is always that lithology for which the probability is
the highest of the seven. The good agreement between core and
log is readily apparent.

Analysis of Misprediction
'""~_-,····
It is instructive to investigate the reasons why 316 ft [96 m], or
25%, of the core in Group 2 wells did not match.
There are intervals within the Shublik in which lithology interbed-
Fig. &-Comparison plot of lithology probabilities and pre- ding occurs, and there are some lithologies that may grade into
dicted vs. actual lithology. others. Thus, it is of interest to study the misclassified feet to learn
what lithology probabilities were assigned for them.
Thin beds exist within the Shublik. Log resolution problems, core
nonrecovery intervals, and occasional log/core misalignment proq-

416 SPE Formation Evaluation, December 1987


TABLE 10-EXPLANATIONS FOR
MISMATCHED FEET AS CUMULATIVE PROPORTIONS

TABLE 9-EXPLANATIONS FOR Reason Cumulative %


MISMATCHED FEET IN VALIDATION DATA SET 1. Matches 1 ft above or below 43.0
2. Matches 2 ft above or below 59.2
Reason % 3. Core lithology probability greater
1. Matches 1 ft above or below 43.0 than 0.2 86.1
2. Matches 2ft above or below 54.4 4. Core lithology probability greater
3. Core lithology probability greater than 0.2 48.4 than 0.1 95.2
4. Core lithology probability greater than 0 .. 1 76.5 5. Shale/siltstone mismatch 95.6
5. Shale/siltstone mismatch· 9.5 6. Limestone/phosphatic-limestone
6. Limestone/phosphatic limestone mismatch 9.2 mismatch 95.6
7. Shale/limestone mismatch · 33.5 7. Shale/limestone mismatch 97.8
8. Siltstone/limestone mismatch 23.7 8. Siltstone/limestone mismatch 97.8

lems also create the possibility that the log response for a given M = lithoporosity function
point may match the core at a nearby interval. N ·= lithoporosity variable
Finally, certain lithologies are similar in properties and are easi- 1r 1,2 , 3 = proportions of sideritic mud rock, shale, and lime-
ly confused. In many cases, these confusions are innocuous. Lithol-
ogies that are occasionally confused include shale with siltstone, stone, respectively
and limestone with phosphatic limestone. In addition, as a result
of thin-bed effects and intergrading, limestone and shale and lime- Acknowledgments
stone and siltstone are sometimes confused. We thank B.E. Hunter, D.K. Davies, and D.B. Schafer for their
We distinguished eight possible explanations for mismatched in- invaluable help in the geologic portion of this work. We also thank
tervals: (1) the log-determined lithology agreed with the core li- Arco Alaska Inc., Exxon Co. USA, and Sohio Petroleum Co. for
thology either 1 ft [0.3 m] above or below the interval being their permission to publish this paper.
classified; (2) the log lithology matched the core 2ft [0.6 m] above
or below the interval in question; (3) the core lithology for the in-
terval was assigned a probability greater than 0.2, so the logs evi- References
denced some ambiguity concerning the classification; (4) the core 1. Burke, J.A., Campbell, R.L. Jr., and Schmidt, A.W.: "The Litho-
lithology was assigned a probability greater than 0.1; (5) a shale Porosity Crossplot,'' Trans:, SPWLA Annual Logging Symposium
was classified as a siltstone or vice versa; (6) a limestone was clas-. (1969) paper Y. ,
2. Clavier, L. and Rust, D.H.: "MID-PLOT: A New Lithology Tech-
sified as a phosphatic limestone or vice versa; (7) confusion be-
nique;" The Log Analyst (Nov.-Dec. 1976) 16-24.
tween shales and limestones; and (8) confusion between siltstones 3. Delfiner, P.C., Peyret, 0., and Serra, 0.: "Automatic Determination
and limestones. · of Lithology from Well Logs," SPEFE (Sept. 1987) 303-10.
Table 9 shows the proportion of the mismatched 316 ft [96 m] 4. Johnson, R.A. and Wichern, D. W.: Applied Multivariate Statistical
that fell into the categories defined by the eight reasons. Because Analysis, Prentice-Hall Inc., Englewood Cliffs, NJ (1982) 461-531.
more than one reason might serve as an explanation for the same 5. Anderson, T. W.: An Introduction to Multivariate Statistical Analysis,
mismatch, the percentages total more than 100. Table 10 presents John Wiley Publishing Co., New York City (1958) 126-53.
the information in a hierarchical fashion, showing the proportion · 6. Gnanadesikan, R.: Methods For Statistical Data Analysis of Multivariate
Observations, John Wiley Publishing Co., New York City (1977)
of mismatches attributable to Reason 1, then the additional mis-
82-103.
matches attributable to Reason 2, and so forth. 7. Boutemy, Y., Simond, R.G., and Clavier, C.: "Field Studies: A Prog-
It is readily seen from Table 9 that the first four reasons, reflect- ress Report On The Contribution of Logging," paper SPE 8178
ing resolution, bed boundary problems, possible core/log misalign- presented at the 1979 SPE Offshore Europe Conference, Aberdeen,
ment, etc., account for 95% of the mismatches, leaving only 8ft Sept.3-7.
[2 .4 m], or 2. 5%, to be explained by shale/ siltstone or shale/lime- 8. Haslett, J.: "Maximum Likelihood Discriminant Analysis On the Plane
stone confusions.· In the end, only 7ft [2.1 m], accounting for 2.2% Using a Markovian Model of Spatial Content,'' Pattern Recognition
(1985) 18, 287-96.
of the mismatches and 0. 6% of the validation data base, remained
unexplained. The overall conclusion again is that the log model is Appendix-Statistical Considerations
working extremely well.
In the work reported in this paper, except for the pedagogic exam-
ples, actual lithology proportions were used to obtain weighted clas-
Conclusions
sification probabilities. This procedure is justifiable in any situation
This paper demonstrates that with statistical discriminant analysis, in which the available core may be reasonably considered to be a
it is possible to find a log model that accurately predicts lithology random sample from the reservoir. In our case, because only wells
for a formation as lithologically complex as the. Shublik. The method whose cores completely penetrated the Shublik were used, we con-
requires only that a competent geologist describe a statistically rep- sidered this to be the best thing to do. Situations are conceivable
resentative amount of core. In the case of the Shublik, not only in which this proportion weighting would not be indicated. For ex-
was it possible to find a model for which the log lithology agreed ample, if a geologist examined core only until reaching a certain
with the core very well, but the overwhelming majority of the mis- quota of samples for each of several lithologies, rather than until
matched data could plausibly be explained as thin-bed effects, log completely describing available core; if a number of the wells had
resolution problems, and core/log misalignment problems. In sum- core only from the top or bottom of the formation; or if wells had
mary, discriminant analysis applied to log lithology determination been cored in some region of the field specifically for the purpose
is a promising technique for reservoir description. of obtaining data on what was considered a possible geologic anoma-
ly, then the existing data would be subject to sampling bias. In such
Nomenclature cases, the appropriate weights are not obvious and would have to
distribution curve functions for the three lithologies
/ 1,2 ,3 = be determined on a case-by-case basis.
An additional factor that might be used in a more complex weight-
thickness of zone, ft [m]
d =
ing scheme is that of transition probabilities. If deposition is con-
h = proportional height-in-zone variable sidered to be a first-order Markov process, then the lithology
ha = height above zone bottom, ft [m] probability at any depth depends on the actual lithology at the next
SPE Formation Evaluation, December 1987 417
lower depth. The transition probabilities (the probabilities that a tested by use of Bartlett's test, for example, but this test is known
given observation is of Lithology x when the next lower observa- to l:>e extremely unreliable in the absence of nonnormality of dis-
tion was of Lithology y) for all pairs (x,y) of lithologies can be eas- tribution, and we have chosen to rely on a simple comparison of
ily estimated. A method for incorporating this additional information intrapopulation (within lithology) standard deviations and correla-
into discriminant analysis has recently been given by Haslett. 8 This tion coefficients. More complicated kinds of discriminant analysis
method promises to be especially valuable in formations where (1) have been proposed, but software to perform them is not readily
there is a pronounced tendency for lithologies to persist from one available, and their properties are not as well known. In any event,
observation to the next or (2) the classic discriminant analysis shows a primary purpose of this paper is to suggest that a relatively sim-
evidence (through probability vs. depth plots) of considerable am- ple technique is available and deserves wider application.
biguity in' the d~termination of lithology. · One further topic should be mentiop.ed, although space limita-
Another matter that needs careful study in an actual application tions preclude an adequate discu~sion. Discriminant analysis clas-
is the metric in which the data are measured. Well-to-well varia- sifies each observation into some lithology, and the assigned
,tion in log response may suggest calibration offsets, 7 although the classification probabilities are predicated on the ass~mption that
usefulness of these in applications to uncored wells is slight. In many some of the stated lithologies are in fact correct. It may be, of co~rse,
cases, the distribution of the variables used may be nonnormal (non- that none of the assumed lithologies matches the log response, either
Gaussian). Although the nonnormality frequently does not damage because one or more of the logs is anomalous or because an un-
the inference from the analysis, in extreme cases it is worth con- cored well has penetrated a lithology different from any examined
sidering a transformed variable. In this study, the logarithm of the by the geologist. In this event, a statistic known as the Mahalano-
gamma ray, rather than the gamma ray itself, was used for this rea- bis distance may reveal the· situation. This statistic· will have ap-
son. The gamma ray was also calibrated, or normalized, by stan- proximately a x2 distribution, and thus observations that are
dardizing its response in known shale markers. The neutron logs incompatible with the known lithologies may be detected.
available included several tool types, which were also standard-
ized before use in the analysis. Sl Metric Conversion Factor
The discriminant analysis used in this study.js simple linear dis-
ft X 3.048* E-01 m
criminant analysis, which wefelt was preferable to more compli-
cated alternatives in the absence of indications from the data that
·conversion factor is exact. SPEFE
such were called for. Its mathematical derivation rests on the as-
sumption, among others, that the variances of the variables (actu- Original SPE manuscript received for review Sept. 22, 1985. Paper accepted for publica-
tion July 11, 1986. Revised manuscript received Dec. 8, 1986. Paper (SPE 14301) first
ally, their variance/covariance matrix) do not differ from one presented at the 1985 SPE Annual Technical Conference and Exhibition held in Las Ve-
population (lithology) to another. This assumption can be formally gas, Sept. 22-25.

418 SPE Formation Evaluation, December 1987

You might also like