Professional Documents
Culture Documents
A R T I C L E I N F O A B S T R A C T
Keywords: Brittleness is an important geomechanical property of reservoirs, which is usually estimated from cores or sonic
Machine learning logs that are expensive to acquire. In this study, we report data-driven, machine learning workflows to predict
Unconventional reservoir brittleness from less expensive, readily available conventional logs. We propose three strategies to predict
Brittleness
brittleness using gamma ray, neutron porosity, density, caliper, sonic, and photoelectric factor logs by utilizing
Reservoir property estimation
Marcellus shale
gradient boosting (GB), support vector regression (SVR), and neural networks (NN). The first strategy involves
predicting brittleness directly from the logs while the second strategy predicts shear sonic logs used for the
estimation of brittleness. The performance of the models given as R2 on deployment on the testing set for the first
strategy is: GB (0.87), SVR (0.73), and NN (0.82), while for the second strategy: GB (0.94), SVR (0.87), and NN
(0.94). In the third strategy, we convert the prediction into a classification task by grouping the brittleness es
timate into ductile, transition, and brittle. The accuracy of model deployment on the testing set for the third
strategy is: GB (89.37%), SVM (89.06%), and NN (89.16%). We demonstrate that depending on the strategy
adopted, the gradient boosting algorithm outperforms the other peer algorithms in terms of training and vali
dation scores. Furthermore, we combine the three algorithms using a committee machine to improve the per
formance of the model. The workflow in this study can be adopted to predict other reservoir properties from
available logs. The workflow can also be adopted to characterize reservoir heterogeneity from seismic traces
trained by well logs.
* Corresponding author.
E-mail address: dengliang.gao@mail.wvu.edu (D. Gao).
https://doi.org/10.1016/j.cageo.2022.105266
Received 2 May 2022; Received in revised form 13 October 2022; Accepted 4 November 2022
Available online 8 November 2022
0098-3004/© 2022 Elsevier Ltd. All rights reserved.
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
Fig. 1. A base map of the location where the data was acquired with the available wells and the seismic survey. The red wells on the map have full wave sonic for the
estimation of brittleness, while the black wells do not. The wells outside the seismic survey are used to increase the data base to build the model for better prediction
of brittleness in the black wells within the seismic survey.(For interpretation of the references to color in this figure legend, the reader is referred to the Web version
of this article.)
exploration for unconventional plays where these data are not available. most stable of all models. Ye et al. (2022) proposed a method to predict
One possible solution to this is the use of machine learning methods to the brittleness index of the Wufeng-Longmaxi and Baota formations of
fill the gap in a data-driven fashion. the Sichuan Basin via principal component analysis and
The adoption of machine learning to predict reservoir brittleness has back-propagation neural networks.
been well-documented in recent literature. Wood (2021) utilized a In this study, we adopt a workflow that can efficiently estimate and
data-matching algorithm to predict the mineralogical brittleness index predict the brittleness of reservoirs in wells without the right set of data
of the Lower Barnett Shale using gamma ray, density, neutron, re by utilizing gradient boosting, support vector regression, and artificial
sistivity, and sonic logs. Negara et al. (2017) applied a support vector neural network algorithms. Advanced pre-processing of the data is
regression to predict the brittleness index from elemental spectroscopy incorporated into the workflow to reduce uncertainty and improve the
and petrophysical properties and documented that the prediction was a predictive ability of the models. This study represents the first docu
good match with the laboratory-measured brittleness indices. Ahmadov mented attempt to estimate and predict the brittleness of the Marcellus
(2021) utilized linear, ridge and lasso regression, K-nearest neighbors, Shale in the Appalachian Basin. We propose a new brittleness index that
support vector machine, decision tree, random forest, AdaBoost, and conforms with the known physical properties of rocks, highlights several
gradient boosting to predict the geomechanically derived brittleness best practices in building machine learning models, and suggest three
index of the Tuscaloosa Shale and found that the tree-based methods strategies to approach the problem of brittleness prediction. The output
(gradient boosting and random forest) outperformed all other methods, of these models will aid the understanding of the brittleness variability
which agrees with the results of Ore and Gao (2021) and Ore’s Thesis in the reservoir which will be useful for the optimization of well oper
(2020). In a similar work, Sun et al. (2020) applied Chi-square automatic ation and for reservoir modeling.
interaction detector, random forest, support vector machine, K-nearest
neighbors, and artificial neural network to predict the brittleness of rock
samples from a water transfer tunneling project in Malaysia using the 1.1. Dataset
results of simple rock index tests as predictors. They found that the
random forest model performed best for the training and validation set, The dataset is from the Appalachian Basin with the target reservoir
while for the test set the artificial neural network was better performing. being the Marcellus Shale. There are 9 wells with different available
However, they emphasized that the K-nearest neighbors model was the geophysical logs, with 6 in the seismic survey available which, in the
future, will be used for brittleness inversion (Fig. 1). Since the logs
2
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
needed for building the machine learning models must be available in all the shear sonic (uS/Ft), Δtcomp is the compressional sonic (uS/Ft) and ρb
wells, this poses a significant constraint on the process. For this reason, is the bulk density (kg/m3).
gamma ray (GR), neutron porosity (NPHI), density (RHOZ), caliper The YME and PR are estimated using equations (5) and (6):
(HCAL), compressional sonic (DTCO), and photoelectric factor (PEFZ)
9G ∗ K
are used as the predictors, where all the logs are environmentally cor E= (5)
G + 3K
rected before use in the model building. Only 4 of the 9 available wells
have the full sonic-wave log suites needed to estimate brittleness; as a 3K − 2G
result, these 4 wells are used to build the model and evaluate how well it v= (6)
6K + 2G
predicts brittleness for other wells.
where E and v are Young’s Modulus and Poisson’s ratio, respectively.
1.2. Brittleness In convention, the computed YME and PR are normalized by sub
tracting each value by the minimum and dividing by the range, which is
Brittleness is the ease of a rock to break and create planes of weak called min-max normalization using equations (7) and (8), and the
ness under certain differential stress. This property is dependent on li brittleness average is then estimated using equation (9):
thology, mineral content, temperate, fluids, diagenesis, and effective E − Emin
stress (Perez and Marfurt, 2013). The goal of a hydraulic fracturing Ebrittleness = (7)
Emax − Emin
operation is to increase permeability in the rock through the opening of
natural fractures or the creation of new fractures, making rocks with vbrittleness =
v − vmin
(8)
high brittleness suitable candidates. Other factors such as loading his vmax − vmin
tory, engineering design of the stimulation, and lithology of top and base
Ebrittleness + vbrittleness
reservoir, are important in the determination if eventually fractures will BA = (9)
2
be induced by the stimulation.
Researchers often debate the definition of brittleness and the method where Emin is the minimum YME, Emax is the maximum YME, vmin is the
of estimating this geomechanical property, making it a contentious topic minimum PR, vmax is the maximum PR and BA is the brittleness average.
of debate. Consequently, no globally accepted brittleness estimation However, Rickman et al.’s (2008) brittleness average of equation (9)
method has been put forward to date (Altindag and Guney, 2010). depends on how the YME brittleness and PR brittleness are normalized,
However, based on extensive reviews of the subject concept in literature, and can be incorrect if conventional min-max normalization scheme is
the methods that utilize mineralogy and elastic property seem to be used. For example, the conventional nomalization for PR brittleness
gaining traction (Mews et al., 2019). using equation (8) could lead to incorrect estimate of the brittleness, as
Jarvie et al. (2007) estimated brittleness by using a ratio of the tested in section 3.1. Rickman et al. (2008) used a normalization scheme
mineral content (equation (1)). They hypothesized that mineralogical for YME and PR brittleness, but it is complicated, inconsistent and
characteristics are important for the success of stimulation and frac difficult to interprete.
turing. Wang and Gale (2009) modified this relationship by considering To correctly and consistently evaluate brittleness avergae using a
the impact of dolomite and organic content on the brittleness (equation consistent and simple normalization scheme, we propose a modified
(2)). These estimation methods do not factor in the effect of stress normalization scheme for the PR brittleness using equation (10). This
regime and diagenesis on the brittleness, resulting in a poor applicabi approach gives a straightforward and better-constrained brittleness es
lityin different plays with unique mineralogical properties. timate and conforms with scientific knowledge about the geomechanical
Q behavior of materials as tested by computational experimentation in
BIj = (1) section 3.1.
Q + Ca + Cl
1
( )
− min 1v
BIw =
Q+D
(2) vbrittleness = v ( 1) ( ) (10)
Q + D + Ca + Cl + TOC max v − min 1v
3
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
4
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
of the model.
parameter controls the distance of the
influence of a single training point.
2.5. Data preprocessing Typically, models with a very large
gamma tends to overfit
Data in its raw form is often time not consumable by machine Gradient Maximum depth This is the maximum number of nodes
Boosting allowed from the root to the farthest leaf
learning models because they are not in the right format, or they contain
of a tree. The ability for the model to
information that can adversely affect the performance of the model. learn complex relationship depends on
Therefore, preprocessing is a de facto first step of a machine learning the depth, however, deeper trees are
pipeline that comprises several steps, the first being the outlier removal susceptible to overfitting
(Felix and Lee, 2019). Minimum child This is the minimum weight required for
weight a new node to be created in the tree.
Extreme data points which can be a representation of error in data More branches are created in the tree if
collection are called outliers. These outliers are most times removed by it is small, but the problem of overfitting
studying the box-and-whisker plot. However, this process is subjective can also arise
and can lead to bias in the outlier removal. The isolation forest algorithm Number of This is the number of gradients boosted
estimators trees which is equivalent to the number
is adopted to remove the outlier in the dataset. This is a tree-based al
of boosting rounds
gorithm that isolates data points randomly by selecting a feature and Learning rate This is related to the weights of the
splitting it between the maximum and minimum (Liu et al., 2008). The nodes and determines how fast the
length between the root and the terminating node is equivalent to the boosting learning will reach a minimum
number of splitting required to isolate a data point, which is averaged Neural Network Number of hidden The robustness of the data and nature of
layers the problem sought out to be modeled
over the forest. In the splitting, shorter path lengths are representative of must be considered when selecting the
potential anomalies which will adversely affect the model’s appropriate number of hidden layers.
performance. Also, as described above, when too small
In machine learning classification tasks, a major problem is class they lead to high bias and when too
large they lead to the problem of
imbalance in the dataset where the models have a high accuracy just by
overfitting
predicting the majority class but fails to capture the minority class. A Number of neurons This also must be tuned to find the
widely adopted technique to circumvent this problem is resampling i.e., per layer perfect combination that will result in a
removing samples (under-sampling) and/or adding more samples (over- network properly trained with output in
sampling). In this work, we implement a combination, developed by between high bias and high variance
Solver This is the algorithm the neural network
Batista et al. (2004), of over-sampling (Synthetic Minority Oversampling
will use to update the weights of every
Technique) and under-sampling (Edited Nearest Neighbor), together layer after each epoch. Some popular
referred to as SMOTEENN. When using the Synthetic Minority Over algorithms which will be tested are
sampling Technique (SMOTE), a point from the minority class is selected stochastic gradient descent, Adam and
lbfgs (Limited-memory Broyden-
at random, and its k-nearest neighbors are calculated. Between the
Fletcher-Goldfarb-Shanno)
selected point and its neighbors, the synthetic points are added (Chawla Learning Rate The rate at which the weights are
et al., 2002). The Edited Nearest Neighbor (ENN) under-sampling updates
technique eliminates instances of the majority class on the boundary Number of iterations The number of times the learning
whose predictions by the k-nearest neighbors algorithm are different algorithm will train over the entire
training dataset. That is, one iteration/
from the other majority class points (Wilson, 1972).
epoch means that each sample in the
The dataset is then split into partitions: train, validation, and test. training dataset has had a chance to
The model learns the relationship between the input features and output update the internal model parameters
from the training set. Typically, the model is trained iteratively via grid (weight). This plays an important role in
how well the model fits on the training
search cross-validation and the validation set is used to evaluate the
dataset. This is usually tuned based on
performance of the model to prevent overfitting. The test set is used to computational power and time
assess the generalization of the model on the dataset, as the model has
not seen this set and data leakage is avoided. It is important to split the
dataset in a manner that the partitions have similar distributions to normalization and the Standard scaler approach and assess which
avoid bias. technique results in a better-performing model. The min-max scaler
transforms each feature to range between 0 and 1 preserving the original
2.6. Feature scaling data distribution, by using the relationship in equation (22).
x − min(x)
Some machine learning algorithms are sensitive to the magnitude of x=
̂ (22)
max(x) − min(x)
the range of features and the variance in the dataset, making feature
scaling an important step in the model-building pipeline. Generally, the where x is the original value, and ̂
x is the normalized value. The Stan
performance of most machine learning algorithms improves signifi dard scaler centralizes the data around a mean of 0 with a standard
cantly upon scaling the feature (Han et al., 2011). This can be done by deviation of 1 through equation (23).
normalization and standardization of the features, and the choice of
scaling depends on the algorithm. Here, we use the min-max
5
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
x− μ
x=
̂ (23)
σ
1 ∑N
MAE = yi|
|yi − ̂ (26)
where μ is the mean of the values, and σ is the standard deviation of the N i=1
values.
where N is the data size, yi is the true value, ̂
y i is the predicted value, and
∑
yi is the average of the true value (yi = N1 Ni=1 yi ).
2.7. Hyperparameter tuning
The confusion matrix is used to assess the performance of a classi
Machine learning model’s performance depends on configurations fication model through a count of misclassification which can be dis
that can either be learned from the data or predefined prior to training played in a tabular form (Fig. 5). This table gives all the information
the models. Parameters are internal to the model and are often estimated required to quantify the predictive ability of a classification model. The
through an optimization technique e.g the support vectors of a SVR, the accuracy score is a ratio of the set of correct predicted values to the total
weights in a NN, etc. On the other hand, hyperparameters are external to number of predictions made.
the model and are set manually e.g the learning rate, etc (Li et al., 2017).
Hyperparameters can be thought of as a-priori parameters whose suit 3. Results and discussion
able value is achieved by random search in a range of combinations.
Table 1 describes the hyperparameters considered for each model. 3.1. Elastic based brittleness estimation
An efficient way to train the model with a host of hyperparameters is
by Grid Search Cross-Validation. Here, an array of possible hyper Brittleness is estimated using equations (5)–(10) for comparative
parameters is passed, and all possible combinations are tested for the analysis. The estimate utilizing Poisson’s ratio brittleness relationships
optimal hyperparameters based on a predefined evaluation metric in equations (8) and (10) are compared with Young’s modulus and
(Krstajic et al., 2014). The training data is split into K folds, where the Poisson’s ratio via a cross-plot (Fig. 6) and the viability of the methods to
first fold is used to train the model with a certain hyperparameter estimate the brittleness of the Marcellus shale is assessed.
combination and the left out is used for testing (Fig. 4). This is an iter Rocks in the subsurface are confined, restricting how much they can
ative process that is repeated K times for a hyperparameter combination. extend or shorten, especially in the horizontal direction. For example, a
sedimentary rock buried would experience a vertical shortening that
could only be compensated by a horizontal extension. By squeezing a
2.8. Estimating model performance sample, one can simulate this in the laboratory by creating horizontal
stresses that will balance the axial shortening (Fossen, 2016). This axial
The predictive capability of a machine learning model is assessed strain can be expressed as:
through its performance on unseen data. This performance is typically
evaluated using statistical relationships that compare the difference 1
ez = [σ 1 − v(σ 2 + σ 3 )] (27)
between the predicted and true values. Depending on the modeling task E
(classification or regression), there exists a plethora of evaluation met A similar expression can be found for horizontal stresses (Fossen,
rics. Here, we utilize R-squared (R2), Mean Squared Error (MSE), and 2016). Considering the vertical stress σ 1 , σ2 = σ3 , and the boundary
Mean Absolute Error (MAE) for the regression while the accuracy score condition ex = 0, we obtain:
and confusion matrix are used to define the performance for the
v
classification. σ2 = σ3 = σ1 (28)
1− v
The R2 score is a measure of the amount of variation in the dependent
variable that is explained by the independent variable. It is calculated The Poisson’s ratio (v) is directly proportional to σ 3 which has im
using equation (24): plications for the differential stress required for the failure of the rock. In
other words, an increase in the Poisson’s ratio results in higher differ
∑ ential stress to create fractures in the rock, therefore, inversely propor
N
(yi − ̂y i )2
2
R =1− i=1
(24) tional to the brittleness of the rock. This contrasts with the brittleness
N ( )2
∑
yi − yi estimate using the min-max normalization scheme, where the brittleness
i=1 increases with increasing Poisson’s ratio (Fig. 6a), while our proposed
method shows the expected relationship (Fig. 6b).
The MSE is the average of the squared differences between the pre
As a proof of concept, we compare the proposed brittleness rela
dicted and true values:
tionship in this paper with known rock properties and mineralogy. In
1 ∑N
Fig. 7, the two estimates are compared with closure pressure, a hydraulic
MSE = y i )2
(yi − ̂ (25)
N i=1 fracture design parameter obtained from a diagnostic fracture injection
test that identifies the pressure at which the fracture closes without
The MAE is the average of the absolute error values:
6
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
Fig. 6. A cross-plot of Young’s modulus against Poisson’s ratio with an overlay of brittleness estimate (color) using Poisson’s ratio brittleness normalization equation
(8) on the left and the proposed normalization equation (10) on the right.(For interpretation of the references to color in this figure legend, the reader is referred to
the Web version of this article.)
Fig. 7. A cross-plot of closure pressure against brittleness estimate. The closure pressure has an inverse relationship to the brittleness of the rock (Li et al., 2020). The
brittleness estimate on the left does not reflect this relationship, however, the proposed brittleness estimate on the right does.
7
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
Fig. 8. Ternary diagram of the mineral composition of the silica-rich Marcellus Shale overlaid by the brittleness estimate. On the left is the incorrectly normalized
brittleness estimate which shows that the high brittleness is clustered in the low silicate mineral region.
Fig. 9. Statistical relationship between the geophysical logs. (a) Mutual information (b) Pearson’s correlation.
sensitive to mineralogical change and elastic properties that will have standard deviations, resulting in a value between − 1 and 1. The result
brittleness implications are considered. These logs are caliper (HCAL), shows that in terms of mutual information, the compressional sonic log
bulk density (RHOZ), gamma ray (GR), neutron porosity (NPHI), pho has the highest association with brittleness while in terms of correlation,
toelectric factor (PEFZ), and compressional sonic (DTCO). the neutron porosity has the highest correlation (Fig. 9). The caliper log
To assess the relationship and association between the features, the has the lowest association with brittleness but is included as a predictor
mutual information and Pearson correlation are utilized. The measure since its correlation is > 0.1. However, the density log is not used as a
ment of the dependence between two random variables is mutual in predictor in the model building as the association is low and the cor
formation, which has a non-negative value. it is a measure of the amount relation is close to 0.
of information about one random variable that can be learned from
observing another random variable. Higher values indicate higher 3.3. Data preprocessing
dependence, and it equals zero only when two random variables are
independent (Kraskov et al., 2004). The Pearson correlation is a measure Basic statistics are used to understand the nature of the data and its
of the linear correlation between two random variables. It is a ratio distribution. Specifically, we observe the range and variance of the
between the covariance of the variables and the product of their features as machine learning models are, most times, sensitive to
Table 3
The statistical summary of the raw data prior to preprocessing. The range and order of magnitude of the features are different, reinforcing the need to normalize the
data before model building.
count mean std min 25% 50% 75% max
8
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
Table 4
Dataset summary after scaling using the min-max approach. The range of each feature in the data set is constrained to 1 without affecting the shape of the distribution.
count mean std min 25% 50% 75% max
Table 5
Dataset summary after standardization. The features are constrained to a distribution with mean of 0 and standard deviation of 1.
count mean std min 25% 50% 75% max
Fig. 11. Boxplot of the well log data which is informative of the skewness and the outliers. Using the information on the interquartile ranges from the boxplot alone
to remove outliers will result in the loss of vital information.
9
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
Fig. 12. Boxplot of the data after outlier removal using the Isolation Forest algorithm. This data is better constrained for building machine learning models,
especially those sensitive to outliers.
Table 6
Model performance of strategy 1. The gradient boosting model has the best
training set performance, but the committee machine outperformed other
models in the validation and testing set.
Model Data R2 MSE MAE
− 4 2
Support Vector Regression Training 0.9588 9.2 x 10 2.0 x 10−
3 2
Validation 0.9297 1.7 x 10− 2.6 x 10−
3 2
Testing 0.7298 3.3 x 10− 4.4 x 10−
4 3
Gradient Boosting Training 0.9928 1.6 x 10− 7.6 x 10−
3 2
Validation 0.9505 1.2 x 10− 2.2 x 10−
3 2
Testing 0.8685 1.6 x 10− 2.8 x 10−
3 2
Neural Network Training 0.9057 2.2 x 10− 3.5 x 10−
3 2
Validation 0.9043 2.3 x 10− 3.5 x 10−
2 2
Testing 0.8193 3.5 x 10− 3.5 x 10−
4 2
Committee Machine Training 0.9625 9.1 x 10− 2.0 x 10−
4 2
Validation 0.9596 9.8 x 10− 2.0 x 10−
3 2
Testing 0.8782 1.5 x 10− 2.8 x 10−
actual subsurface data which can give more illumination of the physical
behavior of the stratigraphy. Fig. 13. A histogram showing the significance of each feature in training the
decision trees of the gradient boosting model.
10
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
Fig. 14. Model result on deployment on the blind well using for strategy 1.
Fig. 15. Error histogram showing the distribution of the difference of the actual and predicted brittleness on deployment of the strategy 1 model on the validation
and test set.
11
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
Table 7 predict the shear sonic log before using an empirical relationship to
Model performance of strategy 2. The gradient boosting model outperformed the calculate the brittleness estimate. This is a problem that has been
other models in the training and validation sets but the committee machine is the extensively addressed in the literature, therefore, making this approach
best for the blind test set. feasible (Bukar et al., 2019; Zhang et al., 2022; Rajabi et al., 2010; Yu
Model Data R2 MSE MAE et al., 2021). However, a major drawback of this strategy is the fact that
Support Vector Regression Training 0.9887 4.4042 1.5111 the wells in which the brittleness is sought to be predicted must have the
Validation 0.9514 20.5059 2.3744 compressional sonic log for the brittleness calculation, as this informa
Testing 0.8684 31.6586 4.0317 tion is also very important for enhancing the predictive capability of the
Gradient Boosting Training 0.9980 0.7844 0.64637 machine learning models for the shear sonic. Typically, sonic logs are
Validation 0.9863 5.7939 1.5502
Testing 0.9353 15.55721 3.0236
not available for all wells, making this strategy, not the first choice in
Neural Network Training 0.9821 6.9836 2.0579 predicting brittleness.
Validation 0.9752 10.4656 2.3759 For this task, the geophysical logs used as predictors are the gamma-
Testing 0.9432 13.6690 2.6000 ray, neutron porosity, caliper, compressional sonic, and photoelectric
Committee Machine Training 0.9925 2.9301 1.2978
factor. The 3 machine learning algorithms (SVR, GB, and NN) are
Validation 0.9838 6.8545 1.8009
Testing 0.9505 11.8954 2.5860 trained with the shear sonic as the target. The hyperparameters for the
GB model that resulted in the best performance are a learning rate of 0.1,
maximum depth of 6, minimum samples leaf of 3, minimum samples
further investigated (Fig. 13). The sonic and neutron, referred to as split of 2, and the number of estimators of 200. The R2 score on the
porosity logs, are the most influential features in the building of the GB deployment of the model on the test set is 0.9353, with an MSE of 15.6
model. The neutron log is sensitive to the presence of gas which can and an MAE of 3.0. For the SVR model, the hyperparameter combination
serve as a proxy for natural fractures in shales, while the sonic log is the is a C of 10, an epsilon of 0.02, gamma set to scale (1/(number of fea
basis for the brittleness estimation which further supports the algo tures * Variance(X)), and a radial basis function kernel. The performance
rithms selection of these features as important in the splitting of the metrics for the testing set are: R2 of 0.8684, MSE of 31.7, and MAE of
decision trees for the ensemble model. This follows physical intuition as 4.0. The best hyperparameter for the NN model was found to be 1 hidden
the brittleness estimate is related to porosity (Heidari et al., 2014). layer containing 17 neurons and an lbfgs solver. The testing set R2 is
The SVR, NN, and GB performed relatively well in modeling the 0.9432, MSE is 13.7 and MAE is 2.6. The fusion of the three trained
brittleness as can be seen in the prediction on the blind well (test set). models into a committee machine resulted in a model with better pre
However, it appears that they are sensitive to the gamma-ray values as dictive power reflected by the performance on the test set where R2 is
regions with high gamma ray display unusual patterns. Irrespective, the 0.9505, MSE is 11.9 and MAE is 2.6 (See Table 7 for the performance
models correctly predict the trend of brittleness in the blind well metrics of the models on the training and validation set).
(Fig. 14). Analysis of the error histograms further gives information on The errors of the model are normally distributed, except for the SVR
the distribution of the errors between the estimated and predicted which has a left-skewed distribution to the overestimated region
brittleness. This also is indicative of the uncertainty associated with the (Fig. 17). However, more than 80% of the errors are in the ±10 bounds,
models. In all models, the errors in the prediction of the validation set with the committee machine having a tighter bound of ±5 making the
are normal and centered at zero. However, the errors of the SVR in the committee machine a more robust solution to the problem, as it per
prediction of the test set are skewed to the underestimated region, with forms well on the blind well. The model is deployed to predict the shear
about 80% in the region and the bound for the prediction is ±0.2. The sonic log, and the brittleness is then estimated from the relationship
SVR, NN, and GB results are combined to form a committee machine described in the methodology (Fig. 16). All models capture the general
capable of better generalization for the prediction of brittleness from the trend of brittleness variability in the blind well. However, a significant
geophysical logs. This is reflected by the improvement in the perfor issue arises from the amplification of errors in the models. The brittle
mance scores (testing set scores: R2 of 0.8782, MSE of 0.0015, and MAE ness estimates obtained from the predicted shear sonic appear to be most
of 0.028) and the reduction in the error variance (Fig. 15). shifted with the SVR model being significant, though all the models
perform well in predicting the shear sonic. This is because, in the brit
3.4.2. Strategy 2 tleness computation, the error from the shear sonic prediction is squared
Another approach to estimating brittleness for wells without the multiple times making the errors magnified, consequently making this
required geophysical logs is to use machine learning algorithms to strategy not the first choice for the prediction of brittleness.
Fig. 16. Model result on deployment on the blind well using for strategy 2.
12
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
Fig. 17. Error histogram showing the distribution of the difference of the actual and predicted brittleness on deployment of the strategy 2 model on the validation
and test set.
end goal of our study is to invert brittleness from seismic trace data. This
will be difficult to achieve using the information of brittleness classes,
however, we will implement this strategy and demonstrate the result
performance and capabilities.
The first step is to group the brittleness estimates into classes. The
cutoff for the classes is from prior knowledge of the reservoir often
biased, or through unsupervised techniques such as K-nearest neighbors.
In this study, 3 groups were found: ductile, transition, and brittle, with
0.27 and 0.5 as cutoffs (Fig. 18). Isolation Forest, with contamination of
0.1, is used to identify and remove outliers in the training data, while the
classification variant of the three machine learning algorithms was used
to build the models, and the metric used to assess the performance of the
model is accuracy and a confusion matrix.
Before building the model, about 60% of the training data was found
to be from the low brittleness class which brings up the issue of class
imbalance. This imbalance has grave implications for the model per
formance, as it will find it harder to predict the minority classes. To work
Fig. 18. Histogram showing the brittleness distribution for the training set. The around this issue, we adopted a resampling technique popularly known
brittleness estimates are grouped as ductile (0.0–0.27), transition (0.27–0.5) as SMOTEENN (see section 2.5) to balance the data, resulting in the 3
and brittle (0.5–1.0). groups having similar sample sizes.
Using the same training strategy as the regression problem, the
3.4.3. Strategy 3 model’s hyperparameters were tuned. For the GB model, the optimum
Experts often time interpret brittleness qualitatively as low and high. hyperparameters were the maximum depth of 7, minimum samples leaf
The idea behind this is hinged upon the fact that the actual brittleness of 2, and minimum samples split of 2 with accuracy on the training set of
definition of rocks has not been properly constrained to date, and all the 100%, on the validation set of 98.58%, and testing set of 89.37%. Using
relationships used are proxies and serve as estimates. One can therefore a gamma of 1/(number of features * Variance(X), the optimum hyper
lump the brittleness estimate values into groups and solve this predic parameters for the SVM is a C of 100 and a radial basis function kernel
tion problem as classification. One advantage of doing this is that the which had a performance of 96.01% on the training set, 95.15% on the
complexity of the problem will reduce, and we can build a more robust validation set and 89.06% on the testing set. The NN model was built
and scalable model to predict the brittleness classes. One drawback of with one hidden layer containing 19 neurons and an lbfgs solver with an
this approach is that a ton of information will be lost. For instance, the accuracy of 94.46% on the training set, 94.63% on the validation set,
13
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
Fig. 19. Confusion matrix of the models on deployment on the test set. The three models struggle to predict the transition class correctly.
14
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
Fig. 20. Model result on deployment on the blind well using for strategy 3. The green flag represents high brittleness, while the yellow represents low brittleness.
(For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Declaration of competing interest the mohr-coulomb failure criteria. In: Unconventional Resources Technology
Conference, pp. 4036–4046. Denver, Colorado, 22-24 July 2019.
Evans, K.G., Carr, T.R., Ghahfarokhi, P.K., Ore, T., Smith, J., Toth, R., 2019b. Improving
The authors declare that they have no known competing financial completion techniques of unconventional shale reservoirs through the analysis of
interests or personal relationships that could have appeared to influence geomechanical properties and fracture imaging; A study of horizontal velocity and
the work reported in this paper. image logs within the MIP-3H Marcellus shale well in monongalia county, West
Virginia. In: 2019 AAPG Eastern Section Meeting: Energy from the Heartland.
Felix, E.A., Lee, S.P., 2019. Systematic literature review of preprocessing techniques for
Data availability imbalanced data. IET Softw. 13 (6), 479–496.
Fossen, H., 2016. Structural Geology. Cambridge university press.
Friedman, J.H., 2001. Greedy function approximation: a gradient boosting machine.
Data will be made available on request. Ann. Stat. 1189–1232.
Grieser, W.V., Bray, J.M., 2007. Identification of production potential in unconventional
Acknowledgements reservoirs. In: Production and Operations Symposium. Society of Petroleum
Engineers.
Han, J., Pei, J., Kamber, M., 2011. Data Mining: Concepts and Techniques. Elsevier.
We thank Occidental Corporation for the support of our joint in Hastie, T., Tibshirani, R., Friedman, J., 2009. The Elements of Statistical Learning: Data
dustry geophysics consortium. Energy Corporation of America (ECA) Mining, Inference, and Prediction. Springer Science & Business Media.
Heidari, M., Khanlari, G.R., Torabi-Kaveh, M., Kargarian, S., Saneie, S., 2014. Effect of
provided well data along with a 3D seismic survey. Tim Carr offerred porosity on rock brittleness. Rock Mech. Rock Eng. 47 (2), 785–790.
two additional sets of well logs from MIP 3H and Boggess wells in the Hopfield, J.J., 1988. Artificial neural networks. IEEE Circ. Dev. Mag. 4 (5), 3–10.
data base. Journal peer reviews by Associate Editor and four anonymous Jarvie, D.M., Hill, R.J., Ruble, T.E., Pollastro, R.M., 2007. Unconventional shale-gas
systems: the Mississippian Barnett Shale of North-Central Texas as one model for
reviewers helped improve the quality of the paper.
thermogenic shale-gas assessment. AAPG (Am. Assoc. Pet. Geol.) Bull. 91, 475–499.
Kraskov, A., Stögbauer, H., Grassberger, P., 2004. Estimating mutual information. Phys.
Code availability Rev. 69 (6), 066138.
Krogh, A., Vedelsby, J., 1994. Neural network ensembles, cross validation, and active
learning. Adv. Neural Inf. Process. Syst. 7.
The code used in this study is available for download at the link: Krstajic, D., Buturovic, L.J., Leahy, D.E., Thomas, S., 2014. Cross-validation pitfalls when
https://github.com/tobi-ore/Brittleness-Predicition-using-Machine-Le selecting and assessing regression and classification models. J. Cheminf. 6 (1), 1–15.
arning Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., Talwalkar, A., 2017. Hyperband: a
novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res.
18 (1), 6765–6816.
References Li, Y., Zhou, L., Li, D., Zhang, S., Tian, F., Xie, Z., Liu, B., 2020. Shale brittleness index
based on the energy evolution theory and evaluation with logging data: a case study
Ahmadov, J., 2021. Utilizing data-driven models to predict brittleness in Tuscaloosa of the Guandong block. ACS Omega 5 (22), 13164–13175.
marine shale: a machine learning approach. September. In: SPE Annual Technical Liu, F.T., Ting, K.M., Zhou, Z.H., 2008. Isolation forest. December. In: 2008 Eighth Ieee
Conference and Exhibition. OnePetro. International Conference on Data Mining. IEEE, pp. 413–422.
Altindag, R., Guney, A., 2010. Predicting the relationships between brittleness and Medsker, L., Jain, L.C. (Eds.), 1999. Recurrent Neural Networks: Design and
mechanical properties (UCS, TS and SH) of rocks. Sci. Res. Essays 5, 2107–2118. Applications. CRC press.
Batista, G.E., Prati, R.C., Monard, M.C., 2004. A study of the behavior of several methods Mews, K.S., Alhubail, M.M., Barati, R.G., 2019. A review of brittleness index correlations
for balancing machine learning training data. ACM SIGKDD explorations newsletter for unconventional tight and ultra-tight reservoirs. Geosciences 9 (7), 319.
6 (1), 20–29. Negara, A., Ali, S.S., Al Dhamen, A., Kesserwan, H., Jin, G., 2017. Data-Driven Brittleness
Bishop, C.M., 1995. Neural Networks for Pattern Recognition. Oxford university press. index prediction from elemental spectroscopy and petrophysical properties using
Bukar, I., Adamu, M.B., Hassan, U., 2019. A machine learning approach to shear sonic support-vector regression. June. In: SPWLA 58th Annual Logging Symposium.
log prediction. August. In: SPE Nigeria Annual International Conference and OnePetro.
Exhibition. OnePetro. Ore, T.M., 2020. A machine learning and data-driven prediction and inversion of
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P., 2002. SMOTE: synthetic reservoir brittleness from geophysical logs and seismic signals. In: A Case Study in
minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357. Southwest Pennsylvania, Central Appalachian Basin. M.S.Thesis. West Virginia
Cortes, C., Vapnik, V., 1995. Support-vector networks. Mach. Learn. 20 (3), 273–297. University.
Evans, K., Toth, R., Ore, T., Smith, J., Bannikova, N., Carr, T., Ghahfarokhi, P.K., 2019a. Ore, T., Gao, D., 2021. Supervised machine learning to predict brittleness using well logs
Fracture analysis before and after hydraulic fracturing in the Marcellus shale using and seismic signal attributes: methods and application in an unconventional
15
T. Ore and D. Gao Computers and Geosciences 171 (2023) 105266
reservoir. In: First International Meeting for Applied Geoscience & Energy. Society of Sun, D., Lonbani, M., Askarian, B., Jahed Armaghani, D., Tarinejad, R., Thai Pham, B.,
Exploration Geophysicists, pp. 1566–1570. Huynh, V.V., 2020. Investigating the applications of machine learning techniques to
O’Shea, K., Nash, R., 2015. An Introduction to Convolutional Neural Networks arXiv predict the rock brittleness index. Appl. Sci. 10 (5), 1691.
preprint arXiv:1511.08458. Tresp, V., 2018. Committee machines. In: Handbook of Neural Network Signal
Perez, R., Marfurt, K., 2013. Brittleness estimation from seismic measurements in Processing. CRC Press, 5-1.
unconventional reservoirs: application to the Barnett Shale. January. In: 2013 SEG Wang, F.P., Gale, J.F.W., 2009. Screening criteria for shale-gas systems. Gulf Coast Assoc.
Annual Meeting. Society of Exploration Geophysicists. Geol. Soc. Transact. 59, 779–793.
Perrone, M.P., 1993. Improving Regression Estimation: Averaging Methods for Variance Wilson, D.L., 1972. Asymptotic properties of nearest neighbor rules using edited data.
Reduction with Extensions to General Convex Measure Optimization. Brown IEEE Transact. Syst. Man Cybernetics (3), 408–421.
University. Wood, D.A., 2021. Brittleness index predictions from Lower Barnett Shale well-log data
Rajabi, M., Bohloli, B., Ahangar, E.G., 2010. Intelligent approaches for prediction of applying an optimized data matching algorithm at various sampling densities.
compressional, shear and Stoneley wave velocities from conventional well log data: Geosci. Front. 12 (6), 101087.
a case study from the Sarvak carbonate reservoir in the Abadan Plain (Southwestern Ye, Y., Tang, S., Xi, Z., Jiang, D., Duan, Y., 2022. A new method to predict brittleness
Iran). Comput. Geosci. 36 (5), 647–664. index for shale gas reservoirs: insights from well logging data. J. Petrol. Sci. Eng.
Rickman, R., Mullen, M.J., Petre, J.E., Grieser, W.V., Kundert, D., 2008. A practical use of 208, 109431.
shale petrophysics for stimulation design optimization: all shale plays are not clones Yu, Y., Xu, C., Misra, S., Li, W., Ashby, M., Pan, W., Deng, T., Jo, H., Santos, J.E., Fu, L.,
of the Barnett Shale. January. In: SPE Annual Technical Conference and Exhibition. Wang, C., 2021. Synthetic sonic log generation with machine learning: a contest
Society of Petroleum Engineers. summary from five methods. Petrophysics-The SPWLA J. Format. Evaluat. Reservoir
Sazli, M.H., 2006. A brief review of feed-forward neural networks. Communications Description 62 (4), 393–406.
Faculty of Sciences University of Ankara Series A2-A3. Phys. Sci. Eng. 50 (1). Zhang, F., Deng, S., Wang, S., Sun, H., 2022. Convolutional neural network long short-
Smola, A.J., Schölkopf, B., 2004. A tutorial on support vector regression. Stat. Comput. term memory deep learning model for sonic well log generation for brittleness
14 (3), 199–222. evaluation. Interpretation 10 (2), T367–T378.
16