Augmenting and Eliminating The Use of Sonic Logs Using Artificial Intelligence A Comparative Evaluation

Geophysical Prospecting, 2022 doi: 10.1111/1365-2478.
13213
Augmenting and eliminating the use of sonic logs using artificial

intelligence: A comparative evaluation
Vishnu Roy1 , Ankur Gupta1 , Romy Agrawal1 , Nitesh Kumar2 and Amit Saxena1∗
1 Rajiv Gandhi Institute of Petroleum Technology, Jais, Uttar Pradesh, India, and 2 MDNK Oil and Gas Consultants, Mumbai, Maharashtra,
India
Received May 2021, revision accepted April 2022
ABSTRACT
In oil and gas exploration, it is vital to acquire information about the bottom hole
conditions. This is done in the field using wireline logging. The sonic log is one of the
most prolific logs as it assists in porosity determination, cement evaluation and iden-
tification of lithology and gas-bearing intervals. However, sonic logging tools are not
always a part of the wireline logging arrangement. Still, there are sections where the
logging data are missing, and in some cases, these are dependent upon old tools. The
tool is incapable of recording shear wave transit times. This study explores the pos-
sibility of substituting the sonic log using machine learning and artificial intelligence
techniques. These techniques can also predict the sonic log data in sections where
these are missing or unreliable. Artificial neural networks, decision tree regression,
random forest regression, support vector regression and extreme gradient boosting
are the most popular tools available at our disposal for making these estimations.
This study has compared these different techniques for their effectiveness and accu-
racy in making sonic transit time predictions based on other available well logs. The
obtained results suggest that despite all the attention on artificial neural networks,
eXtreme gradient boosting and the random forest regression outperform it for the
given purpose. In the case of missing shear transit time data, random forest regression
made predictions with a root mean squared error of 1.03 × 10–4 and a root mean
squared error of 0.97 while eXtreme gradient boosting regression did so with a root
mean squared error of 1.36 × 10–4 and the same regression coefficient (0.97). When
no sonic data were available, random forest regression estimated shear transit time
with a root mean squared error of 6.41×10–4 and a regression coefficient of 0.95, and
compressional transit time with a root mean squared error of 9.06×10–5 and a root
mean squared error of 0.94. The root mean squared error of shear transit time and
compressional transit time predictions made using eXtreme gradient boosting were
found to be the same as those of random forest regression. The root mean squared
error, however, was observed to be slightly less for the compressional transit time pre-
dictions and somewhat more for the shear transit time predictions. As data analysis,
in general, is a better method for estimation than the use of empirical correlations,
∗ Email: asaxena@rgipt.ac.in
© 2022 European Association of Geoscientists & Engineers. 1

2 V. Roy et al.
these machine learning-based predictions can serve as powerful tools in the oil and
gas exploration industry.
Key words: Artificial intelligence, Compressional wave, Neural networks, Shear

wave, Sonic log.
monopole logging tools, i.e. they do not give any information

I N T RO D U C T I O N about the shear wave transit times in ‘fast’ formations (Harri-
son et al., 1990). Fast formations are those in which the shear
The knowledge of geo-mechanical and petrophysical prop-
response comes back before the compressional wave response.
erties of the subsurface is essential for hydrocarbon ex-
For measuring the shear transit times and the compressional
ploitation. Petrophysical properties enable us to determine
wave transit times, dipole and multipole sonic logging tools
the economic viability of exploitation. Simultaneously, geo-
are available to use flexural waves (Alford et al., 2012).
mechanical properties help improve the efficacy of the drilling
Despite the sonic log utilities, it is not a preferred part
operation and reduce the associated risks. Technical strategies
of the wireline logging arrangement (Elkatatny et al., 2018).
and financial evaluations rely predominantly on the accuracy
Even when the tool cannot measure compressional and shear
of the acquired information about the reservoir.
wave transit times, inaccurate measurements might be ob-
The traditional method of analysing the rock properties is
tained due to the faulty tool, poor data storage, bad hole con-
through laboratory analysis of the cores. This helps to estimate
ditions etc. In such cases, artificial intelligence (AI) provides a
the reservoir rock properties at bottom hole conditions. How-
powerful solution to estimate the sonic log data based on the
ever, there are two significant drawbacks to this technique.
other available logs (Haghighi et al., 2014; Anemangely et al.,
The first is the procurement of core and extensive lab analysis.
2017; Anemangely et al., 2019; Mehrad et al., 2022).
Second, the core represents a smaller section of the reservoir,
This paper aims to identify different AI techniques and
and they are incapable of providing a holistic framework of
evaluate their performance in predicting the sonic transit times
reservoir properties for heterogeneous formations. For miti-
using the other available log data as input parameters. The
gating these disadvantages, the petroleum industry resorts to
proposed work intends to observe each method’s performance
wireline logging operations.
in this task and compare them for an optimized result. Two
Sonic logs have proven to be reliable in gathering in-
cases are being considered for the study. In the first one, a pri-
formation about the subsurface formations. They are instru-
mary wave transit time is available, and the secondary wave
mental in determining the porosity, lithology, fluid saturation
transit time can be determined. In the second case, no sonic
and mechanical properties of the formation rock (Khazane-
data are available. It will be determined using the other avail-
hdari & McCann, 2005; Asoodeh, 2013; Shahvar et al., 2014).
able logs: gamma-ray log, spectral gamma-ray logs, resistivity
When used in conjunction with seismic logs, sonic logs can
logs, neutron porosity and density log.
estimate the pore pressure. These logs indicate the formation
properties due to the sonic transit times affected by reservoir
properties like compaction, density, anisotropy, porosity, con-
S U P E RV I S E D L E A R N I N G R E G R E S S I O N
solidation, cementation, pore pressure and overburden stress
TECHNIQUES
(Krief et al., 1990; Williams, 1990). Initially, the sonic log-
ging tools used in the industry had a single transmitter and Supervised learning algorithms build a model of the relation-
receiver pair. From there, they have evolved today into varied ships between the input and output parameters based on the
combinations. Generally, dual receiver systems compensate input data (Srivastava et al., 2014). The model is trained with
for the borehole and drilling mud (Doh & Alger, 1958). Hence, the available data and can then predict the output from a
borehole-compensated sonic tools are used to deal with the ir- new set of inputs (Nair & Hinton, 2010). Supervised learn-
regularities in the borehole size. ing regression techniques in today’s world have ubiquitous
Further, array sonic logging tools containing an array applications. The present work has extensive data with an ex-
of transmitters and receivers provide improved measure- tremely high number of observations compared with the num-
ment quality (Hsu et al., 1987). The above tools are usually ber of features in each observation. Hence, the low bias/high
© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

Augmenting and eliminating the use of sonic logs using artificial intelligence 3
variance algorithms have been used in the present work RELU is computationally cost-effective compared with
(Kingma & Ba, 2014). other functions like sigmoid and tanh due to its simple math-
ematical operation.
The adjustments in a neural network’s attributes, such as
Artificial neural network
weights and biases, control the learning process. The objec-
An artificial neural network (ANN) is inspired by the data tive of these adjustments is to minimize the cost function. The
processing techniques of the human brain. Just like the human use of optimizers achieves this. As shown in Figure 2, opti-
brain, an ANN comprises of interconnected nodes of neurons mizers use gradient descent to find the coefficients that mini-
(Bishop, 1995). The brain has billions of neurons that process mize some cost functions (e.g. root mean square error, sum of
information flow to and fro from (output) the brain to per- squared residuals etc.) (Maleki et al., 2014). A downhill path
form the desired task. Similarly, an ANN has artificial neurons is followed until the global minimum of the cost function is
known as processing units interconnected by nodes (Gevrey reached. In simple mathematical terms, this is the point where
dJ(w)
et al., 2003). dw
= 0.
It is a beautiful technique to understand and establish the For our model, the optimizer ADAM has been used.
relationships between complex non-linear parameters when ADAM is an abbreviation for adaptive momentum estimation
the system structure is unknown. In a way, ANN mimics the and uses running averages of both the gradients and their sec-
behavioural patterns of the system, and it grows and learns ond moments (Pedregosa et al., 2011).
about the system’s functioning even when no mathematical
relationship is known to exist. An ANN, just like the human
Decision tree model
brain, can learn from past scenarios and experiences, i.e. data.
Its functionality can be represented using Figure 1. The decision tree (DT) model resembles the structure of a tree.
ANN structure consists of several connected layers of The input dataset is split into subsets, and an associated DT is
neurons. There is an input layer, an output layer and one constructed. The DT is made of decision nodes and leaf nodes.
or more hidden layers. Each connection possesses a cer- It is one of the most practical approaches in machine learn-
tain weight which might be positive or negative. A positive ing used for classification and regression applications (Xu
weight activates the neuron, whereas a negative one inhibits it. et al., 2005). This model’s flexibility makes it a suitable can-
The neural network’s performance depends on the structure’s didate when the relationship between inputs and outputs is
weights, biases and the number of neurons in the hidden lay- complex.
ers. Fewer neurons result in underfitting, whereas an exceed- The model classifies the data based on a set of rules
ingly high number of neurons results in overfitting. Hence, an using concepts like entropy and Gini index. Both are mea-
optimized number is always desirable (Hinton et al., 2006). sures of node impurities. Entropy in this context, like
A suitable way to deal with overfitting is the use of dropout. in thermodynamics, is a measure of disorder. Mathemati-
Dropout is a regularization method in which some nodes are cally, these can be represented as in equations (2) to (4)
ignored randomly. In effect, it makes the layer look like it has (Hartshorn, 2016):
a different number of nodes and connections to the previous
layer than it actually does. The training process becomes nois-
m

Entropy = pn log(pn ) (2)
ier, which compels the system to develop a more probabilis- n=1
tic learning pattern. Thus, overfitting is reduced (Singh et al.,
2016).
m
Gini index = 1 − p2n (3)
An activation function is used to determine whether a n = 1
neuron should be activated or not, and it introduces non-

linearity into the output. The function used in this study was Information gain = Entropy dataset − Entropy feature .(4)
RELU, which stands for rectified linear unit. It gives an out-
come 0 if x is negative and x if x is positive (Choudhary & Information gain is the criterion used by the DT algo-
Gianey, 2017). rithm to split nodes. Before and after a transformation, the
dataset’s entropy is compared to calculate the information
A (x) = max (0, x) . (1) gain (Mitchell, 1997).

4 V. Roy et al.
Figure 1 (a) General structure of ANN; (b) effect of dropout on ANN.

function used in this study is the radial basis function (RBF)

kernel. It is the default kernel in SVM and computes how close
it is for any 2 points X1 and X2 . If we denote the variance by σ ,
the RBF function can be mathematically represented by equa-
tion (5) (Vert et al., 2004)

||X1 − X2 ||2
K (X1 , X2 ) = exp − . (5)
2σ 2
To classify points as positive or negative, SVM uses deci-

sion boundaries which act as lines of demarcation. Points on
the decision line may be classified as positive or negative. The
theory is represented in Figure 3.
Figure 2 Gradient descent used in optimizers. If, for instance, the equation of a hyperplane is y = ax +
b, the equations of the associated decision boundaries will be
−a < y − (ax + b) < a.
Support vector machine
Support vector regression penalizes the predictions that
Support vector machine (SVM) is a type of supervised learning are outside the decision boundaries. The points closest to the
technique mainly used for pattern recognition and regression. hyperplane on either side determine the support vectors (Pe-
It attempts to separate or classify data using hyperplanes. A dregosa et al., 2011). The solver algorithms of support vec-
hyperplane is a subspace whose dimension is one less than that tor regression make iterative predictions until at least one
of the space it lies, and this means that a three-dimensional feasibility gap, gradient difference or largest Karush–Kuhn–
space will have a two-dimensional hyperplane. Once the hy- Tucker violation is lower than their respective tolerance val-
perplanes have been established, points are classified depend- ues (Huang et al., 2006). The above criteria help in optimizing
ing on whether they lie on the hyperplane’s positive or negative the iteration of variables and the Lagrange multiplier.
side. SVM helps capture non-linear functions based on statis-
tical learning theory, a framework for machine learning when
Random forest method
prediction is the ultimate objective (Segal, 2004).
In SVM, when the system cannot find a hyperplane in Random forest (RF) is an ensemble learning bagging tech-
the given dimensions, it uses kernel functions to increase its nique. Ensemble learning offers the liberty of not relying on
dimensions. A kernel function makes this transition possi- only one learning model. It provides a structured solution to
ble without increasing the computational cost. The kernel combine different learners who may or may not be from the
Figure 3 (a) Decision boundaries associated with hyperplanes. (b) Kernal function transforming a 2D space into a 3D space.

6 V. Roy et al.
same learning algorithm. Bagging is a technique in which sev- electrical resistivity as input parameters. In their study, they
eral learning algorithms are built, and then means are taken created 28 empirical correlations (Augusto & Martins, 2009).
to find bagging probabilities (Dietterich, 2002). For a rock having a Poisson’s ratio in the range of 0.15
The RF technique uses the aggregation of many DTs, re- to 0.35 and a P wave velocity between 2000 and 6000 ft/s,
sulting in a reduced variance compared with the DT model. Carrol provided an equation to determine the S wave velocity.
It focuses on both observations and variables and creates in- The equation held up 30% to 60% for the porosity range and
dividual DTs. The total average is then taken when solving the density range of 1.6 g/cc to 2.7 g/cc (Carroll, 1969).
regression problems (Ben-Hur et al., 2001). The advent of AIattempts has been made to determine
the S wave velocity using the P wave velocity and other logs.
A backpropagation neural network and support vector regres-
eXtreme gradient boosting
sion was used by Maleki et al. They reduced log inputs. Only
eXtreme gradient boosting (XGBoost), like the RF technique, gamma-ray, compressional wave velocity and neutron poros-
is an ensemble learning algorithm based on a DT and uses ity were used (Maleki et al., 2014).
a gradient boosting framework. Chen et al. introduced this Predicting S wave velocity using P wave velocity is more
scalable end-to-end tree boosting system in 2015. It stands for accurate than using porosity logs (Tabari et al.). They pre-
extreme gradient boosting and is known for scalability and dicted P wave velocity from the S wave velocity using neural
fast execution. This method provides features like paralleliza- network models (Tabari et al., 2011).
tion, distributed computing and cache optimization (Chen A pseudo P wave velocity can also be determined by
et al., 2015). It is among the most versatile and futuristic stacked seismic traces (Lindseth, 1979). Silva et al. (2001) then
techniques in the arena of machine learning. Unlike RF, a stated that accurate results could be obtained using artificial
bagging technique, XGBoost uses smaller trees that are not neural network (ANN) to predict the sonic log from stacked
as deep. seismic traces.
Some studies independently used ANN to estimate sonic
logs from other well logs (Onalo et al., 2018; Elkatatny et al.,
PREDICTION OF SONIC TRANSIT TIME OR
2018). Elkatatny et al. estimated sonic log data using other
VELOCITY
well logs/ drilling parameters as input parameters (Tariq et al.,
From the estimation of bubble point right down to the calcu- 2016; Muqtadir et al., 2019; Gowida & Elkatatny, 2020). An
lation of the rate of penetration, machine learning has clearly attempt to make similar predictions using multi-well data was
found various use-cases in research related to the field of oil made using the eXtreme gradient boosting model for the pur-
and gas exploration (Anemangely et al., 2018; Ashrafi et al., pose (Liu et al., 2021). Regression techniques were also used
2019; Sabah et al., 2019; Anemangely et al., 2019; Mehrad for the task by other authors (Joshi et al., 2021; Gamal et al.,
et al., 2020; Ghorbani et al., 2020; Mohamadian et al., 2021; 2021).
Rashidi et al., 2021; Sabah et al., 2021; Dhar et al., 2021; It is evident from the literature that efforts have been fo-
Gupta et al., 2021; Abad et al., 2022; Matinkia et al., 2022; cused on predicting the shear wave velocity using the compres-
Agrawal et al., 2022). sional wave velocity. Previous studies have predicted the sonic
There have been numerous attempts to develop empiri- logs using other well logs with limited input parameters. The
cal correlations to predict the shear wave velocity from the use of limited parameters simplifies the process but might lead
compressional wave velocity in conjunction with other logs. to sub-optimal accuracy. The paper compares the most popu-
Castagna et al. (1985) used four different correlations for four lar AI techniques for their relative applicability in predicting
formations to predict the S wave velocity using the P wave ve- sonic logs by using other well logs. We have used an extensive
locity (Castagna et al., 1985). Eskandari et al. (2004) formed array of well logs to provide the system with the maximum
an artificial intelligence (AI)-based model of regression from training data.
geophysical well log data. They used P wave velocity, neutron
porosity and bulk density as predictors for the S wave velocity
(Eskandari et al., 2004). Augusto et al. presented a correlation
DEVELOPMENT METHODOLOGY
to determine the P wave velocity using well log data. They used
a non-linear regression technique to develop the correlation. The flow chart for the complete model development process
Their model used effective porosity, gamma-ray, shaliness and is given in Figure 4.

Figure 4 Workflow for the execution of algorithms.
Data collection, pre-processing and quality check were eliminated from the dataset during data cleaning. Zones
of washouts and keyseating were identified and eliminated us-
The dataset was obtained from an Australian oil field in a
ing the calipre log. Finally, Poisson’s ratio was calculated from
.las file and extracted into an excel spreadsheet. The objective
the sonic log data and checked for a reasonable range. This is
was to predict the sonic transit times at any depth and not
an industry-standard logging quality check. The representa-
just when specific formations were present; the complete log-
tion of the data used is illustrated in Figure 5(a–l).
ging data of the well was used. The dataset contained logging
data of neutron porosity, bulk density, gamma-ray, compres-
Input parameters
sional wave transit time (DTC), shear wave transit time (DTS),
deep resistivity (RESD), medium resistivity (RESM) and shal- All these available logs were used in our study as it is im-
low resistivity (RESS). The sections of data having null sets perative to provide the system with the maximum possible

8 V. Roy et al.
Figure 5 (a) P wave transit times, (b) S wave transit times, (c) gamma-ray log, (d) neutron porosity log, (e) photoelectric absorption factor, (f)
potassium spectral gamma-ray log, (g) deep resistivity log, (h) medium resistivity log, (i) shallow resistivity log, (j) thorium spectral gamma-ray
log, (k) uranium spectral gamma-ray log and (l) density log.

Table 1 ANN model summary
Layer (type) Output shape Input parameters
Dense_2 (None, 128) 1408

Dropout (None, 128) 0
Dense_3 (None, 1) 129
learning regression techniques like the decision tree regres-

sion, random forest regression, support vector regression and
XGBoost were applied to these data. Also, an effective artifi-
cial neural network was constructed and tested for the same.
The optimizer used with the artificial neural network (ANN)
Figure 6 Correlation coefficients of DTC and DTS with the other well
logs.
model was ADAM, which is an adaptive movement estima-
tion technique. A dropout neural network layer minimizes the
chances of overfitting. Table 1 shows the ANN model sum-
information to learn. This builds a better interpretation of the mary. The total number of parameters was 1537, and all of
dataset. Along with an extensive array of well logs, more than them were trainable.
10,000 data points were used for the study. Although supervised learning regression techniques are
Figure 6 shows how the different well logs correlate to not dependent upon any particular set of equations, the func-
the compressional and shear sonic transit times. tional equations of ANN for our model are shown by equa-
The correlation coefficients of each well log with shear tions (7) and (8).
wave transit time and compressional wave transit time were ⎡ ⎛ ⎞⎤
calculated. The neutron porosity showed a strong correlation
128

Y = max ⎣0, ⎝ hidden j ∗ w1, j + b⎠⎦ , (7)
coefficient (R2 ) of 0.83 with the S wave transit time (DTS) and j=0
0.75 with the P wave transit time (DTC). This strong correla-
tion shows that both the neutron porosity log and the sonic

9

log are porosity logs. The potassium spectral gamma-ray log hidden j = max 0, xi ∗ wi, j + b j . (8)
illustrated a moderate positive R2 of 0.67 with DTS and 0.62 i = 0
with DTC. Other parameters failed to demonstrate good cor-
The proposed work assesses two distinct scenarios. In
relations with the sonic transit times but were still relevant.
the first scenario, the compressional wave transit times are
known but lags the information about the shear wave tran-
Development of the machine learning models sit time. In the second scenario, no sonic log data are avail-
able. The existing well logs have to be used to estimate
As a standard procedure followed in machine learning, the
the compressional and shear wave transit times. The train-
dataset was fragmented into two parts. Seventy-five per cent
ing and validation loss curves for DTS prediction by ANN
of the data trained the model, and 25% validated it, i.e. 75%
in case 1 are represented in Figure 8(a). Figure 8(b and
of the data were used to form decision trees, determine weights
c) illustrates the ANN models to predict DTC and DTS in
and biases etc. The remaining 25%were concealed from the
case 2.
model and were later used to evaluate the developed model’s
These training and validation loss curves are indicative
skill for predicting the sonic logs. Thus, in our study, 8172 data
of a model’s tendency for overfitting or underfitting. Overfit-
points were used in training and 2724 for validation. Com-
ting is the ANN equivalent of memorizing. This means that
plete data were scaled using min–max scaling, i.e. the highest
the model becomes very skilled at predicting the outcome in
value was taken as 1 and the lowest as 0.
the training set but fails to make accurate predictions when
X − Xmin given an input different from the data set. On the other hand,
Xscaled = . (6)
Xmax − Xmin underfitting refers to the condition in which the input data
The other points were scaled accordingly to ascertain are insufficient for making the necessary assessments. The
the distribution of values in the range of 0 to 1. Supervised curves shown in Figure 7(a–c) illustrate the developed model’s

10 V. Roy et al.
Figure 7 (a) Training and validation loss curves for DTS (Case 1), (b) training and validation loss curves for DTS (Case 2), (c) training and
validation loss curves for DTC (Case 2).
effectiveness due to a minimal deviation between the training of hyperparameters. The grid search considers all the com-
and validation losses. binations and returns the best one (Pedregosa et al., 2011).
Table 2 summarizes the hyperparameter tuning done in the
study for the different algorithms.
Hyperparameter tuning
The learning process of any machine learning model is gov-

Metrics used for comparison
erned by the values of some defined parameters like tree depth
in the decision tree model, number of trees in random forest Once the different models were constructed and executed, the
model, number of neurons in a layer in ANN etc. Such param- coefficient of determination (R2 ) and root mean square error
eters are known as hyperparameters and are chosen wisely. (RMSE) were used as metrics for comparison.
Assigning values to these hyperparameters is challenging and The coefficient of determination is a popular metric for
vital for the model’s optimum performance (Claesen & De assessing the performance of a regression model. It depends
Moor, 2015). on the sum of squared residuals to the total sum of errors and
Hyperparameter tuning or optimization is the process of can be represented by equation (9).
identifying a set of values for any algorithm that minimizes the SSres
loss function. An exhaustive grid search helps in the selection R2 = 1 − . (9)
SStotal

Table 2 List of hyperparameters tested and the obtained best hyperparameters
Algorithm Parameters Hyperparameters Best Hyperparameters
Artificial neural network activation [’linear’, ’relu’, ’sigmoid’] ‘relu’

init_weights [’uniform’, ’normal’, ’he_uniform’] ‘uniform’
optimizer [’SGD’, ’RMSprop’, ’Adam’] ‘Adam’
dropout_rate [0.0, 0.2, 0.4] 0.2
neurons [32, 64, 128] 128
epochs [300,500] 500
Decision tree regression criterion [‘mse’, ‘mae’] ‘mse’
min_samples_split [2,3,4] 3
max_depth [10, 11, 12] 10
min_samples_leaf [4, 6, 8] 6
max_leaf_nodes [100, 150, 200] 200
Random forest regression bootstrap [True, False] False
max_depth [20,40, 60] 40
max_features [‘auto’, ‘sqrt’] ‘sqrt’
min_samples_leaf [1, 2] 1
min_samples_split [1, 3, 5] 3
n_estimators [200, 300, 400] 400
Support vector regression kernel [‘rbf’, ‘linear’, ‘poly’] ‘rbf’
gamma [‘scale’, ‘auto’] ‘scale’
epsilon [0.2, 0.5,1] 1
C [1, 10, 100] 100
XGBoost regression learning_rate [0.03,0.1] 0.03
max_depth [6,12,18] 18
min_child_weight [1, 3, 5] 3
subsample [0.3, 0.4, 0.7] 0.4
colsample_bytree [0.7,0.8,0.9] 0.9
n_estimators [300, 400, 500] 500
objective [‘reg:squarederror’, ‘reg:squarederror’
‘reg:squaredlogerror’, ‘reg:logistic’]
Greater values of R2 signify a better relationship between these data is the driving force for applying machine learning
the input parameters and output. models. Also, the technique may be used to develop sonic data
output in the zones where the measurements are not taken or

1 m 2
RMSE = Xobserved − Xpredicted . (10) are not reliable.
m j=1
In this case, we consider that the compressional wave
transit time is available, and the shear wave transit time is
RMSE is the average of the squares of all the errors re-
vital for the compressional wave velocity. In this case, we used
lated to the root of squared error loss. RMSE incorporates
compressional wave transit time (DTC) along with other logs
both variance and the estimator’s bias and is always non-
as inputs to determine shear wave transit time (DTS).
negative. Values as low as possible are desirable.
eXtreme gradient boosting (XGBoost) and the random
forest technique emerged as the most suitable methods with
R E S U LT S A N D D I S C U S S I O N correlation coefficients (R2 ) 0.97 each and root mean square
errors (RMSE) of 1.36 × 10−4 and 1.03 × 10−4 , respec-
Case 1. Compressional transit time data available
tively. They were followed by artificial neural network (ANN),
Shear wave and compressional wave transit times play an es- which had an R2 of 0.96 and an RMSE of 1.75 × 10−4 .
sential role in predicting the geo-mechanical properties of for- The R2 values of Support Vector Regression (SVR) and de-
mations. However, in many wells, monopole sonic logging cision tree regression were 0.94 and 0.95, respectively. Their
tools are run in. These tools are incapable of measuring the respective root mean squared errors were 8.58 × 10−3 and
shear transit times. The possibility of accurately predicting 3.48 × 10−4 .

12 V. Roy et al.
Table 3 Correlation coefficients and root mean square errors of all the applied AI techniques for the scaled dataset
Algorithm Coefficient of determination R2 Root mean square error
DTS (Case 1) ANN 0.96 1.75 × 10−4

Decision tree regression 0.95 3.48 × 10−4
Random forest regression 0.97 1.03 × 10−4
Support vector regression 0.94 8.58 × 10−3
XGBoost regression 0.97 1.36 × 10−4
DTC (Case 2) ANN 0.91 2.48 × 10−4
DTS (Case 2) ANN 0.93 8.50 × 10−4
The coefficients of determination, in this case, were ob- Correlation coefficients and root mean squared errors were
served to be the highest. This is because available DTC is determined for each technique and were reported in Table 3.
strongly related to the shear wave transit time. For this purpose, random forest and XGBoost proved to
Although all the methods show a high enough value of be the most effective techniques, with an R2 of 0.94 and RMSE
R2 , Figure 8(a–e) shows that the predictions made by the de- of 9.06 × 10−5 and 4.95 × 10−5, respectively. The difference
cision tree model and SVM lie close to the measured values in applicability between random forest, ANN and XGBoost
but are distributed in a broader range. was not that significant.
We should have predictions distributed near the mea- However, the regression coefficient confirms that the de-
sured values across a narrower range to reduce the error of cision tree model and SVM regression are less effective. In the
any individual prediction. Thus, it can be said that ANN, ran- case of SVM, there is a significant deviation between the pre-
dom forest and XGBoost, for the given dataset, are relatively dicted and measured results. As indicated by the R2 values,
more effective techniques. The regression line (red line) signi- the range of predicted values is the broadest for SVM and de-
fies a close match between measured DTS and the predicted cision tree. Subsequently, it is the narrowest for random forest
data. technique and XGBoost.
Case 2. Sonic logging data not available Prediction of shear wave transit time
Machine learning models offer the possibility of building sonic DTS, in our case, showed a better correlation with the input
log data even when the sonic logging tool has not been run in data than DTC. Once again, XGBoost and the random for-
or has failed to return reliable data for any reason. The input est technique emerged to be the most effective methods with
logs have been represented in Figure 5. To predict compres- R2 of 0.95 each compared with that of 0.93, 0.92 and 0.89
sional and shear wave transit times (DTC and DTS), all the of ANN, Decision tree regression and SVR. The root mean
logs except these two were taken as input parameters. squared error of each method has been reported in Table 3.
Also, Figure 10(a–e) represents the distribution of predictions
and confirms the effectiveness of the methods.
Prediction of compressional wave transit time
Instantly, it can be observed that when no sonic transition time

S U M M A RY A N D C O N C L U S I O N S
data were provided, the measured DTC data and the predicted
data are in confluence with each other. Figure 9(a–e) illus- Five artificial intelligence (AI) models (artificial neural net-
trates the predicted DTC values against the measured values. work, support vector regression, decision tree regression,

Figure 8 (a) Training and validation loss curves for DTS (Case 1), (b) training and validation loss curves for DTS (Case 2), (c) training and
validation loss curves for DTC (Case 2).

14 V. Roy et al.
Figure 9 (a) Performance plot for ANN, (b) performance plot for decision tree regression, (c) performance plot for random forest regression, (d)
performance plot for SVM regression, (e) performance plot for XGBoost.

Figure 10 (a) Performance plot for ANN, (b) performance plot for decision tree regression, (c) performance plot for random forest regression,
(d) performance plot for SVM regression, (e) performance plot for XGBoost.

16 V. Roy et al.
random forest regression and eXtreme gradient boosting (XG- neural network: a comparative field data study with optimiz-
Boost)) were developed and used to estimate the sonic log ing algorithms. Journal of Energy Resources Technology, 144(4),
043003.
transit times using other logs. The models were trained and
Alford, J., Blyth, M., Tollefsen, E., Crowe, J., Loreto, J., Mohammed,
tested using more than 10,000 data points. The present work
S. et al. (2012) Sonic logging while drilling-shear answers. Oilfield
efficiently predicts the sonic log transit time using other logs Review, 24(1), 4–15.
using AI. AI provides a cost-effective way to predict sonic tran- Anemangely, M., Ramezanzadeh, A., Amiri, H. and Hoseinpour, S.A.
sit times in the absence of measured data. This application (2019) Machine learning technique for the prediction of shear wave
augments the cases of missing or unreliable sonic logging data. velocity using petrophysical logs. Journal of Petroleum Science and
Engineering, 174, 306–327.
In cases of unavailability of both compressional wave transit
Anemangely, M., Ramezanzadeh, A. and Behboud, M.M. (2019) Ge-
time (DTC) and shear wave transit time (DTS), AI techniques omechanical parameter estimation from mechanical specific energy
can still predict them satisfactorily. using artificial intelligence. Journal of Petroleum Science and Engi-
XGBoost and the random forest technique resulted in neering, 175, 407–429.
the most accurate predictions with an R2 of 0.97 (root mean Anemangely, M., Ramezanzadeh, A. and Tokhmechi, B. (2017) Shear
wave travel time estimation from petrophysical logs using ANFIS-
square errors (RMSE) 1.03 × 10−4 and 1.36 × 10−4 , respec-
PSO algorithm: a case study from Ab-Teymour oilfield. Journal of
tively) each when DTC was available. If both DTC and DTS
Natural Gas Science and Engineering, 38, 373–387.
need to be predicted, the R2 of both the techniques for DTC Anemangely, M., Ramezanzadeh, A., Tokhmechi, B., Molaghab, A.
was 0.94 (RMSE 9.06 × 10−5 and 4.95 × 10−5 , respectively) and Mohammadian, A. (2018) Drilling rate prediction from petro-
and for DTS, 0.95 (RMSE 6.41 × 10−4 and 7.99 × 10−4 re- physical logs and mud logging data using an optimized multilayer
spectively). The correlation of the combined input data was perceptron neural network. Journal of Geophysics and Engineering,
15(4), 1146–1159.
higher with DTS than with DTC. XGBoost and the random
Ashrafi, S.B., Anemangely, M., Sabah, M. and Ameri, M.J. (2019) Ap-
forest regression techniques outperform all for sonic log pre- plication of hybrid artificial neural networks for predicting rate of
diction purposes. It is an efficient estimation method com- penetration (ROP): a case study from Marun oil field. Journal of
pared with empirical correlations. Petroleum Science and Engineering, 175, 604–623.
Asoodeh, M. (2013) Prediction of Poisson’s ratio from conventional
well log data: a committee machine with intelligent systems ap-
AC K N OW L E D G E M E N T S proach. Energy Sources, Part A: Recovery, Utilization, and Envi-
ronmental Effects, 35(10), 962–975.
We greatly acknowledge the Rajiv Gandhi Institute of
Augusto, F.D.O.A. and Martins, J.L. (2009) A well-log regres-
Petroleum Technology for its computational and technical sion analysis for P-wave velocity prediction in the Namorado
support. Thanks are extended to all the members associated oil field, Campos basin. Revista Brasileira de Geofisica, 27(4),
with the work. 595–608.
Ben-Hur, A., Horn, D., Siegelmann, H.T. and Vapnik, V. (2001)
Support vector clustering. Journal of Machine Learning Research,
DATA AVA I L A B I L I T Y S TAT E M E N T 2(Dec), 125–137.
Bishop, C.M. (1995) Neural Networks for Pattern Recognition. Ox-
The data that support the findings of this study are available ford: Oxford University Press.
on request from the corresponding author. The data are not Carroll, R.D. (1969) The determination of the acoustic parameters of
publicly available due to privacy or ethical restrictions. volcanic rocks from compressional velocity measurements. Interna-
tional Journal of Rock Mechanics and Mining Sciences & Geome-
chanics Abstracts, 6 (6), 557–579.
ORCID Castagna, J.P., Batzle, M.L. and Eastwood, R.L. (1985) Relationships
between compressional-wave and shear-wave velocities in clastic
Amit Saxena https://orcid.org/0000-0001-7958-3576
silicate rocks. Geophysics, 50(4), 571–581.
Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y. and
Cho, H. (2015) Xgboost: extreme gradient boosting. R package
REFERENCES
version 0.4-2.
Abad, A.R.B., Ghorbani, H., Mohamadian, N., Davoodi, S., Mehrad, Choudhary, R. and Gianey, H.K. (2017) Comprehensive review on su-
M., Aghdam, S.K.Y. and Nasriani, H.R. (2022) Robust hybrid ma- pervised machine learning algorithms. In 2017 International Con-
chine learning algorithms for gas flow rates prediction through ference on Machine Learning and Data Science (MLDS) (pp. 37–
wellhead chokes in gas condensate fields. Fuel, 308, 121872. 43). IEEE.
Agrawal, R., Malik, A., Samuel, R. and Saxena, A. (2022) Real- Claesen, M. and De Moor, B. (2015) Hyperparameter search in ma-
time prediction of litho-facies from drilling data using an artificial chine learning. arXiv:1502.02127.

Dhar, V., Beg, M., Saxena, A. and Sharma, S. (2021) Capillary suc- Hsu, K., Brie, A. and Plumb, R.A. (1987) A new method for fracture
tion timer and machine learning techniques as tools for eval- identification using array sonic tools. Journal of Petroleum Tech-
uating the performance of different shale inhibitors used in nology, 39(6), 677–683.
drilling mud. Journal of Natural Gas Science and Engineering, Huang, T.M., Kecman, V. and Kopriva, I. (2006) Kernel-Based Algo-
96, 104301. rithms for Mining Huge Data Sets (Vol. 1). Heidelberg: Springer.
Dietterich, T.G. (2002) Ensemble learning. The Handbook of Brain Joshi, D., Patidar, A.K., Mishra, A., Mishra, A., Agarwal, S., Pandey,
Theory and Neural Networks, 2, 110–125. A. et al. (2021) Prediction of sonic log and correlation of lithol-
Doh, C.A. and Alger, R.P. (1958) Sonic logging, a new petrophys- ogy by comparing geophysical well log data using machine learn-
ical tool. In Rocky Mountain Annual Joint Meeting. Society of ing principles. GeoJournal, 1–22. https://doi.org/10.1007/s10708-
Petroleum Engineers. 021-10502-6
Elkatatny, S., Tariq, Z., Mahmoud, M., Mohamed, I. and Abdulra- Khazanehdari, J. and McCann, C. (2005) Acoustic and petrophysical
heem, A. (2018) Development of new mathematical model for com- relationships in low-shale sandstone reservoir rocks. Geophysical
pressional and shear sonic times from wireline log data using arti- Prospecting, 53(4), 447–461.
ficial intelligence neural networks (white box). Arabian Journal for Kingma, D.P. and Ba, J. (2014) Adam: a method for stochastic opti-
Science and Engineering, 43(11), 6375–6389. mization. arXiv:1412.6980.
Eskandari, H., Rezaee, M.R. and Mohammadnia, M. (2004) Applica- Krief, M., Garat, J., Stellingwerff, J. and Ventre, J. (1990) A petro-
tion of multiple regression and artificial neural network techniques physical interpretation using the velocities of P and S waves (full-
to predict shear wave velocity from wireline log data for a carbon- waveform sonic). The Log Analyst, 31(6), SPWLA-1990-v31n6a2.
ate reservoir South-West Iran. CSEG Recorder, 42, 48. Lindseth, R.O. (1979) Synthetic sonic logs: a process for stratigraphic
Gamal, H., Alsaihati, A. and Elkatatny, S. (2021) Predicting the interpretation. Geophysics, 44(1), 3–26.
rock sonic logs while drilling by random forest and decision tree- Liu, S., Zhao, Y. and Wang, Z. (2021) Artificial intelligence method for
based algorithms. Journal of Energy Resources Technology, 144(4), shear wave travel time prediction considering reservoir geological
043203. continuity. Mathematical Problems in Engineering, 2021, 5520428.
Gevrey, M., Dimopoulos, I. and Lek, S. (2003) Review and compar- Maleki, S., Moradzadeh, A., Riabi, R.G., Gholami, R. and
ison of methods to study the contribution of variables in artificial Sadeghzadeh, F. (2014) Prediction of shear wave velocity using
neural network models. Ecological Modelling, 160(3), 249–264. empirical correlations and artificial intelligence methods. NRIAG
Ghorbani, H., Wood, D.A., Mohamadian, N., Rashidi, S., Davoodi, Journal of Astronomy and Geophysics, 3(1), 70–81.
S., Soleimanian, A. et al. (2020) Adaptive neuro-fuzzy algorithm Matinkia, M., Amraeiniya, A., Behboud, M.M., Mehrad, M., Ba-
applied to predict and control multi-phase flow rates through jolvand, M., Gandomgoun, M.H. and Gandomgoun, M. (2022) A
wellhead chokes. Flow Measurement and Instrumentation, 76, novel approach to pore pressure modeling based on conventional
101849. well logs using convolutional neural network. Journal of Petroleum
Gowida, A. and Elkatatny, S. (2020) Prediction of sonic wave transit Science and Engineering, 110156.
times from drilling parameters while horizontal drilling in carbon- Mehrad, M., Bajolvand, M., Ramezanzadeh, A. and Neycharan, J.G.
ate rocks using neural networks. Petrophysics-The SPWLA Journal (2020) Developing a new rigorous drilling rate prediction model
of Formation Evaluation and Reservoir Description, 61(5), 482– using a machine learning technique. Journal of Petroleum Science
494. and Engineering, 192, 107338.
Gupta, A., Pandey, A., Kesarwani, H., Sharma, S. and Saxena, A. Mehrad, M., Ramezanzadeh, A., Bajolvand, M. and Hajsaeedi, M.R.
(2021) Automated determination of interfacial tension and con- (2022) Estimating shear wave velocity in carbonate reservoirs
tact angle using computer vision for oil field applications. Journal from petrophysical logs using intelligent algorithms. Journal of
of Petroleum Exploration and Production Technology, 12, 1453– Petroleum Science and Engineering, 110254.
1461. Mitchell, T.M. (1997) Machine Learning. New York: McGraw Hill.
Haghighi, M., Shadizadeh, S.R. and Shahbazian, M. (2014) Prediction Mohamadian, N., Ghorbani, H., Wood, D.A., Mehrad, M., Davoodi,
of compressional and shear slowness from conventional well log S., Rashidi, S. et al. (2021) A geomechanical approach to cas-
data: using intelligent systems. Energy Sources, Part A: Recovery, ing collapse prediction in oil and gas wells aided by machine
Utilization, and Environmental Effects, 36(19), 2126–2134. learning. Journal of Petroleum Science and Engineering, 196,
Harrison, A.R., Randall, C.J., Aron, J.B., Morris, C.F., Wignall, A.H., 107811.
Dworak, R.A. et al. (1990) Acquisition and analysis of sonic wave- Muqtadir, A., Elkatatny, S.M., Tariq, Z., Mahmoud, M.A. and Ab-
forms from a borehole monopole and dipole source for the deter- dulraheem, A. (2019) Application of artificial intelligence to predict
mination of compressional and shear speeds and their relation to sonic wave transit time in unconventional tight sandstones. In 53rd
rock mechanical properties and surface seismic data. In SPE Annual US Rock Mechanics/Geomechanics Symposium. OnePetro.
Technical Conference and Exhibition. Society of Petroleum Engi- Nair, V. and Hinton, G.E. (2010) Rectified linear units improve re-
neers. stricted Boltzmann machines. In Proceedings of the 27th Interna-
Hartshorn, S. (2016) Machine learning with random forests and de- tional Conference on Machine Learning, Haifa, Israel
cision trees: a visual guide for beginners. Kindle edition. Onalo, D., Adedigba, S., Khan, F., James, L.A. and Butt, S. (2018) Data
Hinton, G.E., Osindero, S. and Teh, Y.W. (2006) A fast learning algo- driven model for sonic well log prediction. Journal of Petroleum
rithm for deep belief nets. Neural Computation, 18(7), 1527–1554. Science and Engineering, 170, 1022–1037.

18 V. Roy et al.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., application to Namorado Reservoir Data, Campos Basin, Brazil.
Grisel, O. et al. (2011) Scikit-learn: machine learning in Python. In 7th International Congress of the Brazilian Geophysical Society
Journal of Machine Learning Research, 12, 2825–2830. (pp. cp–217). European Association of Geoscientists and Engineers.
Rashidi, S., Mehrad, M., Ghorbani, H., Wood, D.A., Mohamadian, Singh, A., Thakur, N. and Sharma, A. (March). A (2016) review of
N., Moghadasi, J. and Davoodi, S. (2021) Determination of bubble supervised machine learning algorithms. In 2016 3rd International
point pressure and oil formation volume factor of crude oils apply- Conference on Computing for Sustainable Global Development
ing multiple hidden layers extreme learning machine algorithms. (INDIACom) (pp. 1310–1315). IEEE.
Journal of Petroleum Science and Engineering, 202, 108425. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhut-
Sabah, M., Mehrad, M., Ashrafi, S.B., Wood, D.A. and Fathi, S. (2021) dinov, R. (2014) Dropout: a simple way to prevent neural networks
Hybrid machine learning algorithms to enhance lost-circulation from overfitting. The Journal of Machine Learning Research, 15(1),
prediction and management in the Marun oil field. Journal of 1929–1958.
Petroleum Science and Engineering, 198, 108125. Tabari, K., Tabari, O. and Tabari, M. (2011) A fast method for es-
Sabah, M., Talebkeikhah, M., Wood, D.A., Khosravanian, R., Ane- timating shear wave velocity by using neural network. Australian
mangely, M. and Younesi, A. (2019) A machine learning approach Journal of Basic and Applied Sciences, 5, 1429–1434.
to predict drilling rate using petrophysical and mud logging data. Tariq, Z., Elkatatny, S., Mahmoud, M. and Abdulraheem, A. (2016)
Earth Science Informatics, 12(3), 319–339. A new artificial intelligence based empirical correlation to predict
Segal, M.R. (2004) Machine learning benchmarks and random for- sonic travel time. In International Petroleum Technology Confer-
est regression. UCSF: Center for Bioinformatics and Molecu- ence. OnePetro.
lar Biostatistics. Retrieved from https://escholarship.org/uc/item/ Vert, J.P., Tsuda, K. and Schölkopf, B. (2004) A primer on kernel meth-
35x3v9t4 ods. Kernel methods in computational biology, 47, 35–70.
Shahvar, M.B., Badounak, N.D. and Kharrat, R. (2014) A new ap- Williams, D.M. (1990) The acoustic log hydrocarbon indicator. In SP-
proach for compressional slowness modeling using wavelet coeffi- WLA 31st Annual Logging Symposium. Society of Petrophysicists
cients. Energy Sources, Part A: Recovery, Utilization, and Environ- and Well-Log Analysts.
mental Effects, 36(19), 2106–2112. Xu, M., Watanachaturaporn, P., Varshney, P.K. and Arora, M.K.
Silva, M.B.C., dos Santos, R.V., Martins, J.L. and Fontoura, S.A. (2005) Decision tree regression for soft classification of remote
(2001) Sonic log prediction using artificial neural networks: sensing data. Remote Sensing of Environment, 97(3), 322–336.

Augmenting and Eliminating The Use of Sonic Logs Using Artificial Intelligence A Comparative Evaluation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Augmenting and Eliminating The Use of Sonic Logs Using Artificial Intelligence A Comparative Evaluation

Uploaded by

Copyright:

Available Formats

Geophysical Prospecting, 2022 doi: 10.1111/1365-2478.

Augmenting and eliminating the use of sonic logs using artificial

Received May 2021, revision accepted April 2022

© 2022 European Association of Geoscientists & Engineers. 1

Key words: Artificial intelligence, Compressional wave, Neural networks, Shear

monopole logging tools, i.e. they do not give any information

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

Figure 1 (a) General structure of ANN; (b) effect of dropout on ANN.

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

function used in this study is the radial basis function (RBF)

To classify points as positive or negative, SVM uses deci-

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

Figure 4 Workflow for the execution of algorithms.

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

Table 1 ANN model summary

Layer (type) Output shape Input parameters

Dense_2 (None, 128) 1408

learning regression techniques like the decision tree regres-

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

The learning process of any machine learning model is gov-

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

Table 2 List of hyperparameters tested and the obtained best hyperparameters

Algorithm Parameters Hyperparameters Best Hyperparameters

Artificial neural network activation [’linear’, ’relu’, ’sigmoid’] ‘relu’

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

Algorithm Coefficient of determination R2 Root mean square error

DTS (Case 1) ANN 0.96 1.75 × 10−4

Instantly, it can be observed that when no sonic transition time

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

© 2022 European Association of Geoscientists & Engineers., Geophysical Prospecting, 1–18

You might also like