1 s2.0 S0920410521016442 Main

Journal of Petroleum Science and Engineering 210 (2022) 110033
Contents lists available at ScienceDirect
Journal of Petroleum Science and Engineering

journal homepage: www.elsevier.com/locate/petrol
Computational prediction of the drilling rate of penetration (ROP): A

comparison of various machine learning approaches and traditional models
Ehsan Brenjkar, Ebrahim Biniaz Delijani *
Department of Petroleum Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
A R T I C L E I N F O A B S T R A C T
Keywords: Rate of penetration (ROP) prediction, can assist precise planning of drilling operations and can reduce drilling
Rate of penetration costs. However, easy estimation of this key factor by traditional or experimental models is very difficult. This
Traditional models requires comparing available models for achieving the best prediction approach. In this study, four machine
Artificial neural networks
learning (ML) methods and two traditional ROP models were utilized to predict ROP. ML techniques include
Mud logging unit
Meta-heuristic algorithms
multilayer perceptron neural network (MLPNN), radial basis function neural network (RBFNN), adaptive neuro-
Machine learning methods fuzzy inference system (ANFIS), and support vector regression (SVR). MLPNN, ANFIS, and RBFNN methods were
trained with four meta-heuristic algorithms including particle swarm optimization (PSO), ant colony optimiza
tion (ACO), differential evolution (DE), and genetic algorithm (GA). The backpropagation (BP) algorithm was
also incorporated to train the ANFIS and MLPNN methods as a conventional method. For comparison purposes,
the traditional ROP models of Bourgoyne and Young (BYM) and Bingham were also implemented in combination
with four meta-heuristic algorithms. The required data were collected from the mud logging unit (MLU) and the
final report of a drilled well located in southwestern of Iran. In the MLU, the information of different sensors is
firstly collected through the data communications protocols and sent to a master unit to be processed into
relevant dimensions (i.e. create operational variables). In order to accurately record information, the sensors are
calibrated at regular intervals. After removing the outliers, the overall noise of data was reduced by Savitzky-
Golay (SG) smoothing filter. Then, in order to simulate a drilling process in a realistic manner and also to
evaluate the performance of the models in approximating the penetration rate at greater depth of hole, the
available data were divided into 6 sections based on depth. After that, sections 2 to 6 were separately used as a
test dataset and all previous sections were considered as a training dataset. Results indicated that PSO-MLPNN
achieved the highest performance in comparing with other developed models. The concluding remark is that ML
models are more efficient and reliable than traditional models. In addition, combining ML models with meta-
heuristic algorithms can achieve better results than conventional algorithms such as BP. The results of this
study can be used as a practical guide for the management and planning of future well drilling.
1. Introduction achieving an optimal value of ROP is essential (Abdulmalek et al., 2018).

The variables affecting ROP can be divided into two subgroups of
The rate of penetration (ROP) is an important factor for optimizing controllable and uncontrolled variables. The controllable variables can
the drilling operation, which precise approximation of this factor can be changed during the drilling operation, while the uncontrollable
lead to managing the consumption costs, helping to select the appro variables cannot be simply altered due to economic and environmental
priate drilling bit, and also providing better planning of future wells issues (Abbas et al., 2018). Each of the models presented in the literature
(Amer et al., 2017). ROP optimization allows achieving the target depth review considers a certain number of drilling variables as inputs.
at a lower cost in addition to meeting operational and environmental However, three variables, including weight on bit, bit rotational speed
requirements (Bodaghi et al., 2015). Excessive increases in ROP can and bit size are commonly found in most ROP models (Ayoub et al.,
cause problems such as stuck pipe, poor hole cleaning, and bit tooth 2017).
wear, which counteract the benefits of quick drilling. Therefore, In recent decades, several mathematical relationships have been
* Corresponding author.
E-mail address: biniaz@srbiau.ac.ir (E. Biniaz Delijani).
https://doi.org/10.1016/j.petrol.2021.110033
Received 30 May 2021; Received in revised form 29 November 2021; Accepted 7 December 2021
Available online 10 December 2021
0920-4105/© 2021 Elsevier B.V. All rights reserved.
E. Brenjkar and E. Biniaz Delijani Journal of Petroleum Science and Engineering 210 (2022) 110033
the noisy data. Savitzky-Golay (SG) smoothing filter is a low pass filter
that extracts useful data from the noisy data in a way that ultimately
retains the original shape of the signal (Dombi and Dineva, 2018). After
removing outliers and reducing the noises, all available data were
divided into 6 sections based on depth, and then sections 2 to 6 were
considered separately as a test dataset, and all sections before them were
considered as a training dataset.
The aim of this study is to develop four ML-based methods and to
compare their performance with two traditional ROP models to provide
a basis for selecting the best method that can approximate ROP at
greater depth of hole. The ML-based methods include multilayer per
ceptron neural network (MLPNN), radial basis function neural network
(RBFNN), adaptive neuro-fuzzy inference system (ANFIS), and support
vector regression (SVR). The MLPNN, ANFIS, and RBFNN models were
combined with four different evolutionary algorithms, including parti
cle swarm optimization (PSO), ant colony optimization (ACO), differ
ential evolution (DE), and genetic algorithm (GA). The backpropagation
(BP) algorithm was also used as a conventional algorithm to combine
with ANFIS and MLPNN models. The traditional ROP models are
comprised of the BYM and the Bingham model. In the literature review,
the constants of the BYM model were calculated using optimization al
gorithms and mathematical methods (Anemangely et al., 2017; Bahari
et al., 2008; Eren and Ozbayoglu, 2010; Nascimento et al., 2015).
Consequently, four evolutionary algorithms including PSO, GA, ACO,
and DE were employed to calculate the field constants of these two
Fig. 1. Schematic of a sequence of drilled formations and the different sections
of the studied well based on the depth.
traditional ROP models. The error of all models was evaluated and
compared by different statistical performance indices such as average
absolute percent relative error (AAPRE), root mean square error
presented by researchers to model the correlation between ROP and
(RMSE), and regression coefficient (R). The distinguishing goal of this
different drilling variables. Some important previous ROP models were
study is to perform a comprehensive comparison between widely used
presented by Galle and Woods (1963), Bingham (1965), and Bourgoyne
ML methods and traditional models optimized by various meta-heuristic
and Young (1974). The Bourgoyne and Young model (BYM) is one of the
algorithms to predict ROP. ROP estimation is the first step in optimizing
most comprehensive experimental models that is widely accepted as a
the drilling process. So that the use of more accurate models helps to
useful model in the industry. Experimental models have some constants,
calculate the optimal values of drilling parameters more accurately to
which change from one field to another. These constants depend on
reduce time and costs during drilling operations. The results obtained
formation properties such as pressure and rock type, which must be
from this study can be considered in the optimization and management
determined based on the field data. Furthermore, in some traditional
of future drilling wells.
models, it is assumed that the effects of parameters such as weight on bit
and bit rotational speed on ROP are linear, and poor hole cleaning at
2. Data collection and preprocessing
very high drilling speeds is not considered. Despite some advantages of
traditional ROP models, such limitations increase calculation error and
Required data was collected from the mud logging unit (MLU) and
loss of time. Therefore, machine learning (ML) methods have been
the final report of a drilled well located in southwestern Iran. This well is
considered as a suitable alternative to ROP estimation by researchers
drilled vertically and possesses a conventional bottom hole assembly.
(Amer et al., 2017; Kor and Altun, 2020).
The MLU includes simultaneous continuous analysis of drilling fluid,
In this study, mud logging unit (MLU) and final report data of a
comprehensive monitoring of drilling operations, specialized calcula
drilled well located in southwestern Iran were used to develop ROP
tions, and surface analysis. In this unit, information is continuously
estimation models. The raw data usually includes noises that overlay
collected from various sensors and processed into operational variables.
useful data (Nettleton et al., 2010; Wang et al., 1995). The smoothing
A schematic of a sequence of drilled formations and the different sec
filter provided by Savitzky and Golay (1964) was employed to reduce
tions of the studied well is shown in Fig. 1.
Table 1
Statistical details of variables that collected from the studied well with 1912 samples.
Coded factor Parameter Unit Minimum Maximum Average SD skewness Kurtosis
Q1 Depth ft 1583.36 8677.82 5357.99 2015.95 − 0.17 1.92

Q2 Bit Size (BS) in 8.50 17.50 11.48 3.37 0.76 2.23
Q3 Weight on Bit (WOB) klbf 0.65 23.50 11.02 3.86 0.39 2.96
Q4 Hook Load (HL) klbf 99.02 250.47 176.93 40.90 − 0.10 1.83
Q5 Bit Rotational Speed (BRS) RPM 70.58 195.84 123.15 35.02 0.9 1.37
Q6 Torque lbf.ft 1135.72 2347.06 1880.75 241.31 − 1.20 3.57
Q7 Pump Pressure (PP) psi 175.04 2619.95 1299.95 987.69 − 0.04 1.09
Q8 Flow Rate (FR) gpm 199.74 913.56 527.41 232.68 0.31 1.62
Q9 Lag Time (LT) min 19.72 84.96 42.72 11.15 0.25 2.66
Q10 Mud Weight (MW) ppg 7.61 19.30 12.95 4.32 0.13 1.21
Q11 Equivalent Circulating Density (ECD) ppg 13.34 15.28 14.22 0.76 0.01 1.12
Q12 Mud Temperature (MT) ◦
C 29.81 61.47 44.12 9.26 0.14 1.33
Q13 Bit Working Hours (BWH) hr 39.78 535.11 280.25 143.39 − 0.03 1.81
Q14 Rate of Penetration (ROP) ft/hr 0.66 57.43 19.95 8.53 1.08 4.52
2
Fig. 2. The pairs plot for all collected variables to evaluate their correlation with each other.
distribution, respectively (Phirani et al., 2018). Moreover, the pairs plot

of all drilling variables based on regression coefficient is shown in Fig. 2
pairs plot represents the linear relationship between different variables
in overall view (Feng et al., 2021). The red lines in Fig. 2, indicate the
trend line through which type and intensity of correlation between two
variables can be evaluated. The slope of this line (or correlation coeffi
cient) is between − 1 and +1. By using the pairs plot, the correlation
between each pair of variables can be represented, and the variables
with a poor correlation with the target can be easily identified (Ashrafi
et al., 2019; Gan et al., 2019).
2.1. Noise reduction
The raw data usually contain noises that overlay useful data (Net
tleton et al., 2010; Wang et al., 1995). These noises are caused by factors
Fig. 3. The obtained correlation coefficients from the SVR model for different
values of window size and polynomial order of SG filter. such as the inaccuracy of recording systems and human errors, which
prolong the training process of models and reduce the model’s gener
alizability. Therefore, overcoming the noisy data is essential for
The database consists of 1912 samples with 13 independent variables
enhancing the model performance and achieving reliable results
and one dependent variable (i.e., rate of penetration). Measuring some
(Quinlan, 1986; Zhu et al., 2003).
parameters such as pore pressure gradient and bit wear in real-time is
In this research, the Savitzky-Golay (SG) smoothing filter was
very difficult or impossible (Barbosa et al., 2019; Soares and Gray,
employed to reduce the noisy data. SG is a low pass filter that extracts
2019). For this reason, in the development of ML methods, these pa
useful data from data with noise by retaining the original shape of the
rameters were omitted. The variables used to develop the models, are
signal (Dombi and Dineva, 2018). In recent years, some researchers have
presented along with statistical information such as mean, standard
used the SG filter to reduce the noise of drilling and petrophysical data to
deviation (SD), skewness, and kurtosis in Table 1. Skewness is a metric
improve the performance of ROP estimation models (Anemangely et al.,
showing the asymmetry of the probability distribution around distri
2018; Ashrafi et al., 2019; Sabah et al., 2019). This filter reduces the
bution mean. Right and left skewed data result in positive and negative
noise by smoothing via a polynomial function. To assign smoother
skewness, respectively. Kurtosis indicates the height of a distribution
values to each data point, a window moves across the data and a low
relative to the normal distribution. Positive and negative kurtosis means
degree polynomial function is fitted to a number of points. The
that the desired distribution is higher and lower than the normal
smoothing performance of this filter depends on two parameters
3
Fig. 4. Comparison of field data and denoised data for all collected variables.
including the order of polynomial and window size. Increasing the order values of these two parameters on the modeling approximation perfor
of polynomial and decreasing the window size reduces the smoothing, mance. SVR model was trained by 80% of filtered data and tested by the
while the opposite leads to over-smoothing. Also, no smoothing occurs remaining 20%. Finally, the correlation coefficient between real and
on the data if the difference between the order of polynomial and the approximate values was calculated, which is shown in Fig. 3. Based on
window size is equal to one (Liu et al., 2016). Fig. 3, the optimal values of the polynomial order and the window size
To find the best values for these two parameters, a sensitivity anal were selected to be 3 and 11, respectively. For all collected variables, the
ysis was employed. The intervals 1 to 4 and 3 to 29 were considered for field data (in orange) along with denoised data via SG filter (in black)
the polynomial order and window size, respectively. Then, the SVR are compared in Fig. 4.
model with RBF kernel was employed to evaluate the effect of different
4
Fig. 5. An illustration of separating all data to estimate the ROP.
2.2. Data split input of ML models in all 18 studies.
Collected variables include well depth (ft), bit size (in), weight on bit 3.1.1. Multiple layer perceptron neural network (MLPNN)
(klbf), hook load (klbf), bit rotational speed (rpm), torque (lbf.ft), pump The multilayer perceptron neural network (MLPNN) is one of the
pressure (psi), flow rate (gpm), lag time (min), mud weight (ppg), mud most popular artificial neural networks (ANNs) that are inspired by the
equivalent circulating density (ppg), mud temperature (◦ C), and bit structure of communication between human brain neurons. By utilizing
working hours (hr). 13 variables with 1912 existing samples were this network, the limitations of traditional models can be overcame and
divided into six sections. Then, sections 2 to 6 were considered sepa nonlinear functions can be approximated with acceptable accuracy (Li
rately as a test dataset and all previous sections were considered as a and Samuel, 2019). MLPNN consist of three input, hidden, and output
training dataset. In fact, splitting the data in a realistic manner and layers that are connected to each other by specific communication
based on a field application can determine the exact performance of the weights. The function of the MLPNN is such that the inputs are firstly
models (Tunkiel et al., 2021). An illustration of dividing all data into test multiplied by specific weight values and then transferred to the hidden
and training datasets is shown in Fig. 5 (the data have been colored layer. There are activation functions within the hidden layer neurons,
based on each section). which are responsible for collecting the weighted values of the inputs
and transferring them to a specified interval. The sigmoid activation
3. Methodology function is one of the most frequently used functions in the MLPNN
(Ashrafi et al., 2019; Brenjkar et al., 2021; Gouda et al., 2021). In the
3.1. Machine learning (ML) methods output layer, the weighted values of the hidden layer neurons are
summed and after transfer to a specific range by an activation function
Machine learning (ML) methods are a subset of artificial intelligence (typically linear), the outputs of the network are generated (Soofastaei
(AI) that mimics the mechanism of the human mind in learning to pre et al., 2016; Warsito et al., 2018). It should be noted that the input
dict and decisions making (Carbonell et al., 1983). The ability to variables must be normalized at a specified interval to eliminate the
recognize nonlinear patterns by ML methods is an obvious advantage impact of the data scale and fairly evaluate each of the inputs. Eq. (1)
over traditional models, which makes them able to predict ROP with can be used to normalize the data in the interval between 0 and 1.
much higher accuracy (Kor et al., 2021). ML methods can control the X − Xmin
number of inputs, so they can generate models by considering any xi = (1)
Xmax − Xmin
dataset size (Barbosa et al., 2019). On the other hand, ML methods can
be trained during drilling operations and increase model performance in where xi is the normalized value, X is the initial value, Xmin and Xmax
real-time with newly recorded operational data. In addition, the effect of implies the minimum and maximum value of the desired parameter
increasing the model performance due to increasing the number of (Abbas et al., 2018).
samples in ML models is more than traditional modes (Soares and Gray,
2019). Table 2 provides a summary of the literature on ML methods 3.1.2. Support vector regression (SVR)
which have been employed to estimate the ROP along with the input Support vector regression (SVR) can be used in regression problems
variable of the models. to approximate nonlinear and complex functions (Cheng et al., 2017;
Considering the 18 reviewed works in Table 2, the frequency of in Drucker et al., 1997). This method, based on statistical learning and
puts with at least two repetitions is shown in Fig. 6. Based on this figure, considers structural risk minimization as the objective function instead
it can be concluded that RPM and WOB variables have been used as
5
Table 2 Table 2 (continued )

A summary of some utilized ML-based models in the research literature for References Models Model Inputs The best
estimating the ROP. model
References Models Model Inputs The best Siltstone, Claystone,
model Anhydrite and lime stone
Li and Samuel BP-ANN True vertical depth, RPM, _ Ansari et al. CSVR-ICA Interval layer, Formation _
(2019) Bit hours, WOB, Bit type, (2017) type, RPM, Flow rate, Pump
Torque, Flow rate,Pump pressure, 1/WOB, 1/Mud
pressure,Total flow area, weight, Inner row bit tooth
Azimuth, Inclination,Mud wear
weight, UCS, Pore pressure, Shi et al. (2016) ELM,USA,ANN Bit size, Bit type, UCS, USA
Vertical stress, Maximum Formation drill ability,
and minimum horizontal Formation abrasiveness,
stress Pump Pressure, RPM, WOB,
Zhao et al. (2020) LM-ANN, SCG- Well depth, RPM, WOB, LM-ANN Mud type, Mud viscosity
ANN, OSS-ANN Shut in pipe pressure, Fluid Bodaghi et al. HPGSVR, CASVR Mud viscosity, Mud weight, CSSVR
rate, The ratio of yield point (2015) and CSSVR Pump rate, Pump pressure,
to plastic viscosity, The Well deviation, RPM, WOB,
ratio of 10 min gel strength Interval drilling, Bit size,
to 10 s gel strength Bit wear, Formation type
Ahmed et al. ANN, ELM, SVR, Inputs1: WOB, RPM, LS-SVR Jacinto et al. DENFIS and WOB, UCS, RPM, Bit DENFIS
(2019) LS-SVR Torque, SSP, Flow rate, (2013) Bayesian diameter
Well depth, Mud weight, Network
Bit diameter Inputs2: Well AlArfaj et al. RBF and ELM Mud density, WOB,RPM, ELM
depth, WOB, RPM, Torque, (2012) Bit hydraulic, Pore
Flow rate pressure, Threshold WOB,
Elkatatny (2019) SADE-ANN RPM, WOB to bit diameter, _ Well depth, Bit wear, Bit
The production of pressure diameter
and flow rate, Torque to
UCS, Density to plastic
viscosity of empirical risk minimization, which ensures the achievement of a
Ashrafi et al. PSO-MLP, GA- WOB, RPM, Pump flow PSO- global optimal and prevents over-fitting problem (Abdulmalek et al.,
(2019) MLP, ICA-MLP, rate, Pump pressure, Pore MLP 2018; Kim, 2003). In the SVR structure, some functions are responsible
BBO-MLP and pressure, Gama ray, Density
Simple MLP log, Shear wave velocity
for mapping data and act as a mediator between the input variables and
PSO-RBF, GA- the continuous target variable (Ulker and Sorgun, 2016). For the
RBF, ICA-RBF, training data {(xi .yi ). i = 1 : N .xi ∈ RN . yi ∈ R}, while xi represents the
BBO-RBF and ith input data of the N space, and yi is the corresponding real output to
Simple RBF
Momeni et al. BP-ANN Well depth, WOB, RPM, Bit _
this input set, The SVR model can be written as Eq. (2).
(2018) diameter, Mud weight,
yi = f (xi ) = W T φ(Xi ) + b (2)
Plastic viscosity, Flow rate
Anemangely et al. PSO-MLP, COA- RPM, WOB, Shear wave COA-
(2018) MLP slowness value, MLP where f(xi ) is the output of the model or approximated value for yi , and
Compressional wave φ(Xi ) is the feature function of the input vector Xi . Moreover, W and b
slowness value, Flow rate are the weights and biases of the model that are optimized throughout
Abbas et al. BP-ANN TVD,WOB, RPM, Bit type, _
the training process. Optimization of these parameters is carried out
(2018) Bit working hour, Torque,
Flow rate, Circulation using a loss function. This loss function is defined as Eq. (3) and Eq. (4)
pressure, Total flow rate, (AL-Musaylh et al., 2018).
Azimuth, Inclination, Mud
weight, UCS, Pore pressure, 1 1∑ n
R(C) = ‖w‖2 + C × |yi − f (xi )|ε (3)
Vertical stress, Maximum 2 n i=1
horizontal stress, Minimum
horizontal stress {
0 |yi − f (xi )| ≤ ε
Zhou and Chen SVR WOB, RPM, Mud flow in, _ |yi − f (xi )|ε = (4)
(2018) Mud density, ROP in the |yi − f (xi )| − ε Otherwise
current time
Al-AbdulJabbar BP-ANN Flow rate, RPM, WOB, _ where C is the error coefficient, and ε is the maximum acceptable error.
et al. (2018) Stand pipe pressure, As long as the output difference is less than ε, the loss function will be
Torque, UCS
Abdulmalek et al. SVM WOB, Flow rate, SSP, RPM, _
equal to zero and will increase linearly once leaving the allowed area
(2018) Torque, Mud density, (Cortes and Vapnik, 1995). The problem of minimizing the loss function
Funnel viscosity, Plastic can be written as Eq. (5) (AL-Musaylh et al., 2018).
viscosity, Yield point,Fluid
solids 1 1 ∑n
Ayoub et al. ANFIS Depth, Bit Size, Mud _

min ‖w‖2 + C |(ξ + ξ∗ )| (5)
w.b.ξ.ξ∗ 2 n i=1
(2017) Weight, WOB,RPM
Amer et al. (2017) BP-ANN Bit Type1, Bit Type2, IADC _ Subject to:
code1, IADC code2, IADC
⎧
code3, Bit diameter, Bit ⎪ T
status1, Bit status2, ⎨ − yi + w φ(xi ) + b ≤ ε + ξi , i = 1.2.….m
⎪
Measure depth, True yi + w φ(xi ) − b ≤ ε + ξ∗i , i = 1.2.….m
T
⎪
vertical depth, RPM, WOB, ⎪
⎩ ξi . ξ∗i ≥ 0 , i = 1.2.….m
Torque, Flow rate, SSP,
Mud weight, Percentage of
shale, Sandstone, Clay,
In this optimization problem, ξi and ξ∗i meet restrictions, C estab
lishes a balance between the complexity of the model and the training
error, and ε is used for consideration of an appropriate interval of
6
Fig. 6. Frequency of inputs used to feed ML-based models according to 18 reviewed publications.
Fig. 7. An illustration of computational space of loss function and nonlinear kernel functions in the SVR model: (a) The data in the primary space; (b) Transferring
the nonlinear data to the feature space with higher dimensions; (c) Using the loss function to determine the error values of the data output from the ε− tube area.
∑
n
( )
αi − α∗i = 0
j=1
0 ≤ αi . α∗i ≤ C i = 1.2, …, m
To solve the problem as mentioned above, the SVR model is finally

defined as follows:
∑
n
( )
αi − α∗i k(xi .x) + b (7)
i=1
where αi and α∗i are Lagrangian coefficients, and k(xi .x) is the kernel
function. Applying a robust optimization algorithm to select the optimal
values of the kernel parameters leads to the development of a high-
Fig. 8. The network structure of support vector regression (SVR). precision and high-performance SVR model (Akande et al., 2017;
El-Sebakhy et al., 2007). There are various kernel functions to fit into the
variations. By solving a Lagrangian dual problem from Eq. (5), the SVR structure, each of them has a different function. The most common
optimization problem of the model is obtained as Eq. (6) (AL-Musaylh kernel functions include Gaussian, polynomial, and linear, which their
et al., 2018; Drucker et al., 1997): implementation can be examined in following references (AL-Musaylh
et al., 2018; Cheng et al., 2017; Hong, 2009). The performance of kernel
1∑ m
( )( ) 1∑ n
( ) and loss functions for a set of nonlinear data is illustrated in Fig. 7. In
min∗ αi − α∗i αj − α∗j k(xi .xi ) + (ε − yi )αi + (ε + yi )α∗i (6)
α.α 2 i.j=1 2 i=1 addition, a schematic of the SVR structure is shown in Fig. 8.
Subject to: 3.1.3. Adaptive neuro-fuzzy inference system (ANFIS)

The basis of fuzzy logic (FL) is defined with the uncertain expression
7
Fig. 9. ANFIS structure for two inputs and two rules.
( )
1 (x − c)2
O1i = φ(x) = exp − (8)
2 σ2
where O1i is the ith output of the first layer, φ(x) indicates the
membership degree of each input in the considered MFs, c is the
center of the Gaussian MF and σ is the width (i.e., variance) (Kar
aboga and Kaya, 2019; Karkevandi-Talkhooncheh et al., 2017).
• Layer 2:
This layer is responsible for determining the firing strength of a
rule as:
O2i = Wi = φAi (x1 ).φBi (x2 ) i = 1, 2 (9)
• Layer 3:
This layer responsible for normalizing the firing strength of each
rule relative to the total firing strength according to Eq. (10)
Fig. 10. The structure of radial basis function neural network (RBFNN). (Enayatollahi et al., 2020).
Wi
O3i = W i = i = 1, 2 (10)
of data, which at first was proposed by Zadeh (1975). Jang (1993), W1 + W2
proposed the adaptive neuro-fuzzy inference system (ANFIS) model by
integrating the capabilities of FL and ANN. According to Fig. 9, ANFIS
structure consists of five layers, in which specific operations are applied • Layer 4:
to produce a model consisting of fuzzy rules. In this layer, an expression is formed according to Eq. (11), which
In the following, each of these layers will be described: is the product of the normalized firing strength values and the fuzzy
if-then rules.
• Layer 1:
O4i = W i fi = Wi (ai x1 + bi x2 + c1 ) i = 1, 2 (11)
In the first layer, the numerical values of input variables are con
verted into fuzzy values by using membership functions (MFs). The where Q4i represents the outputs of the fourth layer, Wi is the
Gaussian function is one of the most commonly used types of MFs, normalized firing strength values of each rule, and fi is the fuzzy if-
which is defined as the following: then rules that are defined as follows:
IF x1 is A1 AND x2 is B1 . THEN y = a1 x1 + b1 x2 + c1
Fig. 11. An illustration of social behavior in ants to finding the shortest route between the colony and food source: (a) starting to search for food randomly; (b)
finding the first route to the food source and secreting the initial pheromones; (c) augmenting the closest path to the food source by secreting more pheromones by
other ants.
8
Fig. 12. Movement of one particle toward the next position based on three vectors including the inertia weight, the best previous position of the particle, and the best
position for other particles.
IF x1 is A2 AND x2 is B2 . THEN y = a2 x1 + b2 x2 + c2 2017).
where IF and THEN are called the antecedent and consequence

sections, respectively. Both A and B terms are indicatives of MFs; and 3.2. Optimization algorithms
the {a2 .b2 .c2 } values are parameter sets of the model that are opti
mized during the training process of the model (Basarir et al., 2014; 3.2.1. Ant colony optimization (ACO)
Karkevandi-Talkhooncheh et al., 2017; Naresh et al., 2020). The ant colony optimization (ACO) algorithm was proposed by
• Layer 5: Dorigo et al., 1996). This algorithm was inspired by the social behavior
of ants and their special ability to find the shortest route from the colony
In this layer, all the output signals from the previous layer are to the food source (Dorigo et al., 1996; Li et al., 2017). An illustration of
aggregated in a single node to obtain the quantitative network output as the ants’ natural behavior for finding food sources is shown in Fig. 11.
follows (Karaboga and Kaya, 2019): The first ACO algorithm was applied to combinatorial optimization
∑ problems. Socha and Dorigo (2008), were among the researchers who
∑ W i fi modified the ACO algorithm for application in continuous domains,
O5 = i
W i fi = ∑ i = 1, 2 (12) called ACOR.
Wi
A probability density function (PDF) is used in ACOR, which contains
i
i
a set of Gaussian kernel functions. This function for the ith dimension of
3.1.4. Radial basis function neural network (RBFNN) the solution is defined as follows:
Radial basis function neural network (RBFNN) is a particular type of 2
artificial neural network proposed by Moody and Darken (1989). The (x− ∝il )
∑
k ∑
k
1 −
(14)
2
i i 2σ i
RBFNN consists of only one hidden layer, which makes it much more G (x) = ω l gl (x) = ωl i √̅̅̅̅̅e
σl 2π
l
efficient with fewer computations compared to other types of neural

l=1 l=1
networks with multiple hidden layers. There are non-linear activation where gli (x) is the Gaussian function for the ith dimension, ∝i =
units with radial basis functions (RBFs) in the hidden layer of this
{∝1 , …, ∝k } is the mean vector, σ i = {σ 1 , …, σ k } is the standard deviation
network, which are responsible for mapping the input data into a higher
vector, and ω = {ω1 , …, ωk } is the vector of assigned weight values to
dimension space (i.e. the feature space) (Aggarwal, 2018; Kakouei et al.,
the solutions (Zhao et al., 2021). After generating and sorting the so
2014; Rezaei et al., 2022). Fig. 10 shows a simple representation of the
lutions in the archive, weights are assigned to these solutions according
RBFNN structure.
to Eq. (15).
The Gaussian function is one of the most popular and widely used
RBFs (Slema et al., 2018). The kth output from RBFNN is defined as 1 (l− 1)2
(15)
−
ωl = √̅̅̅̅̅e 2q2 k2
follows: qk 2π
∑
h
yi =
̂ wki φk (x) + εk (13) where q > 0 is a tuning parameter that acts similarly with the selection
k=1 pressure parameter. The parameter ∝il for the solution l and the ith de
cision variable in the archive is equal to xil . The probability of selecting a
where h is the number of hidden neurons, wki represents a connection
solution by an ant is defined based on its weight and can be written as
weight from the hidden layer neurons to the output layer neurons, and εk
Eq. (16).
is the bias values (Chandra et al., 2020). After generating the initial
centers during an unsupervised step, the RBFs are applied to the Pl =
ωl
(16)
Euclidean distances between the center of functions and the input vec ∑
k
ωj
tors (Aggarwal, 2018; Hordri et al., 2017; Soofi and Cao, 2002). The j=1
RBFNN parameters include ε, w, c and σ , which must be set based on the

value of the approximation error (De Mulder et al., 2020; Mosavi et al., where ωl is the weight of the Gaussian function of the lth solution. After
generating a new solution using the Gaussian function, the mean values
9
Fig. 13. Flowchart of models development process for ROP estimation.
μij and the standard deviation σij are calculated according to Eq. (17) and values for it (Azad et al., 2019). In the following, generated solutions for
Eq. (18), respectively. all decision variables resulting from the selection of ants and some old
solutions from the previous step are combined to form a new solution
μij = xij (17) archive. Then, k top solutions are saved and the rest are eliminated.
⃒ ⃒ These steps will be iterated till the satisfaction of the termination criteria
∑k ⃒ i
xe − xil ⃒ (Afshar and Madadgar, 2008; Bamdad et al., 2017; Zhao et al., 2021).
σ ij = ξ (18)
e=1
k− 1
3.2.2. Particle swarm optimization (PSO)
where ξ > 0 is a parameter similar to the pheromone evaporation in The particle swarm optimization (PSO) algorithm is inspired by the
ACO, which the convergence speed is increased by selecting the low flocking behavior of birds and fishes in search of food and escape from
10
Table 3 Table 6
Specifications of MLPNN structures for ROP estimation. The values of the utilized kernel parameters in the SVR structure.
Network Settings Value or type Case number Kernel Function C Kernel Scale (γ) ε
Type of Network “Feed forward” Case 1 Linear 8.18 5.12 0.01
Number of hidden layer(s) for case 1 to 5 2, 2, 2, 1, 1 RBB 24.12 22.38 0.11
Number of hidden Neurons (Case 5) [6 7] Polynomial 32.23 15.38 0.69
Number of hidden Neurons (Case 2) [6 5] Case 2 Linear 9.12 25.41 0.95
Number of hidden Neurons (Case 3) [6 5] RBB 14.99 18.55 0.36
Number of hidden Neurons (Case 4) [7] Polynomial 28.68 12.95 0.91
Number of hidden Neurons (Case 5) [7] Case 3 Linear 5.90 12.31 0.31
Hidden layer Transfer function “Sigmoid” RBB 24.91 16.05 0.02
Output layer Transfer function “Linear” Polynomial 47.12 12.35 0.01
Number Of Inputs 13 Case 4 Linear 37.55 5.29 0.09
Number Of Output 1 RBB 39.98 8.38 0.03
Minimum and maximum number of the 99–135 Polynomial 24.31 11.58 0.72
weight elements Case 5 Linear 28.39 18.45 0.01
Stopping condition Achieve minimum error (MSE) or RBB 31.12 13.46 0.47
max epoch Polynomial 34.25 7.01 0.55
Table 4
Table 7
Calculated values of RBFNN parameters utilizing four meta-heuristic algorithms.
Specifications of PSO, GA, ACO, and DE algorithms for the training of BYM and
Number of Parameters Algorithms Bingham models.
cases
ACO PSO DE GA Algorithm Parameters Models
Case 1 Number of active 93 110 77 142 Value for BYM Value for Bingham
centers model model
Width of the RBFs 58.61 32.63 43.47 73.25
Case 2 Number of active 203 249 221 171 PSO Number of Particle 80 50
centers Cognitive term 2 2
Width of the RBFs 80.32 176.78 80.70 154.68 Social Component 2 2
Case 3 Number of active 262 312 362 176.9 Inertia Weight 0.9 0.9
centers Iteration 500 500
Width of the RBFs 59.53 115.36 102.06 75.54 ACO Number of Ants 80 50
Case 4 Number of active 239 282 290 279 Archive Size 30 20
centers Intensification factor 0.5 0.5
Width of the RBFs 82.33 76.56 131.80 81.66 Deviation-Distance ratio 1 1
Case 5 Number of active 354 272 355 322 Iteration 500 500
centers DE Population Size 80 50
Width of the RBFs 273.45 215.36 181.23 194.58 Upper Bound of Scaling 0.8 0.8
Factor
Lower Bound of Scaling 0.2 0.2
Factor
Crossover Probability 0.8 0.8
Table 5
Iteration 500 500
Specifications of PSO, GA, ACO, and DE algorithms for training MLPNN, ANFIS,
GA Population Size 80 50
and RBFNN models. Crossover Rate 0.9 0.9
Algorithm Parameters Models Mutation Rate 0.1 0.15
Selection pressure 3 3
MLPNN RBFNN ANFIS Iteration 500 500
PSO Number of Particle 80 50 60
Cognitive term 2 2 2
Social Component 2 2 2 danger (Kennedy and Eberhart, 1995). The constituent elements in the
Inertia Weight 0.9 0.9 0.8 PSO algorithm are called particles. In each iteration process of PSO, the
Iteration 500 100 1000 particles are located in a new position in space, and the objective
ACO Number of Ants 60 30 80
Archive Size 30 20 40
function is calculated for each particle (Jang-Ho et al., 2008). The next
Intensification factor 0.5 0.5 0.5 position of each particle is determined by some components, including
Deviation-Distance ratio 1 1 1 the best position of the particle (pij ), the position already reached by the
Iteration 500 100 1000 other particles (pgi ), and the inertia of the particle motion. After the
DE Population Size 80 50 80
particle movement, one step of the algorithm is completed and this
Upper Bound of Scaling Factor 0.8 0.8 0.8
Lower Bound of Scaling Factor 0.2 0.2 0.2 process is iterated several times till the stopping condition is reached. In
Crossover Probability 0.8 0.8 0.8 each iteration, the position and velocity of particles were determined by
Iteration 500 100 1000 the following equations:
GA Population Size 80 40 80 [ ] [ ]
Crossover Rate 0.9 0.9 0.9 υij = (t + 1) = ωυij (t) + c1 r1j (t) pij − xij (t) + c2 r2j (t) pgi − xij (t) (19)
Mutation Rate 0.1 0.15 0.15
Selection pressure 3 3 3
xi (t + 1) = xij (t) + υij (t + 1) (20)
Iteration 500 100 1000
BP Initial step size 0.001 – 0.01
Step size decrease 0.1 – 0.7 Where xij (t) is the current position of the ith particle in tth iteration, pij is
Step size increase 10 – 0.1 referred to as the best position discovered by the ith particle, and pgi is
Maximum epoch 200 – 200 the best position found by the other particles which are shared with each
particle (Liang et al., 2019). The parameters c1 and c2 are the cognitive
term and the social component, respectively. The constants value r1 and
11
Table 8
The calculated constants of the BYM model.
Model Case number Model Constants
a1 a2 a3 a4 a5 a6 a7 a8
PSO-BY Case 1 3.37 − 1.7 × 10 − 4

− 2.8 × 10 − 6
1.6 × 10 − 5 − 0.01 0.29 2.47 0.36
Case 2 0.77 2.6 × 10− 4
6.3 × 10− 5
− 3.6 × 10− 6 − 0.08 − 0.21 1.22 − 0.95
Case 3 0.46 1.1 × 10− 4
8.8 × 10− 5
8.6 × 10− 5 − 0.02 0.28 − 3.91 − 0.51
Case 4 0.84 2.1 × 10− 4 9.1 × 10− 5
7.4 × 10− 5 − 0.05 0.16 0.61 − 0.55
Case 5 0.93 6.6 × 10− 5
− 1.3 × 10− 5
9.9 × 10− 6 − 0.03 0.11 − 0.55 − 0.09
ACO-BY Case 1 0.50 1.2 × 10− 5
1.0 × 10− 6
3.9 × 10− 5 0.50 1.03 1.50 0.30
Case 2 1.90 1.0 × 10− 6
1.0 × 10− 6
1.0 × 10− 6 1.09 0.40 1.50 0.30
Case 3 1.80 1.0 × 10− 6
1.0 × 10− 6
1.0 × 10− 6 0.94 0.40 1.80 0.46
Case 4 1.72 1.0 × 10− 6
1.0 × 10− 6
4.3 × 10− 5 − 0.50 0.72 1.22 0.30
Case 5 1.57 1.0 × 10− 6
1.0 × 10− 6 1.0 × 10− 5 0.53 0.38 0.31 0.52
DE-BY Case 1 1.22 − 1.1 × 10− 5
4.9 × 10− 5
8.5 × 10− 5 − 0.06 0.82 0.12 0.51
Case 2 0.23 1.6 × 10− 4
1.7 × 10− 4
1.3 × 10− 5 − 0.01 0.26 0.15 − 0.16
Case 3 0.89 2.1 × 10− 5
5.8 × 10− 5
7.4 × 10− 5 0.06 0.81 0.11 0.53
Case 4 1.21 5.6 × 10− 5
2.2 × 10− 5
5.8 × 10− 5 0.05 − 0.28 − 0.78 0.04
Case 5 0.98 6.4 × 10− 5
3.5 × 10− 5
7.2 × 10− 5 0.01 0.65 − 0.14 − 0.42
GA-BY Case 1 0.99 6.1 × 10− 5
4.9 × 10− 6 7.9 × 10− 5 0.50 0.96 0.45 0.43
Case 2 0.88 9.9 × 10− 5
9.4 × 10− 4 8.2 × 10− 5 0.32 0.68 0.09 0.33
Case 3 1.14 5.0 × 10− 5
7.8 × 10− 6
6.7 × 10− 6 0.50 0.83 1.24 − 0.28
Case 4 0.99 7.4 × 10− 5
7.7 × 10− 5
5.2 × 10− 5 0.51 0.68 0.80 0.45
Case 5 1.07 6.2 × 10− 5
6.8 × 10− 5
4.2 × 10− 5 0.61 0.55 0.79 0.39
randomly selected vectors. After applying the mutation operator, the

Table 9
crossover operator is applied based on the binomial crossover, and then
The calculated constants of the Bingham model.
the trial vector is obtained as follows:
Model Case number Model Constants {
Vji,G+1 . rand dj ≤ Cr or J = rnbri
a k Uji, G+1 (22)
X ji.G . rand dj > Cr and J ∕
= rnbri
PSO-Bingham Case 1 0.16 0.22
Case 2 0.21 0.18 where rnbri ∈ {1.2.….N} is assumed for satisfying that the resulted so
Case 3 0.41 0.18
Case 4 0.16 0.18
lutions are different from the initial ones at least in only one case. The
Case 5 0.19 0.18 other solutions are selected with a probability ofCr ∈ [0 1], so that the
ACO- Bingham Case 1 0.00 0.18 probability of selection from Vji,G+1 and initial solutions are considered
Case 2 0.09 0.15 to be Cr and Cr − 1, respectively.rand dj ∈ [0 1] is a random number that
Case 3 0.27 0.16
Case 4 − 0.05 0.16
is generated by uniform distribution, and in terms of being equal or less
Case 5 0.05 0.22 than Cr , the selected solution will be from the Vji,G+1 set; otherwise, the
DE- Bingham Case 1 0.01 0.18 selected solution will be from the Xji,G set. After applying the crossover
Case 2 0.11 0.15 operator, the trial vector obtained from the previous step and the target
Case 3 0.28 0.16
Case 4 − 0.05 0.16
vector that was selected in the first step is evaluated by considering the
Case 5 0.13 0.16 fitness value. If the trial vector possesses higher fitness than the target
GA- Bingham Case 1 0.17 0.21 vector, it will be classified as one of the members of the next generation;
Case 2 0.19 0.15 otherwise, the target vector will be considered in the next generation.
Case 3 0.34 0.16
Following this step, the selection operator is started and selects the
Case 4 0.17 0.16
Case 5 0.22 0.14 population of the next generation from parents and offspring. This
process continues till the satisfaction of termination conditions (Karke
vandi-Talkhooncheh et al., 2017; Liu et al., 2014).
r2 are random numbers in the range of[0 1] and ω is the inertia weight of
the particle (Karkevandi-Talkhooncheh et al., 2017; Ma and Wang, 3.2.4. Genetic algorithm (GA)
2010). Fig. 12, indicates the movement of each particle in the PSO Genetic algorithm (GA) is one of the evolutionary algorithms which
algorithm. was designed and established based on natural selection and Darwin’s
revolutionary theory (Holland, 1992). In this algorithm, a population of
3.2.3. Differential evolution algorithm (DE) chromosomes (i.e. solutions or elements of the algorithm) is firstly
The Differential Evolution (DE) algorithm was firstly proposed by generated within a search space. During the algorithm process, the se
Storn and Price (1997). This algorithm generates solutions in the range lection operator randomly selects two chromosomes as parent chromo
of the problem values and then selects four solutions randomly in the somes and the probability of their presence in the next generation is
vector form that one of them is the target vector. In the next step, the determined after conducting the fitness evaluation (Bodaghi et al.,
mutant vector is obtained according to Eq. (21). 2015). The crossover operator chooses some points from two chromo
( ) somes as the transmission points and then recombines them to generate
Vi,G+1 = xr1,G + M × Xr2,G− xr3,G (21)
some solutions as children that inherit their parent’s characteristics.
where Mε[0 2] is the mutation factor, andr1 , r2 and r3 are indexes of Also, the mutation operator based on the change of binary numbers is
12
Table 10
Summary of simulation results for each method.
Models Train sections Test sections
Average R Average AAPRE Average RMSE Average R Average AAPRE Average RMSE
ACO-MLPNN 0.834 16.945 4.894 0.637 19.651 5.078

PSO-MLPNN 0.916 12.128 3.595 0.698 17.993 4.916
DE-MLPNN 0.808 20.933 5.450 0.586 21.881 5.330
GA-MLPNN 0.858 16.427 4.637 0.648 19.325 4.959
BP-MLPNN 0.845 19.246 4.848 0.613 21.116 5.188
ACO-ANFIS 0.744 23.025 6.060 0.504 23.084 5.756
PSO-ANFIS 0.733 24.029 6.189 0.503 23.388 5.647
DE-ANFIS 0.659 26.429 6.885 0.439 24.591 5.715
GA-ANFIS 0.751 23.144 6.008 0.538 22.980 5.624
BP-ANFIS 0.700 25.740 6.438 0.478 23.957 5/634
ACO-RBFNN 0.797 20.859 5.590 0.569 21.892 5.274
PSO-RBFNN 0.870 15.984 4.545 0.648 19.543 4.984
DE-RBFNN 0.842 18.421 4.973 0.625 20.908 5.220
GA-RBFNN 0.907 13.016 3.841 0.670 18.222 4.913
SMO-SVR (RBF) 0.893 14.598 4.055 0.647 18.528 5.158
SMO-SVR (Polynomial) 0.858 18.653 4.677 0.613 20.712 5.247
SMO-SVR (Linear) 0.644 28.584 6.977 0.399 25.695 5.767
ACO-BYM 0.667 26.947 6.791 0.406 24.968 5.876
PSO-BYM 0.593 28.909 7.414 0.360 26.386 6.180
DE-BYM 0.697 27.026 6.459 0.434 24.551 5.732
GA-BYM 0.579 30.677 7.633 0.378 26.139 5.833
ACO-Bingham 0.516 32.686 7.780 0.299 27.448 6.164
PSO-Bingham 0.527 32.478 7.767 0.273 27.689 6.216
DE-Bingham 0.570 31.489 7.571 0.324 27.177 6.125
GA-Bingham 0.454 33.451 8.011 0.278 28.067 6.195
regarded as an auxiliary operator that is applied to the population at a

Table 11 low rate. The mutation operator extends the search space and adds the
Summary of simulation results for each case.
random property to the algorithm (Wang et al., 2010).
Case number Train sections Test sections
R AAPRE RMSE R AAPRE RMSE 3.3. Traditional ROP models

Case 1 0.756 10.799 4.799 0.514 32.224 7.949
Case 2 0.736 24.481 6.863 0.652 20.349 5.952 Bingham)1965), proposed an experimental model with three vari
Case 3 0.658 27.467 6.901 0.393 36.857 6.487 ables including weight on bit, bit diameter, and rotary speed (rpm) to
Case 4 0.755 27.775 5.794 0.421 11.285 3.691 estimate the ROP. There is no restriction on choosing the utilized bit
Case 5 0.744 25.839 5.458 0.531 14.461 3.666
type in this model. The model equation is defined as follows:
[ ]a
W
R=K N (23)
db
Fig. 14. Obtained R values based on model type and case number for training sections.
13
Fig. 15. Obtained AAPRE values based on model type and case number for training sections.
Fig. 16. Obtained RMSE values based on model type and case number for training sections.
where R is the rate of penetration (ft/hr), W is the weight on bit (klb), db literature (Bahari et al., 2008; Bourgoyne and Young, 1974; Darwesh
is the bit diameter (in), N is the bit rotation speed (rpm), k is the constant et al., 2020; Eren and Ozbayoglu, 2010).
( ∑ )
indicating the ease of drilling and a is the weight on bit constant 8
df a1 + ax
(Bingham, 1965; Soares and Gray, 2019). =e j=2 j j
(24)
dt
Bourgoyne and Young (1974), proposed an experimental model for
predicting ROP that is highly acceptable in the drilling industry (Eren Where, a1 to a8 are model constants and x1 to x8 are eight independent
and Ozbayoglu, 2010). The Bourgoyne and Young model (BYM) is functions.
defined as the product of eight independent functions. Each one of these
independent functions is considered to determine the effect of specific 4. Model development and ROP estimation
parameters on the ROP. The general equation of the BYM is defined as
Eq. (24). Further details about this model are provided in the research After dividing the data into training and test sets (see section 2.2),
14
Fig. 17. Obtained R values based on model type and case number for test sections.
Fig. 18. Obtained AAPRE values based on model type and case number for test sections.
the 5 data cases were generated to develop ML models and adjust the minimum value of MSE versus the minimum dimensions of the network
constants of conventional models. Fig. 13 shows the flow chart of the was selected as the appropriate structure. In Table 3, the specifications
modeling process for estimating ROP by using the studied methods. In of the MLP neural network structure are listed in more detail for all
the following, the development of ROP estimator models is described in cases. The reason for choosing a separate structure for each case was that
more detail. the number of training samples increases from case 1 to case 5, and this
necessitates the selection of an optimal structure to reduce overfitting
and produce an efficient network with minimal complexity against
4.1. ROP estimation using machine learning (ML) methods error. After finding the optimal network structure, four meta-heuristic
algorithms (PSO, ACO, DE, and GA) and a conventional algorithm
MLPNN model was constructed by using feed forward network. Then (Levenberg-Marquardt BP) were utilized to optimize weights and biases.
the optimal number of neurons and layers were selected based on trial To develop the RBFNN model, the K-means algorithm was firstly
and error. Different structures with maximum of two hidden layers and used for clustering data with specific centers. Then, RBFs were applied
11 neurons were evaluated and the structure that provided the
15
Fig. 19. Obtained RMSE values based on model type and case number for test section.
Fig. 20. Comparison of R, AAPRE and RMSE statistical criteria for all cases.
to these clusters for mapping data from the input space to the feature categories, and MFs are applied to these categories (Wang, 1997). In this
space. In this study, a Gaussian function was used. In using population- study, the Sugeno-type fuzzy inference system along with the fuzzy
based algorithm approaches, approximation error is considered as the clustering algorithm (FCM) and Gaussian type MF were used to estimate
objective function. Besides, model parameters including weight values, ROP. The number of rules was determined in nine based on a trial and
number of active centers, and width of the RBFs were optimized to error approach, which is summarized below:
minimize the objective function or approximation error (Bonanno et al.,
⎧
⎪
⎪ Rule 1 : if < X1 is A1 and X2 is B1 and … X13 is M1 > Then < ROP1 = (a1 X1 ×b1 X2 × … m1 X13 + f1 ) >
⎨
Rule 2 : if < X1 is A2 and X2 is B2 and … X13 is M2 > Then < ROP2 = (a2 X1 ×b2 X2 × … m2 X13 + f2 ) >
⎪
⎪ ⋮
⎩
Rule 9 : if < X1 is A9 and X2 is B9 and … X13 is M9 > Then < ROP9 = (a9 X1 ×b9 X2 × … m9 X13 + f9 ) >
2012; Hu et al., 2014). Table 4, gives a list of calculated values of RBFNN

parameters utilizing four meta-heuristic algorithms (ACO, PSO, DE, and To train the ANFIS model, 234 MFs parameters in the antecedent
GA) for all cases. section and 126 coefficients in the consequence section are initially
Choosing sufficient rules is very important for the development of valued and then optimized through four meta-heuristic algorithms
the ANFIS model. The high number of rules complicates the fuzzy sys (ACO, PSO, DE, and GA) and a conventional algorithm (Gradient descent
tem, while the low number of rules does not ensure reliable results. To BP). After obtaining the optimized ANFIS model, it was used to estimate
define fuzzy rules, input-output pairs are classified into different ROP in the testing datasets.
16
Fig. 21. Box plot of normalized errors of the models for case 1. Fig. 23. Box plot of normalized errors of the models for case 3.
Fig. 22. Box plot of normalized errors of the models for case 2. Fig. 24. Box plot of normalized errors of the models for case 4.
The adjustment parameters for algorithms are presented in Table 5. 4.2. ROP estimation using traditional models
Appropriate values for the parameters of each algorithm are determined
by considering the complexity of the problem and can be adjusted based For the eight constants of the BYM model, their ranges are presented
on a trial and error approach in order to make the algorithm search in the research literature, so utilizing these constant ranges can ensure
process more efficient (Kramer, 2017). the convergence of algorithms and achieve more reliable results from
The SMO algorithm was used to train the SVR model. The SMO al the model (Bourgoyne and Young, 1974). Table 7, lists the specifications
gorithm shows high potential ability in solving problems with high di of the PSO, GA, ACO, and DE algorithms with more details for training
mensions and can significantly reduce the training time (Yang et al., BYM and Bingham models. Also, Table 8 and Table 9, show the calcu
2007; Zhou and Ma, 2013). The type of kernel has a significant impact lated constants of the BYM and Bingham models for each case.
on the accuracy of the SVR model. In this study, three widely used
kernels including RBF, Polynomial, and Linear were used in the SVR 5. Discussion and results
structure. The optimal value of hyper-parameters of kernels were listed
in Table 6 for each case. In this section, the performance of developed models is evaluated to
select the most accurate model. The main purpose of estimating ROP by
different methods was to achieve a method that more accurately rec
ognizes the patterns between operational variables and ROP. Achieving
17
evaluating model performance (Karkevandi-Talkhooncheh et al., 2017).

According to Table 10 and Figs. 14–19, it can be concluded that all ML
models are more accurate and reliable than traditional ROP models.
Also, optimized multilayer perceptron neural network by particle swarm
optimization algorithm (PSO-MLPNN) has the highest accuracy in test
sections with average values of 17.993% for AAPRE. The correlation
between the actual and the estimated ROP of PSO-MLPNN is about 30%
higher than the best traditional ROP model (DE-BYM). Traditional
models have experimental coefficients by which the model fits on the
field data. However, since ROP data is nonlinear and traditional models
are simple in nature, the ability of these models to approximate ROP is
less accurate (Hegde et al., 2017).
According to Table 11 and Fig. 20, case 4 and 5 have lower error
rates than the previous three cases, which shows that increasing the
volume of data in training dataset reduces the effective error due to
retraining or better training of patterns between drilling variables. Also,
The results show that about 15%–40% of a formation data is effective for
approximating the ROP in the same formation at later depth. Un
doubtedly, there is a direct relationship between the accuracy of models
and the percentage of information of each formation. In addition, the use
of data from other drilled wells in the same field, may help to improve
the performance by increasing the volume of training data, which re
Fig. 25. Box plot of normalized errors of the models for case 5. quires further research and is beyond the scope of this research.
Box plot is one of the very efficient graphs in model error analysis
an accurate ROP model is also essential for drilling optimization. (Hegde et al., 2017). The box plot was proposed by Tukey (1977), which
Because a more accurate model helps to set operational parameters more is an efficient way to illustrate five statistical criteria including mini
precisely to reduce costs and time. In addition, comparing widely used mum, maximum, median, first quartile, and third quartile for different
models to determine the best ones is as important as trying to generate a groups of data. In addition to data distribution, this graph can reveal the
new model. In this study, a comprehensive comparison was performed existence of outlier data and symmetry in data (Kramer and Rosenthal,
between the ML and traditional ROP models in order to examine their 1998). The box plot of the developed models for cases 1 to 5 are shown
ability to gain more knowledge of drilling mechanisms. Regarding the in Figs. 21–25, respectively. Also, a comparison between measured and
development of these models for real-time utilize, some researchers estimated ROP versus depth for all developed models in test sections is
agree that this issue needs more research because it has received little shown in Fig. 26, along with average RMSE values.
attention in the drilling industry (Barbosa et al., 2019; Kor et al., 2021; The use of a non-specialized and unsuitable architecture for ML
Kor and Altun, 2020). models makes it impossible to achieve the highest level of performance.
According to the results, increasing the window size to more than 11 Therefore, one of the important factors with a significant impact on the
does not have much effect on the performance of the SVR model and accuracy of ML models is the proper determination of the main pa
even reduces the accuracy of the model. It can be inferred that an rameters in these models. Table 12 summarizes the main parameters and
excessive increase in window size obscures useful data and confuses the optimized values in the training process for developed ML models in this
model in pattern recognition. To evaluate the performance of developed study.
models, error analysis was investigated by using statistical criteria As shown in Table 12, the SMO-SVR model has three important
including average absolute percent relative error (AAPRE), root mean hyper-parameters and a weight vector (alpha values) that significantly
square error (RMSE), and regression coefficient (R). Summary of simu impact on the model performance. However, before the modeling by
lation results are shown in Tables 10 and 11, respectively. Table 10 MLPNN, the optimal number of hidden layers and neurons must be
presents the average values of three statistical criteria of all cases for determined. But in the RBFNN, there is only one hidden layer and the
each method and Table 11 presents these from all methods for each case. main goal is to optimize the number of active centers, width of the RBFs,
These statistical criteria are defined as follows (Ashrafi et al., 2019; and weights. For ANFIS, the number of model parameters depends on
Karkevandi-Talkhooncheh et al., 2017; Zhao et al., 2020): the number of rules. By considering 9 rules, 234 nonlinear parameters
Average Absolute Percent Relative Error (AAPRE): and 126 consequent parameters are generated, which are optimized by
algorithms during the training process. Therefore, each model must be
⃒( ⃒
n ⃒ ROP) ⃒
field − (ROP)predict ⃒
professionally and separately parameterized to achieve the best
1∑ ⃒
AAPRE = ⃒ ( ⃒ × 100 (25) performance.
n i=1 ⃒ ROP)fieeld ⃒
Root Mean Square Error (RMSE): 6. Conclusion

√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
n ( )2
1 ∑ The importance of estimating the drilling rate of penetration (ROP) is
RMSE = (ROP)field − (ROP)predict (26)
n i=1 so great that many methods have been developed to predict it as accu
rately as possible. In this study, after reducing noisy data by Savitzky
In the following, to visualize and compare the accuracy of the -Golay (SG) filter, 13 input variables with 1912 samples were divided
developed models, statistical criteria values based on model type and into 6 sections. Then section 2 to 6 were separately used as a test dataset
case number are shown in Figs. 14–19 for training and test sections Also, and the other sections before them were used as a training dataset. Next,
the comparison of R, AAPRE and RMSE statistical criteria for all five ML models which included MLPNN, ANFIS, and RBFNN were combined
cases is shown in Fig. 20 with four meta-heuristic algorithms (ACO, PSO, DE, and GA) to develop
As much as the RMSE and R are closer to 0 and 1, respectively, the ROP prediction models. Besides, ANFIS and MLPNN models were com
model can show higher accuracy (Basarir et al., 2014; Yılmaz and bined with the BP algorithm, and also, SMO-SVR model with three
Yuksek, 2008). AAPRE is one of the most important criteria for different kernels were used to estimate ROP. The results of this study are
18
Fig. 26. Comparison between measured and estimated ROP versus depth for all developed models in the test sections.
19
Table 12 References
Summary of the main parameters and optimizable values in the training process
of the developed models in this study. Abbas, A.K., Rushdi, S., Alsaba, M., 2018. Modeling rate of penetration for deviated wells
using artificial neural network. In: Day 2 Tue, November 13, 2018. SPE. https://doi.
Models Number of The Description of the Number of weight factors org/10.2118/192875-MS.
main main parameters or optimizable values Abdulmalek, A.S., Salaheldin, E., Abdulazeez, A., Mohammed, M., Abdulwahab, Z.A.,
parameters during the training process Mohamed, I.M., 2018. Prediction of rate of penetration of deep and tight formation
using support vector machine. In: All Days. SPE. https://doi.org/10.2118/192316-
SVR 3 Kernel scale (γ), 253–1139
MS.
Epsilon (ε) and Box Afshar, A., Madadgar, S., 2008. Ant colony optimization for continuous domains:
constraint (C) application to reservoir operation problems. In: 2008 Eighth International
RBFNN 2 Number of active 78–363 Conference on Hybrid Intelligent Systems. IEEE, pp. 13–18. https://doi.org/
centers (β) and width 10.1109/HIS.2008.121.
of RBFs (σ) Aggarwal, C.C., 2018. Radial basis function networks. In: Neural Networks and Deep
Learning. Springer International Publishing, Cham, pp. 217–233. https://doi.org/
ANFIS 234 Centers (C) and 126
10.1007/978-3-319-94463-0_5.
widths (σ) of MFs Ahmed, O.S., Adeniran, A.A., Samsuri, A., 2019. Computational intelligence based
MLPNN 2 Number of hidden 99–135 prediction of drilling rate of penetration: a comparative study. J. Petrol. Sci. Eng.
layers and optimal 172, 1–12. https://doi.org/10.1016/j.petrol.2018.09.027.
number of neurons Akande, K.O., Owolabi, T.O., Olatunji, S.O., AbdulRaheem, A., 2017. A hybrid particle
swarm optimization and support vector regression model for modelling permeability
prediction of hydrocarbon reservoir. J. Petrol. Sci. Eng. 150, 43–53. https://doi.org/
10.1016/j.petrol.2016.11.033.
as follows:
Al-AbdulJabbar, A., Elkatatny, S., Mahmoud, M., Abdulraheem, A., 2018. Predicting rate
of penetration using artificial intelligence techniques. In: All Days. SPE, pp. 23–26.
• Selecting two parameters of polynomial order and window size has a https://doi.org/10.2118/192343-MS.
AL-Musaylh, M.S., Deo, R.C., Li, Y., Adamowski, J.F., 2018. Two-phase particle swarm
significant impact on the SG filter performance. According to the
optimized-support vector regression hybrid model integrated with improved
results, increasing the window size higher than 11 onwards caused empirical mode decomposition with adaptive noise for multiple-horizon electricity
an overlay of the useful data, which in turn increased the SVR model demand forecasting. Appl. Energy 217, 422–439. https://doi.org/10.1016/j.
error. apenergy.2018.02.140.
AlArfaj, I., Khoukhi, A., Eren, T., 2012. Application of advanced computational
• ML methods, can overcome the complex and nonlinear patterns of intelligence to rate of penetration prediction. In: 2012 Sixth UKSim/AMSS European
data and demonstrate higher accuracy than traditional models in Symposium on Computer Modeling and Simulation. IEEE, pp. 33–38. https://doi.
ROP estimation. org/10.1109/EMS.2012.79.
Amer, M.M., Dahab, A.S., El-Sayed, A.-A.H., 2017. An ROP predictive model in nile delta
• By comparing the AAPRE of all models, it can be concluded that PSO- area using artificial neural networks. In: Day 2 Tue, April 25, 2017. SPE. https://doi.
MLPNN model showed the highest accuracy in ROP estimation. Also, org/10.2118/187969-MS.
RBFNN and ANFIS models provided the best estimation accuracy in Anemangely, M., Ramezanzadeh, A., Tokhmechi, B., 2017. Determination of constant
coefficients of Bourgoyne and Young drilling rate model using a novel evolutionary
combination with GA algorithm. In addition, the SMO-SVR model algorithm. J. Min. Environ. 8, 693–702. https://doi.org/10.22044/JME.2017.842.
with RBF kernel showed better performance than linear and poly Anemangely, M., Ramezanzadeh, A., Tokhmechi, B., Molaghab, A., Mohammadian, A.,
nomial kernels. 2018. Drilling rate prediction from petrophysical logs and mud logging data using an
optimized multilayer perceptron neural network. J. Geophys. Eng. 15, 1146–1159.
• The BP algorithm, as a conventional algorithm in combination with
https://doi.org/10.1088/1742-2140/aaac5d.
MLPNN and ANFIS showed weaker performance than ACO, PSO, and Ansari, H.R., Sarbaz Hosseini, M.J., Amirpour, M., 2017. Drilling rate of penetration
GA algorithms, but showed better results than DE. prediction through committee support vector regression based on imperialist
competitive algorithm. Carbonates Evaporites 32, 205–213. https://doi.org/
• One of the key factors in the development of ML-based models, is the
10.1007/s13146-016-0291-8.
proper determination of the main parameters of these models. Ashrafi, S.B., Anemangely, M., Sabah, M., Ameri, M.J., 2019. Application of hybrid
Therefore, each model must be specialized and its parameters artificial neural networks for predicting rate of penetration (ROP): a case study from
determined separately. Marun oil field. J. Petrol. Sci. Eng. 175, 604–623. https://doi.org/10.1016/j.
petrol.2018.12.013.
• Variety of drilled formations and low volume of training dataset Ayoub, M., Shien, G., Diab, D., Ahmed, Q., 2017. Modeling of drilling rate of penetration
leads to increase of approximation error in cases 1 to 3. However, using adaptive neuro-fuzzy inference system. Int. J. Appl. Eng. Res. 12,
increasing the volume of training dataset effectively reduces the 12880–12891.
Azad, A., Manoochehri, M., Kashi, H., Farzin, S., Karami, H., Nourani, V., Shiri, J., 2019.
error rate due to retraining or better training of the pattern between Comparative evaluation of intelligent algorithms to improve adaptive neuro-fuzzy
drilling variables. inference system performance in precipitation modelling. J. Hydrol. 571, 214–224.
https://doi.org/10.1016/j.jhydrol.2019.01.062.
Bahari, M.H., Bahari, A., Moharrami, F.N., Sistani, M.B.N., 2008. Determining Bourgoyne
Authorship contributions and Young model coefficients using genetic algorithm to predict drilling rate.
J. Appl. Sci. 8, 3050–3054. https://doi.org/10.3923/jas.2008.3050.3054.
Ehsan Brenjkar: Conceptualization, Methodology, Software, Formal Bamdad, K., Cholette, M.E., Guan, L., Bell, J., 2017. Ant colony algorithm for building
energy optimisation problems and comparison with benchmark algorithms. Energy
analysis, Investigation, Resources, Data curation, Writing- Original draft Build. 154, 404–414. https://doi.org/10.1016/j.enbuild.2017.08.071.
preparation, Writing-Reviewing and Editing, Visualization. Ebrahim Barbosa, L.F.F.M., Nascimento, A., Mathias, M.H., de Carvalho, J.A., 2019. Machine
Biniaz Delijani: Supervision, Conceptualization, Methodology, Vali learning methods applied to drilling rate of penetration prediction and optimization
- a review. J. Petrol. Sci. Eng. 183, 106332. https://doi.org/10.1016/j.
dation, Writing-Reviewing and Editing, Project administration. petrol.2019.106332.
Basarir, H., Tutluoglu, L., Karpuz, C., 2014. Penetration rate prediction for diamond bit
drilling by adaptive neuro-fuzzy inference system and multiple regressions. Eng.
Declaration of competing interest Geol. 173, 1–9. https://doi.org/10.1016/j.enggeo.2014.02.006.
Bingham, M.G., 1965. A New Approach to Interpreting Rock Drillability. Pet. Publ. Co.
Bodaghi, A., Ansari, H.R., Gholami, M., 2015. Optimized support vector regression for
The authors declare that they have no known competing financial
drillingrate of penetration estimation. Open Geosci. 7, 870–879. https://doi.org/
interests or personal relationships that could have appeared to influence 10.1515/geo-2015-0054.
the work reported in this paper. Bonanno, F., Capizzi, G., Graditi, G., Napoli, C., Tina, G.M., 2012. A radial basis function
neural network based approach for the electrical characteristics estimation of a
photovoltaic module. Appl. Energy 97, 956–961. https://doi.org/10.1016/j.
Acknowledgements apenergy.2011.12.085.
Bourgoyne, A.T., Young, F.S., 1974. A multiple regression approach to optimal drilling
and abnormal pressure detection. Soc. Petrol. Eng. J. 14, 371–384. https://doi.org/
Authors would like to thank all whose comments improved this
10.2118/4238-PA.
paper, especially six anonymous reviewers and the editor for their deep
and constructive comments on the earlier versions of this paper.
20
Brenjkar, E., Biniaz Delijani, E., Karroubi, K., 2021. Prediction of penetration rate in Kennedy, J., Eberhart, R., 1995. Particle swarm optimization. In: Proceedings of ICNN’95
drilling operations: a comparative study of three neural network forecast methods. - International Conference on Neural Networks. IEEE, pp. 1942–1948. https://doi.
J. Pet. Explor. Prod. 11, 805–818. https://doi.org/10.1007/s13202-020-01066-1. org/10.1109/ICNN.1995.488968.
Carbonell, J.G., Michalski, R.S., Mitchell, T.M., 1983. An overview of machine learning. Kim, K., 2003. Financial time series forecasting using support vector machines.
In: Machine Learning. Elsevier, pp. 3–23. https://doi.org/10.1016/B978-0-08- Neurocomputing 55, 307–319. https://doi.org/10.1016/S0925-2312(03)00372-2.
051054-5.50005-4. Kor, K., Altun, G., 2020. Is Support Vector Regression method suitable for predicting rate
Chandra, S., Gaur, P., Pathak, D., 2020. Radial basis function neural network based of penetration? J. Petrol. Sci. Eng. 194, 107542. https://doi.org/10.1016/j.
maximum power point tracking for photovoltaic brushless DC motor connected petrol.2020.107542.
water pumping system. Comput. Electr. Eng. 86, 106730. https://doi.org/10.1016/j. Kor, K., Ertekin, S., Yamanlar, S., Altun, G., 2021. Penetration rate prediction in
compeleceng.2020.106730. heterogeneous formations: a geomechanical approach through machine learning.
Cheng, K., Lu, Z., Wei, Y., Shi, Y., Zhou, Y., 2017. Mixed kernel function support vector J. Petrol. Sci. Eng. 207, 109138. https://doi.org/10.1016/j.petrol.2021.109138.
regression for global sensitivity analysis. Mech. Syst. Signal Process. 96, 201–214. Kramer, O., 2017. Genetic Algorithm Essentials, Studies in Computational Intelligence.
https://doi.org/10.1016/j.ymssp.2017.04.014. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-319-
Cortes, C., Vapnik, V., 1995. Support-vector networks. Mach. Learn. 20, 273–297. 52156-5.
https://doi.org/10.1023/A:1022627411411. Kramer, S.H., Rosenthal, R., 1998. Meta-analytic research synthesis. In: Comprehensive
Darwesh, A.K., Rasmussen, T.M., Al-Ansari, N., 2020. Controllable drilling parameter Clinical Psychology. Elsevier, pp. 351–368. https://doi.org/10.1016/B0080-4270
optimization for roller cone and polycrystalline diamond bits. J. Pet. Explor. Prod. (73)00261-3.
Technol. 10, 1657–1674. https://doi.org/10.1007/s13202-019-00823-1. Li, Y., Samuel, R., 2019. Prediction of penetration rate ahead of the bit through real-time
De Mulder, W., Molenberghs, G., Verbeke, G., 2020. An interpretation of radial basis updated machine learning models. In: Day 1 Tue, March 05, 2019. SPE. https://doi.
function networks as zero-mean Gaussian process emulators in cluster space. org/10.2118/194105-MS.
J. Comput. Appl. Math. 363, 249–255. https://doi.org/10.1016/j.cam.2019.06.011. Li, P., Nie, H., Qiu, L., Wang, R., 2017. Energy optimization of ant colony algorithm in
Dombi, J., Dineva, A., 2018. Adaptive multi-round smoothing based on the savitzky- wireless sensor network, 155014771770483 Int. J. Distributed Sens. Netw. 13.
golay filter. In: Advances in Intelligent Systems and Computing, pp. 446–454. https://doi.org/10.1177/1550147717704831.
https://doi.org/10.1007/978-3-319-62521-8_38. Liang, H., Zou, J., Li, Z., Khan, M.J., Lu, Y., 2019. Dynamic evaluation of drilling leakage
Dorigo, M., Maniezzo, V., Colorni, A., 1996. Ant system: optimization by a colony of risk based on fuzzy theory and PSO-SVR algorithm. Future Generat. Comput. Syst.
cooperating agents. IEEE Trans. Syst. Man, Cybern. Part B 26, 29–41. https://doi. 95, 454–466. https://doi.org/10.1016/j.future.2018.12.068.
org/10.1109/3477.484436. Liu, X., Kong, L., Zhang, P., Zhou, K., 2014. Permeability estimation using relaxation
Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A., Vapnik, V., 1997. Support vector time spectra derived from differential evolution inversion. J. Geophys. Eng. 11
regression machines. In: Advances in Neural Information Processing Systems 9: https://doi.org/10.1088/1742-2132/11/1/015006.
Proceedings of the 1996 Conference, pp. 155–161. Liu, Y., Dang, B., Li, Y., Lin, H., Ma, H., 2016. Applications of savitzky-golay filter for
El-Sebakhy, E., Sheltami, T., Al-Bokhitan, S., Shaaban, Y., Raharja, P., Khaeruzzaman, Y., seismic random noise reduction. Acta Geophys. 64, 101–124. https://doi.org/
2007. Support vector machines framework for predicting the PVT properties of 10.1515/acgeo-2015-0062.
crude-oil systems. In: Proceedings of SPE Middle East Oil and Gas Show and Ma, H., Wang, Y., 2010. Formation drillability prediction based on PSO-SVM. In: IEEE
Conference. Society of Petroleum Engineers. https://doi.org/10.2523/105698-MS. 10th INTERNATIONAL CONFERENCE on SIGNAL PROCESSING PROCEEDINGS.
Elkatatny, S., 2019. Development of a new rate of penetration model using self-adaptive IEEE, pp. 2497–2500. https://doi.org/10.1109/ICOSP.2010.5656700.
differential evolution-artificial neural network. Arab. J. Geosci. 12, 19. https://doi. Momeni, M., Hosseini, S.J., Ridha, S., Laruccia, M.B., Liu, X., 2018. An optimum drill bit
org/10.1007/s12517-018-4185-z. selection technique using artificial neural networks and genetic algorithms to
Enayatollahi, H., Fussey, P., Kha Nguyen, B., 2020. Modelling evaporator in organic increase the rate of penetration. J. Eng. Sci. Technol. 13, 361–372.
Rankine cycle using hybrid GD-LSE ANFIS and PSO ANFIS techniques. Therm. Sci. Moody, J., Darken, C.J., 1989. Fast learning in networks of locally-tuned processing
Eng. Prog. 19, 100570. https://doi.org/10.1016/j.tsep.2020.100570. units. Neural Comput. 1, 281–294. https://doi.org/10.1162/neco.1989.1.2.281.
Eren, T., Ozbayoglu, M.E., 2010. Real time optimization of drilling parameters during Mosavi, M.R., Khishe, M., Hatam Khani, Y., Shabani, M., 2017. Training radial basis
drilling operations. In: All Days. SPE. https://doi.org/10.2118/129126-MS. function neural network using stochastic fractal search algorithm to classify sonar
Feng, R., Grana, D., Balling, N., 2021. Imputation of missing well log data by random dataset. Iran. J. Electr. Electron. Eng. 13, 100–111. https://doi.org/10.22068/
forest and its uncertainty analysis. Comput. Geosci. 152, 104763. https://doi.org/ IJEEE.13.1.10.
10.1016/j.cageo.2021.104763. Naresh, C., Bose, P.S.C., Rao, C.S.P., 2020. ANFIS based predictive model for wire edm
Galle, E.M., Woods, H.B., 1963. Best constant weight and rotary speed for rotary rock responses involving material removal rate and surface roughness of Nitinol alloy.
bits. Drill. Prod. Pract. Mater. Today Proc. 33, 93–101. https://doi.org/10.1016/j.matpr.2020.03.216.
Gan, C., Cao, W.-H., Wu, M., Chen, X., Hu, Y.-L., Liu, K.-Z., Wang, F.-W., Zhang, S.-B., Nascimento, A., Tamas Kutas, D., Elmgerbi, A., Thonhauser, G., Hugo Mathias, M., 2015.
2019. Prediction of drilling rate of penetration (ROP) using hybrid support vector Mathematical modeling applied to drilling engineering: an application of Bourgoyne
regression: a case study on the Shennongjia area, Central China. J. Petrol. Sci. Eng. and Young ROP model to a presalt case study. Math. Probl Eng. 2015, 1–9. https://
181, 106200. https://doi.org/10.1016/j.petrol.2019.106200. doi.org/10.1155/2015/631290.
Gouda, A., Gomaa, S., Attia, A., Emara, R., Desouky, S.M., El-hoshoudy, A.N., 2021. Nettleton, D.F., Orriols-Puig, A., Fornells, A., 2010. A study of the effect of different types
Development of an artificial neural network model for predicting the dew point of noise on the precision of supervised learning techniques. Artif. Intell. Rev. 33,
pressure of retrograde gas condensate. J. Petrol. Sci. Eng. https://doi.org/10.1016/j. 275–306. https://doi.org/10.1007/s10462-010-9156-z.
petrol.2021.109284, 109284. Phirani, J., Roy, S., Pant, H.J., 2018. Predicting stagnant pore volume in porous media
Hegde, C., Daigle, H., Millwater, H., Gray, K., 2017. Analysis of rate of penetration (ROP) using temporal moments of tracer breakthrough curves. J. Petrol. Sci. Eng. 165,
prediction in drilling using physics-based and data-driven models. J. Petrol. Sci. Eng. 640–646. https://doi.org/10.1016/j.petrol.2018.02.066.
159, 295–306. https://doi.org/10.1016/j.petrol.2017.09.020. Quinlan, J.R., 1986. The effect of noise on concept learning. Mach. Learn. An Artif. Intell.
Holland, J.H., 1992. Adaptation in Natural and Artificial Systems, Adaptation in Natural Approach 2, 149–166.
and Artificial Systems. The MIT Press. https://doi.org/10.7551/mitpress/ Rezaei, F., Jafari, S., Hemmati-Sarapardeh, A., Mohammadi, A.H., 2022. Modeling of gas
1090.001.0001. viscosity at high pressure-high temperature conditions: integrating radial basis
Hong, W.-C., 2009. Chaotic particle swarm optimization algorithm in a support vector function neural network with evolutionary algorithms. J. Petrol. Sci. Eng. 208,
regression electric load forecasting model. Energy Convers. Manag. 50, 105–117. 109328. https://doi.org/10.1016/j.petrol.2021.109328.
https://doi.org/10.1016/j.enconman.2008.08.031. Sabah, M., Talebkeikhah, M., Wood, D.A., Khosravanian, R., Anemangely, M.,
Hordri, N.F., Yuhaniz, S.S., Shamsuddin, S.M., Ali, A., 2017. Hybrid biogeography based Younesi, A., 2019. A machine learning approach to predict drilling rate using
optimization—multilayer perceptron for application in intelligent medical diagnosis. petrophysical and mud logging data. Earth Sci. India 12, 319–339. https://doi.org/
Adv. Sci. Lett. 23, 5304–5308. https://doi.org/10.1166/asl.2017.7364. 10.1007/s12145-019-00381-4.
Hu, Z., Zhang, Y., Yao, L., 2014. Radial basis function neural network with particle Savitzky, A., Golay, M.J.E., 1964. Smoothing and differentiation of data by simplified
swarm optimization algorithms for regional logistics demand prediction. Discrete least squares procedures. Anal. Chem. 36, 1627–1639.
Dynam Nat. Soc. 1–13. https://doi.org/10.1155/2014/414058, 2014. Jang-Ho, Seo, Chang-Hwan, Im, Kwak, Sang-Yeop, Lee, Cheol-Gyun, Jung, Hyun-Kyo,
Jacinto, C.M.C., Freitas Filho, P.J., Nassar, S.M., Roisenberg, M., Rodrigues, D.G., 2008. An improved particle swarm optimization algorithm mimicking territorial
Lima, M.D.C., 2013. Optimization models and prediction of drilling rate (ROP) for dispute between groups for multimodal function optimization problems. IEEE Trans.
the Brazilian pre-salt layer. Chem. Eng. Trans. 33, 823–828. https://doi.org/ Magn. 44, 1046–1049. https://doi.org/10.1109/TMAG.2007.914855.
10.3303/CET1333138. Shi, X., Liu, G., Gong, X., Zhang, J., Wang, J., Zhang, H., 2016. An efficient approach for
Jang, J.-S.R., 1993. ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans. real-time prediction of rate of penetration in offshore drilling. Math. Probl Eng.
Syst. Man. Cybern. 23, 665–685. https://doi.org/10.1109/21.256541. 2016, 1–13. https://doi.org/10.1155/2016/3575380.
Kakouei, A., Masihi, M., Sola, B.S., Biniaz, E., 2014. Lithological facies identification in Slema, S., Errachdi, A., Benrejeb, M., 2018. A radial basis function neural network model
Iranian largest gas field: a comparative study of neural network methods. J. Geol. reference adaptive controller for nonlinear systems. In: 2018 15th International
Soc. India 84, 326–334. https://doi.org/10.1007/s12594-014-0136-9. Multi-Conference on Systems, Signals & Devices (SSD). IEEE, pp. 958–964. https://
Karaboga, D., Kaya, E., 2019. Adaptive network based fuzzy inference system (ANFIS) doi.org/10.1109/SSD.2018.8570538.
training approaches: a comprehensive survey. Artif. Intell. Rev. 52, 2263–2293. Soares, C., Gray, K., 2019. Real-time predictive capabilities of analytical and machine
https://doi.org/10.1007/s10462-017-9610-2. learning rate of penetration (ROP) models. J. Petrol. Sci. Eng. 172, 934–959. https://
Karkevandi-Talkhooncheh, A., Hajirezaie, S., Hemmati-Sarapardeh, A., Husein, M.M., doi.org/10.1016/j.petrol.2018.08.083.
Karan, K., Sharifi, M., 2017. Application of adaptive neuro fuzzy interface system Socha, K., Dorigo, M., 2008. Ant colony optimization for continuous domains. Eur. J.
optimized with evolutionary algorithms for modeling CO 2 -crude oil minimum Oper. Res. 185, 1155–1173. https://doi.org/10.1016/j.ejor.2006.06.046.
miscibility pressure. Fuel 205, 34–45. https://doi.org/10.1016/j.fuel.2017.05.026.
21
Soofastaei, A., Aminossadati, S.M., Arefi, M.M., Kizil, M.S., 2016. Development of a Yang, J.-F., Zhai, Y.-J., Xu, D.-P., Han, P., 2007. SMO algorithm applied in time series
multi-layer perceptron artificial neural network model to determine haul trucks model building and forecast. In: 2007 International Conference on Machine Learning
energy consumption. Int. J. Min. Sci. Technol. 26, 285–293. https://doi.org/ and Cybernetics. IEEE, pp. 2395–2400. https://doi.org/10.1109/
10.1016/j.ijmst.2015.12.015. ICMLC.2007.4370546.
Soofi, A.S., Cao, L., 2002. Modelling and Forecasting Financial Data, Studies in Yılmaz, I., Yuksek, A.G., 2008. An example of artificial neural network (ANN) application
Computational Finance. Springer US, Boston, MA. https://doi.org/10.1007/978-1- for indirect estimation of rock parameters. Rock Mech. Rock Eng. 41, 781–795.
4615-0931-8. https://doi.org/10.1007/s00603-007-0138-7.
Storn, R., Price, K., 1997. Differential evolution - a simple and efficient heuristic for Zadeh, L.A., 1975. The concept of a linguistic variable and its application to approximate
global optimization over continuous spaces. J. Global Optim. 11, 341–359. https:// reasoning—II. Inf. Sci. 8, 301–357.
doi.org/10.1023/A:1008202821328. Zhao, Y., Noorbakhsh, A., Koopialipoor, M., Azizi, A., Tahir, M.M., 2020. A new
Tukey, J.W., 1977. Exploratory Data Analysis. Addison-Wesley Pub. Co. methodology for optimization and prediction of rate of penetration during drilling
Tunkiel, A.T., Sui, D., Wiktorski, T., 2021. Reference dataset for rate of penetration operations. Eng. Comput. 36, 587–595. https://doi.org/10.1007/s00366-019-
benchmarking. J. Petrol. Sci. Eng. 196, 108069. https://doi.org/10.1016/j. 00715-2.
petrol.2020.108069. Zhao, D., Liu, L., Yu, F., Heidari, A.A., Wang, M., Liang, G., Muhammad, K., Chen, H.,
Ulker, E., Sorgun, M., 2016. Comparison of computational intelligence models for 2021. Chaotic random spare ant colony optimization for multi-threshold image
cuttings transport in horizontal and deviated wells. J. Petrol. Sci. Eng. 146, 832–837. segmentation of 2D Kapur entropy. Knowl. Base Syst. 216, 106510. https://doi.org/
https://doi.org/10.1016/j.petrol.2016.07.022. 10.1016/j.knosys.2020.106510.
Wang, L.-X., 1997. A Course in Fuzzy Systems and Control. Prentice Hall PTR Upper, Zhou, Y., Chen, X., 2018. Prediction of ROP and MPV based on support vector regression
Saddle River, NJ. method. In: 2018 37th Chinese Control Conference (CCC). IEEE, pp. 1839–1843.
Wang, R.Y., Storey, V.C., Firth, C.P., 1995. A framework for analysis of data quality https://doi.org/10.23919/ChiCC.2018.8484136.
research. IEEE Trans. Knowl. Data Eng. 7, 623–640. https://doi.org/10.1109/ Zhou, X., Ma, Y., 2013. A study on SMO algorithm for solving ε-SVR with non-PSD
69.404034. kernels. Commun. Stat. Simulat. Comput. 42, 2175–2196. https://doi.org/10.1080/
Wang, S., Dong, X., Sun, R., 2010. Predicting saturates of sour vacuum gas oil using 03610918.2012.695843.
artificial neural networks and genetic algorithms. Expert Syst. Appl. 37, 4768–4771. Zhu, X., Wu, X., Chen, Q., 2003. Eliminating class noise in large datasets. In: Proceedings,
https://doi.org/10.1016/j.eswa.2009.11.073. Twentieth International Conference on Machine Learning, pp. 920–927.
Warsito, B., Santoso, R., Suparti Yasin, H., 2018. Cascade forward neural network for
time series prediction. J. Phys. Conf. Ser. 1025 https://doi.org/10.1088/1742-6596/
1025/1/012097, 012097.
22

1 s2.0 S0920410521016442 Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S0920410521016442 Main

Uploaded by

Copyright:

Available Formats

Journal of Petroleum Science and Engineering 210 (2022) 110033

Contents lists available at ScienceDirect

Journal of Petroleum Science and Engineering

Computational prediction of the drilling rate of penetration (ROP): A

1. Introduction achieving an optimal value of ROP is essential (Abdulmalek et al., 2018).

Q1 Depth ft 1583.36 8677.82 5357.99 2015.95 − 0.17 1.92

distribution, respectively (Phirani et al., 2018). Moreover, the pairs plot

2.1. Noise reduction

Fig. 5. An illustration of separating all data to estimate the ROP.

2.2. Data split input of ML models in all 18 studies.

Table 2 Table 2 (continued )

Ayoub et al. ANFIS Depth, Bit Size, Mud _

To solve the problem as mentioned above, the SVR model is finally

Subject to: 3.1.3. Adaptive neuro-fuzzy inference system (ANFIS)

Fig. 9. ANFIS structure for two inputs and two rules.

IF x1 is A2 AND x2 is B2 . THEN y = a2 x1 + b2 x2 + c2 2017).

where IF and THEN are called the antecedent and consequence

efficient with fewer computations compared to other types of neural

RBFNN parameters include ε, w, c and σ , which must be set based on the

Fig. 13. Flowchart of models development process for ROP estimation.

PSO-BY Case 1 3.37 − 1.7 × 10 − 4

randomly selected vectors. After applying the mutation operator, the

ACO-MLPNN 0.834 16.945 4.894 0.637 19.651 5.078

regarded as an auxiliary operator that is applied to the population at a

R AAPRE RMSE R AAPRE RMSE 3.3. Traditional ROP models

2012; Hu et al., 2014). Table 4, gives a list of calculated values of RBFNN

evaluating model performance (Karkevandi-Talkhooncheh et al., 2017).

Root Mean Square Error (RMSE): 6. Conclusion

You might also like