Farsi 2021

Natural Resources Research (Ó 2021)
https://doi.org/10.1007/s11053-021-09852-2
Original Paper
Predicting Formation Pore-Pressure from Well-Log Data

with Hybrid Machine-Learning Optimization Algorithms
Mohammad Farsi,1 Nima Mohamadian,2 Hamzeh Ghorbani ,3 David A. Wood ,4,8

Shadfar Davoodi ,5 Jamshid Moghadasi,6 and Mehdi Ahmadi Alvar7
Received 23 September 2020; accepted 28 February 2021
Accurate prediction of pore-pressures in the subsurface is paramount for successful planning and
drilling of oil and gas wellbores. It saves cost and time and helps to avoid drilling problems. As it is
expensive and time-consuming to measure pore-pressure directly in wellbores, it is useful to be able to
predict it from various petrophysical input variables on a supervised learning basis calibrated to a
benchmark wellbore. This study developed and compared three-hybrid machine-learning opti-
mization models applied to a diverse suite of 9 petrophysical input variables to predict pore-pressure
across a 273-m-thick, predominately carbonate, reservoir sequence in the giant Marun oil field (Iran)
using 1972 data records. The analysis identified that the multilayer extreme learning machine model
hybridized with a particle swarm optimization (MELM–PSO) applied to seven input variables by
feature selection provided the most accurate pore-pressure predictions for the full dataset
(RMSE = 11.551 psi (1 psi = 6.8947590868 kPa) for well MN#281). The Savitzky–Golay (SG) filter
was applied to pre-process the data, and the properties were filter-ranked using the wrapping method.
The MELM–PSO model outperformed the pore-pressure prediction accuracy achieved by com-
monly used empirical formulas involving sonic or resistivity log data or calculated pore compress-
ibility. To further verify and generalize the applicability of the MELM–PSO model, it was applied to
two other Marun oil field wells (MN#297 and MN#378) achieving RMSE prediction accuracy of
10.031 psi and 10.150 psi, respectively. These results confirmed that the trained model can be reliably
applied to multiple locations across the Marun oil field for predicting pore-pressure.
KEY WORDS: Pore-pressure prediction, Petrophysical well-log data, Hybrid machine-learning opti-
mization models, Feature selection, Empirical model comparisons, Multilayer extreme learning
machine, Particle swarm optimization.
1
Department of Petroleum Engineering, Faculty of Petroleum
and Chemical Engineering, Science and Research Branch,
Islamic Azad University, Tehran, Iran.
2
Young Researchers and Elite Club, Omidiyeh Branch, Islamic
Azad University, Omidiyeh, Iran.
3
Young Researchers and Elite Club, Ahvaz Branch, Islamic Azad
University, Ahvaz, Iran.
4
DWA Energy Limited, Lincoln LN5 9JP, UK.
5
School of Earth Sciences and Engineering, Tomsk Polytechnic
University, Lenin Avenue, Tomsk, Russia.
6
Petroleum Engineering Department, Petroleum Industry
University, Ahvaz, Iran.
7
Faculty of Engineering, Department of Computer Engineering,
Shahid Chamran University, Ahwaz, Iran.
8
To whom correspondence should be addressed; e-mail:
dw@dwasolutions.com
Ó 2021 International Association for Mathematical Geosciences

M. Farsi et al.
network (RBFNN) model to predict soil pore-water

INTRODUCTION
pressure responses to rainfall. Their results showed
that RBFNN models can be used to map meaning-
Understanding the detailed variations in sub-
fully the nonlinear and complex behavior of pores in
surface formation pore-pressure is a key factor in
subsurface formations. Keshavarzi and Jahanbakhsh
designing accurate drilling plans for the implemen-
(2013) developed a back-propagation artificial neu-
tation and completion of oil and gas wellbores. The
ral network (BPANN) to predict the pore-pressure
formulation of drilling plans, subsurface geo-me-
gradient in the Asmari (Oligo–Miocene) and
chanical models, enhanced oil and gas recovery
Bangestan (Cretaceous) reservoirs of a large oil field
programs and reservoir development strategies all
in southwestern Iran. Results demonstrated that
depend largely on accurate knowledge of formation
their BPANN model predicted the pore-pressure
pore-pressures. Such knowledge substantially im-
gradient of that field with higher accuracy than the
proves the chances of conducting successful and
empirical Eaton-method models.
efficient drilling operations avoiding hazards and
Ahmed et al. (2019) applied an ANN model
minimizing drilling time and costs.
using borehole variables and well-logs recorded
Stratified layers of sedimentary rocks deposited
while drilling to predict pore-pressure. The well they
over geological time scales of millions of years are
studied was drilled onshore with data from sections
compressed progressively over time as they become
drilled at two different hole sizes: 83/8 and 57/8 in-
buried. Gradually, the build-up of the sedimentary
ches.1 The drilling and well-log variables they eval-
overburden pressurizes the pore fluids of the buried
uated included weight on bit (WOB), rotational
formations. The presence of compacted and non-
speed (RPM), rate of penetration (ROP), mud
permeable layers capping porous zones constrains
weight (MW), bulk density (RHOB), porosity (u)
the pressurized formation fluids from escaping from
and compressional sonic travel time (DT). Their
those potential reservoirs. Hydrostatic pressure
results identified that feature selections among these
gradually increases in sedimentary formations with
variables are important in determining the accuracy
burial depth governed by a functional relationship
of pore-pressure prediction. Applying an optimized
with overburden pressure (Polito et al. 2008; Shi and
ANN model with careful feature selection, they
Wang 1986). However, over-pressured systems can
were able to estimate pore-pressure for that well
evolve in tightly sealed porous zones. A key chal-
with high accuracy (correlation coefficient = 0.998;
lenge in oil and gas drilling is anticipating and mit-
absolute percentage error = 0.17%). Ahmed, Elk-
igating pore-pressure variations in the sections being
atatny, et al. (2019), Ahmed, Mahmoud, et al. (2019)
drilled, above and through, the potential reservoir
applied a support vector machine (SVM) algorithm
zones (Ghorbani et al. 2019; Mohamadian et al.
to predict pore-pressure from petrophysical data
2019, 2021; Rashidi et al. 2020, 2021), some of which
collected from a lateral branch of an offshore well-
may not be in hydrostatic communication with for-
bore. The wellbore penetrated a wide range of
mations above or below them Liu (2017).
lithologies including five interbedded shales and
The availability of subsurface pore-pressure
sandstones overlain by a carbonate layer. They
maps, identifying anomalous-pressure regions, both
predict pore-pressure and fracture pressure for that
over-pressured and under-pressured, substantially
well using an SVM model achieving a coefficient of
reduces drilling risks and surprises. It facilitates
determination (R2) of about 0.995.
drilling plans and pressure management that are
Hutomo et al. (2019) and Andrian et al. (2020)
more reliable during drilling operations (Rehm et al.
predicted pore-pressure values and mapped their
2013). Hazards, such as well blowouts, pressure
distribution using 3D seismic data and a neural
kicks, circulation losses and fluid influx, may all be
network method, achieving prediction accuracies of
more easily avoided armed with prior knowledge of
up to 90%. Yu et al. (2020) developed four machine-
the subsurface pressure regimes (Osborne and
learning methods for pore-pressure prediction
Swarbrick 1997).
involving a multilayer perceptron neural network,
In recent years, there have been several at-
support vector machine, random forest and gradient
tempts to use machine-learning algorithms to out-
boosting methods applied to well-log data from a set
perform empirical equations in the prediction of
of offshore exploration wells. Their result demon-
subsurface pore-pressure profiles. Mustafa et al.
(2012) developed a radial basis function neural 1
1 inch = 0.0254 m.

strated that their random forest model outper- Rlog n
formed the other algorithms in terms of goodness- Pp ¼ dv dv Phyd ; ð2Þ
Rn
of-fit, generalizability and prediction accuracy.
In this paper, three-hybrid algorithms, namely where Pp is pore-pressure, dv is total vertical stress,
MLP–PSO, LSSVM–PSO and MELM–PSO, were Phyd is normal or hydrostatic pressure, Rlog is the
developed and compared for predicting pore-pres- observed resistivity log value, Rn is the resistivity log
sure from well-log data. The dataset evaluated in- associated with a normal pore-pressure profile, and,
cludes the available information comprising n is an empirical constant, for which a value of 1.2 is
effective stress (deff), pore compressibility (Cp), commonly used (Yoshida et al. 1996). EatonÕs (1975)
corrected gamma ray (CGR; adjusted for uranium empirical formula for predicting pore-pressure from
concentrations), uncorrected spectral gamma ray sonic well-log transit time values is expressed as
(SGR), potassium (POTA) from the SGR tool, (Yoshida et al. 1996):
thorium (THOR) from the SGR tool, uranium
DT log n
(URAN) from the SGR tool, the photoelectric Pp ¼ dv dv Phyd ; ð3Þ
absorption factor (PEF), bulk formation density DT n
(RHOB), compressional sonic transit time (DT),
neutron porosity (NPHI) and deep resistivity (ILD). where Pp is pore-pressure, dv is total vertical stress,
The best-performing algorithm was identified for Phyd is normal or hydrostatic pressure, DTlog is the
accurate prediction of pore-pressure using optimized sonic log transit time value, DTn is the sonic log
feature selection from the well-log curves. value when pore-pressure is normal, n is an empiri-
cal constant, for which a value of 3.0 is commonly
used.
PORE-PRESSURE MODELS In 1995, Bowers (1995) proposed a model for
determining pore-pressure based on effective stress,
Calculating subsurface pore-pressure using Ea- the main purpose of which was to calculate effective
ton’s method (Eaton 1975) by solving the equations velocity stresses and use them to calculate pore-
of Terzaghi et al. (1996) as a function of overburden pressure. That method considered compaction dise-
pressure and matrix stress is widely used. Equa- quilibrium and unloading due to fluid expansion as
tion (1) expresses the fundamental relationship be- the main mechanisms involved in generating over-
tween overburden pressure and pore-pressure, thus: pressure. Azadpour et al. (2015) predicted pore-
pressure in an Iranian gas field based on three
S ¼ r þ PP ; ð1Þ empirical methods: Eaton (Eq. 2) (Eaton 1975),
Bowers (velocity-based model) (Bowers 1995) and
where S is overburden pressure; r is matrix stress; the compressibility (Atashbari and Tingay 2012)
and PP is pore-pressure. As a benchmark, a ‘‘nor- methods. They found that the Eaton method, with
mal’’ compaction curve is established using Eq. 1 for an n exponent of about 0.5, provided more accurate
fine-grained sediments for which overburden stress predictions than the other methods. Consequently,
and pore-pressure increase with burial depth as they developed a three-dimensional pore-pressure
‘‘normal’’ compaction progresses. Such compaction prediction model for that field applying statistical
curves vs. depth provide simplistic effective stress extrapolations. The estimated pore-pressure values
and/or pore-pressure relationship that tend to vary from that 3D model proved to be in good agreement
among sedimentary basins due to distinctive with pore-pressures measured directly by using a
lithologies and prevailing subsurface stress regimes wireline-based modular dynamic tester. Taking into
(Swarbrick 2001). It is useful to make a comparison account compaction to address the challenge of
between the ‘‘observed’’ and ‘‘normal’’ compaction disequilibrium applies Eqs. (4) to (6) to resolve
profiles to identify over-pressured and under-pres- unloading conditions.
sured zones (Swarbrick 2001). Eaton’s method pro-
vides well-used empirical equations for deriving V ¼ V 0 þ ArB ð4Þ
pore-pressure from basic well-log data. For example,
" ðU1 Þ #B
Eqs. 2 and 3 express EatonÕs formulas for calculating r
pore-pressure from resistivity and sonic well-logs, V ¼ V 0 þ A rmax ð5Þ
rmax
respectively.
M. Farsi et al.
1 Feature Selection and Ranking

V max 1500 B
rmax ¼ ; ð6Þ
A In recent years, the number of features selected
for a variety of machine-learning applications has
where V is velocity, V0 is surface velocity (normally changed from less than 10 features to more than 100
1500 m/s), r is vertical effective stress, A and B are variables in some applications. Eliminating irrele-
obtained from calibrating regional offset velocity vs. vant or less influential variables can be computa-
effective stress data, U is the unloading parameter, tionally efficient, develop simpler models and, in
and Vmax and rmax are velocity and effective stress, some cases, improve dependent variable prediction
respectively, at the onset of unloading; accuracy. This involves selecting from a set of all
Atashbari and Tingay (2012) calculated effec- available features (variables) a subset that provides
tive stress using well-log derived bulk density data the best prediction estimates for variables of interest
for two Iranian carbonate reservoirs. As pore-pres- (Jain and Zongker 1997). In many cases, such a
sure changes during compaction are, to a degree, selection can be NP-hard and requires a large
related to the changing magnitude of the pore vol- number of iterations to estimate all possible com-
ume, that volume is also a function of bulk density, binations (Chandrashekar and Sahin 2014). For
as it increases during compaction. As compaction example, if 20 features are available, then 2 N pos-
progresses, the pore space is reduced causing the sible combinations (1,048,576) exist.
pore fluids to become compressed, thereby increas- Feature selection can be achieved by several
ing pore-pressure. Equation (7) expresses the rela- methods, including: filtering, wrapping and embed-
tionship they evaluated between rock and pore ded methods. The filtering method offers the sim-
compressibility and pore-pressure, thus: plest route to eliminating certain features. However,
c
ð1 £ÞCb reff : care in applying this method is required, because not
Pp ¼ ; ð7Þ every evaluated subset is optimal, and different
ð1 £ÞCb £Cp
combinations of features can produce comparable
where Pp is pore-pressure, ø is fractional porosity, prediction accuracy. In addition, the use of an effi-
Cb is bulk compressibility in psi1, Cp is pore com- cient machine-learning algorithm is essential for this
pressibility in psi1, reff is vertical effective stress in purpose (John et al. 1994). On the other hand,
psi, and, is an empirical constant, ranging from 0.9 wrapping methods can often be more effective in
to 1.0. (Note: 1 psi = 6.8947590868 kPa). selecting features. This is because, by evaluating
multiple test solution spaces, they more effectively
take into account the architecture of the underlying
METHODOLOGY model. Unlike filtering methods, wrappers are also
able to detect attribute dependencies (Kohavi and
Work Flow John 1997). Wrapping methods tend to use either
sequential selection of features or employ evolu-
This study combined three machine-learning tionary algorithms to test multiple feature combi-
models with an efficient optimization to predict nations. A binary genetic algorithm (GA) is
formation pore-pressure from petrophysical data suitable for identifying ineffective variables for
collected from wellbores drilled in the large Marun potential feature removal.
oil and gas field in southwest Iran. It combined the In this research, feature selection was con-
basic well-log data and some of the established pore- ducted using a wrapping method that combines a
pressure relationships with stress and compressibil- genetic algorithm with a multilayer perceptron
ity to derive high-resolution predictions for forma- neural network (MLP–NN). Some subsets of
tion pore-pressure prediction for the prolific oil- potential solutions were initially generated. Each of
bearing Asmari carbonate reservoir in that field. The such solution contained a different combination of
models were developed and evaluated in a novel input variables that were evaluated in normalized
three-stage sequence summarized in Figure 1. The terms. Fitness scores for each combination of vari-
approach identifies for the first time the usefulness ables were expressed as a cost function acting as the
of gamma ray and PEF log data in accurately pre- objective function for the optimization. The cost
dicting subsurface pore-pressures. function used in this study was root mean squared
error (RMSE) of the dependent variable. High-
Figure 1. Workflow sequence applied to evaluate hybrid machine-learning/optimization models to predict

accurately formation pore-pressure based on various petrophysical inputs.
performing solutions (those with the lowest RMSE sures from multiple depths. At each depth to be
values) were identified by ranking and passed on to tested, the borehole interval is isolated and a con-
the next optimization iteration. The GA applied trolled flow of reservoir fluid is induced. The RFT
three modification processes (crossover, combina- data, recorded in situ, provide valuable information
tion and mutation) to create a new subset of solu- about the reservoir condition, such as temperature,
tions, including the best performers from the pressure, permeability. The fluid samples recovered
previous iteration, for evaluation in the next itera- by the RFT can be analyzed subsequently to provide
tion (Wahab et al. 2015). compositional details of the reservoir fluids. How-
ever, deploying such tools to record multiple pres-
sure points in boreholes is time-consuming and
Repeat Formation Tester (RFT) Downhole Direct expensive and such data are not routinely collected
Pressure Measurements across the entire reservoir sequence in every well
drilled.
Wireline or measurement-while-drilling This study used the data recorded by the RFT
(MWD) techniques can provide direct formation- tool to provide a measured pressure vs. depth data
pressure measurements. Using two packers and a series from a single well. The objective was to use
probe, the repeat formation tester (RFT), and sim- hybrid optimization/machine-learning tools to pre-
ilar tools, can sequentially record formation pres- dict that measured formation pore-pressure vs.
M. Farsi et al.
depth data from the available suite of well-logs. The Multilayer Perceptron
value of doing this is that reliable and verified ma-
chine-learning solutions can be applied to predict Artificial neural networks (ANNs) have be-
pore-pressure in adjacent wells (already drilled and come widely used since their introduction in the
yet-to-be drilled) for which RFT data are not 1990s (Ali 1994). Four factors determine the pre-
available. This would avoid the cost and time asso- diction accuracy of ANN models (Maimon and
ciated with taking direct in situ pressure measure- Rokach 2009): (a) feature selection (i.e., what input
ments in the reservoir sections of each well drilled. variables should be included); (b) network archi-
tecture (i.e., number of layers and neurons); (c)
transfer functions between layers; and (d) training
Machine-Learning Algorithms Evaluated algorithm selection. Multilayer perceptron (MLP) is
the most commonly used feed-forward ANN be-
Optimization and machine-learning can help to cause it is easy to set up and adapted for evaluating
resolve issues in many areas of the oil and gas large and complex datasets (Bishop 2006). The
industry, including reservoirs (Ghorbani et al. Levenberg–Marquardt (LM) is the most common
2017a), formation damage (Mohammadian and algorithm applied for training of MLPs because it
Ghorbani 2015), wellbore stability (Darvishpour tends to converge rapidly and reliably to find high-
et al. 2019), rheology and filtration (Mohamadian performing predictions. However, in large, complex,
et al. 2018), production (Ghorbani and Moghadasi nonlinear datasets, the LM algorithm tends to con-
2014; Ghorbani et al. 2014 2017b) and drilling fluid verge too rapidly and it becomes trapped at local
(Mohamadian et al. 2019). Machine-learning algo- minima. Consequently, to improve the performance
rithms are now the mathematical tools of choice to of MLPs with such datasets, it is beneficial to sup-
provide accurate and reliable predictions of depen- plement the LM algorithm with a more effective
dent variables governed by nonlinear and scattered optimization algorithm.
relationships with other influential variables. As In this study, the PSO is used for that purpose.
such they offer valuable solutions in all sectors of the Based on trials and sensitivity analysis, two hidden
oil and gas industry (Choubineh et al. 2017; Ghor- layers were selected as the optimum structure for an
bani et al. 2017c 2018 2019, 2020a, b; Mohamadian MLP to predict pore-pressure, with hidden layer 1
et al. 2021; Rashidi et al. 2020; Farsi et al. 2021; and hidden layer 2 assigned 10 and 5 neurons,
Ranaee et al. 2021). respectively. Similarly, based on sensitivity analysis
In this study, three machine-learning algorithms performed, the transfer functions ‘‘tansig’’ and
were deployed: (1) least squares support vector ‘‘purelin’’ were selected for hidden layers 1 and 2,
machine (LSSVM); (2) extreme learning machine respectively.
(ELM); and (3) multilayer perceptron (MLP) neural
network. These algorithms were each coupled with
the powerful particle swarm optimization (PSO) to Extreme Learning Machine
develop hybrid models that are effective in predict-
ing pore-pressure rapidly and with high accuracy. The ELM algorithm was introduced in 2006 an
alternative feed-forward neural network algorithm
offering a high computational speed (Huang et al.
Genetic Algorithm (GA) 2006). Applications have shown that ELMs can re-
duce a networkÕs training time and improve its
GA is an evolutionary algorithm that simulates overall performance. Whereas MLPs use time-con-
natural selection and solves problems optimally in suming iterative back-propagation algorithms to
an iterative manner as described in Simon (2013). establish the weights and biases applied to their
High-performance solutions are identified in each hidden layers, ELMs, with a single hidden layer,
iteration and are used preferentially to contribute to select weights and biases randomly for that hidden
the modification involved in generating new solu- layer from a uniform distribution. In the ELM, those
tions to consider for the next GA iteration. Con- randomly selected weights and biases are typically
versely, the worse-performing solutions are not adjusted during the network tuning process
progressively excluded based on their poor fitness (Huang et al. 2006; Wang et al. 2014). The output
comparisons. weights of ELM are derived using the Moore–Pen-
rose generalized inverse of the hidden-layer output. was adopted here to predict pore-pressure from
Figure 2 illustrates the structure of a single-hidden- well-log data.
layer ELM (Yeom and Kwak 2017).
More complex variants of ELMs, such as the
two-hidden-layer ELM, the four-hidden-layer ELM PSO Algorithm
and the multiple-hidden-layer ELM (MELM) have
been applied successfully to more complex, large, The PSO algorithm was developed by Kennedy
nonlinear datasets (Liu et al. 2019a 2019b; Xiao and Eberhart in 1995 (Kennedy 1997; Kennedy and
et al. 2017). Tests suggested that an MELM was Eberhart 1995). This algorithm is a stochastic opti-
more effective to predict pore-pressure from input mization technique originating as an analogy to
data from multiple well-logs. More details about animal swarming behaviors observed in the natural
theoretical description and construction process of world (Yang and Papa 2016). PSO algorithm in
MELM can be found in Xiao et al. (2017). combination of other optimization and machine-
learning algorithms has been used widely to solve
many nonlinear problems in petroleum industry
Least Squares Support Vector Machine achieving high degrees of prediction accuracy
(Atashnezhad et al. 2014). Theoretical background
LSSVM was developed in 1999 by Suykens and and implementation details for the PSO algorithm
Vandewalle (1999) as a modified form of the are available elsewhere (Anemangely et al. 2017;
established regression-based support vector machine Kennedy 1997; Mohamadian et al. 2021).
(SVM) machine-learning algorithm (Vapnik 2013).
There are three main differences between the
LSSVM and SVM algorithms. First, LSSVM uses a Hybrid Machine-Learning Optimization Algorithms
least squares error cost function while SVM uses
nonnegative errors for its cost function. Second, Hybrid LSSVM–PSO Model
LSSVM uses equality constraints while SVM uses
inequality constraints (Yuan et al. 2015). Third, A schematic flow diagram (Fig. 3) illustrates
whereas SVM conducts training using quadratic how LSSVM and PSO algorithms were integrated to
programming (QP), LSSVM conducts training using operate as a hybrid LSSVM–PSO machine-learning
linear programing (LP), which reduces the compu- optimization model to predict pore-pressure effec-
tational complexity and speeds up the algorithm tively in the dataset evaluated. An RBF kernel
(Kisi and Parmar 2016). The LSSVM algorithm has function provides the best pore-pressure predictions.
been employed successfully to solve a wide range of For tuning the hyperparameters influencing the
regression and classification machine-learning tasks LSSVM prediction performance, the PSO deter-
(Adankon and Cheriet 2009; Lima et al. 2010) and it mined optimum values for the regularization
Figure 2. Extreme learning machine (ELM) with single hidden

layer structure (Yeom and Kwak 2017). X represents two input
variables, Y represents one output (dependent variable), w is a
matrix of weights applied to each node (G1 to G4) in the hidden
layer and b is the bias applied.
M. Farsi et al.
Figure 3. Flow diagram for LSSVM–PSO hybrid machine-learning optimization model applied for PP
prediction.
2 ðc) and the Gaussian RBF kernelÕs vari-

parameter sensitivity analysis is often used to establish the best
ance r . These values are listed in Table 1. L and n values to use. However, using an opti-
mization to conduct this sensitivity analysis speeds
up the process and evaluates a greater number of
Hybrid MELM–PSO Model feasible L and n values. On the one hand, MELM
architectures involving a large number of hidden
MELM neural networks are set up with a layers each with many neurons will increase its
specified number of hidden layers, L, each with a structural complexity, computational duration and
specified number of neurons (n). Trial-and-error more likely lead to over-fitting the available data.
Table 1. Optimized control parameter for developed hybrid LSSVM–PSO used for pore-pressure prediction
PSO LSSVM
Control parameter Value Control parameter Value
Maximum iterations 100 Regularization parameter y 2.2065

Swarm size 60 Variance of RBF Kernel r2 0.6814
Cognitive constant 2.05 Objective function RMSE
Social constant 2.05
Inertia weight (damping ratio) 0.98
Var minimum 10
Var maximum 10
Inertia coefficient (w) 1
Maximum velocity 3
Minimum velocity 3
Figure 4. Flow diagram for hybrid MELM–PSO model.

M. Farsi et al.
Table 2. RMSE values calculated for a range of hidden layers and neurons in the layers of the MELM–PSO sensitivity analysis for pore-
pressure prediction
Number of hidden layers Number of neurons in the layers
5 10 15 20 25
3 0.168009802 0.153221446 0.083342882 0.04963854 0.04958423

5 0.144563639 0.133925463 0.062135551 0.04563541 0.04935854
7 0.12542371 0.093628744 0.051254873 0.04545285 0.04985168
9 0.13582213 0.103258745 0.055231565 0.04854299 0.04974982
Table 3. Hybrid MELM–PSO control parameters used for predicting pore-pressure
PSO MELM
Maximum iterations 100 Number of input 7

Swarm size 60 Number of neurons/layer 20
Cognitive constant 2.05 Number of hidden layers 5
Social constant 2.05 Objective function RMSE
Inertia weight (damping ratio) 0.99
Var minimum 10
Var maximum 10
Maximum velocity 2
Minimum velocity 2
On the other hand, MELM with a small number of neurons in those layers to consider. The established
hidden layers and neurons is more likely not to narrow ranges were then used as constraints for the
achieve optimal prediction accuracy of dependent developed PSO–MELM–PSO model. The PSO
variables. Therefore, a balance must be struck in (Step 2 procedure) calculates the values of weights
network-element selection to achieve an efficient and biases for the limited defined ranges of number
MELM model with optimal performance. In this of layers and neurons in each hidden layer estab-
study, the PSO algorithm was applied in two steps to lished by Step 1.
construct the optimized MELM network (Fig. 4). The of number of MELM hidden layers, based
Step 1 Identify optimum narrow ranges for on step 1 analysis, was allowed to vary from 3 to 9,
numbers of hidden layers and neurons in each hid- and the number of neurons in each hidden layer was
den layer (replacing the traditional method of trial- allowed to vary from 5 to 25. Table 2 displays the
and-error method). results of the Step 1 analysis performed for the
Step 2 Identify the optimum values for the MELM–PSO hybrid model. The highest pore-pres-
weights assigned to each neuron in each hidden sure prediction accuracy (lowest root mean squared
layer and the biases assigned to each hidden layer error, RMSE value) was achieved in the range of 7–9
(replacing the traditional ELM method of randomly hidden layers with 20–25 neurons in each hidden
assigning these values). layer. Consequently, in Step 2 of the PSO–MELM–
The two-step optimization method developed is PSO model developed for prediction of pore-pres-
able to reduce the computational time of this hybrid sure in the studied dataset, the range of the hidden
algorithm. The method involves an ‘‘initial-pass’’ layers was constrained to vary from 7 to 9, and the
tuning optimization procedure. This was performed range of the number of neurons in each hidden layer
with the objective of narrowing down the ranges of was constrained to vary from 20 to 25.
meaningful number of hidden layers and numbers of
The values of control parameters established by

trial-and-error for the PSO algorithm are listed in
Table 3. The number of iterations for the PSO
algorithm used in Step 1 (MELM–PSO) was 40. For
the second-step application of the PSO algorithm,
i.e., to select optimum weights and biases in the
MELM–PSO configuration, the number of iterations
was set to 100.
Hybrid MLP–PSO Model
A schematic flow diagram for implementing the

hybrid MLP–PSO model is illustrated in Figure 5.
The PSO algorithm was applied with 100 iterations
to determine the optimum weights and biases for the
specified number of hidden layers and number of
neurons within each hidden layer of the MLP. The
PSO control-parameter values defined in Table 3
were applied, i.e., the same control-parameter values
used for the MELM–PSO model. Table 4 lists the
MLP setup for the hybrid MLP–PSO model used to
predict pore-pressure.
DATA COLLECTION
AND CHARACTERIZATION
Marun Oil Field Description

Figure 5. Flow diagram for implementing the hybrid MLP–
Three wellbores (MN # 281, MN # 297 and MN PSO neural network used for pore-pressure prediction.
# 378) were evaluated to provide pore-pressure
predictions for the giant Marun oil field onshore
southwest Iran (Fig. 6). The field was discovered in
trated the reservoir in the measured depth range
1963 and it is one of the largest oil fields in the
from 3660 to 3933 m (273 m thick). Datasets from
Zagros Basin consisting of two oil reservoirs. The
two other Marun oil field wells (MN# 279 and MN#
Asmari (Oligocene to Early Miocene) and Banges-
378) were used subsequently as additional indepen-
tan (Upper Cretaceous) Formations collectively
dent testing subsets to confirm the generality and
contain in-place oil resources of some 46 billion
reliability of the developed algorithms. Pore-pres-
barrels. The Khami (Lower Cretaceous) Formation
sure was the dependent variable representing the
forms an underlying natural gas reservoir with some
prediction objective. The petrophysical variables
462 trillion cubic feet2 of gas in place.
were all derived from common well-log tools (Dar-
ling 2005; Rubin and Hubbard 2005; Lyons and
Plisga 2011; Satter and Iqbal 2015; Ghasemi and
Data Collection
Bayuk 2020) recorded or calculated from wireline
data. They included effective stress (deff), pore
In total, 1792 well-log data records from well
compressibility (Cp), corrected gamma ray (CGR)
MN # 281, drilled in 2015, were compiled for initial
adjusted for uranium contents, uncorrected spectral
training and testing analysis of the algorithms across
gamma ray (SGR), potassium (POTA) from the
the Asmari carbonate reservoir. MN # 281 pene-
SGR tool, thorium (THOR) from the SGR tool,
2 uranium (URAN) from the SGR tool, the photo-
1 cubic feet = 0.0283168466 cubic meter.
M. Farsi et al.
Table 4. Hybrid MLP–PSO control parameters used for predicting pore-pressure
PSO MLP
Maximum iterations 100 Input variables 7

Swarm size 60 Number of hidden layers 2
Cognitive constant 2.05 Input layer neurons 7
Social constant 2.05 Hidden layer l neurons 10
Inertia weight (damping ratio) 0.98 Hidden layer 2 neurons 5
Var minimum 1 Output layer neurons 1
Var maximum 1 Objective function RMSE
Maximum velocity 0.2
Minimum velocity 0.2
electric absorption factor (PEF), bulk formation Higher NPHI and DT in the upper part of the
density (RHOB), compressional sonic transit time formation tended to correspond with lower pore-
(DT), neutron porosity (NPHI) and deep resistivity pressures, whereas lower RHOB values tended to
(ILD). Sensitivity analysis suggested that the com- correspond with relatively high pore-pressures in
positional variables (POTA, THOR and URAN) that part of the formation (Fig. 8). In contrast with
were not related to pore-pressure. The remaining 9 the other variables, mid-range ILD values corre-
variables show complex relationships with pore- sponded to relatively high pore-pressures in the
pressure and depth and were therefore used as in- upper part of the formation. On the other hand, pore
puts for the machine-learning prediction analysis. compressibility (Cp) and photoelectric absorption
Table 5 statistically summarizes the distributions of factor (PEF) values did not show obvious relation-
the 9 input variables and the dependent variable, ships with pore-pressure in the upper part of the
pore-pressure, across the Asmari reservoir forma- formation. No obvious relationships existed between
tion for the three wells studied MN# 281 (1792 data the values of the input variables and pore-pressure
records), MN# 279 (1225 data records) and MN# 378 in the lower part of the Asmari Formation (Figs. 7
(1225 data records). and 8). The distributions displayed in Figures 7 and
8 confirm that none of the 9 input variables would be
individually reliable on its own for predicting pore-
Variable Distributions vs. Depth and Pore-pressure pressure.
The distributions of the 9 input variables vs.

depth for MN#281, contoured for pore-pressure, are RESULTS AND DISCUSSION
displayed individually in Figures 7 and 8. The con-
tour plots identify that the highest pore-pressures Feature Ranking for Pore-pressure Prediction
occur in the depth interval 3850–3875 m of the As-
mari Formation penetrated by well MN#281. Gen- Feature selection determines the optimal num-
erally, the lowest pore-pressures exist in the depth ber of input parameters to combine in hybrid mod-
interval 3750–3775 m and relatively low pore-pres- els. The GA–MLP described above was applied to
sures prevail in the upper part of the formation, reduce the number of independent variables in the
above 3800 m depth. Higher effective stress (deff) models by evaluation using multiple training and
in the upper part of the formation tended to corre- testing subsets of data records. Trial-and-error
spond with lower pore-pressures, whereas lower identified a two-layer MLP with 6 and 5 nodes in its
gamma ray readings (CGR and SGR) tended to first and second hidden layers, respectively, as the
correspond with relatively high pore-pressures in most effective feature selection model at minimizing
that part of the formation (Fig. 7). the RMSE of the predicted pore-pressure values.
Figure 6. Map showing the location of the Marun oil field, onshore southwest Iran.
Having established the best architecture for the dation technique, because it overcame over-fitting
MLP, several methods were evaluated for selecting issues; eightfold validation divided the entire dataset
multiple mutually exclusive testing and training into 8 non-overlapping sections. A single section was
subsets of data records for model evaluation. One then selected as a subset to be evaluated. The
approach considered was to select randomly 30% of remaining 7 sections of the dataset, in each case,
all available data records for the testing subset, and were assigned to the training subset. For each subset
then assigning the remaining 70% of data records to selection, the MLP was evaluated 80 times (10 times
the training subset. This method failed to prevent for each training / testing subset combination). The
over-fitting during the feature selection process, model with the lowest predicted vs. recorded pore-
leading to some features being attributed too much pressure RMSE was then selected for each of the 10
influence in the predictions. A more successful ap- training / testing subset combination. The average of
proach was achieved by applying an eightfold vali- the 10-best RMSE values obtained for the eightfold
M. Farsi et al.
Table 5. Statistical characterization of the variables constituting the well-log dataset for three Marun oil field wells: MN#281, MN#279 and
MN#378. The variable values for all data records are available to download in a supplementary file (see Supplementary Data) (1
psi = 6.8947590868 kPa, 1ft = 0.3048 m)
Dataset vari- Dv Cp CGR SGR PEF RHOB DT NPHI ILD Pore-pres-

ables sure
Wells Units Psi psi1 API API Barns/ g/cm3 ls/ft (%) ohm-m psi
cm3
MN#281 Mean 4501.3 1.30E06 21.8 41.3 4.00 2.55 64.7 13.9 1162.6 4733.7
1792 Data Re- Standard devi- 302.8 1.34E07 20.2 20.9 0.93 0.18 9.1 5.6 4271.4 340.9
cords ation
Variance 91,639.3 1.79E14 408.4 438.0 0.87 0.03 82.4 31.6 18234882.3 116137.9
Minimum 2146.9 1.06E06 1.1 12.2 1.91 1.20 51.1 1.3 0.4 4092.2
Maximum 5080.9 2.07E06 121.4 143.6 6.33 2.87 117.2 46.7 20,000.0 5375.4
MN#297 Mean 4860.4 1.41E06 52.1 36.1 4.77 2.67 72.1 11.5 814.0 4877.0
1225 data re- Standard devi- 245.0 2.61E07 29.3 15.9 1.83 0.12 8.6 7.0 3446.9 268.4
cords ation
Variance 59,952.4 6.80E14 856.0 254.0 3.33 0.02 73.1 48.4 ######### 71,994.6
Minimum 4371.1 1.10E06 5.1 6.0 2.63 2.35 41.9 1.0 0.4 4474.0
Maximum 5435.0 2.25E06 110.2 78.3 8.68 2.98 89.3 34.7 20,000.0 5543.0
MN#378 Mean 4605.3 1.39E06 60.0 37.5 14.93 2.52 60.0 12.0 1198.2 4860.0
1225 data re- Standard devi- 325.6 1.80E06 9.6 13.0 2.83 0.13 9.6 5.8 4409.9 168.1
cords ation
Variance 105,918.8 3.25E12 91.8 168.8 8.02 0.02 91.8 34.1 ######### 28,235.3
Minimum 3779.8 4.02E05 48.3 13.6 4.29 2.12 48.3 0.0 0.5 4498.0
Maximum 5192.4 4.07E05 100.3 89.5 17.48 2.77 100.3 34.4 20,000.0 5428.0
was then taken as representative of the prediction Identifying the Best-performing Algorithm
performance accuracy of the feature selection eval- for Pore-pressure Prediction
uated. The eightfold validation sequence applied is
illustrated in Figure 9. Evaluating the models with normalized data
The main purpose of this method of imple- avoids systematic biases resulting from different
mentation feature selection was to determine the value scales among the input variables. Thus, Eq. 8
optimal combination of the 9 input variables (fea- was used to normalize all the data variables to range
tures) listed in Table 6, i.e., the combination of from 1 to + 1.
features that achieved the minimum pore-pressure
X X min
RMSE values. The results of the feature selection X norm ¼ 2 1: ð8Þ
X max X min
analysis are presented in Table 7 and Figure 10. The
lowest pore-pressure RMSE value achieved was The studied dataset was divided into two
71.79 psi.3 This was associated with a 7-variable groups—the training and testing subsets. The train-
combination, which excluded variables Z6 (RHOB) ing subset comprised 70% of all data records, which
and Z7 (DT). For models with more than 7 features, were selected randomly and distributed evenly
the RMSE was higher than for the 7-variable com- across the entire range of the dependent variableÕs
bination identified. It is clear from Table 7 that distribution. Then, 15% of data records constituted
inclusion of a variable in one round of feature the validation subset and 15% of data records con-
selection did not guarantee its inclusions in a sub- stituted the testing subset, which were held inde-
sequent round. For example, feature Z2 was selected pendently of the training subset. The testing subset
as one of the best variables for a 3-variable combi- was used to verify the prediction accuracy of opti-
nation but not for a 4-variable combination. How- mum model solutions derived during training in
ever, variable Z2 was selected as one of the variables terms of its prediction repeatability. The testing
for the best 6-variable and 7-variable combinations, subset also provided indications of over-fitting, if it
occurred.
3
1 psi = 6.8947590868 kPa.
Figure 7. Contour plots of the input variables deff, Cp, CGR, SGR and PEF vs. depth and contoured for pore-pressure for Marun oil field
well MN#281 (1 psi = 6.8947590868 kPa).
Performance accuracy of the three-hybrid ma- Pn

chine-learning optimization algorithms was assessed i¼1 jPDi j
AAPD ¼ ð11Þ
in terms of errors between measured and predicted n
pore-pressure trends vs. depth. The statistical mea- Standard Deviation (SD):
sures of prediction accuracy used were percentage sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn 2
deviation (PDi), average percentage deviation i¼1 ðDi DimeanÞ
(APD), average absolute percentage deviation SD ¼ ð12Þ
n1
(AAPD), standard deviation (STD), mean squared
error (MSE), root mean square error (RMSE) and 1X n
coefficient of determination (R2). The computation Dimean ¼ ðn nPredictedi Þ ð13Þ
n i¼1 Measuredi
formulas for these statistical accuracy measures are
as follows: Mean Square Error (MSE):
nðMeasuredÞ nðPredictedÞ 1X n
PDi ¼ x100 ð9Þ MSE ¼ ðn nPredictedi Þ2 ð14Þ
nðMeasuredÞ n i¼1 Measuredi
Average percent deviation (APD): Root Mean Square Error (RMSE):
Pn sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PDi Pn
APD ¼ i¼1 ð10Þ pffiffiffiffiffiffiffiffiffiffiffi 2
n i¼1 ðxi yi Þ
RMSE ¼ MSE ¼ ð15Þ
n
Absolute average percent deviation (AAPD):
M. Farsi et al.
Figure 8. Contour plots of the input variables RHOB, DT, NPHI and ILD vs. depth and contoured for pore-pressure for Marun oil field
well MN#281 (1 psi = 6.8947590868 kPa).
Coefficient of Determination (R2): and MELM–PSO) as presented in Tables 8, 9, 10

PN 2 and 11 and Figure 11.
2 i¼1 ðnPredictedi nMeasuredi Þ The accuracies of pore-pressure prediction for
R ¼1 Pn 2
; ð16Þ
PN n
I¼1 Measured i the three machine-learning optimization algorithms
i¼1 ðnPredictedi n Þ
and empirical equations evaluated for training, val-
where n is number of data records, xi is measured idation, testing subsets and full dataset (Tables 8, 9,
dependent variable value for the ith data record, yi is 10, 11) in terms of six prediction accuracy statistics,
predicted dependent variable value for the ith data highlight the superiority of the hybrid machine-
record, and n is parameter value. learning methods. A comparison of these accuracy
The most accurate pore-pressure prediction measures clearly reveals MELM–PSO as the best-
models were assessed based on the feature selection performing model overall. It achieved a RMSE of
involving 7 input variables (CGR, PEF, Cp, deff, 11.551 psi4 (full dataset) evaluating a pore-pressure
NPHI, ILD and SGR; Table 7). Of the 1792 data range for the MN#281 well dataset extending across
records in the dataset sampling Marun oil field well 1283 psi. A comparison of the prediction perfor-
MN#281, for each evaluation of the model 1250 mance of the three-hybrid machine-learning models
( 70%) data records, spread across the full pore- and three empirical models developed for predicting
pressure value range, were allocated to the training pore-pressure (Fig. 11) reveals that only the
subset. The, 260 data records were allocated to the MELM–PSO model achieved those predictions to a
validation subset. The remaining 282 data, also re- high degree of accuracy. The LSSVM–PSO model
cords spread across the full pore-pressure value performed slightly better than the MLP–PSO model
range, were allocated to the testing subset (for in terms of prediction accuracy but neither com-
MN#281). Using these allocations, pore-pressure pared with the MELM–PSO modelÕs performance
prediction performance was compared between the (Tables 8, 9, 10, 11).
empirical equations and the three machine-learning
optimization algorithms (MLP–PSO, LSSVM–PSO 4
1 psi = 6.8947590868 kPa.
Figure 9. Schematic diagram of the eightfold cross-validation method applied for

selecting testing and training subsets to evaluate.
Figure 12 displays more clearly the pore-pres-

Table 6. Nine input variables considered for feature selection in sure prediction performance comparison among the
the prediction of pore-pressure for the Marun oil field dataset (1 three-hybrid algorithms evaluated. Figure 13 dis-
psi = 6.8947590868 kPa, 1ft = 0.3048 m) plays the relative prediction percentage error
Variable Unit Symbol Code (Eq. 23) for an example pair of training ( 70% of
data records, 1250 in total), validation ( 15% of
deff psi Z1 data records, 260 in total) and testing subsets
Cp psi1 Z2
CGR API Z3
( 15% of data records, 282 in total) evaluated by
SGR API Z4 each of the three pore-pressure prediction models.
PEF barns/cm3 Z5 The relative error percentage values were cat-
RHOB g/cm3 Z6 egorized for each model into positive (over-esti-
DT ls/ft Z7 mate) and negative (under-estimate) ranges. These
NPHI % Z8
ILD ohm/m Z9
were sampled by both the training and testing sub-
sets. Note that the vertical scale in Figure 13 for the
MELM–PSO model is expanded by an order of
magnitude, highlighting the much smaller magni-
tudes of relative errors generated for individual data
records by that model. The results show that the
Figure 11 displays the poor pore-pressure pre- empirical models generated relative errors (%) be-
diction performance for the well MN#281 dataset tween 0.2 and 0.6 but the hybrid machine-learning
achieved by applying the Eaton sonic-based model models generated relative errors (%) between
(Eq. 3) and Eaton resistivity-based model (Eq. 2) vs. 0.004 and 0.025. The MELM–PSO achieved sub-
the pore-pressure values measured by RFT tools. As stantially the lowest relative errors (%) between
with the other well-log derived empirical formulas, 0.004 and 0.004.
the pore compressibility model also did not achieve The shale effects leading to substantial pore-
acceptable pore-pressure prediction accuracy for the pressure underestimates are revealed in Figure 14,
MH#281 wellbore. Figure 11 also reveals the high in which the Eaton sonic pore-pressure calculated
coefficient of determination achieved by the values are plotted vs. depth with the lithologies
MELM–PSO method (R2 = 0.9987). highlighted. The Eaton sonic model empirical for-
M. Farsi et al.
Table 7. Feature selections applying the eightfold validation method with GA–MLP to find the optimum combination for pore-pressure
prediction (1 psi = 6.8947590868 kPa)
Number of input variables Input variables RMSE (psi)
1 Z3 207.515
2 Z3, Z5 162.055
3 Z1, Z2, Z3 123.325
4 Z8, Z3, Z5, Z1 111.430
5 Z1, Z5, Z8, Z3, Z9 97.608
6 Z2, Z1, Z5, Z8, Z6, Z3 75.428
7 Z3, Z5, Z2, Z1, Z8, Z9, Z4 (Best) 71.790
8 Z6, Z1, Z5, Z3, Z8, Z2, Z4, Z9 81.876
9 Z7, Z3, Z2, Z5, Z4, Z9, Z8, Z6, Z1 85.095
ships that may vary from one shale formation to

another.
For the pore-pressure analysis, Figure 15 dis-
plays the effect across multiple iterations of com-
bining the PSO algorithm with three ML algorithms
MLP, LSSVM and MELM. The MELM–PSO
algorithm converged more efficiently to the lowest
RMSE value after 45 iterations.
DEVELOPMENT AND GENERALIZATION

OF MELM–PSO MACHINE-LEARNING
MODEL
Figure 10. RMSE prediction accuracy achieved for pore-
pressure applying the GA–MLP feature selection
combinations of the 9 input variables available (1
The results presented in the preceding section
psi = 6.8947590868 kPa). show training, validation and testing of the algo-
rithms in terms of the data records from well MN#
mula (Eq. 3) used to calculate pore-pressure in- 281 only. To evaluate the accuracy of the algorithms
cludes effective stress (vertical stress) and sonic log for general application to the Marun oil field as a
transit time data. To correct for shale impacts, cor- whole, additional datasets from wells MN# 297 (1225
rections need to be applied, but this complicates the data) with and MN# 378 (1225 data) were evaluated
formulaic relationships with lithological relation- by applying the best-performing algorithm MELM–
PSO 7-variable model developed using the MN#281
Table 8. Pore-pressure prediction accuracy achieved by the hybrid machine-learning optimization algorithm and empirical models applied
to the training subsets ( 70% of the data records) for the 7-feature selection for Marun oil field MN#281 (1 psi = 6.8947590868 kPa)
Pore-pressure prediction accuracy measures based on 7-variable feature selection

(training subset: 1250 data records)
Models Statistical errors APD (%) AAPD (%) SD (psi) MSE (psi) RMSE (psi) R2
Empirical models Pore compressibility model 0.243 6.679 419.6 176,972 420.7 0.0212
Eaton Mosel (resistivity) 1.546 6.099 369.8 137,727 371.1 0.0003
Eaton model (sonic) 1.126 5.956 356.4 128,078 357.9 0.0018
Hybrid ML-optimizer models MLP–PSO 0.013 1.366 71.7 5326 73.0 0.9455
LSSVM–PSO 0.006 0.862 45.4 2136 46.2 0.9772
MELM–PSO 0.006 0.223 11.1 141 11.86 0.9985
to the validation subsets ( 15% of the data records) for the 7-feature selection for Marun oil field MN#281 (1 psi = 6.8947590868 kPa)

(validation subset: 260 data records)
Empirical models Pore compressibility model 14.985 15.794 1089.8 1,197,539 1094.3 0.3798
Eaton Mosel (resistivity) 17.066 17.066 1153.6 1,337,556 1156.5 0.3539
Eaton model (sonic) 16.785 16.785 1137.9 1,297,441 1139.1 0.3473
LSSVM–PSO 0.023 0.879 52.1 2827 53.2 0.9884
MELM–PSO 0.023 0.206 12.1 165 12.85 0.9929
to the testing subsets ( 15% of the data records) for the 7-feature selection for Marun oil field MN#281 (1 psi = 6.8947590868 kPa)

(testing subset: 282 data records)
LSSVM–PSO 0.003 0.892 48.4 2422 49.2 0.9666
MELM–PSO 0.003 0.225 11.8 157 12.53 0.9833
to the full subsets ( 100% of the data records) for the 7-feature selections for Marun oil field MN#281 (1 psi = 6.8947590868 kPa)

(full dataset: 1792 data records)
LSSVM–PSO 0.001 0.789 44.8 2073 45.5 0.9806
MELM–PSO 0.001 0.201 10.9 133 11.55 0.9987
dataset. Statistical measures of accuracy achieved MN#281 data, when applied to other wells in the
for data records from these two wells are shown in Marun oil field.
Table 12. A comparison of the results of Table 12 Figures 16 and 17 plot the actual vs. predicted
with those of Tables 8, 9, 10 and 11 confirms the high pore-pressure values achieved by the MELM–PSO
pore-pressure prediction accuracy achieved by the model trained with MN#281 data records and ap-
developed MELM–PSO model, trained with plied to all the data records available for wells MN#
M. Farsi et al.
Figure 11. Predicted vs. measured pore-pressure comparisons for three-hybrid machine-learning optimization models and empirical
models applied to the complete MN#281 dataset of 1792 data records with the 7-variable model (1 psi = 6.8947590868 kPa).
297 and MN# 378. The performance accuracy would need to be recalibrated initially with some
achieved by this algorithm confirms its reliability for direct formation-pressure measurements from at
application across the Marun oil field in wells, or least one well in each of the fields to which it is
sections of the reservoir, for which direct formation- applied.
pressure measurements are not available. The Figures 18 and 19 show the performance of the
method can be used in other fields but, of course, MELM–PSO model in providing accurate pore-
variables (CGR, PEF, Cp, deff, NPHI, ILD and

SGR), provided the most accurate pore-pressure
predictions for the full dataset of well MN#281
(RMSE = 11.551 psi5 for a pore-pressure range
encountered of 1283 psi). It also achieved similar
prediction accuracy when tested with data from two
other wells drilled in the Marun oil field (MN#297
and MN#378). The data were pre-processed by
applying a Savitzky–Golay (SG) filter, and feature
selection was achieved by filter ranking using the
wrapping method.
Assessment of widely used empirical pore-
pressure prediction formulas based on sonic or
Figure 12. Predicted vs. measured pore-pressure compared for
resistivity well-log data or pore compressibility data
the three-hybrid machine-learning optimization models
evaluated for 1792 data records through the Asmari were outperformed substantially by even the poor-
carbonate reservoir for Marun oil field well MN#281 (1 est-performing MELM–PSO feature selections. A
psi = 6.8947590868 kPa). significant downside to the empirical models evalu-
ated is that their pore-pressure predictions are im-
pressure predictions for wells MN #297 and MN pacted substantially by shale effects, which are
#378, respectively. These figures are useful for dis- lithology dependent.
tinguishing depth intervals in these wells associated A standard MELM algorithm executed alone
with high pore-pressures. They further confirm the was less efficient because its hyperparameters were
reliability of the MELM–PSO model for predicting selected randomly. When hybridized with an opti-
pore-pressure in wells drilled throughout the Marun mization algorithm (MELM–PSO) to select opti-
oil field. mally the MELM hyperparameters, its performance
was improved substantially, particularly with rigor-
ous feature selection. The method proposed can be
CONCLUSIONS adapted readily for application to other oil and gas
fields. However, in reservoirs that display substantial
Subsurface pore-pressure prediction trends with spatial heterogeneity, the optimum feature selection
depth can be achieved to high degrees of accuracy may vary from well to well across the field. Future
for the Asmari carbonate reservoir formation (273 m studies are planned to test the model with multiple-
thick) drilled in the giant onshore Marun oil field in well datasets focusing on ways to optimize further
southwest Iran. This was achieved using hybrid well-log feature selection for heterogeneous reser-
machine-learning optimization models applied to a voirs.
diverse suite of petrophysical input data. The mul-
tilayer extreme learning machine model hybridized
with a PSO (MELM–PSO) outperformed substan-
tially a multilayer perceptron or a least squares
vector machine coupled with a PSO when applied to
a dataset consisting of 1792 data records and 9
petrophysical variables. The MELM–PSO model
applies the PSO, firstly to select meaningful range
constraints for the numbers of MELM layers and
neurons in each layer, and subsequently to find the
optimum weights and biases to apply to those neu-
rons and layers.
When well-log variables were used collectively
in trained models, they were able to predict pore-
pressure definitively in the Asmari reservoir across
the Marun oil field. Feature selection established
that the MELM–PSO model, incorporating 7 input 5
1 psi = 6.8947590868 kPa.
M. Farsi et al.
Figure 13. Pore-pressure prediction relative errors (%) compared for the training, validation and testing subsets of the empirical
equation and hybrid machine-learning optimization models evaluated for the MN#281 dataset. Note that both training, validation
and testing subsets are spread across the entire depth interval sampled and are displayed sequentially in these plots for illustrative
purposes only. Note the different vertical scales on the right-side plots.
Figure 14. Shales effects in the Asmari reservoir zone of MH#281 impacting the
empirically calculated Eaton sonic log pore-pressure values (1
psi = 6.8947590868 kPa).
Figure 15. Comparison of RMSE achieved after each Iteration

for pore-pressure predictions by the three-hybrid machine-
learning optimization models evaluated for the MN#281
dataset (MLP–PSO, LSSVM–PSO and MELM–PSO) (1
psi = 6.8947590868 kPa).
M. Farsi et al.
Table 12. Pore-pressure prediction accuracy of the MELM–PSO model, trained with MN#281 data, applied to the complete datasets
available for Marun oil field wells MN#297 and MN#378 and treating them as additional independent testing subsets (1
psi = 6.8947590868 kPa)
Pore-pressure prediction accuracy measures for MELM–PSO model

Trained with MN#281 data applied to other Marun oil field wells
Wells APD (%) AAPD (%) SD (psi) MSE (psi) RMSE (psi) R2
MN#297 0.002 0.226 9.4 100.6 10.0 0.9978

1225 data records
MN#378 0.004 0.232 9.6 103.0 10.1 0.9942
1225 data records
Figure 16. Cross-plot of predicted vs. measured pore-pressure

values compared for the MEML–PSO mode trained with data Figure 18. Relative error (%) pore-pressure prediction by
from well MN#281 applied to MN#297 data points (1 MELM–PSO algorithm for all data index well MN# 297 data
psi = 6.8947590868 kPa). records.
Figure 17. Cross-plot of predicted vs. measured pore-pressure

Figure 19. Relative error (%) pore-pressure prediction by
values compared for the MEML–PSO mode trained with data
MELM–PSO algorithm for all data index well MN# 378 data
from well MN#281 applied to MN#378 data points (1
records.
psi = 6.8947590868 kPa).
ACKNOWLEDGMENT Chandrashekar, G., & Sahin, F. (2014). A survey on feature

selection methods. Computers & Electrical Engineering,
40(1), 16–28.
This research was supported by Tomsk Choubineh, A., Ghorbani, H., Wood, D. A., Moosavi, S. R.,
Polytechnic University under Grant Number VIU- Khalafi, E., & Sadatshojaei, E. (2017). Improved predictions
of wellhead choke liquid critical-flow rates: Modelling based
CPPSND-214/2020. on hybrid neural network training learning based optimiza-
tion. Fuel, 207, 547–560.
Darling, T. (2005). Well-logging and formation evaluation. Else-
vier.
SUPPLEMENTARY INFORMATION Eaton, B. A. (1975). The equation for geopressure prediction
from well-logs. Paper presented at the fall meeting of the
The online version contains supplementary Society of Petroleum Engineers of AIME. https://doi.org/10.
2118/5544-MS.
material available at https://doi.org/10.1007/s11053- Farsi, M., Barjouei, H. S., Wood, D. A., Ghorbani, H.,
021-09852-2. Mohamadian, N., Davoodi, S., Nasriani, H. R., & Ah-
madi Alvar, M. (2021). Prediction of oil flow rate
through orifice flow meters: Optimized machine-learning
techniques. Measurement. https://doi.org/10.1016/j.measur
ement.2020.108943.
REFERENCES Ghasemi, M., & Bayuk, I. (2020). Bounds for pore space param-
eters of petroelastic models of carbonate rocks. IZVESTIYA,
Physics of the Solid Earth. https://doi.org/10.1134/S10693513
Adankon, M. M., & Cheriet, M. (2009). Model selection for the 20020032.
LS-SVM. Application to handwriting recognition. Pattern Ghorbani, H., Moghadasi, J., & Wood, D. A. (2017a). Prediction
Recognition, 42(12), 3264–3270. of gas flow rates from gas condensate reservoirs through
Ahmed, A., Elkatatny, S., Ali, A., Mahmoud, M., & Abdulra- wellhead chokes using a firefly optimization algorithm.
heem, A. (2019). New model for pore pressure prediction Journal of Natural Gas Science and Engineering, 45, 256–271.
while drilling using artificial neural networks. Arabian Jour- Ghorbani, H., & Moghadasi, J. (2014). Development of a new
nal for Science and Engineering, 44(6), 6079–6088. comprehensive model for choke performance correlation in
Ahmed, S., Mahmoud, A. A., Elkatatny, S., Mahmoud, M., & Iranian oil wells. Advances in Environmental Biology, 8(17),
Abdulraheem, A. (2019). Prediction of pore and fracture 877–882.
pressures using support vector machine. Paper presented at Ghorbani, H., Moghadasi, J., Dashtbozorg, A., & Abarghoyi, P.
the international petroleum technology conference. https://doi. G. (2017b). The exposure of new estimating models for
org/10.2523/IPTC-19523-MS. bubble point pressure in crude oil of one of the oil fields in
Ali, J. (1994). Neural networks: a new tool for the petroleum Iran. American Journal of Oil and Chemical Technologies,
industry? Paper presented at the European petroleum com- 178–193.
puter conference. https://doi.org/10.2118/27561-MS. Ghorbani, H., Wood, D. A., Choubineh, A., Mohamadian, N.,
Anemangely, M., Ramezanzadeh, A., & Tokhmechi, B. (2017). Tatar, A., Farhangian, H., & Nikooey, A. (2020a). Per-
Shear wave travel time estimation from petrophysical logs formance comparison of bubble point pressure from oil
using ANFIS-PSO algorithm: A case study from Ab-Tey- PVT data: Several neurocomputing techniques compared.
mour Oilfield. Journal of Natural Gas Science and Engi- Experimental and Computational Multiphase Flow, 2(4),
neering, 38, 373–387. 225–246.
Andrian, D., Rosid, M. S., & Septyandy, M. R. (2020). Pore Ghorbani, H., Moghadasi, J., Dashtbozorg, A., & Kooti, S.
pressure prediction using anfis method on well and seismic (2017c). Developing a new multiphase model for choke
data field ‘‘Ayah’’. In IOP Conference Series: Materials Sci- function relation for Iran’s gas wells. American Journal of Oil
ence and Engineering (Vol. 854, No. 1, p. 012041). IOP and Chemical Technologies.
Publishing. https://doi.org/10.1088/1757-899X/546/3/032017/m Ghorbani, H., Wood, D. A., Choubineh, A., Tatar, A., Abarghoyi,
eta. P. G., Madani, M., & Mohamadian, N. (2018). Prediction of
Atashbari, V., & Tingay, M. R. (2012). Pore pressure prediction in oil flow rate through an orifice flow meter: Artificial intelli-
carbonate reservoirs. Paper presented at the SPE Latin gence alternatives compared. Petroleum. https://doi.org/10.1
America and Caribbean petroleum engineering conference. h 016/j.petlm.2018.09.003.
ttps://doi.org/10.2118/150835-MS. Ghorbani, H., Wood, D. A., Moghadasi, J., Choubineh, A., Ab-
Atashnezhad, A., Wood, D. A., Fereidounpour, A., & Khosra- dizadeh, P., & Mohamadian, N. (2019). Predicting liquid
vanian, R. (2014). Designing and optimizing deviated well- flow-rate performance through wellhead chokes with genetic
bore trajectories using novel particle swarm algorithms. and solver optimizers: An oil field case study. Journal of
Journal of Natural Gas Science and Engineering, 21, 1184– Petroleum Exploration and Production Technology, 9(2),
1204. 1355–1373.
Azadpour, M., Manaman, N. S., Kadkhodaie-Ilkhchi, A., & Sed- Ghorbani, H., Moghadasi, J., Mohamadian, N., Mansouri Zadeh,
ghipour, M.-R. (2015). Pore pressure prediction and model- M., Hezarvand Zangeneh, M., Molayi, O., & Kamali, A.
ing using well-logging data in one of the gas fields in south of (2014). Development of a New Comprehensive Model for
Iran. Journal of Petroleum Science and Engineering, 128, 15– Choke Performance Correlation in Iranian Gas Condensate
23. Wells, 8(17), 308–313.
Bishop, C. M. (2006). Pattern recognition and machine learning. Ghorbani, H., Wood, D. A., Mohamadian, N., Rashidi, S., Da-
Springer. voodi, S., Soleimanian, A., & Mehrad, M. (2020b). Adaptive
Bowers, G. L. (1995). Pore pressure estimation from velocity data: neuro-fuzzy algorithm applied to predict and control multi-
Accounting for overpressure mechanisms besides under- phase flow rates through wellhead chokes. Flow Measurement
compaction. SPE Drilling and Completion, 10(02), 89–95. and Instrumentation, 76, 101849.
M. Farsi et al.
Huang, G.-B., Zhu, Q.-Y., & Siew, C.-K. (2006). Extreme learning provement additive for drilling fluids. Journal of Polymer
machine: Theory and applications. Neurocomputing, 70(1–3), Research, 26(2), 33.
489–501. Mohamadian, N., Ghorbani, H., Wood, D. A., Mehrad, M., Da-
Hutomo, P. S., Rosid, M. S., & Haidar, M. W. (2019). Pore voodi, S., Rashidi, S., & Shahvand, A. K. (2021). A geome-
pressure prediction using eaton and neural network method chanical approach to casing collapse prediction in oil and gas
in carbonate field ‘‘X’’ based on seismic data. In IOP con- wells aided by machine learning. Journal of Petroleum Sci-
ference series: Materials science and engineering (Vol. 546, ence and Engineering, 196, 107811.
No. 3, p. 032017). IOP Publishing. https://doi.org/10.1088/17 Mustafa, M., Rezaur, R., Rahardjo, H., & Isa, M. (2012). Pre-
57-899X/546/3/032017/meta. diction of pore-water pressure using radial basis function
Jain, A., & Zongker, D. (1997). Feature selection: Evaluation, neural network. Engineering Geology, 135, 40–47.
application, and small sample performance. IEEE Transac- Osborne, M. J., & Swarbrick, R. E. (1997). Mechanisms for gen-
tions on Pattern Analysis and Machine Intelligence, 19(2), erating overpressure in sedimentary basins: A reevaluation.
153–158. AAPG Bulletin, 81(6), 1023–1041.
John, G. H., Kohavi, R., & Pfleger, K. (1994). Irrelevant features Polito, C. P., Green, R. A., & Lee, J. (2008). Pore pressure gen-
and the subset selection problem. In Machine learning pro- eration models for sands and silty soils subjected to cyclic
ceedings 1994 (pp. 121–129). Morgan Kaufmann. https://doi. loading. Journal of Geotechnical and Geoenvironmental
org/10.1016/B978-1-55860-335-6.50023-4. Engineering, 134(10), 1490–1500.
Kennedy, J. (1997). The particle swarm: social adaptation of Ranaee, E., Ghorbani, H., Keshavarzian, S., Ghazaeipour Abar-
knowledge. Paper presented at the proceedings of 1997 IEEE ghoei, P., Riva, M., Inzoli, F., & Guadagnini, A. (2021).
international conference on evolutionary computation Analysis of the performance of a crude-oil desalting system
(ICEC’97). https://doi.org/10.1080/10.1109/ICEC.1997.59232 based on historical data. Fuel. https://doi.org/10.1016/j.fuel.
6. 2020.120046.
Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Rashidi, S., Mohamadian, N., Ghorbani, H., Wood, D. A., Shah-
Paper presented at the proceedings of ICNN’95-international bazi, K., & Ahmadi Alvar, M. (2020). Shear modulus pre-
conference on neural networks. https://doi.org/10.1080/10.110 diction of embedded pressurize salt layers and pinpointing
9/ICNN.1995.488968. zones at risk of casing collapse in oil and gas wells. Journal of
Keshavarzi, R., & Jahanbakhshi, R. (2013). Real-time prediction Applied Geophysics, 104205.
of pore pressure gradient through an artificial intelligence Rashidi, S., Mehrad, M., Ghorbani, H., Wood, D. A., Mohama-
approach: A case study from one of Middle East oil fields. dian, N., Moghadasi, J., & Davoodi, S. (2021). Determination
European Journal of Environmental and Civil Engineering, of bubble point pressure and oil formation volume factor of
17(8), 675–686. crude oils applying multiple hidden layers extreme learning
Kisi, O., & Parmar, K. S. (2016). Application of least square machine algorithms. Journal of Petroleum Science and
support vector machine and multivariate adaptive regression Engineering. https://doi.org/10.1016/j.petrol.2021.108425.
spline models in long term prediction of river water pollution. Rehm, B., Schubert, J., Haghshenas, A., Paknejad, A. S., &
Journal of Hydrology, 534, 104–112. Hughes, J. (2013). Managed pressure drilling. Elsevier.
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset Rubin, Y., & Hubbard, S. (2005). Hydrogeophysics, water science
selection. Artificial Intelligence, 97(1–2), 273–324. and technology library. Springer.
Lima, C. A., Coelho, A. L., & Eisencraft, M. (2010). Tackling Satter, A., & Iqbal, G. M. (2015). Reservoir engineering: The
EEG signal classification with least squares support vector fundamentals, simulation, and management of conventional
machines: A sensitivity analysis study. Computers in Biology and unconventional recoveries. Gulf Professional Publishing.
and Medicine, 40(8), 705–714. Shi, Y., & Wang, C. Y. (1986). Pore pressure generation in sedi-
Liu, H. (2017). Principles and applications of well-logging. mentary basins: Overloading versus aquathermal. Journal of
Springer. https://doi.org/10.1007/978-3-662-53383-3. Geophysical Research: Solid Earth, 91(B2), 2153–2162.
Liu, J., Liu, X., Liu, C., Le, B. T., & Xiao, D. (2019). Random Simon, D. (2013). Evolutionary optimization algorithms. Wiley.
search enhancement of incremental regularized multiple Suykens, J. A., & Vandewalle, J. (1999). Least squares support
hidden layers ELM. IEEE Access, 7, 36866–36878. vector machine classifiers. Neural Processing Letters, 9(3),
Liu, J., Liu, X., & Le, B. T. (2019b). Rolling force prediction of 293–300.
hot rolling based on GA-MELM. Complexity, 2019. https:// Swarbrick, R. E. (2001). Challenges of porosity-based pore pres-
www.hindawi.com/journals/complexity/2019/3476521/. sure prediction. Paper presented at the 63rd EAGE confer-
Lyons, W. C., & Plisga, G. J. (2011). Standard handbook of pet- ence & exhibition. https://doi.org/10.3997/2214-4609-pdb.15.
roleum and natural gas engineering. Elsevier. O-25.
Maimon, O., & Rokach, L. (2009). Introduction to knowledge Terzaghi, K., Peck, R. B., & Mesri, G. (1996). Soil mechanics in
discovery and data mining. In Data mining and knowledge engineering practice (3rd edn.). John Wiley & Sons.
discovery handbook (pp. 1–15): Springer. https://doi.org/10. Vapnik, V. (2013). The nature of statistical learning theory.
1007/978-0-387-09823-4_1. Springer.
Mohammadian, N., & Ghorbani, H. (2015). An investigation on Wahab, M. N. A., Nefti-Meziani, S., & Atyabi, A. (2015). A
chemical formation damage in Iranian reservoir by focus on comprehensive review of swarm optimization algorithms.
mineralogy role in shale swelling potential in Pabdeh and PLoS ONE, 10(5), 1–36.
Gurpi formations. Advances in Environmental Biology, 9(4), Wang, S.-J., Chen, H.-L., Yan, W.-J., Chen, Y.-H., & Fu, X.
161–166. (2014). Face recognition and micro-expression recognition
Mohamadian, N., Ghorbani, H., Wood, D. A., & Hormozi, H. K. based on discriminant tensor subspace analysis plus extreme
(2018). Rheological and filtration characteristics of drilling learning machine. Neural Processing Letters, 39(1), 25–43.
fluids enhanced by nanoparticles with selected additives: An Xiao, D., Li, B., & Mao, Y. (2017). A multiple hidden layers
experimental study. Advances in Geo-Energy Research, 2(3), extreme learning machine method and its application.
228–236. Mathematical Problems in Engineering, 2017. https://www.h
Mohamadian, N., Ghorbani, H., Wood, D. A., & Khoshmardan, indawi.com/journals/mpe/2017/4670187/.
M. A. (2019). A hybrid nanocomposite of poly (styrene-me- Yang, X.-S., & Papa, J. P. (2016). Bio-inspired computation and
thyl methacrylate-acrylic acid)/clay as a novel rheology-im- applications in image processing. Academic Press.
Yeom, C.-U., & Kwak, K.-C. (2017). Short-term electricity- SPE/IADC Asia Pacific Drilling Technology. https://doi.org/h
load forecasting using a TSK-based extreme learning ttps://doi.org/10.2118/36381-MS.
machine with knowledge representation. Energies, 10(10), Yu, H., Chen, G., & Gu, H. (2020). A machine learning
1613. methodology for multivariate pore-pressure prediction.
Yoshida, C., Ikeda, S., & Eaton, B. A. (1996). An investigative Computers & Geosciences, 143, 104548.
study of recent technologies used for prediction, detection, Yuan, X., Chen, C., Yuan, Y., Huang, Y., & Tan, Q. (2015). Short-
and evaluation of abnormal formation pressure and fracture term wind power prediction based on LSSVM–GSA model.
pressure in North and South America. Paper presented at the Energy Conversion and Management, 101, 393–401.

Farsi 2021

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Farsi 2021

Uploaded by

Copyright:

Available Formats

Natural Resources Research (Ó 2021)

Predicting Formation Pore-Pressure from Well-Log Data

Mohammad Farsi,1 Nima Mohamadian,2 Hamzeh Ghorbani ,3 David A. Wood ,4,8

Received 23 September 2020; accepted 28 February 2021

Ó 2021 International Association for Mathematical Geosciences

network (RBFNN) model to predict soil pore-water

1 Feature Selection and Ranking

Figure 1. Workflow sequence applied to evaluate hybrid machine-learning/optimization models to predict

Figure 2. Extreme learning machine (ELM) with single hidden

2 ðc) and the Gaussian RBF kernelÕs vari-

Control parameter Value Control parameter Value

Maximum iterations 100 Regularization parameter y 2.2065

Figure 4. Flow diagram for hybrid MELM–PSO model.

Number of hidden layers Number of neurons in the layers

3 0.168009802 0.153221446 0.083342882 0.04963854 0.04958423

Table 3. Hybrid MELM–PSO control parameters used for predicting pore-pressure

Control parameter Value Control parameter Value

Maximum iterations 100 Number of input 7

The values of control parameters established by

Hybrid MLP–PSO Model

A schematic flow diagram for implementing the

Marun Oil Field Description

Table 4. Hybrid MLP–PSO control parameters used for predicting pore-pressure

Control parameter Value Control parameter Value

Maximum iterations 100 Input variables 7

The distributions of the 9 input variables vs.

Dataset vari- Dv Cp CGR SGR PEF RHOB DT NPHI ILD Pore-pres-

Performance accuracy of the three-hybrid ma- Pn

Coefficient of Determination (R2): and MELM–PSO) as presented in Tables 8, 9, 10

Figure 9. Schematic diagram of the eightfold cross-validation method applied for

Figure 12 displays more clearly the pore-pres-

Number of input variables Input variables RMSE (psi)

ships that may vary from one shale formation to

DEVELOPMENT AND GENERALIZATION

Pore-pressure prediction accuracy measures based on 7-variable feature selection

Pore-pressure prediction accuracy measures based on 7-variable feature selection

Pore-pressure prediction accuracy measures based on 7-variable feature selection

Pore-pressure prediction accuracy measures based on 7-variable feature selection

variables (CGR, PEF, Cp, deff, NPHI, ILD and

Figure 15. Comparison of RMSE achieved after each Iteration

Pore-pressure prediction accuracy measures for MELM–PSO model

MN#297 0.002 0.226 9.4 100.6 10.0 0.9978

Figure 16. Cross-plot of predicted vs. measured pore-pressure

Figure 17. Cross-plot of predicted vs. measured pore-pressure

ACKNOWLEDGMENT Chandrashekar, G., & Sahin, F. (2014). A survey on feature

You might also like

1 Feature Selection and Ranking