Professional Documents
Culture Documents
https://doi.org/10.1007/s13202-024-01774-y
Abstract
Permeability prediction and distribution is very critical for reservoir modeling process. The conventional method for obtaining
permeability data is from cores, which is a very costly method. Therefore, it is usual to pay attention to logs for calculating
permeability where it has massive limitations regarding this step. The aim of this study is to use unique artificial intelligence
(AI) algorithms to tackle this challenge and predict permeability in the studied wells using conventional logs and routine
core analysis results of the core plugs as an input to predict the permeability in non-cored intervals using extreme gradient
boosting algorithm (XGB). This led to promising results as per the R2 correlation coefficient. The R2 correlation coefficient
between the predicted and actual permeability was 0.73 when using the porosity measured from core plugs and 0.51 when
using the porosity calculated from logs. This study presents the use of machine-learning extreme gradient boosting algorithm
in permeability prediction. To our knowledge, this algorithm has not been used in this formation and field before. In addition,
the machine-learning model established is uniquely simple and convenient as only four commonly available logs are required
as inputs, it even provides reliable results even if one of the required logs for input is synthesized due to its unavailability.
Keywords Rock typing · Permeability prediction · Artificial intelligence · Machine-learning · Nubia formation · October
Field · Gulf of Suez
List of symbols GRshale Gamma-ray log reading for shale (API)
a Archie constant (Dimensionless) k Horizontal permeability (mD)
Accuracy Machine-learning metric score NPHIma Neutron log reading for clean matrix (Ratio)
(Dimensionless) NPHI Neutron log reading (Ratio)
BVI Bulk volume irreducible (Ratio) NPHIsh Neutron log reading for shale (Ratio)
C Constant depending on lithology NPHIfl Neutron log reading for drilling fluid (Ratio)
(Dimensionless) Precision Machine-learning metric score
F1 Score Machine-learning metric score (Dimensionless)
(Dimensionless) m Cementation exponent (Dimensionless)
FFI Free fluid index (Ratio) n Saturation exponent (Dimensionless)
FP False positive (Dimensionless) Recall Machine-learning metric score
FN False negative (Dimensionless) (Dimensionless)
FZI Flow zone indicator (mD) RHOBma Density log reading for clean matrix (g/cc)
GRlog Gamma-ray log reading (API) RHOB Density log reading (g/cc)
GRsand Gamma-ray log reading for clean and sand RHOBsh Density log reading for shale (g/cc)
matrix (API) RHOBfl Density log reading for drilling fluid (g/cc)
RW Resistivity of formation water (Ohm.m)
RT Resistivity of uninvaded formation (Ohm.m)
* Mohamed A. Kassab RQI Rock quality index (mD)
mkassab68@yahoo.com
SW Water saturation (Ratio)
1
Department of Exploration, Egyptian Petroleum Research SWindo Indonesian water saturation (Ratio)
Institute (EPRI), Cairo, Egypt TP True positive (Dimensionless)
2
Geology Department, Faculty of Pet. & Min. Eng, Suez TN True negative (Dimensionless)
University, Suez, Egypt Vp Sonic log reading (ms/ft)
Vol.:(0123456789)
Journal of Petroleum Exploration and Production Technology
Fig. 1 Location map of the study area, Gulf of Suez, Egypt (EGPC 1996)
them has developed a simple model that only uses four of features/inputs required. Also, showing the results of
input well logging curves. Also, either of them did not such workflow and comparing them to other related per-
choose the same machine-learning algorithm chosen in meability finding tools such as routine core analysis and
this paper, and of course, the hyper-parameters of this nuclear magnetic resonance logging tool (NMR) to find
algorithm are different than those of the one discussed the optimum permeability prediction. This will help to
here in this paper. Finally, either of both researches were have some sort of realization of the permeability of the
conducted in October Field, Gulf of Suez area. targeted reservoir, thus enhancing the choice of perforation
The current study aiming to apply the workflow of using intervals, a crucial decision in the oil and gas industry that
machine-learning algorithms to predict rock types in un- has a massive effect on hydrocarbon production.
cored wells and intervals in a very simple manner in terms
Journal of Petroleum Exploration and Production Technology
Literature review African continent and the Sinai Peninsula. This region is
part of the northern compartment of the Red Sea rift. The
On the contrary to physical and empirical methods for Gulf of Suez area holds significant importance in Egypt’s
permeability prediction whether Kozney–Carman models, exploration efforts and stands as the most extensively
nuclear magnetic resonance or induced polarization used drilled and explored portion. It encompasses over 80 oil
in Weller et al. (2014), which use physical and geologi- fields, hosting reservoirs ranging in age from the Precam-
cal approaches as pore radius, tortuosity, polarization and brian period to the Quaternary period (Elsayed et al. 2021;
relaxation times along with other physical and geological El Nady et al. 2015). The Gulf of Suez basin is categorized
approaches. Machine-learning models are mathematical into three distinct structural provinces, determined by the
and statistical models which are numerical in nature, even regional dip direction of its tilted fault blocks (Bosworth
when dealing with the physical input features, it only uses and McClay 2001). The northern and southern provinces
their values to establish an array of numbers from which exhibit a southwestward dip, while the central province
it can learn and, therefore, predict. demonstrates a northeastward dip. Separation between
In addition, the machine-learning model used in Wang both provinces occurs via two accommodation zones that
et al. (2020) is a mere decision tree, which is a building unit trend in a northeast direction (Moustafa 1976; Patton et al.
to the random forest, which, in turn, is a building unit to the 1994). With a length of approximately 320 km and a width
extreme gradient boosting algorithm which is used in this ranging from 60 to 25 km, the Gulf of Suez basin is classi-
study. This reflects the superiority of the algorithm used fied as a rift basin. It is characterized by intricate tectonic
in the present study compared to the decision tree. Also, activity, wherein faulted blocks are delineated by signifi-
the input features required for the former study are differ- cant northwest–southeast faults (Clysmic direction), along
ent, and its number exceeds that of the present study as the with secondary southwest–northeast trending faults. This
number of input features required for the former study is region is renowned as the most prolific oil rift basin in
five inputs features which are porosity, shale volume, water both the Middle East and Africa, boasting high levels of
saturation, density log and shallow resistivity log compared oil production (Elsayed et al. 2021; El Nady et al. 2015).
to the four input features required for the present study which October Field is located in the north-central part of the
are gamma-ray log, density log, neutron log and sonic log. Gulf of Suez, Egypt (Fig. 1) (EGPC 1996), with latitude from
Rezaee and Ekundayo (2022) use four algorithms: multi- 28° 48′ 00" N to 28° 53′ 00'' N and longitude from 33° 03′ 00''
layer perceptron/neural network, support vector regressor, E to 33° 08′ 00'' E. October Field was discovered in 1977, the
random forest and gradient boosting regressor, the last algo- GS 195–1 well (later renamed October A-1) was drilled to
rithm is a similar algorithm to the algorithm used in the test a large, NW-trending, fault-bounded structure that had
present study. Input features used for this similar algorithm been identified from a 1976 regional seismic survey in the
were seven which are shale volume, density log, photoelec- October area. The October Field, positioned approximately
tric effect log, neutron log, sonic log, deep resistivity log 25 km to the north of the vast Belayim field, exhibits a struc-
and effective porosity log, yet the accuracy was less than tural configuration characterized by elongated faulted blocks.
that of the present study as the accuracy R2 correlation score These faulted blocks align in a northwest–southeast trend
dropped from above 0.9 to below 0.7, while in the present and have a northeast dip. These pre-Miocene faulted blocks
study, the R2 correlation score dropped from 1 to 0.728 in extend along the strike of the field for approximately 20 km
the training and validation phases, respectively. (Zahran 1986; EGPC 1996; Kassem et al. 2021; Khattab et al.
Therefore, this is a new method in predicting permeabil- 2023). The field is bounded to the west by a sequence of
ity where it uses statistical and mathematical approaches normal faults with downthrown displacement to the west, it
through machine-learning algorithms with the addition is also divided into a main southern block and a secondary
of simplicity of the hyper-parameters, and the number of northern block (El‐Ghamri et al. 2002; Radwan et al. 2021b).
input features required in addition to the commonality of The Miocene Nukhul formation, which was produced
these input features and enhanced accuracy. from the onshore Abu Rudeis Field, 10–15 km to the east,
was the main objective, with the Paleozoic–Cretaceous Nubia
sandstones as a secondary target (Lelek et al. 1992; Radwan
et al. 2020). The Nukhul reservoir was absent, but the Nubia
Geological setting contained 541 ft of net oil pay, which tested 29.6°API oil at
4562 Barrel Oil Per Day (BOPD) on a 34/64 in. choke. Logs
The Gulf of Suez region is a narrow water body situated showed a single oil water contact (OWC) at 11,670 ft True
in a north/northwest–south/southeast direction, serving as Vertical Depth Sub Sea (TVDSS) (Zahran 1986). The wells
a natural division between the northeastern part of the were put on production in October 1977 (Zahran 1986), and a
Journal of Petroleum Exploration and Production Technology
production platform was installed in 1979 (Borling et al. 1996). gamma-ray (GR), caliper (CALI) deep and medium resis-
The primary oil production at the October Field is derived tivity (RD and RM), neutron (NPHI), density (RHOB) and
from the sandstones within the Paleozoic–Cretaceous Nubia sonic (DT). The logging data were used for formation evalu-
Formation (Lelek et al. 1992). In 1989, additional reserves ation and as inputs for AI technique. As for the core data,
were discovered in a smaller, separate Nubia pool with a shal- routine core analysis (RCAL) is available for all three cored
lower OWC in the North October Area by the discovery well, wells (OCT-A2B, OCT-B8 and OCT-B6), while special core
GS172-1 (October J-1), which penetrated Nubia sandstones analysis (SCAL) is available for only two wells (OCT-A2B
at 10,723 ft TVDSS and tested at a rate of 7880 BOPD from and OCT-B6). The core data were used for the rock typing of
254 ft of net oil pay. In the October Field, the basal pre-rift the formation so that the rock types are used as input for AI
reservoir section is primarily composed of various sandstones technique. For the un-cored well OCT-K5, same set of logs
known as the “Nubian Sandstone.” The Nubia Formation is available except for the sonic log is missing in its data-
ranges in age from the Paleozoic to the Lower Cretaceous and set but it has nuclear magnetic resonance log (NMR) avail-
unconformably overlies the Precambrian crystalline basement able. NMR log is used to calculate permeability to compare
and is conformably overlain by Upper Cretaceous shales of it with permeability obtained from AI technique. Table 1
the Nezzazat Group (Hussein et al. 2017; Zahran 1986). Fig- shows the detailed database used in the current study.
ure 2 shows the litho-stratigraphic column of the October Field
(Peijs et al. 2012). Formation evaluation for the studied wells
Deaf (2009) suggested that during the late Cretaceous
period in the Gulf of Suez, there was a notable marine trans- Well log data were used to perform formation evaluation to
gression, the lower part of the Gulf of Suez sequence, which obtain petrophysical parameters such as shale volume (Vsh),
is of the latest early Cretaceous age, appears to have been porosity (Ø), water saturation (Sw), reservoir, pay flags and
deposited in a continental basin. The late Cretaceous sedi- the ratio between them. Shale volume is calculated by the
mentary record in the Gulf of Suez demonstrates a transi- linear formula (Eq. 1) from gamma-ray for a pessimistic esti-
tion from continental alluvial deposits to marine carbonate mation (Atlas 1979) and from density–neutron log (Eq. 2)
sequences. (Schlumberger 1972, 2009). Porosity determination in the
The Nubian Sandstone can be further subdivided into study is primarily obtained from density and neutron logs
different groups based on lithological characteristics. The (Wyllie 1963; Asquith and Gibson 1982; Schlumberger
Lower Paleozoic Qebliat Group, which corresponds to the 1972) (Eqs. 3–6). Water saturation derived from Archie
previously used terms Nubia “D” and “C,” is dominated equation (Archie 1942) (Eq. 7) and from Indonesian equa-
by sandstone lithologies. The Carboniferous Ataqa Group, tion (Poupon and Leveaux 1971) (Eq. 8). For reservoir and
equivalent to the Nubia “B” and the lower portion of the pay flags, standard cutoffs for shale volumes (50%), poros-
Nubia “A,” consists of both dolomites and sandstones. ity (10%) and water saturation (50%) were applied in order
This group exhibits a mixture of carbonate and siliciclastic and sequentially (Darling 2005; Kassab et al. 2020; El-Din
facies. The Lower Cretaceous Malha Formation represents et al. 2013).
the upper part of the Nubia “A” and is primarily composed
GRlog − GRsand
of coarse-grained sandstones. These sandstones within the VSH = (1)
Malha Formation exhibit a coarser grain size compared to GRshale − GRsand
the other units.
Overall, the Nubian Sandstone in the October Field com- X1 − X0
prises different lithological units, including the sandstone-
Vsh = (2)
X2 − X0
dominated Lower Paleozoic Qebliat Group, the dolomites
where
and sandstones of the Carboniferous Ataqa Group and the
coarse-grained sandstones of the Lower Cretaceous Malha X0 = NPHIma
Formation. These sandstone units form the dominant res-
ervoir section underlying the October Field (Hasouba et al. X1 = NPHI + M1(RHOBma − RHOB)
1992; El‐Ghamri et al. 2002).
Fig. 2 Tectonostratigraphic history and stratigraphic megasequences of Gulf of Suez (Peijs et al. 2012)
OCT-A2B √ √ × √ √ √ √ √ × √ √
OCT-B6 √ √ √ √ √ √ √ √ × √ √
OCT-B8 √ √ √ √ √ √ √ √ × √ ×
OCT-K5 √ √ √ √ √ × √ √ √ × ×
√
Rock typing/Clustering the formation
∅2N + ∅2D
∞= (4)
2 Core data were used to perform rock typing by reservoir
quality index (RQI), the normalized porosity (Øz) and flow
∅e = ∅t − ∅sh Vsh (5) zone indicator (FZI) from the helium porosity and horizontal
permeability measured using Amaefule formulas (Amaefule
𝜌ma − 𝜌sh et al. 1993; El-Sayed et al. 2021; Abuhagaza et al. 2021;
∅sh =
𝜌ma − 𝜌fluid (6) Kassab et al. 2021; Radwan et al. 2021a; El-Gendy et al.
2022; Ismail et al. 2023; Hassan et al. 2023) (Eqs. 9–11).
where The samples where then sorted in terms of FZI and grouped
∅e= effective porosity. in rock types/clusters. Validation of the rock typing process
∅t= total porosity. by comparing permeability calculated from the actual FZI
∅sh= porosity of shale derived. and the actual porosity from the core with the actual perme-
Vsh= volume of shale at formation. ability measured from core.
√ √
R k
SW = n m W (7) RQI = 0.0314 ( ) (9)
∅ .RT ∞
where Where
SW= water saturation, fraction. k is the horizontal permeability measured from core.
Ø = porosity, fraction. Ø is the porosity measured from core (preferably helium
RW= resistivity of formation water, ohm.m. porosity).
RT= resistivity of uninvaded formation, ohm.m. Also, the normalized porosity (Øz) was calculated by the
m = cementation exponent. following equation.
n = saturation exponent. ∞
∞z =
(1 − ∞) (10)
� (2∕n)
⎧ 1 ⎫
⎪ Rt ⎪ Then, the flow zone indicator (FZI) was calculated by the
SWindo = ⎨ � (1−0.5V ) � � ⎬ (8)
following equation.
⎪ Vsh
+ a.R ⎪
∅m
sh
√
⎩ Rsh w ⎭
RQI
FZI = (11)
where ∞z
SWindo = water saturation, fraction.
Rt = resistivity of uninvaded formation, ohm.m.
Rsh = resistivity of shale, ohm.m.
Using machine‑learning (ML) for rock type
Vsh =volume of shale.
classification
Ø = porosity, fraction.
In contrast with conventional programming methods, which
a =Archie constant.
typically involve creating a detailed design and implement-
Rw = resistivity of formation water, ohm.m.
ing it as a program, ML takes a different approach. Instead
m = cementation exponent.
Journal of Petroleum Exploration and Production Technology
of explicitly instructing the computer on how to solve a means of measured depth. Then, the data are split based on
problem, ML algorithms learn from data and examples to the wells, where OCT-A2B and OCT-B8 were the train data,
generalize patterns and make informed predictions or deci- i.e., the data from which the machine-learning algorithm
sions (Rebala et al. 2019). The process of machine-learning will learn, and OCT-B6 was the test data, i.e., the data on
is divided into the following steps: which the machine-learning algorithm will apply what it
learned. The following machine-learning algorithms were
Feature engineering and data cleaning used for the data after the wrangling:
Features is the terminology in machine-learning for the 1. Random forest classifier (RF)
input data to the model, while labels are the terminology in 2. Extreme gradient boost classifier (XGB)
machine-learning for the output data predicted or classified 3. Multi-layer perceptron/neural network classifier (MLP/
by the model. Features were conventional logs, while the NN)
label was the rock type. 4. Support vector machine classifier (SVC) with radial-
Data cleaning was done by discovering and eliminat- based function kernel (RBF)
ing the plugs with bad test measurements (plug broken or 5. Support vector machine classifier (SVC) with polyno-
test failure), nearly impermeable plugs (less than one md) mial kernel
and plugs where logs have problems while recording (noise
while data acquisition), such as washout areas in the well A range of hyper-parameters for each algorithm were used
where density and neutron log would give wrong readings by iterations and observing the results and metrics such as
for the formation. the accuracy score (Eq. 13), precision score (Eq. 14), recall
Feature engineering was represented in the choice of con- score (Eq. 15), F1 score (Eq. 16) and confusion matrix. The
ventional logs to be used in the machine-learning model, confusion matrix was the most important and effective met-
which was then decided, by choosing the gamma-ray (GR), ric, as it clarifies how each sample has been classified and
density (RHOB), neutron (NPHI) and sonic (DT) logs. The what its classification result was (Powers 2020).
resistivity log was excluded as it has no relationship to the These metrics have the following equations:
facies, porosity or permeability of the formation as it is (TP + TN)
mainly related to the formation fluids and not the formation Accuracy = (13)
(TP + TN + FP + FN)
and its pore space, size and geometry due to the low to non-
collinearity between these features.
TP
These logs were scaled and normalized for each well by Precision = (14)
(TP + FP)
means of the min–max scaler (Dodge 2003; Freedman et al.
2020) (Eq. 12), which uses the following equation:
TP
(X − Xmin ) Recall = (15)
(TP + FN)
Xscaled = (12)
(Xmax − Xmin )
(2(Precision)(Recall))
The standard scaler was excluded, as the data did not F1 − Score = (16)
follow a Gaussian distribution, where the standard scaler (Precision + Recall)
performs better in such data. The scaling step is essential for where TP, TN, FP and FN are true positive, true negative,
the machine-learning model as most machine-learning algo- false positive and false negative, respectively.
rithms are sensitive to the input values and could be biased A true positive (TP) occurs when the model correctly pre-
toward big values and inputs with large scales (scale of GR dicts an instance as belonging to the positive class. It means
curve ranges from 0 to 100, while RHOB curve ranges from the model identified a positive case correctly.
1.95 to 2.95) and deteriorate the machine-learning model. A true negative (TN) happens when the model correctly
predicts an instance as belonging to the negative class. It
Choosing the machine‑learning algorithm means the model identified a negative case correctly.
On the other hand, a false positive (FP) occurs when the
After examining the generalized buffering algorithm, which model incorrectly predicts an instance as belonging to the
takes the distance geometrically along with the characteris- positive class when it is actually from the negative class. It
tics, attributes in regards (Zhou et al. 2021) and also exam- means the model identified a negative case as positive.
ining the deep learning stochastic framework which aids in A false negative (FN) happens when the model incor-
the subsurface structure estimation (Zhan et al. 2022). Core rectly predicts an instance as belonging to the negative
data and log data for each well were then concatenated by
Journal of Petroleum Exploration and Production Technology
class when it is actually from the positive class. It means empirically derived coefficients are found through itera-
the model identified a positive case as negative. tions of sonic and density logs in numerous offset wells. The
A true class is the event that the model is aiming to pre- machine-learning model is then applied to the well, and per-
dict/classify while a negative class is any other possibility. meability curve is calculated. Permeability from NMR log is
In our case, one of the rock types will be the true class, while calculated by Coates formula (Coates et al. 1999) (Eq. 19).
all the other rock types will be considered negative classes Finally, permeability calculated from the machine-learning
by the model. The best algorithm and hyper-parameters that model is compared to that calculated from NMR log.
produce the best accepted results shown by these metrics
is extreme gradient boosting (XGB) (Chen and Guestrin 𝜌 = 𝛼Vp𝛽 (18)
2016). XGB is a group of random forests, which is, in turn,
where
a group of decision trees; it is a split-based algorithm to
ρ is the density log in g/cm3.
predict or classify the labels of a labeled dataset by means
Vp is the sonic log or P-wave velocity.
of the entropy and Gini index of the data after each split.
α is a formula empirically derived coefficient.
Via minimizing the entropy and the Gini index to the least
β is a formula empirically derived coefficient.
possible value as the dataset is well classified or predicted
as the entropy and Gini index decrease. These trees and for- ( )4 ( )2
�NMR FFI
ests are a relatively weak predictor, and each tree and forest k= (19)
C BVI
work sequentially, not in parallel. This is to pass the results
of each weak predictor to the next weak predictor so that where
each consecutive predictor would learn from the errors of the k is the horizontal permeability.
previous predictor, thus minimizing the error and achieving C is a constant depending on lithology.
the best results possible. The XGB is used with its default ØNMR is the porosity from NMR log (given from the log).
hyper-parameters except for Max_Depth, as the best value FFI is the free fluid index (given from the log).
for it was ten instead of the default value of six. BVI is bulk volume irreducible (given from the log).
Fig. 3 OCT-B8 formation evaluation, the caliper log (CALI) and shale volume (Vsh) is in the sixth track, final porosity (PHI_FINAL)
gamma-ray log (GR) are in the first track. Bulk density log (RHOB), is in the seventh track and Indonesian (final) water saturation (SWE_
neutron porosity log (NPHI) and bulk density correction log (DRHO) INDO) is in the eighth track. Finally, the reservoir net flag (RES_
are in the second track. The deep resistivity log (RD) and medium NET_FLAG) exists in the ninth track, and the pay net flag (PAY_
resistivity log (RM) are in the third track. Sonic travel time log (DT) NET_FLAG) is in the tenth track
is in the fourth track. Bad hole flag (BHF) is in the fifth track, final
Journal of Petroleum Exploration and Production Technology
Fig. 4 OCT-A2B formation evaluation, the caliper log (CALI) and time log (DT) is in the fourth track. Bad hole flag (BHF) is in the
gamma-ray log (GR) are in the first track. Bulk density log (RHOB), fifth track, final shale volume (Vsh) is in the sixth track, final poros-
neutron porosity log (NPHI) and bulk density correction log (DRHO) ity (PHI_FINAL) is in the seventh track and Indonesian (final) water
are in the second track. The deep resistivity log (RD), deep induc- saturation (SWE_INDO) is in the eighth track. Finally, the reservoir
tion resistivity log (ILD), medium resistivity log (RM) and medium net flag (RES_NET_FLAG) exists in the ninth track, and the pay net
induction resistivity log (ILM) are in the third track. Sonic travel flag (PAY_NET_FLAG) is in the tenth track
Journal of Petroleum Exploration and Production Technology
(n), Archie constant (a) and the formation water resistivity the helium porosity measured from the core is used to cal-
(Rw) are 1.95, 2, 1 and 0.016, respectively. Table 3 shows culate permeability, and then, the calculated permeability
the results of the average formation evaluation parameters. by this method is plotted versus the measured permeability
Figure 3 shows the formation evaluation for well OCT-B8. from the core in log–log scale. The R2 correlation coeffi-
Figure 4 shows the same log and curves for well OCT-A2B. cient was approximately 0.938, which is an almost identi-
cal correlation; therefore, the rock typing is valid. Figure 6
Clustering and rock typing shows the measured permeability from the core versus the
calculated permeability from the rock typing process.
After removing any impermeable core samples results, i.e.,
any samples with permeability less than 1 mD, 705 sam-
ples were obtained. Then, the rock quality index (RQI), Machine‑learning process
normalized porosity (Øz) and flow zone indicator (FZI) are
calculated. Data cleaning and exploration
These samples were then divided into five rock types or
clusters by means of FZI. Table 4 shows the criteria for FZI A filter on the core samples is applied where any point
ranges and mean FZI by which these rock types were clas- in the wells concerning the study that has a caliper log
sified and the mean FZI for each rock type. Figure 5 shows (CALI) reading equal to or greater than 9 inches (0.5
the samples after being rock typed. inches greater than the bit size as the bit size in all wells
To validate this clustering/rock typing process, the is 8.5 inches) is eliminated. This step aimed to remove any
mean FZI for each rock type of each sample along with noisy readings with errors due to washout, as the density
and neutron logs would mostly have incorrect readings due
to their relatively shallow depth of investigation, which
Table 4 Criteria of FZI range and mean for rock typing would result in readings that represent the drilling mud
Rock type number FZI range (cutoff) Mean FZI properties more than representing the formation proper-
ties. This step further reduced the number of samples from
1 0.488–1.8 1.19
705 to 623. The core samples were calibrated in depth to
2 1.8–5 3.37
match that of logs via the core gamma-ray.
3 5–9 6.9
4 9–14.92 11.73
5 14.92–46.222 21.49
1000
100
10
1
y = 1.6391x0.9267
0.1 R² = 0.938
0.01
0.01 0.1 1 10 100 1000 10000 100000
Measured horizontal permeability (mD)
To achieve the best inputs/features/logs to use for the After scaling the features (log data), each curve in each well
machine-learning model, two approaches were applied to of the study is scaled separately as each curve has its own
the log data. First, a physical approach to the data is used, range, and each well has its own unique conditions regard-
which results in excluding the resistivity logs and caliper ing the hole condition, hole size and tool calibrations and
logs. The former is more concerned with the fluids inside corrections. The data are split into train data that the model
the formation than the lithology and facies of the forma- will learn from and test data by which the model would be
tion, while the latter has no meaning in terms of facies and validated through applying and comparing the predicted/
lithology as it only gives an indication of the hole size, so classified rock type results with the actual rock types. The
there is no real information from it, and it has been limited splitting was with a ratio of 90% of the samples were train
only for the purpose of identifying washout intervals and data (561 from 623 samples) and 10% were test data (61
eliminating their associated data. from 623 samples). In terms of wells, the data of wells OCT-
Second approach was of mathematical and statistical A2B and OCT-B8 were for training, while the OCT-B6 well
nature, as correlation between the remaining logs (GR, was for testing.
RHOB, NPHI and DT) was examined to identify any fea- Each algorithm had several trials with the data applying
tures highly correlated with each other so that only one of wide ranges for the values of their respective hyper-param-
the correlated features is chosen. This step aims to elimi- eters, and the choosing process is based on the metrics and
nate any duplication in the features, as the mathemati- results. After these trials, Table 5 shows the best metrics and
cal model would be biased toward features that are highly results of each algorithm used, and figures from Figs. 8, 9,
correlated and would highly influence its prediction/clas- 10, 11 and 12 demonstrate the confusion matrix of each of
sification. Figure 7 shows a matrix heat map that shows these algorithms used.
the correlation between the features and each other and As noticed, most of the metrics of most of these algo-
between the label/output, which is rock type (RT). rithms are similar, but the decisive metric was the confu-
It can be noticed that no feature is highly correlated sion matrix, as the confusion matrix of all the algorithms
with any other, whether positive or negative correlation. was predicting/classifying the rock types very skewed away
Therefore, all four features will be chosen to enter the from their actual rock types. Except for the extreme gradi-
model and are ready for scaling. In addition, it is notice- ent boosting (XGB) algorithm, which although got a similar
able that the number of features is relatively small, which incorrect number of samples like the other algorithms yet
establishes a simple low-dimension machine-learning what the algorithm classified incorrectly was not extremely
model easy to use for prediction/classification. skewed or far away from the actual rock type (mostly a
Journal of Petroleum Exploration and Production Technology
Table 5 Metric score and results of different algorithms the model with the least important feature being RHOB
Metric/algo- SVM_RBF SVM_Poly XGB RFC MLP_NN
with around 22% importance and the most important feature
rithm being NPHI with around 28% importance. The importance
of the feature comes from how much it reduces the Gini
Train score 0.561 0.404 1 0.33 0.33 index and the entropy of the data with each split regarding
Test score 0.388 0.32 0.398 0.33 0.31 the feature.
Precision score 0.347 0.234 0.367 0.11 0.1
Recall score 0.388 0.32 0.398 0.33 0.31
Calculating permeability
F1 score 0.325 0.235 0.371 0.164 0.156
As mentioned previously, the mean FZI from the predicted/
classified rock types along with the helium porosity from the
difference of one rock type than the actual in either the nega- core were used to calculate the permeability by the Amae-
tive or positive direction). This could be an accepted result, fule equation then compared to the actual measured perme-
as rock types are not sharply separated and could show some ability from the core. This step is vital for the validation of
sort of interference between each rock type and the one that the machine-learning model. The R2 correlation coefficient
precedes or succeeds it. between the actual and calculated permeability was approxi-
Therefore, the best algorithm chosen for this study was mately 0.7281.
the XGB, and it was used with its default hyper-parameters Then, the same step is done, but with the porosity cal-
except for “Max_Depth,” as the best value for it was ten culated from the logs. This step is important due to the
instead of the default value of six. absence of core helium porosity in un-cored intervals and
In addition, it has been confirmed that the feature engi- wells. The R2 correlation coefficient was approximately 0.51
neering and selection was successful, as all features contrib- (exactly 0.5098). The huge difference in the R2 correlation
uted almost equally to the prediction/classification models. coefficient between estimating permeability based on helium
Figure 13 shows the feature importance of each feature to core porosity and the same estimation based on porosity
Journal of Petroleum Exploration and Production Technology
calculated from logs is not due to any defect in the machine- from logs, as in all cases, the same mean FZI is used in the
learning model. However, it is rather in the big difference equation, and in all cases, the same predicted/classified rock
between the helium core porosity and porosity calculated types are used. Figure 14 shows both steps and comparisons.
Journal of Petroleum Exploration and Production Technology
To prove that the decrease in the R2 correlation coeffi- and the porosity calculated from the logs, then compared
cient is due mainly to the porosity, another permeability was to the actual permeability measured from the core. The R2
calculated using the mean FZI from the actual rock types correlation coefficient reached over 0.66 (Exactly 0.6682).
Journal of Petroleum Exploration and Production Technology
NPHI
GR
Features
DT
RHOB
Fig. 13 XGB feature importance, NPHI feature has the highest importance of about 28% while RHOB feature has least importance of about 22%
Figure 15 shows this comparison. Figure 16 shows a com- Dealing with missing logs using Gardner equation
parison between actual core measurements of porosity and
horizontal and vertical permeability versus the calculated When attempting to apply the machine-learning model
porosity from logs and the horizontal permeability predicted on the un-cored well OCT-K5, it was found that it has
from a machine-learning algorithm. no sonic log, which is an essential input/feature to the
Journal of Petroleum Exploration and Production Technology
100000
calculated from logs
10000
1000
100
10
1
0.1
0.01 y = 2.9875x0.8427
R² = 0.6682
0.001
0.01 0.1 1 10 100 1000 10000 100000
Measured horizontal permeability (mD)
machine-learning model. Therefore, Gardner equation was OCT-K,5 and rock types of the Nubia formation in the well
used to estimate the sonic log from the density log. could be predicted/classified, from which permeability could
To obtain best α and β, many trials with the density log be estimated similarly to the previous wells.
then comparing the calculated sonic log to the actual sonic In addition, the NMR log run in the well is used to estimate
log of the offset wells to the well OCT-K5, it was found that permeability using Coates method/equation with the constant
α = 0.2 and β = 3.1. The R2 correlation coefficient was approx- “C” value taken as 33 as this value is the correct value for
imately −0.71 (exactly −0.708), the negative correlation coef- sandstone, the dominant lithology in the formation.
ficient is due to the inverse relation between density (compact- Finally, the permeability estimates from both sources are
ness of the formation) and sonic velocity (travel time). compared to each other. The R2 correlation coefficient was
approximately 0.48 (exactly 0.4755). The reduction in cor-
relation coefficient and accuracy between this well and the
NMR permeability versus estimated core other cored wells is due to the synthesizing of the sonic log,
permeability from rock typing which, like any synthesizing process, leads to some error in
the estimation.
After the estimation of the sonic log by Gardner equation,
the machine-learning model could be applied on the well
Journal of Petroleum Exploration and Production Technology
Fig. 16 OCT-A2B measured permeability versus predicted perme- indicators (FZI) given from the predicted rock types through the
ability, the core measurements of helium porosity (COPHI_SHIFT) machine-learning model (COHK_CALC) in the second track and
along with effective porosity calculated from logs (PHIE_ND) in the measured vertical permeability from core (COVK_SHIFT) in the
the first track, measured horizontal permeability from core (COHK_ third track for the well OCT-A2B
SHIFT) along with calculated permeability from the mean flow zone
Journal of Petroleum Exploration and Production Technology
References case study of source rocks in Ras Gharib oilfield, central Gulf of
Suez, Egypt. Egypt J Pet 24:203–211
El-Din ES, Mesbah MA, Kassb MA, Mohamed IF, Cheadle BA, Teama
Abuhagaza AA, Kassab MA, Wanas HA, Teama MA (2021) Reservoir
MA (2013) Assessment of petrophysical parameters of clastics
quality and rock type zonation for the Sidri and Feiran members of
using well logs: the Upper Miocene in El-Wastani gas field,
the Belayim Formation, in Belayim Land Oil Field, Gulf of Suez,
onshore Nile Delta, Egypt. Pet Explor Dev 40:488–494
Egypt. J Afr Earth Sci 181:104242
El-Gendy NH, Radwan AE, Waziry MA, Dodd TJ, Barakat MK (2022)
Ali J, Ashraf U, Anees A, Peng S, Umar MU, Vo Thanh H, Khan
An integrated sedimentological, rock typing, image logs, and arti-
U, Abioui M, Mangi HN, Ali M (2022) Hydrocarbon potential
ficial neural networks analysis for reservoir quality assessment
assessment of carbonate-bearing sediments in a meyal oil field,
of the heterogeneous fluvial-deltaic Messinian Abu Madi reser-
Pakistan: insights from logging data using machine learning and
voirs, Salma field, onshore East Nile Delta, Egypt. Mar Pet Geol
quanti elan modeling. ACS Omega 7:39375–39395
145:105910
Ali N, Chen J, Fu X, Hussain W, Ali M, Iqbal SM, Anees A, Hussain
El-Ghamri MA, Warburton IC, Burley SD (2002) Hydrocarbon gen-
M, Rashid M, Thanh HV (2023) Classification of reservoir quality
eration and charging in the October Field, Gulf of Suez, Egypt. J
using unsupervised machine learning and cluster analysis: exam-
Pet Geol 25:433–464
ple from Kadanwari gas field, SE Pakistan. Geosyst Geoenviron
El-Sayed AMA, Sayed NAE, Ali HA, Kassab MA, Abdel-Wahab SM,
2:100123
Gomaa MM (2021) Rock typing based on hydraulic and elec-
Altunbay MM, Gaafar GR, Ahmad M, Rafek AGM (2018) Develop-
tric flow units for reservoir characterization of Nubia Sandstone,
ment of “Hydraulic Units”(HU) concept in rock typing
southwest Sinai, Egypt. J Pet Explor Prod Technol 11:3225–3237
Amaefule JO, Altunbay M, Tiab D, Kersey DG, Keelan DK (1993)
Elsayed AG, Kassab M, Osman W (2021) Evaluation of petrophysical
Enhanced reservoir description: using core and log data to identify
and hydrocarbon potentiality for the Nubia A, Ras Budran oil
hydraulic (flow) units and predict permeability in uncored inter-
field, Gulf of Suez, Egypt. Egypt J Chem 64:3387–3404
vals/wells. In: SPE annual technical conference and exhibition.
Freedman D, Pisani R, Purves R (2020) Statistics: fourth international
OnePetro. https://doi.org/10.2118/26436-MS
student edition. WW Nort Co Httpswww Amaz ComStatistics-
Amjad MR, Shakir U, Hussain M, Rasul A, Mehmood S, Ehsan M
Fourth-Int-Stud-Free Accessed 22
(2023) Sembar formation as an unconventional prospect: new
Gao G, Hazbeh O, Davoodi S, Tabasi S, Rajabi M, Ghorbani H, Rad-
insights in evaluating shale gas potential combined with deep
wan AE, Csaba M, Mosavi AH (2023) Prediction of fracture den-
learning. Nat Resour Res 1–29
sity in a gas reservoir using robust computational approaches.
Archie GE (1942) The electrical resistivity log as an aid in determining
Front Earth Sci 10:1023578
some reservoir characteristics. Trans AIME 146:54–62
Gardner GHF, Gardner LW, Gregory Ar (1974) Formation velocity and
Asquith GB, Gibson CR (1982) Basic well log analysis for geologists.
density—The diagnostic basics for stratigraphic traps. Geophysics
American Association of Petroleum Geologists Tulsa
39:770–780
Atlas D (1979) Log Interpretation Charts. Dresser Industries Inc., 107p
Hasouba M, Abd El Shafy A, Mohamed A (1992) Nezzazat Group—
Borling DC, Powers BS, Ramadan N (1996) Water shut-off case history
reservoir geometry and rock types in the October field area, Gulf
using through-tubing bridge plugs; October Field, Nubia Forma-
of Suez. In: 11th EGPC petroleum exploration and production
tion, Gulf of Suez, Egypt. In: Abu Dhabi international petroleum
conference. pp 293–317
Exhibition and conference. OnePetro
Hassan AR, Radwan AA, Mahfouz KH, Leila M (2023) Sedimentary
Bosworth W, McClay K (2001) Structural and stratigraphic evolution
facies analysis, seismic interpretation, and reservoir rock typing
of the Gulf of Suez rift, Egypt: a synthesis. Mém Mus Natl D’hist
of the syn-rift Middle Jurassic reservoirs in Meleiha conces-
Nat 1993(186):567–606
sion, North Western Desert, Egypt. J Pet Explor Prod Technol
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system.
13:2171–2195
In: Proceedings of the 22nd Acm Sigkdd international conference
Hussein I, El Kammar AM, Maky AF, Elshafeiy M (2017) Compara-
on knowledge discovery and data mining. pp 785–794
tive organic geochemical studies on some Miocene and Creta-
Coates GR, Xiao L, Prammer MG (1999) NMR logging: principles and
ceous rock units in October field, Gulf of Suez, Egypt
applications. Haliburton Energy Services Houston
Ismail A, Zein el‐Din MY, Radwan AE, Gabr M (2023) Rock typ-
Corporation (EGPC), E.G.P. (1996) Gulf of Suez oil fields (a compre-
ing of the Miocene Hammam Faraun alluvial fan delta sandstone
hensive overview). EGPC, Cairo, Egypt
reservoir using well logs, nuclear magnetic resonance, artificial
Darling T (2005) Well logging and formation evaluation. Elsevier
neural networks, and core analysis, Gulf of Suez, Egypt. Geol J
Deaf AS (2009) Palynology, palynofacies and hydrocarbon poten-
Kassab MA, Abbas A, Ghanima A (2020) Petrophysical evaluation
tial of the Cretaceous rocks of northern Egypt. University of
of clastic Upper Safa Member using well logging and core data
Southampton
in the Obaiyed field in the Western Desert of Egypt. Egypt J Pet
Dodge Y (2003) The Oxford dictionary of statistical terms. OUP
29:141–153
Oxford
Kassab MA, Elgibaly A, Abbas A, Mabrouk I (2021) Identification and
Ehsan M, Gu H, Ahmad Z, Akhtar MM, Abbasi SS (2019) A modi-
distribution of hydraulic flow units of heterogeneous reservoir in
fied approach for volumetric evaluation of shaly sand formations
Obaiyed gas field, Western Desert, Egypt: a case study. AAPG
from conventional well logs: a case study from the talhar shale,
Bull 105:2405–2424
Pakistan. Arab J Sci Eng 44:417–428
Kassem AA, Sen S, Radwan AE, Abdelghany WK, Abioui M (2021)
Ehsan M, Toor MAS, Hajana MI, Al-Ansari N, Ali A, Elbeltagi A
Effect of depletion and fluid injection in the Mesozoic and Paleo-
(2023) An integrated study for seismic structural interpretation
zoic sandstone reservoirs of the October Oil Field, Central Gulf
and reservoir estimation of Sawan gas field, Lower Indus Basin,
of Suez Basin: implications on drilling, production and reservoir
Pakistan. Heliyon 9
stability. Nat Resour Res 30:2587–2606
El Nady MM, Ramadan FS, Hammad MM, Lotfy NM (2015) Evalua-
Khattab MA, Radwan AE, El‐Anbaawy MI, Mansour MH, El‐Tehiwy
tion of organic matters, hydrocarbon potential and thermal matu-
AA (2023) Three‐dimensional structural modelling of structur-
rity of source rocks based on geochemical and statistical methods:
ally complex hydrocarbon reservoir in October Oil Field, Gulf of
Suez, Egypt. Geol J
Journal of Petroleum Exploration and Production Technology
Lelek JJ, Shepherd DB, Stone DM, Abdine AS (1992) October Field: a case study for the Mangahewa gas field, New Zealand. J Rock
The Latest Giant under Development in Egypt’s Gulf of Suez: Mech Geotech Eng 14:1799–1809
Chapter 15 Rajabi M, Beheshtian S, Davoodi S, Ghorbani H, Mohamadian N, Rad-
Manzoor U, Ehsan M, Radwan AE, Hussain M, Iftikhar MK, Arshad wan AE, Alvar MA (2021) Novel hybrid machine learning opti-
F (2023) Seismic driven reservoir classification using advanced mizer algorithms to prediction of fracture density by petrophysical
machine learning algorithms: a case study from the lower Ranikot/ data. J Pet Explor Prod Technol 11:4375–4397
Khadro sandstone gas reservoir, Kirthar fold belt, lower Indus Rajabi M, Hazbeh O, Davoodi S, Wood DA, Tehrani PS, Ghorbani
Basin, Pakistan. Geoenergy Sci Eng 222:211451 H, Mehrad M, Mohamadian N, Rukavishnikov VS, Radwan AE
Marghani MM, Zairi M, Radwan AE (2023) Facies analysis, diagen- (2023) Predicting shear wave velocity from conventional well logs
esis, and petrophysical controls on the reservoir quality of the with deep and hybrid machine learning algorithms. J Pet Explor
low porosity fluvial sandstone of the Nubian formation, east Sirt Prod Technol 13:19–42
Basin, Libya: insights into the role of fractures in fluid migration, Rebala G, Ravi A, Churiwala S (2019) An introduction to machine
fluid flow, and enhancing the permeability of low porous reser- learning. Springer
voirs. Mar Pet Geol 147:105986 Rezaee R, Ekundayo J (2022) Permeability prediction using machine
Moustafa AM (1976) Block faulting in the Gulf of Suez. In: Proceed- learning methods for the CO2 injectivity of the precipice sand-
ings of the 5th Egyptian general petroleum corporation explora- stone in Surat Basin, Australia. Energies 15:2053
tion seminar, Cairo, Egypt Schlumberger LI (1972) Volume 1-Principles. Schlumberger Limited,
Mustafa A, Tariq Z, Mahmoud M, Radwan AE, Abdulraheem A, New York, p 113
Abouelresh MO (2022) Data-driven machine learning approach Schlumberger EUM (2009) Technical description. Schlumberger Ltd,
to predict mineralogy of organic-rich shales: an example from pp 519–538
Qusaiba Shale, Rub’al Khali Basin, Saudi Arabia. Mar Pet Geol Thanh HV, Yasin Q, Al-Mudhafar WJ, Lee K-K (2022) Knowledge-
137:105495 based machine learning techniques for accurate prediction of CO2
Nabawy B, Abudeif AM, Masoud MM (2021) Petrophysical charac- storage performance in underground saline aquifers. Appl Energy
terization, microfacies analysis, and diagenetic attributes of the 314:118985
Lower Jurassic surface analog sequence in Gebel El-Maghara Thanh HV, Rahimi M, Dai Z, Zhang H, Zhang T (2023) Predicting
area, North Sinai, Egypt the wettability rocks/minerals-brine-hydrogen system for hydro-
Patton TL, Moustafa AR, Nelson RA, Abdine SA (1994) Tectonic evo- gen storage: re-evaluation approach by multi-machine learning
lution and structural setting of the Suez rift: chapter 1: Part I. Type scheme. Fuel 345:128183
basin: Gulf of Suez Wang X, Li C, Chen P (2020) Rock typing from cored intervals to all
Peijs J, Bevan TG, Piombino JT (2012) The Gulf of Suez rift basin. wells with method of decision tree. In: IOP conference series:
In: Regional geology and tectonics: phanerozoic rift systems and earth and environmental science. IOP Publishing, p 042003
sedimentary basins. Elsevier, pp 164–194 Weller A, Kassab MA, Debschütz W, Sattler C-D (2014) Permeability
Poupon A, Leveaux J (1971) Evaluation of water saturation in shaly for- prediction of four Egyptian sandstone formations. Arab J Geosci
mations. In: SPWLA 12th annual logging symposium. OnePetro 7:5171–5183
Powers DM (2020) Evaluation: from precision, recall and F-measure to Wyllie MRJ (1963) The fundamentals of well log interpretation. Aca-
ROC, informedness, markedness and correlation. arXiv preprint demic Press
arXiv:2010.16061 Zahran M (1986) In Geology of October field. In: The 8th Exploration
Radwan AE, Kassem AA, Kassem A (2020) Radwany formation: a new International Conference, Egyptian General Petroleum Coopera-
formation name for the Early-Middle Eocene carbonate sediments tion, Cairo
of the offshore October oil field, Gulf of Suez: contribution to the Zhan C, Dai Z, Soltanian MR, de Barros FP (2022) Data‐worth
Eocene sediments in Egypt. Mar Pet Geol 116:104304 analysis for heterogeneous subsurface structure identification
Radwan AE, Nabawy BS, Kassem AA, Hussein WS (2021a) Imple- with a stochastic deep learning framework. Water Resour Res
mentation of rock typing on waterflooding process during second- 58:e2022WR033241
ary recovery in oil reservoirs: a case study, El Morgan Oil Field, Zhou G, Zhang R, Huang S (2021) Generalized Buffering Algorithm.
Gulf of Suez, Egypt. Nat Resour Res 30:1667–1696 IEEE Access 9:27140–27157
Radwan AE, Trippetta F, Kassem AA, Kania M (2021b) Multi-scale
characterization of unconventional tight carbonate reservoir: Publisher's Note Springer Nature remains neutral with regard to
insights from October oil filed, Gulf of Suez rift basin, Egypt. J jurisdictional claims in published maps and institutional affiliations.
Pet Sci Eng 197:107968
Radwan AE, Wood DA, Radwan AA (2022) Machine learning and
data-driven prediction of pore pressure from geophysical logs: