You are on page 1of 6

The 45th Annual Scientific Meeting of Himpunan Ahli Geofisika Indonesia

17-20 October 2020

Machine Learning Application to Predict Potential Reservoir and


Hydrocarbon Zone from Incomplete Well Data

Dimas Andreas Panggabean1a)*, Jihan Hardiyanti Arief1b), MN. Alamsyah2c)


1
University of Brawijaya, Malang, East Java, Indonesia
2
PetroChina International Companies (Indonesia), DKI Jakarta, Indonesia
a)
panggabeandimasandreas@gmail.com
b)
jihan.hardiyanti@gmail.com
c)
m.alamsyah@petrochina.co.id

*panggabeandimasandreas@gmail.com

Abstract. Machine learning in oil & gas can be used to improve the capabilities of this
increasingly competitive sector. One of the most noticeable effects of machine learning in an
industry that focuses on oil & gas is how it changes the discovery process. This can be proven
from the "MajuRoyal" oil field case study in finding potential reservoirs and hydrocarbons in
different structure compartments using two well data of Well-A and Well-B with different data
completeness conditions using the k-Nearest Neighbor (KNN) algorithm, one of the simplest
algorithms in machine learning, for electrofacies, lithology and hydrocarbon zones prediction.
Well-A with more complete data condition as training data and Well-B as test data.
Classification of electrofacies, lithology, hydrocarbon zones and their potential for training data
can be modeled. The KNN algorithm in training data is also analyzed quantitatively in creating
models for prediction and validation using a confusion matrix. The results of the validation of
the KNN with a good correlation on the training data prove that the KNN algorithm can be used
to predict the classification of electrofacies, lithology, and hydrocarbon zones of hydrocarbons
in the test data. The comparison of actual data and prediction data from KNN algorithm in the
training data, shows the average accuracy score above 0.8 for the model in the model
electrofacies, lithology and hydrocarbon zones. The KNN algorithm from the training data
applied to the test data shows good results. From the prediction results, qualitatively,
electrofacies in Well-A and Well-B can be correlated, for lithology prediction and the
hydrocarbon zone in Well-B shows a good correlation. As validation, carried out tests on the
hydrocarbon potential zones at Well-B, and the results showed very satisfactory results and
create new oil compartment on the "MajuRoyal" oil field.

Keywords: machine learning, classification, KNN Algorithm, prediction, electrofacies,


lithology, HC Zone

1. Introduction
The "MajuRoyal" oil field (Figure 1A) production zone in the Early Miocene to Middle Miocene
intervals with a transitional depositional environment dominated by marine shales and open marine
sandstones [3]. Based on the success of oil production in this field, a re-evaluation study of abandoned
exploration wells has carried out to see the potential for hydrocarbons, especially those outside the
structure. This study uses two well data, namely Well-A and Well-B. Refers to (Figure 1A), the Well-
A and Well-B positions are 7 km away and separated by normal fault with NE-SW trend. Well-A is
located in the existing producing oil field, while Well-B is an abandoned well because it is in a different
compartment and has a lower structure than the existing oil field. Moreover, the limitation of the Well-
B data makes Well-B has no indication for showing hydrocarbon potential. Since Well-A has been
proven as a production well and Well-B have similar reservoir characteristics with Well-A, hence Well-
B deserves to be reviewed for its hydrocarbon potential by predicting the type of electrofacies, lithology
and hydrocarbon potential. KNN algorithm used in this study as a tool to predict the reservoir and
hydrocarbon potential of Well-B based on control data from Well-A. KNN or k-Nearest Neighbors is
one of classification methods from machine learning application. Machine learning is a machine that is
designed for learning, so it requires training and testing to find out the machine has learned as desired
by the user. Training data is the information used to train an algorithm, which consists of input data and
the corresponding expected output. The test data is the information used to see the performance of the
used algorithm, which only consists of input data [5]. Classification is the process of building the model
from the training set made up of database instances and associated class labels. The resulting model is
then used to predict the class label of the testing instances where the values of the predictor features are
known. KNN algorithm is a kind of supervised algorithm. This algorithm has a purpose for classifying
new objects based on attributes and training samples [1]. As shown in (Figure 1B), the new object
projected on the dimensional space that contains training data (points). The classification process using
the KNN algorithm will carry out by observing the closest point of the new object. Since machine
learning uses training and testing, it is necessary to know the accuracy level of the created model by
using a confusion matrix. A confusion matrix (Figure 1C) is used to visualize the performance of a
classifier [4]. Confusion matrix arranged from actual data, which is the corresponding expected output
from the data set, and predicted data that has performed.

2. Data and Methodology


(Figure 2A) shows that Well-A has complete data information. The interpretation of electrofacies in
Well-A is qualitatively based on the shape of the gamma ray log response to the grain size of the rock
[2], which is shown by the illustration in (Figure 2B). Irregular naming is used for gamma ray response,
which shows the form of aggradation from shale/silts [2]. (Figure 2C) shows the completeness of Well-
B data that is not as complete as Well-A. This well data consists of wireline logs only. In the process
(Figure 3), Well-A is used as a training data set and Well-B as a test data set with the condition wireline
logs of Well-B have been normalized so that they have the same range of values as Well-A. When
applying the KNN algorithm to Well-A, not all data (features) that have an impact on the output
variable, so it needs to be selected which features will be used. Because this process requires a fitted
model, the features used for the application of the KNN algorithm are log data from gamma ray, true
resistivity, density, and neutron-porosity. Interpreted electrofacies, lithology based on shale volumes,
and tested zones from Well-A are used as the corresponding expected output and control the accuracy
of the model built from the algorithm. The accuracy of the created model is obtained by a confusion
matrix, which is a comparison between the corresponding expected output data (therefore referred as
the actual data) of Well-A and predicted data using the KNN algorithm that also performed on Well-A.
The KNN algorithm that has been validated based on the confusion matrix is then used to predict class
labels for electrofacies, lithology, and zones of potential hydrocarbons in Well-B.

3. Results and Discussion


(Figure 4) shows the results of processing in Well-A, consisting of log predictions from the KNN
algorithm based on features in well log data and prediction accuracy between log predictions and log
interpretations in the log definition column. For electrofacies, both log definition and predicted log
qualitatively shows the suitability and quantitatively of prediction accuracy with an accuracy score of
0.82. Lithology interpretation has been done in Well-A before, so the lithology qualitatively and
quantitatively shows an accuracy score of 1. For hydrocarbon zones, because Well-A has an oil tested
zone, this study also predicts potential hydrocarbon zones of the Well-A using a cut off resistivity value
of 4.5 ohmm. This is based on the range of resistivity values in the Well A oil tested zone which is
around 1.4-17.2 ohmm. Qualitatively and quantitatively between log definition and predicted log of
hydrocarbon zones, the accuracy score is 0.99. With accuracy score above 0.8 in each prediction, it is
assumed that the KNN algorithm is good to be applied in creating the classifier in this study. After
analyzing the classifier accuracy score, the KNN algorithm is used to predict the class label in the test
data set of Well-B.
The results of electrofacies, lithology, and hydrocarbon zone potential predictions of Well-B are shown
in (Figure 5). For electrofacies, the predicted results are correlated qualitatively with the Well-B gamma
ray log pattern. There are intervals that have matches and some that do not match and can be seen that
the accuracy of electrofacies prediction qualitatively on Well-B is not quite good. These are caused by
the results of electrofacies interpretation of the training data set and the algorithm used in predicting
electrofacies in Well-B based on qualitative analysis of well log data of Well-A without supporting
information from core data. Electrofacies prediction results on Well-B need to be validated using
electrofacies interpretation results based on core data also to increase the accuracy score of the
prediction. The lithology predictions of Well-B that shown in the lithology prediction column, based
on the KNN algorithm result, the accuracy between the lithology prediction results and the existing
gamma ray log data can be observed qualitatively and showing good result. The potential hydrocarbon
zone prediction results of Well-B shown by the green color on the HC zone prediction column and made
clear by the green arrow on the right side of the column. To validate the potential hydrocarbon zone
prediction results must go through a well test or DST process. Based on the results of well tests or DST
conducted at 2 intervals shown in prediction accuracy column, it is known that both zones are proven
to produce oil with a significant oil flow rate accompanied by water with an insignificant flow rate.
Based on the validation using the results of the well test data, it can be said that the results of the
prediction of the potential hydrocarbon zone on Well-B have been proven accurate. From the prediction
results on Well-A and specifically Well-B, it can be structurally correlated for Top-X, Top-Y and Top-
Z from Well-A to Well-B, which represents the tops of the productive interval reservoir (Figure 6A).
The structural correlation based on the electrofacies and lithology prediction result between the two
wells and supported by the interpretation of the structure based on seismic data. The green column is
the oil accumulation zone which has been proven to be productive, while the orange column is the
potential oil accumulation zone. Although the initial condition of the Well-B log data is considered less
than ideal in indicating hydrocarbons as shown in (Figure 2C), the results of this study and well test
result show that Well-B has 2 zones that have been proven to produce oil (Top-Y and Top-Z) and other
zone is still in the form of potential oil accumulation (Top-X), which is not defined by the well data
used in this study. These results indicate the existence of structural compartments in the area around
Well-B (Figure 6B) has a potential to become a new hydrocarbon zone that increases the prospect of
hydrocarbon accumulation in the "MajuRoyal" Oil Field.

4. Conclusions
The application of machine learning, especially the k-Nearest Neighbors (KNN) algorithm in predicting
electrofacies, lithology and HC Zone, was successful. This success was not only in sharpening the
interpretation of results based on well logs in Well-A as a training data set with an accuracy score above
0.8, but also in predicting electrofacies, lithology and the hydrocarbon zone in Well-B. Validated
potential hydrocarbon zone of Well-B can increase the prospect of oil accumulation in different
compartments of the "MajuRoyal" oil field.

Acknowledgments
The authors would like to acknowledge the government as host authority: Ministry of Energy & Mineral
Resources and Special Task Force for Upstream Oil & Gas Business Activities of the Republic of
Indonesia and Jabung JV partner : PETRONAS Carigali (Jabung) Ltd., PT.Pertamina Hulu Energi
Jabung, and PT.GPI Jabung Indonesia, for the permission to re-utilize and publish these datasets. We
would also like to thank to The Managements of PetroChina International Jabung Ltd. and the PIT
HAGI 45 for the support in publishing this paper.

References
[1] Bhavsar, Hetal and Amit Ganatra. (2012). A Comparative Study of Training Algorithms for
Supervised Machine Learning. International Journal of Soft Computing and Engineering
(IJSCE) 2(4): 74-81. https://doi.org/10.1.1.492.6088
[2] Emery, D & Myers, KJ, (eds) 1996. Sequence Stratigraphy. Blackwell Science Limited, 297pp.
www.sepmstrata.org.
[3] Ginger, David and Kevin Fielding. (2005). The Petroleum Systems and Future Potential of The
South Sumatra Basin. Proceedings of Indonesian Petroleum Association 39th Annual
Convention & Exhibition. https://doi.org/10.29118/ipa.2226.05.g.039
[4] Klein, Bernd. (2018). Machine Learning. Python Course. https://python-
course.eu/total_listing_machine_learning.pdf
[5] Smith, Daniel. (2019). What is AI Training Data?. Lion Bridge.
https://lionbridge.ai/articles/what-is-ai-training-data/
Figure 1. (A) "MajuRoyal" Oil Field, Top-Y Time Structure Map. (B) KNN or k-Nearest Neighbors
classification methods. (C) Confusion matrix to visualize the performance of a classifier.

Figure 2. (A) Well-A Dataset as Training Data. (B) General gamma ray response to grain size variations
and electrofacies log shapes relating to the sedimentary environment. (C) Well-B dataset as Test Data.
Figure 3. Workflow of the study

Figure 4. The results of processing in Well-A in the form of log predictions from the KNN algorithm
based on features in well log data, then prediction accuracy between log predictions and log
interpretations in the log definition column.
Figure 5. The results of processing in Well-B in the form of predicted electrofacies, lithology, and
hydrocarbon zones potential based on KNN algorithm and its prediction accuracy based on well test or
DST data (Green Square).

Figure 6. (A) The structural correlation between Well-A and Well-B based on the electrofacies and
lithology prediction result. There are several tested oil column (green) and potential hydrocarbon
column (orange). (B) The new hydrocarbon zone that increases the prospect of hydrocarbon
accumulation in the "MajuRoyal" Oil Field.

You might also like