Data Cleaning (Chen2019)

SPE-196487-MS
Reservoir Recovery Estimation Using Data Analytics and Neural Network

Based Analogue Study
Yajing Chen and Zhouyuan Zhu, China University of Petroleum, Beijing; Yangxiao Lu, The University of Texas
at Dallas; Changhao Hu, Fei Gao, Wei Li, Nian Sun, and Tian Feng, E&D Research Institute of Liaohe Oilfield
Company of CNPC
Copyright 2019, Society of Petroleum Engineers
This paper was prepared for presentation at the SPE/IATMI Asia Pacific Oil & Gas Conference and Exhibition held in Bali, Indonesia, 29-31 October 2019.
This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents
of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect
any position of the Society of Petroleum Engineers, its officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written
consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may
not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright.
Abstract
Reservoir analogue study, which is different with flow physics based prediction methods such as reservoir
simulation, is based on human experience and knowledge from skilled reservoir engineers. In this work,
we present a new workflow to replace the human knowledge-based analogue studies with data analytics
and machine learning techniques.
First, we collect reservoir properties, development parameters, and historical recovery data for 1381
actual U.S. oilfields from Tertiary Oil Recovery Information System (TORIS) by U.S. Department of
Energy. We conduct extensive data cleaning for outliers and missing values. Then, we determine the most
important determining factors for recovery factors. We further use single-variable and bi-variable analysis to
understand relationship between recovery factor and determining factors. Finally, we use train an Artificial
Neural Network (ANN) model to make recovery factors predictions.
We have found that the recovery factors mostly depend on 19 principal factors, a reduction from a total
of more than 70 properties originally in TORIS. We randomly select data from 90% of these oilfields as
training set for machine learning. The predictability and accuracy of such methodology is tested by making
recovery forecasts for the remaining 10% oilfields and by comparing the forecasts with the actual recovery
factors. Eventually, the average error in recovery factor predicted by the trained ANN model is about 10%.
Overall, this methodology has shown strong performance in computer assisted analogue study, which shows
minimum requirements on human knowledge and hands-on work to study these 1381 oilfields.
This work provides a new workflow of using data analytics and machine learning techniques for reservoir
analogues studies. Reservoir engineering software systematically built based on this methodology can serve
as more efficient and accurate predictive tool in studying reservoir analogues.
Introduction
During the exploration and production of oil and gas assets, the major task for reservoir engineers is to make
predictions of future productions and reservoir performance. Two main categories of methods are usually
used for this purpose. The first one is fluid flow physics based reservoir modeling. The other is experience-
2 SPE-196487-MS
based prediction. The first type of method is based on forward solving of certain dynamic porous media flow
model, which includes most reservoir engineering analytical solutions such as Buckley-Leverett, material
balance calculations, unsteady state rate transient analysis and numerical reservoir simulation (Coats, 1982).
The second type of method is based on human knowledge, experience and fuzzy logic for future predictions,
which includes reservoir analogue study, and decline curve analysis such as Arps method. We focus on the
use of data analytics and artificial intelligence for reservoir analogue study in this work.
The origin of the systematic scientific reservoir analogue study can be traced back to 1950s. For example,
such method was proposed and applied for bottom water coning in different oil fields (Meyer et al., 1956).
The behavioral analysis of two-phase fluids separated into two different regions by gravity in the reservoir
was conducted. In recent years, such method also was used to search for analogous reservoirs with the aid
of modern statistics (Martin Rodriguez et al., 2013). Multivariate statistical techniques were used to find a
unique and reproducible list of reservoirs with properties that are most similar to the sleeted target. Such
entire procedure was systematic and unbiased.
Data mining is the process of discovering underlying patterns in large datasets through the combined
efforts of modern machine learning, applied statistics, and computer database systems (Han et al., 2011).
Integrated with petroleum engineering knowledge, such method was used for deepwater Gulf of Mexico
reservoir recovery predictions (Srivastava et al., 2016). It involved the use of easy to implement data
mining techniques by integration of dimensionless numbers. Artificial Neural Networks (ANN) are machine
learning models that are inspired by, but not necessarily identical to, the biological neural networks that
constitute animal brains (Han et al., 2011). Such systems "learn" to perform tasks through considering
examples, generally without being programmed with any task-specific rules. In reservoir engineering, this
method was used in reservoir management to optimize the injection-production ratio in a Middle East
reservoir (Stundner et al., 2001). A huge Barnett shale dataset for water-production was analyzed using
statistical methods to determine hidden structures in well and production data. Neural network based model
was used to predict the potential for water production from wells drilled in the Barnett shale (Awoleke et al.,
2010). Furthermore, in order to overcome the disadvantages of deterministic, cumbersome and expensive
(manpower and time consuming) simulation based performance evaluation for SAGD operations, ANN has
also employed as a data-driven modeling alternative to make recovery predictions in Canadian oil sand
reservoirs (Dzurman et al., 2013). In addition, ANN was also used to assist interpretations of well logs
in order to interpret flow units and predict permeability (Aminian et al., 2003). With limited core data
and relevant geological interpretations, statistical techniques and ANN was successfully used to provide
consistent and reliable predictions for permeability and flow units. In the petrophysics discipline, ANN was
also used successfully to predict the rock facies in carbonate well logs (Tang et al., 2011).
In this study, we explore the potential of using data analytics and artificial intelligence for reservoir
analogue studies. This is a new concept combining the statistical data analysis, training and predictions
using ANN and traditional oilfield data analysis. First, we conduct data cleaning for the massive reservoir
properties and performance data from the corresponding database. We conduct several steps to choose a set
of factors, which we deem as the most relevant to reservoir recovery. Then we use single-variable analysis
and bi-variable analysis to perform the initial exploratory data investigation. From the statistical analysis of
numerous existing oil reservoir performances, we find the underlying patterns and behaviors. Finally, we
train an ANN model to make recovery factor predictions accordingly. This work provides a new workflow
for making predictions of reservoir recovery through reservoir analogue study using both data mining and
machine learning.
Reservoir Analogue Study Basics

The reservoir analogue study method refers to the existing characteristics of the developed reservoirs to
predict the dynamic performance of the targeted new reservoir, which has certain similarities in geological
SPE-196487-MS 3
features and petrophysical properties to the existing ones. In the early stage of exploration and production,
almost the only way for reservoir engineers to make predictions is through analogue study, because it is
really difficult to obtain enough information to build reliable flow physics based predictive models such
as numerical reservoir simulation with enough input data certainty. Using the analogue study method, such
reservoirs within the similar geological basin, with the similar geological characteristics, or with the similar
petrophysical properties, are used to predict the recovery performance of the targeted reservoir. It can
estimate the recovery factors, initial production rates, production decline rates, and reservoir recovery drive
mechanisms. It can also make recommendations for the preferred well pattern arrangements. When using
such method, if two reservoirs should have been developed using a similar development strategy, comparing
them may yield reliable results. However, if different development strategies are deployed for the two
reservoirs, this approach may have difficulties in making accurate predictions with enough confidence. The
pros and cons of the reservoir analogue study and its comparisons with other types of reservoir prediction
methods are shown in Table 1.
Table 1—Comparison between different predictive methods for reservoir engineering.
Categories Methods Principles Advantages Disadvantages
Large number of input

Solving the multi-phase multi- parameters is needed. Large
Solving the material balance component porous media flow uncertainty exists for properties
Numerical reservoir
and Darcy's law for discrete problem discretely on complex far from the well location. The
simulation
reservoir simulation grids. reservoir geological models history match and prediction
accurately. process needs large amount of
human work.
The model is simple. It is only

suitable for reservoirs with
relatively uniform average
Fluid flow physics based Solving the reservoir recovery
Material balance pressure, which is measureable
reservoir modeling Volume conservations. problem in a quick and easy
calculations periodically. Multi-tank
way.
material balance may be used
to solve recovery problems in
geologically complex reservoir.
It is only suitable for reservoir

Solving reservoir recovery problems with simple
Reservoir engineering Solving the material balance problems in a quick and easy geology, geometry and well
analytical solutions and Darcy's law analytically. way, with strong physical configurations. Too difficult
intuitions. for reservoirs with complex
geological models.
The method is only applicable

to the stable decline period
in production with minimal
changes to the operating
Empirical observations, Making the reservoir
Classical decline curve conditions. And human
experience and some simple predictions in a quick and easy
analysis experience is important in its
flow physics. way.
applications. It requires large
amount of production data in
order to summarize the decline
behavior and predict the future.
Experience-based
predictions The method is based on the
analysis of large amount
Predictions made based The analysis is simple, which
of existing dataset, which
on human knowledge, can make predictions in the
takes large amount of human
experience and fuzzy logic. initial stage of exploration and
work for data collection and
It is based on experience appraisal to guide decision-
Reservoir analogue study examination. The method
and knowledge from various making, based on a set of basic
is subjective and inaccurate
disciplines such as geology, reservoir properties. And it can
sometimes due to lack of
geophysics, reservoir and also be used for screening of
experience. It usually mandates
production engineering. reservoir development methods.
a long training period for
engineers.
4 SPE-196487-MS
TORIS Database Description

The database used in this work is the Tertiary Oil Recovery Information System (TORIS), an extensive
public dataset currently maintained by the Department of Energy's (DOE) National Energy Technology
Laboratory (NETL). The dataset is public and can be obtained from the NETL website of US DOE. It
contains an extensive field and reservoir level database to evaluate the technical and economic recovery
potential of different oil reservoirs in the U.S.
TORIS was originally developed by the National Petroleum Council (NPC) for its 1984 assessment for the
enhanced oil recovery (EOR) potential for U.S., which was then requested by the U.S. Secretary of Energy.
In this effort, the committee built such data base for individual oil reservoirs. After augmentation, adaptation,
and validation, such oil reservoir database was transferred to the DOE's Bartlesville Project Office for
maintenance, updating, and subsequent application, which becomes components of a larger system known
as TORIS. The data base has been continuously updated and expanded in order to maintain the systems
usefulness as a research tool.
In 1995, the TORIS data base contains over 1,381 oil reservoirs, accounting for over 64% of the
original oil-in-place (OOIP) estimated to exist in the then discovered crude oil reservoirs in the U.S.
TORIS utilizes its comprehensive dataset and detailed engineering and economic analysis at the reservoir
level to demonstrate the estimated crude oil recovery, investment and operating costs, and ultimate project
economics for these oil reservoirs. The spreadsheet visualization of the TORIS dataset with the different
oil reservoir as the rows and multiple reservoir properties as the columns is shown in Fig. 1. The database
originally contains 70 properties for 1,381 oil reservoirs in U.S.
Figure 1—Spreadsheet visualization of the TORIS dataset: different oil

reservoir as the rows, multiple reservoir properties as the columns.
Reading through this huge dataset, generate recovery estimates and make corresponding suggestions for
a new green field development may cost one or two experienced reservoir engineers several months, if all
the work for the reservoir analogue study is performed manually. This dataset serves as an ideal candidate
for testing the use of data analytics and machine learning for assisting reservoir analogue studies.
SPE-196487-MS 5
Data Mining Process of TORIS Database

Data Cleaning
The main purpose of data cleaning is to remove the redundant, useless or missing values in the database.
Although more data helps to train the model more effectively in later machine learning process, the
incorporation of wrong or physically unreasonable data will reduce the reliability and accuracy of the
predictive capability of the trained model. Including redundant data will also increase the complexity of the
model, thus causing expensive computational workload in machine learning. We take the following specific
steps to treat the unusual data:
1. Eliminate data entries marked as unknown or −1;
2. Eliminate data entries with erroneous or unreasonable values, such as permeability of 0;
3. Eliminate verbal descriptive entries.
After data cleaning, the effective TORIS database subset contains 506 reservoirs in this works for
subsequent studies.
Determining Key Factors for Recovery Predictions

Typically, the presence of a large number of controlling factors in data analytics can significantly increase the
complexity of the model and impair computational efficiency. For example, expensive computational costs
are often associated with running large neural network models. We try to reduce the number of controlling
factors, discard the misleading and irrelevant ones, retain as much information as the initial dataset, and
keep the accuracy of the model predictions.
In this study, we focus on predicting the recovery factors for potential green field primary and secondary
oil reservoir developments. We assume the reservoirs in TORIS database have been developed using large
number of regular well patterns. Based on the fundamental knowledge of reservoir engineering and water
flooding, we rule out the redundant and irrelevant reservoir properties to assist further machine learning
and training of an accurate and efficient ANN model as shown in Table 2. Such properties include specific
names (formations, oilfields, locations, geologic province and play, etc.) and reference numbers, extensive
variables (reservoir size, acreage, oil in place, etc.), current conditions (producing gas oil ratio, cumulative
oil, etc.), and other less related information to recovery factors (number of current injection and production
wells, water salinity, etc.). Table 3 shows the key factors that are selected in this reservoir analogue study. As
seen, the selected properties follow the physical intuition and are consistent the basic theories of reservoir
engineering. We have successfully reduced the dimension of controlling factors for recovery predictions
from 70 down to 19.
6 SPE-196487-MS
Table 2—Part of the redundant and irrelevant reservoir properties that are ruled
out in this reservoir analogue study based on reservoir engineering judgments.
Specific names Reference numbers Extensive variables Current conditions Others unrelated variables
Current oil, water

Field name DOE reference number Field acres Number of production wells
and gas saturations
Current oil Formation Number of injection

Reservoir name Preparer's reference number Proven acres
Volume Factor (FVF) wells number
State Reservoir acres Current formation pressure Injection water salinity
Geologic province and play Original oil in place Cumulative oil production Formation temperature
Current producing
Formation name
GOR and injection rate
Table 3—Key factors that are selected in this reservoir analogue study based on reservoir engineering judgments.
Lithology type Well spacing Net pay Gross pay Porosity
Oil Formation
Initial oil saturation (Soi) Initial water saturation (Swi) Total Vertical Depth (TVD) Formation temperature
Volume Factor (FVF)
Permeability API gravity Viscosity Initial gas oil ratio (GOR) Initial pressure
Swept zone residual Dykstra-Parsons

Depositional Environment Geologic trap type
oil saturation (Sor) permeability variation (VDP)
Single-variable analysis
In order to have an overall understanding of the large number of reservoirs contained in this database, we
conduct single-variable analysis on all the key factors. Single-variable histograms of nine selected important
key factors are shown in Fig. 2. It is clear from the histograms that we can observe certain trends about
these reservoirs. Most reservoirs are developed using well spacing from 80 acres down to 5 acres. The oil
API gravities cover a wide range, mostly from 10 to 55 degree. From the API gravity distribution, we can
conclude that most of the reservoirs in the database are light oil. Histogram of true vertical depth shows that
most reservoirs are around 5,000 feet, with very few ones being over 10,000 feet deep. And the reservoir
permeability values mostly range from 2 mD to 2000 mD. The oil viscosities mostly range from 0.2 cp
to 100 cp. This gives mobility ratios that are favorable for water flooding. The initial crude oil saturation
of the reservoir is mostly around 60% to 80%. The inter-layer heterogeneity indicator Dykstra-Parsons
permeability variation VDP is mostly from to 0.5 to 0.95.
SPE-196487-MS 7
Figure 2—Histograms of key factors (well spacing, permeability, Soi, Sor, viscosity, API, GOR, initial Pressure).
Bi-variable analysis
In order to find out whether these key factors are correlated with each other, we conduct bi-variable analysis.
The results are shown in Fig. 3. As we can see, initial water saturation Swi is negatively correlated with
initial oil saturation Soi (correlation coefficient of −0.94); TVD is correlated with reservoir temperature
(correlation coefficient of 0.77) and initial pressure (correlation coefficient of 0.94); TVD is also weakly
correlated with API gravity (correlation coefficient of 0.36); which are consistent with common sense.
Most water flooding operations are conducted in reservoirs with minimal gas saturations. Geothermal
gradient determines the temperature. Depth determines the initial pressure to a large extent in most cases.
Heavier oil with lower API gravity exists in shallower formations due to hydrocarbon migrations, loss of
light components, and biodegradations in shallow depth. For other variable combinations, the correlations
between any two properties are relatively small. Therefore, it is also possible to conduct another set of
8 SPE-196487-MS
machine learning predictions with reduced number of independent variables, with Swi, temperature and
initial pressure ruled out from the training process. Sensitivity studies on this option are conducted in the
following sections. Overall, bi-variable analysis shows the inter-dependency between different variables.
Backed by the understanding of the reservoir engineering physical process, it may help to further reduce
the number of unknowns for following studies.
Figure 3—Bi-variable correlation plots of key factors for reservoir recovery (in sequence): well spacing, net pay,
gross pay, Soi, Swi, oil FVF, TVD, temperature, permeability, API gravity, viscosity, initial GOR, initial Pressure,
Sor, VDP. The upper triangle shows the absolute values of correlation coefficients between any two variables.
SPE-196487-MS 9
Figure 4—Bi-variable correlation plots of more related key factors in this study: Swi-Soi (upper
left), Temperature-TVD (upper right), Pressure-TVD (lower left), and API gravity-TVD (lower right).
Artificial Neural Network Model

Artificial Neural Network is a set of connected neurons. All of the connected neurons are organized in
layers, including input layer, hidden layer and output layer. In this work, we build an ANN model to predict
recovery factors.
ANN Model Training

We deploy the Artificial Neural Network model in Python 3 to perform machine learning and forecasting.
This neural network consists 3 hidden fully connected layers. For this regression problem, the number of
hidden layers is limited to 3 in order to avoid over fitting with too many degrees of freedoms, which may
lead to large error. The schematic configuration of our ANN model is shown in Fig. 5. In order to facilitate
machine learning computations and accelerate model training, we perform standardization preparations of
the dataset to linearly map all the data entries to values between 0 and 1. When predictions are made after
the training process, the same standardizations are made.
10 SPE-196487-MS
Figure 5—Schematic Configuration of the ANN model utilized in this study.
We split our dataset into training dataset and validation dataset. The training dataset is randomly selected
from the TORIS database, which contains 90% of all of the reservoirs. The validation dataset contains the
remaining 10% reservoirs. Our ANN model is trained using the training dataset.
We have implemented 3 sets of trainings in our work:
1. Using the entire 19 key controlling factors shown in the Table 3 (method A);
2. Using the reduced set of 16 key controlling factors with initial water saturation Swi, temperature and
initial pressure ruled out based on the result from bi-variable analysis (method B);
3. Using another reduced set of 16 controlling factors with lithology type, depositional environment, and
geological trap type ruled out, due to their nature of being non-numeric properties (method C).
Then, we need to optimize the hyper-parameters of this network using the validation data. Hyper-
parameters are preset parameters, including learning rate, number of iterations, number of units per layer,
etc. The detailed settings of the different layers, number of units in each layer and corresponding parameters
are shown in Table 4. As seen, the ANN model has been customized in the number of units for each layer.
During training, in order to measure the difference between the model output and the actual values, we
use the mean squared error (MSE) loss function to facilitate the training process. Furthermore, we employ
the ADAM optimizer with the learning rate of 0.001 to update the parameters of the network based on the
loss. In addition, we have to set an appropriate epoch number to avoid under fitting and over fittting. Fig.
6, Fig. 7 and Fig. 8 show the correlation between validation MAE and number of epochs for method A,
method B and method C respectively. After tuning the hyper-parameters, we obtain a well-trained ANN
model for our study.
Table 4—Setting of basic parameters for ANN training.
Method A Method B Method C
Number of units in input layer 36 34 16
Number of units in hidden layer1 64 64 64
Number of units in output layer 1 1 1

SPE-196487-MS 11
Figure 6—The correlation of validation MAE and number of epochs (method A).
Figure 7—The correlation of validation MAE and number of epochs (method B).
Figure 8—The correlation of validation MAE and number of epochs (method C).
Model Prediction Results

The cross plots of actual recovery factors from TORIS database and predicted values based our method
A, B and C are shown in Fig. 9, Fig. 10 and Fig. 11. The mean absolute errors (MAE) of these different
methods are 10.77%, 9.64% and 11.41%, respectively. From the results, we can observe that the prediction
accuracy is hardly affected by removing such properties that are highly correlated with TVD and initial
oil saturation Soi. The computational speed, however, is enhanced by reducing the number of properties.
The elimination of non-numeric lithology, trap type and depositional environment has small impact on the
prediction accuracy. Overall, the ANN model achieves pleasant predictability for recovery estimation of
green fields.
12 SPE-196487-MS
Figure 9—Cross plot of predicted versus actual recovery factors using method A.
Figure 10—Cross plot of predicted versus actual recovery factors using method B.
Figure 11—Cross plot of predicted versus actual recovery factors using method C.
Conclusions
In conclusion, we use data mining techniques to analyze the TORIS database and make reservoir recovery
predications using the trained ANN based model. We demonstrate the integrated process of combining
modern big data technology with traditional oilfield analogue studies to achieve recovery predictions. We
present the following specific conclusions:
SPE-196487-MS 13
1. We perform data cleaning to select the useful dataset from the large number of reservoirs in TORIS
database. The dominant influencing factors for recovery are also successfully screened out from the
various types of reservoirs and geological properties in TORIS. Originally containing a database of 70
properties in 1381 oil reservoirs, we successfully reduced the 70 attribute dimensions to 19 attributes,
or even down to 16 properties, through data cleaning and relevant data analysis.
2. We randomly select 90% of the data from the cleaned database to train the ANN model for machine
learning. Then we use the remaining 10% of the data as test cases to verify the accuracy of predictions
made from the ANN model. When finally comparing the predicted recovery factor with the actual
one, the error of the predicted result by the trained ANN model is about 10%.
Acknowledgement
The authors would like to gratefully acknowledge the financial support from National Natural Science
Foundation of China (Grant No. 51804315). We also want to thank Dr. Xingru Wu from University of
Oklahoma for insightful discussions on this work.
Reference
Aminian, K., Ameri, S., Oyerokun, A., Thomas, B. 2003. Prediction of Flow Units and Permeability Using Artificial
Neural Networks. Presented at the SPE Western Regional/AAPG Pacific Section Joint Meeting, Long Beach,
California, 19-24 May. SPE-83586-MS. https://doi.org/10.2118/83586-MS
Awoleke, O., Lane, R. 2010. Analysis of Data from the Barnett Shale Using Conventional Statistical and Virtual
Intelligence Techniques. SPE Reservoir Evaluation & Engineering, 14(05): 48-49. SPE-127919-PA. https://
doi.org/10.2118/127919-PA
Coats, K. 1982. Reservoir Simulation: State of the Art. Journal of Petroleum Technology, 34(08): 1633-1634. SPE-10020-
PA. https://doi.org/10.2118/10020-PA
Dzurman, P. J., Leung, J. W., Zanon, S. J., Amirian, E. 2013. Data-Driven Modeling Approach for Recovery Performance
Prediction in SAGD Operations. Presented at the SPE Heavy Oil Conference-Canada, Calgary, Alberta, Canada, 11-13
June. SPE-165557-MS. https://doi.org/10.2118/165557-MS
Han, J., Kamber, M., Pei J. 2011. What Is Data Mining. Data Mining: Concepts and Techniques, 3rd Edition, Morgan
Kaufmann Publishers, Waltham, USA.
Martin, H., Escobar, E., Embid, S., Rodriguez, N., Hegazy, M., Lake, L. W. 2013. New Approach to Identify Analogue
Reservoirs. Presented at the SPE Annual Technical Conference and Exhibition, New Orleans, Louisiana, USA, 30
September-2 October. SPE-166449-MS. https://doi.org/10.2118/166449-MS
Meyer, H. I., Searcy, D. F. 1956. Analog Study of Water Coning. Journal of Petroleum Technology, 8(04): 61-64. SPE-554-
G. https://doi.org/10.2118/554-G
Schuetter, J., Mishra, S., Zhong, M., LaFollette, R. 2018. A Data-Analytics Tutorial: Building Predictive Models for
Oil Production in an Unconventional Shale Reservoir. SPE Journal, 23(04): 1075-1089. SPE-189969-PA. https://
doi.org/10.2118/189969-PA
Srivastava, P., Wu, X., Amirlatifi, A., Devegowda, D. 2016. Recovery factor prediction for deepwater Gulf of Mexico
oilfields by integration of dimensionless numbers with data mining techniques. In SPE Intelligent Energy International
Conference and Exhibition. Presented at the SPE Intelligent Energy International Conference and Exhibition,
Aberdeen, Scotland, UK, 6-8 September. SPE-181024-MS. https://doi.org/10.2118/181024-MS
Stundner, M., Al-Thuwaini, J. S. 2001. How Data-Driven Modeling Methods like Neural Networks can help to integrate
different Types of Data into Reservoir Management. Presented at the SPE Middle East Oil Show, Manama, Bahrain,
17-20 March. SPE-68163-MS. https://doi.org/10.2118/68163-MS
Tang, H., Meddaugh, W. S., Toomey, N. 2011. Using an Artificial-Neural-Network Method To Predict Carbonate
Well Log Facies Successfully. SPE Reservoir Evaluation & Engineering, 14(1): 35-44. SPE-123988-PA. https://
doi.org/10.2118/123988-PA
TORIS Data Preparation Guidelines for Management and Operating Contract for the Department of Energy's National
Oil and Related Programs. Bartlesville, Oklahoma: BDM-Oklahoma, Inc. https://www.netl.doe.gov/research/oil-and-
gas/software/databases#NPC

Data Cleaning (Chen2019)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Cleaning (Chen2019)

Uploaded by

Copyright:

Available Formats

SPE-196487-MS

Reservoir Recovery Estimation Using Data Analytics and Neural Network

Copyright 2019, Society of Petroleum Engineers

Reservoir Analogue Study Basics

Table 1—Comparison between different predictive methods for reservoir engineering.

Categories Methods Principles Advantages Disadvantages

Large number of input

The model is simple. It is only

It is only suitable for reservoir

The method is only applicable

TORIS Database Description

Figure 1—Spreadsheet visualization of the TORIS dataset: different oil

Data Mining Process of TORIS Database

Determining Key Factors for Recovery Predictions

Current oil, water

Current oil Formation Number of injection

State Reservoir acres Current formation pressure Injection water salinity

Lithology type Well spacing Net pay Gross pay Porosity

Swept zone residual Dykstra-Parsons

Artificial Neural Network Model

ANN Model Training

Figure 5—Schematic Configuration of the ANN model utilized in this study.

Table 4—Setting of basic parameters for ANN training.

Method A Method B Method C

Number of units in input layer 36 34 16

Number of units in hidden layer1 64 64 64

Number of units in hidden layer2 64 64 64

Number of units in hidden layer3 16 16 16

Number of units in output layer 1 1 1

Model Prediction Results

You might also like