Professional Documents
Culture Documents
PII: S2405-6561(17)30114-1
DOI: 10.1016/j.petlm.2017.11.003
Reference: PETLM 175
Please cite this article as: P. Panja, R. Velasco, M. Pathak, M. Deo, Application of artificial intelligence to
forecast hydrocarbon production from shales, Petroleum (2017), doi: 10.1016/j.petlm.2017.11.003.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and all
legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
PT
2
Department of Chemical Engineering, University of Utah
50 Central Campus Dr., Salt Lake City, UT 84112
RI
*ppanja@egi.utah.edu
*Corresponding author
SC
Abstract
Artificial intelligence (AI) methods and applications have recently gained a great deal of
U
attention in many areas, including fields of mathematics, neuroscience, economics, engineering,
AN
linguistics, gaming, and many others. This is due to the surge of innovative and sophisticated AI
techniques applications to highly complex problems as well as the powerful new developments
M
in high speed computing. Various applications of AI in everyday life include machine learning,
D
pattern recognition, robotics, data processing and analysis, etc. The oil and gas industry is not
TE
behind either, in fact, AI techniques have recently been applied to estimate PVT properties,
optimize production, predict recoverable hydrocarbons, optimize well placement using pattern
EP
recognition, optimize hydraulic fracture design, and to aid in reservoir characterization efforts. In
this study, three different AI models are trained and used to forecast hydrocarbon production
C
from hydraulically fractured wells. Two vastly used artificial intelligence methods, namely the
AC
Least Square Support Vector Machine (LSSVM) and the Artificial Neural Networks (ANN), are
compared to a traditional curve fitting method known as Response Surface Model (RSM) using
second order polynomial equations to determine production from shales. The objective of this
work is to further explore the potential of AI in the oil and gas industry. Eight parameters are
considered as input factors to build the model: reservoir permeability, initial dissolved gas-oil
ACCEPTED MANUSCRIPT
ratio, rock compressibility, gas relative permeability, slope of gas oil ratio, initial reservoir
pressure, flowing bottom hole pressure, and hydraulic fracture spacing. The range of values used
for these parameters resemble real field scenarios from prolific shale plays such as the Eagle
Ford, Bakken, and the Niobrara in the United States. Production data consists of oil recovery
PT
factor and produced gas-oil ratio (GOR) generated from a generic hydraulically fractured
reservoir model using a commercial simulator. The Box-Behnken experiment design was used to
RI
minimize the number of simulations for this study. Five time-based models (for production
SC
periods of 90 days, 1 year, 5 years, 10 years, and 15 years) and one rate-based model (when oil
rate drops to 5 bbl/day/fracture) were considered. Particle Swarm Optimization (PSO) routine is
U
used in all three surrogate models to obtain the associated model parameters. Models were trained
AN
using 80% of all data generated through simulation while 20% was used for testing of the models.
All models were evaluated by measuring the goodness of fit through the coefficient of
M
determination (R2) and the Normalized Root Mean Square Error (NRMSE). Results show that
D
RSM and LSSVM have very accurate oil recovery forecasting capabilities while LSSVM shows
TE
the best performance for complex GOR behavior. Furthermore, all surrogate models are shown
to serve as reliable proxy reservoir models useful for fast fluid recovery forecasts and sensitivity
EP
analyses.
Unconventional reservoirs
AC
1. Introduction
Surrogate models are particularly useful for quick predictions given a range of input parameters.
These models are used to forecast oil production and perform sensitivity and uncertainty
analyses. Polynomial equations and other non-linear equations known as response surface
models (RSM) have been popularized for their simple mathematical structure and for easier
ACCEPTED MANUSCRIPT
engineers and scientists due to their unconventional ways of connecting input data to output.
RSM coupled with a proper design of experiments [1] was proven to be an efficient and fast
proxy model for forecasting production performance and analyzing uncertainties [2]. Oil rate and
PT
water cut results were also predicted using RSM [3]. Response surface models are widely
RI
uncertainty, [4] production uncertainty [5-10], finding an optimal scheme for well placement [7,
SC
11-14], history matching [13, 15, 16], and determining the dew point of water in natural gas
processing unit [17]. Field cases have been studied using pattern recognition techniques [18] to
U
determine pressure and production variation according to well locations.
AN
Even though researchers have developed numerical, analytical, and semi-analytical techniques to
understand the physics underlying the production from hydraulically fractured tight formations
M
[19-22], many of these systems grow in complexity rendering most of these methods
D
inapplicable. The AI approach on the other hand is very useful when dealing with highly
TE
complex systems. At the cost of understanding the physical mechanisms taking place in tight
formations, AI helps us analyze and forecast hydrocarbon production and assess performance.
EP
In this study, two of the most common AI techniques namely, ANN and LSSVM as well a
second order polynomial RSM are used to predict oil and gas-oil ratio production from
C
hydraulically fractured low permeability reservoir. The comparison of these three methods in
AC
terms of performance and accuracy is also discussed. The application of ANN started before
LSSVM in the early 90’s, data from well tests were already being interpreted using ANN [23,
24]. Rock characteristics such as lithology were determined from well logs using fuzzy neural
networks [25]. Reservoir heterogeneity with respect to porosity, permeability, and oil saturations
ACCEPTED MANUSCRIPT
were characterized from geophysical well logs such as gamma ray, bulk density, deep induction,
etc. using ANN [26]. Thermodynamic properties from reservoir fluids such as bubble point
pressures and formation volume factors at the bubble point have been predicted from four inputs:
solution GOR, reservoir temperature, oil gravity, and gas density using ANN, SVM and non-
PT
linear regression [27]. Similarly, crude oil viscosity and solution GOR as functions of pressure
have been determined from 12 variables including compositions of oil, bubble point pressure,
RI
bubble point viscosity [28], etc. using ANN. Calculations of gas condensate dew point pressures
SC
were also made using gas composition, temperature, and heavy fraction properties [29-31] and
condensate to gas ratio [32]. Results predicted by ANN for asphaltene precipitation [33] showed
U
promising results compared to experimental studies [34]. Oil rates have also been measured in
AN
the pipe line using ANN for varying pressures and temperatures [35]. Various applications of
LSSVM include porosity and permeability determination [36-39], water conning in horizontal
M
wells [40, 41], well placement [40], gas-oil relative permeability curves [42], phase equilibrium
D
calculations of hydrates [43], oil flow rate predictions [44], and temperature-pressure
TE
relationship in natural gas production and processing [45]. Wide applications of artificial
intelligence in improved oil recovery were recently described by researchers [46-49]. Other
EP
applications include the description of CO2 solubility [50] and calcium carbonate [51] in brine
sequestration processes.
C
Eight important parameters are considered as input data that include geological parameters
AC
(initial reservoir pressure, rock compressibility, and permeability), operational parameter (bottom
hole pressure), completion parameters (fracture spacing), rock-fluid properties (Corey gas
relative permeability exponent), and fluid properties (initial solution gas-oil ratio and the linear
slope of solution gas-oil ratio versus pressure) which are selected from a previous study [52]
ACCEPTED MANUSCRIPT
where a mechanistic study revealed these parameters to be highly significant. Six models (5
time-based models for production after 90 days, 1 year, 5 years, 10 years, and 15 years and one
rate-based model when oil rate drops to 5 bbl/day/fracture) of oil recovery and produced GOR
are developed for each surrogate model (RSM, ANN and LSSVM). Data is generated from a
PT
generic reservoir model with one vertical hydraulic fracture placed in the middle of the reservoir
using a commercial reservoir simulator. The mathematical formulations and workflow to create
RI
these surrogate models are discussed in this article. The results obtained for all models are
SC
compared using error analyses in terms of coefficient of determination (R2) and normalized root
U
2. Reservoir Model
AN
Unconventional reservoirs such as shales and other tight formations are very complex in terms of
Typically, wells are drilled vertically and then directed horizontally for 1 to 2 miles, where as
TE
many as 100 vertical hydraulic fractures are induced to generate high conductive flow paths to
the wellbore. Simulating an entire reservoir model that consists of 100 hydraulic fractures is very
EP
simulated where production from a single vertical fracture is considered. The reservoir properties
C
PT
The number of unique input parameter combinations could lead to an enormous number of
RI
experiments or simulations. The Box-Behnken method [53] is chosen in this study to keep the
required number of simulations to a minimum. This simulation design is also suitable for second
SC
order response surface models. Using the Box-Behnken design, 114 simulations are modeled for
eight input parameters in three levels (minimum, medium and maximum) as shown in Table 2.
U
The values of all input parameters are converted to a -1, 0, and 1 scale using a linear relationship,
AN
except for matrix permeability and rock compressibility (where logarithmic values are used
M
instead).
Apart from the 114 simulation results that were used to train the models, 30 additional
simulations were ran to test the models. Therefore, the training data is comprised of
approximately 80% of the total data set (114 out of 144) while the testing portion is comprised of
approximately 20% (30 out of 144). The list of simulations used to train and test the models can
be found in the Appendix (Tables A.3 and A.4). IMEXTM from the Computer Modeling Group,
ACCEPTED MANUSCRIPT
Calgary, Canada was used to conduct all black-oil simulations. The minimum number of
simulation grid blocks necessary to obtain accurate results and avoid convergence issues was
3. Surrogate Model
PT
As mentioned earlier, three types of surrogate models – a Response Surface Model (RSM), a
Least Square Support Vector Machine (LSSVM) model, and an Artificial Neural Networks
RI
(ANN) model - were developed and compared in this study. Simulation results in terms of oil
SC
recovery and gas oil ratio (GOR) were obtained in two ways: by recording oil recovery and GOR
after certain production times and when the oil rate dropped to 5 bbl/day/fracture. In other words,
U
five time-based models and one rate-based model were developed as summarized below:
AN
Time based model: Models for oil recovery and GOR at 90 days, 1 year, 5 years, 10
Rate based model: Models for oil recovery and GOR when oil rate drops to 5
D
bbl/day/fracture.
TE
All unknown parameters in the surrogate models (RSM, LSSVM and ANN) are obtained using
(MathWorks® Inc.). The same optimization routine was used for all surrogate models to
eliminate any performance bias. Sometimes, unacceptable physical values such as negative oil
C
recovery factors or gas oil ratios are obtained using surrogate models. To avoid this pitfall,
AC
logarithms of the outcomes (recovery factors and gas-oil ratio) are used to build the models. A
simplified schematic of methodology used to develop the surrogate model is shown in Figure 1.
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
Figure 1: Surrogate model development schematic
All unknown parameters are listed in Table 3. These parameters are discussed in more detail in
M
the upcoming sections.
D
ANN 126 Weights (8X14 for hidden layer+14 for output All
layer)
The first two models (i.e. RSM and LSSVM) are discussed in detail in a previous article [55].
Therefore, these two models are intentionally discussed in brief here and the reader is referred to
The response surface model is the most common method used in many branches of engineering.
Basically, an algebraic equation is fitted to develop a relationship between input and output data.
During equation fitting with training data, the parameters (coefficients, intercepts etc.) are
PT
determined through an optimization routine to minimize error. A second order polynomial
RI
( ) ∑ ∑∑ ( )
SC
For 8 input variables, there are 8 interaction coefficients, ak, 36 second order interaction
U
coefficients, aij, and one intercept, a0, as shown in equation 1. A workflow to develop surrogate
AN
models (RSM and LSSVM) is shown in Figure 2.
M
D
TE
C EP
AC
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
D
TE
EP
Figure 2: Workflow used to develop RSM and LSSVM. Modified from Panja et al. [55]
As part of the development of a model, validation is performed using test data to assess
C
AC
robustness. An accepted error margin is set for the surrogate model. In this fashion, surrogate
models are continuously improved unless the error reaches its acceptance limit.
The Support Vector Machine (SVM) is usually used for classification and regression analysis. A
modified form of SVM, namely the least square support vector machine (LSSVM) is used in this
ACCEPTED MANUSCRIPT
study. LSSVM is close to SVM formulation but solves a linear system instead of a quadratic
programming (QP) problem. It has been widely applied in various fields because it is easier to
implement, speedy solution convergence, etc. On the other hand, LSSVM has the inherent nature
of overfitting to minimize error. Various combinations of data training and testing sets such as
PT
90-10 (%), 85-15(%), 80-20(%), and 70-30(%) were tried. Eventually, a data set with 80% used
for training and 20% used for testing yielded the best prediction capabilities in this study. The
RI
same combination was used for the other two surrogate models (RSM and ANN).
SC
The input and output relationship in LSSVM is given by Equation 2:
( ) ( )
U
The final form of LSSVM is given by Equation 3:
AN
( ) ( )
M
[ ] [ ] ( )
( ) ( ) ( ) ( )
[ ](
D
) ( )
TE
K(x,xi) is known as the kernel function which is chosen a priori. The Radial Basis Function
‖ ‖
( ) ( ) ( )
C
Where,
xi: Input vector of ith data
AC
b: Bias term
: Regularization parameter
: Kernel parameter
i: Support values
It is evident from equations 2, 3, and 4 that if the regularization parameter, , and the kernel
parameter, 2, are provided, the bias term and all support values can be determined from a linear
values are guessed and iteratively improved as described in figure 2. For the optimization part,
the training data is further divided into LSSVM training data (80%) and optimization data (20%)
as shown in Figure 3
PT
RI
SC
Figure 3: Division of total data set into LSSVM training, optimization, and test data
LSSVM training is over once all parameters in the Table 3 are found. At this point, the model
U
can be applied for any unknown input vector using the RBF kernel as shown in Equation 5
( ) ∑
AN (
‖ ‖
) ( )
M
The Artificial Neural Networks (ANNs) algorithm was developed based on human learning
TE
processes through brain and nerve networks. This is a connectionist technique where input and
output are linked through neurons. The most common feed forward architecture consists of one
EP
input layer, one or more hidden layers, and one output layer as shown in Figure 4.
C
AC
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
Figure 4: Basic structure of ANN with input, hidden, and output layers.
M
Links between input and output are established through the internal computations in the hidden
D
layers. The complexity and non-linearity of the model are increased by increasing the number of
TE
hidden layers where the individual components of a layer are known as nodes. In this study, there
are nine input nodes (eight input parameters as listed in Table 2 and one bias) contained in one
EP
hidden layer. A weight was given to each connection for every node and a bias term was added
to each hidden and output node. Bias and weight values used in this study are summarized for
C
one output in Table 3. In the process of training the ANN model, all weights and biases are
AC
determined by minimizing the error between the predicted output and the training output via
activation function at each the node. All output data is normalized as shown in Equation 6:
( )
( )
( ) ( )
PT
RI
SC
(a) (b)
Figure 5: Input-to-output structure and calculations inside (a) hidden and (b) output nodes
U
As shown in figure 5 computation consists of two calculations: summation and transformation
AN
through activation functions where activation functions may be linear or non-linear. In this study,
( )
( )
D
The sensitivity of the model to the number of hidden nodes (neurons) was also investigated. As
TE
described earlier, the non-linearity relationship between input and output data increases with the
number of hidden nodes. However, increasing non-linearity doesn’t always guarantee higher
EP
prediction accuracy. To find out the optimum number of hidden nodes (neurons), a sensitivity
C
study was conducted on the training and testing data for oil recovery and gas oil ratio after 5
AC
PT
RI
SC
(a) (b)
U
Figure 6: Coefficients of determination using different number of hidden nodes for (a) Oil
AN
recovery and (b) gas oil ratio at 5 years.
It is evident from figure 6 that the R2 is close to unity for training data. On the other hand, the R2
M
value for the test data increases initially with the number of neurons for oil recovery and gas oil
ratio. A maximum R2 value can be clearly identified at 14 neurons for the case of oil recovery.
D
Therefore, 14 hidden neurons are used in this study. The ANN parameters used in study are
TE
summarized below:
EP
Number of weights = Number of neurons in the first layer * Number of input + Number
AC
Number of biases = Number of neurons in the first layer + Number of neurons in the
second layer = 15
ACCEPTED MANUSCRIPT
The unknown parameters in the ANN structure are summarized in Table 3. During training of the
ANN, these 126 weights and 15 biases are determined using an optimization routine, namely the
PT
There are various error measuring tools used in every branch of science and engineering. Their
uses are mostly dependent on the model and purpose of the system. During the fitting portion of
RI
the model (training the model) the Mean Square Error (MSE) is set as the objective function to
SC
determine the optimized model parameters using PSO. As the minimum value of MSE is the
indication of a good match between experimental or simulated values and modeled values, MSE
U
is minimized during optimization in PSO. The MSE is calculated between experimental or
AN
simulated values and modeled values as shown in Equation 8.
∑ ( )
M
( )
D
Yobs and Ymodel are the simulated and modeled values respectively, n is the number of data sets.
TE
In this study, the Normalized Root Mean Square Error (NRMSE) and the coefficient of
determination (R2) are adopted to measure the discrepancy between simulated data and model
EP
data. The NRMSE is used over MSE to compare various models (time based and rate based
models) in the same scale. The coefficient of determination, R2, is defined as shown in Equation
C
9
AC
( )
Where,
√
( )
PT
Where Yobs,max is the maximum value and Yobs,min is the minimum value of the observed data.
The value of R2 varies from 0 to 1. R2 values close to unity and small NRMSE values are
RI
indication of a good fit.
SC
4. Optimization Routine: Particle Swarm Optimization
Inspired by the motion of bird swarms, the Particle Swarm Optimization (PSO) routine was
U
developed by Eberhart and Kennedy [56]. In this method, each potential solution is treated as
AN
particle. Each particle is characterized by its position and velocity. The position of a particle is
defined in a hyperspace whose dimension is equal to the number of unknown parameters being
M
optimized as shown in Table 3. For example, in the case of ANN, particles fly in a 141-
D
dimensional hyperspace. Several particles are initially defined in hyperspace where they
TE
iteratively change their position to determine the optimum position. Fitness of a particle is
determined by a fitness function such as the MSE. This algorithm is similar to the method
EP
Two solutions, pbest and gbest, at any iteration during execution of the algorithm are tracked. The
C
local best or pbest is defined as the best position of a particle in the hyperspace as determined by
AC
the fitness value. The global best or gbest is the overall best value by any particle so far in the
population. At each iteration step, the velocity is updated first and then position. Accelerating the
particle towards its pbest and gbest by updating velocity is done by two separate random numbers
( )
The cognitive components guide the local search from its local best (pbest) and the social
PT
component is responsible for global search depending on the population best (gbest) [57]. In
equation 11, is the velocity in the next iteration step which is partially preserved from the
RI
current velocity by an inertia weight, wi (range 0.4 to 0.9). The acceleration coefficients (C1 and
SC
C2) for cognitive and social components are chosen by trial and error. The position of a particle
U
( )
AN
The values of wi, C1, C2, and other parameters used in this study are given in Table 4.
M
Table 4: Particle swarm optimization parameters in various surrogate models
Surrogate Model Particle Swarm Optimization
Number of Particles Maximum
D
1 intercept
RSM 100 2 2 0.6 1000
44 coefficients
1 Regularization parameter ()
LSSVM 100 2 2 0.6 1000
EP
Initial position and velocity of each particle is randomly distributed. After the initialization of
AC
positions and velocities of all particles, fitness is calculated. In subsequent steps, positions and
velocities are updated iteratively by the local best and the global best parameters as summarized
PT
RI
U SC
AN
M
D
TE
Figure 7: Particle Swarm Optimization flow chart. Modified from Ahmadi et al. [58]
The entire flowchart can be divided into four parts, namely, initialization, fitness evaluation,
EP
condition check, and updates of velocity and position. Acceptance of any particle as potential
C
solution is determined by its fitness value which is calculated in each iteration step. As described
AC
earlier, one local best, pbest, and one global best, gbest, are recorded during each iteration. The
number of iterations is only limited by time and computational constraints; hence the maximum
Five time-based models (90 days, 1 year, 5 years, 10 years and 15 years) and one rate-based
model (5 bbl/day/fracture) were trained using RSM, LSSVM and ANN. The objective of this
PT
oil recovery and gas-oil ratio are compared with simulation data. Since all time-based models
behave similarly, only two time-based models (one early production model after 90 days and a
RI
long production model after 10 years) along with one rate-based model are discussed here. The
SC
fitness of a model with training data and test data are both discussed in this section. Once the
model is trained, it is tested against an unknown data set (i.e., test data) to check for robustness
U
of forecasting capabilities. As discussed earlier, the fitness is determined by two measures, the
AN
coefficient of determination (R2) and normalized root mean square error (NRMSE) which are
5.1. Training
D
In this section, model fitness as compared with training data is evaluated and discussed. It is
TE
important to assess individual models to check for overfitting. Three surrogate models (RSM,
LSSVM, and ANN) for oil recovery and gas oil ratio after 90 days, 10 years of production and at
EP
PT
RI
SC
(a) (d)
U
AN
M
D
TE
(b) (e)
C EP
AC
(c) (f)
Figure 8: Comparison of RSM, ANN and LSSVM models using training data for (a) Oil
ACCEPTED MANUSCRIPT
recovery after 90 days (b), Oil recovery after 10 years (c), Oil recovery after oil rate drops to 5
bbl/day/fracture (d), Gas Oil Ratio after 90 days (e), Gas Oil Ratio after 10 years (f), and Gas Oil
Ratio after oil rate drops to 5 bbl/day/fracture
Typical high model fitness as compared with training data can be observed for all cases in Figure
8. However, the capture of the production behavior using surrogate models without apprehending
PT
the underlying physics is a great challenge. It is difficult to model the GOR from low
permeability reservoirs [59] due to its complex behavior. Mainly at higher value of GOR
RI
(obtained from 10 years model), flow becomes boundary dominated. On the other hand, lower
SC
GOR (obtained from 90 days and 1 year models) occurs when flow is at the transient linear
regime. Overall, both oil recovery and GOR from surrogate models are in good agreement with
U
simulation results. The errors are calculated in terms of R2 and NRMSE as listed in Tables A.1
AN
and A.2. For visual comparison, R2 and NRMSE for RSM, ANN and LSSVM models of oil
(a) (b)
Figure 9: Fitness of RSM, ANN and LSSVM models for of oil recovery for training data (a) Co-
efficient of determination, R2 and (b) NRMSE
R2 and NRMSE for oil recovery using RSM, LSSVM and ANN for all time- and rate-based
models are greater than 0.95 and less than 6% respectively. These values are evidence of well-
trained models.
ACCEPTED MANUSCRIPT
5.2. Testing
Some models may have the tendency to overfit with training data and consequently fail to predict
unseen test data with high accuracy. In this study, 20 percent of all data was used to check the
forecasting capabilities of all developed models. Results for oil recovery and GOR after 90 days,
PT
10 years production, and at terminal oil rate (5 bbl/day/fracture) are shown in Figure 10.
RI
U SC
AN
M
(a) (d)
D
TE
C EP
AC
(b) (e)
ACCEPTED MANUSCRIPT
PT
RI
SC
(c) (f)
Figure 10: Comparison of RSM, ANN and LSSVM models using test data for (a) Oil recovery
after 90 days, (b) Oil recovery after 10 years, (c) Oil recovery after oil rate drops to 5
U
bbl/day/fracture, (d), Gas Oil Ratio after 90 days (e), Gas Oil Ratio after 10 years (f), and Gas
Oil Ratio after the oil rate drops to 5 bbl/day/fracture
AN
Although all models had high fitness compared with training data, they showed relatively lower
fitness when compared with test data. Considering the fact that test data was not accounted for
M
during training, the models show promising forecasting capabilities without significant
D
aberrations. R2 and NRMSE are calculated as listed in Tables A.1 and A.2. The R2 and NRMSE
TE
values for RSM, ANN, and LSSVM oil recovery models are shown in Figures 11 a and b
respectively.
C EP
AC
(a) (b)
ACCEPTED MANUSCRIPT
Figure 11: Fitness of RSM, ANN and LSSVM oil recovery models for testing data: (a) Co-
efficient of determination, R2 and (b) NRMSE
Except for a few cases, the forecast accuracy for all models are within decent ranges. As shown
in the figures above, RSM shows higher accuracy predicting oil recovery followed by LSSVM.
As shown in Figures 10 and 11, AI tools have the potential to predict oil recoveries and fluid
PT
ratios given a small training data set. Large amounts of completion, geological, and production
RI
data can indeed be used to train more robust models and complement current conventional tools
to evaluate the potential of tight oil reservoirs. The cost, however, is that AI skips the physical
SC
description and understanding of the multiphase production mechanisms in tight formations. This
cost may not be too high to pay since the current conventional understanding of these systems
U
may not be sufficiently developed yet. In fact, researchers have recently reported on the
AN
discrepancies between conventional thinking and fluids under nanoconfinement in tight
M
formations [60, 61].
D
6. Conclusion
TE
Artificial intelligence tools aimed to predict oil recovery and gas-oil ratio from hydraulically
training framework. In this study, three models were developed based on RSM, ANN, and
C
LSSVM to predict recovery from wells producing under time-based (90 days, 1 year, 5 years,
AC
10 years, and 15 years) and rate-based constraints (5 bbl/day/fracture). Eight key factors,
namely, matrix permeability, gas relative permeability exponent, rock compressibility, initial
gas-oil ratio, slope of solution gas-oil ratio versus pressure, initial pressure, flowing bottom-
hole pressure, and fracture spacing were considered as input parameters for all cases. After all
models were trained with the same database, they were used to predict production for different
ACCEPTED MANUSCRIPT
scenarios. Using simulation as a comparison basis, all models were evaluated in terms of their
oil recovery and producing gas-oil ratio predictive capabilities. It was found that RSM and
LSSVM have better predictive capabilities for oil recovery than ANN. In addition, LSSVM
PT
Field-scale modeling and simulation of hydraulically fractured ultra-low permeability reservoirs
RI
models, on the other hand, are useful for quick oil production forecast and assessment.
SC
Additionally, these models can be used for risk and uncertainty analysis. Overall, artificial
U
production and reservoir engineering.
AN
Nomenclature
M
Kernel Parameter -
i Support Values
TE
Unit of Output
̅ Mean Of Observed Values Unit of Output
a0 The Intercept Of The Surrogate Model Unit of Output
AI Artificial Intelligence -
EP
PT
SSres Residual Sum Of Squares Unit of Output
SStot Total Sum Of Squares Unit of Output
SVM Support Vector Machine -
RI
vk Velocity Of Particle -
wi Inertia Weight -
Ymodel,i Modeled Value Unit of Output
SC
Yobs,i Observed Data Unit of Output
Yobs,max The Maximum Value Of Observed Data Unit of Output
Yobs,min The Minimum Value Of Observed Data Unit of Output
U
References
[1].
AN
Yeten, B., A. Castellini, B. Guyaguler, and W.H. Chen. A Comparison Study on
Experimental Design and Response Surface Methodologies. in SPE Reservoir Simulation
M
Symposium. The Woodlands, Texas: Society of Petroleum Engineers Inc. (2005).
[2]. Amorim, T.C.A.D. and D.J. Schiozer. Risk Analysis Speed-Up With Surrogate Models. in
SPE Latin America and Caribbean Petroleum Engineering Conference. Mexico City,
D
Factor" Analysis for Modeling Nonlinear Responses Caused by Both Reservoir and
Controllable Factors. in SPE Annual Technical Conference and Exhibition. Dallas,
Texas: Society of Petroleum Engineers (2005).
[4]. Peng, C.Y. and R. Gupta. Experimental Design in Deterministic Modelling: Assessing
EP
Significant Uncertainties. in SPE Asia Pacific Oil and Gas Conference and Exhibition.
Jakarta, Indonesia: Society of Petroleum Engineers (2003).
[5]. Dejean, J.P. and G. Blanc. Managing Uncertainties on Production Predictions Using
C
[6]. Corre, B., P. Thore, V.d. Feraudy, and G. Vincent. Integrated Uncertainty Assessment
For Project Evaluation and Risk Analysis. in SPE European Petroleum Conference.
Paris, France: Society of Petroleum Engineers Inc. (2000).
[7]. Manceau, E., M. Mezghani, I. Zabalza-Mezghani, and F. Roggero. Combination of
Experimental Design and Joint Modeling Methods for Quantifying the Risk Associated
With Deterministic and Stochastic Uncertainties - An Integrated Test Study. in SPE
Annual Technical Conference and Exhibition. New Orleans, Louisiana: Society of
Petroleum Engineers Inc. (2001).
[8]. Venkataraman, R. Application of the Method of Experimental Design to Quantify
Uncertainty in Production Profiles. in SPE Asia Pacific Conference on Integrated
ACCEPTED MANUSCRIPT
PT
Studies Using a Surrogate Reservoir Model. in SPE Annual Technical Conference and
Exhibition. San Antonio, Texas, USA: Society of Petroleum Engineers (2006).
[11]. Guyaguler, B. and R.N. Horne. Uncertainty Assessment of Well Placement Optimization.
in SPE Annual Technical Conference and Exhibition. New Orleans, Louisiana: Copyright
RI
2001, Society of Petroleum Engineers Inc. (2001).
[12]. Manceau, E., F. Roggero, and I. Zabalza-Mezghani. Use Of Experimental Design
Methodology To Make Decisions In An Uncertain Reservoir Environment From
SC
Reservoir Uncertainties To Economic Risk Analysis. World Petroleum Congress (2002).
[13]. Landa, J.L. and B. Güyagüler. A Methodology for History Matching and the Assessment
of Uncertainties Associated with Flow Prediction. in SPE Annual Technical Conference
U
and Exhibition. Denver, Colorado: Society of Petroleum Engineers (2003).
[14]. Carreras, P.E., S.E. Turner, and G.T. Wilkinson. Tahiti: Development Strategy
AN
Assessment Using Design of Experiments and Response Surface Methods. in SPE
Western Regional/AAPG Pacific Section/GSA Cordilleran Section Joint Meeting.
Anchorage, Alaska, USA: Society of Petroleum Engineers (2006).
[15]. Yang, C., L.X. Nghiem, C. Card, and M. Bremeier. Reservoir Model Uncertainty
M
Quantification Through Computer-Assisted History Matching. in SPE Annual Technical
Conference and Exhibition. Anaheim, California, U.S.A.: Society of Petroleum Engineers
(2007).
D
[16]. Slotte, P.A. and E. Smorgrav. Response Surface Methodology Approach for History
Matching and Uncertainty Assessment of Reservoir Simulation Models. in
TE
prediction equilibrium water dew point of natural gas in TEG dehydration systems. Fuel,
137 (2014). p. 145-154.
[18]. Mohaghegh, S.D., J.S. Liu, R. Gaskari, M. Maysami, and O.A. Olukoko. Application of
Well-Base Surrogate Reservoir Models (SRMs) to Two Offshore Fields in Saudi Arabia,
C
Case Study. in SPE Western Regional Meeting. Bakersfield, California, USA: Society of
Petroleum Engineers (2012).
AC
[19]. Velasco, R., P. Panja, and M. Deo. New Production Performance and Prediction Tool for
Unconventional Reservoirs, URTEC-2461718-MS. in Unconventional Resources
Technology Conference, 1-3 August. San Antonio, Texas, USA: Unconventional
Resources Technology Conference (2016).
[20]. Patzek, T.W., F. Male, and M. Marder, Gas production in the Barnett Shale obeys a
simple scaling theory. Proceedings of the National Academy of Sciences, 110(49) (2013).
p. 19731-19736.
[21]. Wattenbarger, R.A., A.H. El-Banbi, M.E. Villegas, and J.B. Maggard. Production
Analysis of Linear Flow Into Fractured Tight Gas Wells, SPE-39931-MS. in SPE Rocky
ACCEPTED MANUSCRIPT
PT
[24]. Juniardi, I.R. and I. Ershaghi. Complexities of Using Neural Network in Well Test
Analysis of Faulted Reservoirs. Society of Petroleum Engineers (1993).
[25]. Zhou, C.D., X.-L. Wu, and J.-A. Cheng. Determining Reservoir Properties in Reservoir
Studies Using a Fuzzy Neural Network. in SPE Annual Technical Conference and
RI
Exhibition. Houston, Texas: Society of Petroleum Engineers (1993).
[26]. Mohaghegh, S., R. Arefi, S. Ameri, and M.H. Hefner. A Methodological Approach for
Reservoir Heterogeneity Characterization Using Artificial Neural Networks. in SPE
SC
Annual Technical Conference and Exhibition. New Orleans, Louisiana: Society of
Petroleum Engineers (1994).
[27]. E. El-Sebakhy, T.S., S. Al-Bokhitan, Y. Shaaban, I. Raharja, Y. Khaeruzzaman. Support
U
Vector Machines Framework for Predicting the PVT Properties of Crude-Oil Systems.
Kingdom of Baharin: 15th SPE Middle East Oil & Gas Show and Conference (2007).
[28]. AN
Oloso, M., A. Khoukhi, A. Abdulraheem, and M. Elshafei. Prediction of Crude Oil
Viscosity and Gas/Oil Ratio Curves Using Recent Advances to Neural Networks. in
SPE/EAGE Reservoir Characterization and Simulation Conference. Abu Dhabi, UAE:
Society of Petroleum Engineers (2009).
M
[29]. Rabiei, A., H. Sayyad, M. Riazi, and A. Hashemi, Determination of dew point pressure in
gas condensate reservoirs based on a hybrid neural genetic algorithm. Fluid Phase
Equilibria, 387 (2015). p. 38-49.
D
[30]. Ahmadi, M.A. and M. Ebadi, Evolving smart approach for determination dew point
pressure through condensate gas reservoirs. Fuel, 117 (2014). p. 1074-1084.
TE
[31]. Ahmadi, M.A., M. Ebadi, and A. Yazdanpanah, Robust intelligent tool for estimating dew
point pressure in retrograded condensate gas reservoirs: Application of particle swarm
optimization. Journal of Petroleum Science and Engineering, 123 (2014). p. 7-19.
EP
[32]. Ahmadi, M.A., M. Ebadi, P.S. Marghmaleki, and M.M. Fouladi, Evolving predictive
model to determine condensate-to-gas ratio in retrograded condensate gas reservoirs.
Fuel, 124 (2014). p. 241-257.
[33]. Ahmadi, M.A., Neural network based unified particle swarm optimization for prediction
C
precipitation due to natural depletion by using evolutionary algorithm concept. Fuel, 102
(2012). p. 716-723.
[35]. Ahmadi, M.A., M. Ebadi, A. Shokrollahi, and S.M.J. Majidi, Evolving artificial neural
network and imperialist competitive algorithm for prediction oil flow rate of the
reservoir. Applied Soft Computing, 13(2) (2013). p. 1085-1098.
[36]. Fatai Adesina Anifowose, A.A. Prediction of Porosity and Permeability of Oil and Gas
Reservoirs using Hybrid Computational Intelligence Models. Cairo, Egypt: North Africa
Technical Conference and Exhibition, SPE (2010).
ACCEPTED MANUSCRIPT
[37]. Fatai Adesina Anifowose, A.O.E., Safiriyu Ijiyemi. Prediction of Oil and Gas Reservoir
Properties using Support Vector Machines. Bangkok, Thailand: International Petroleum
Technology Conference, (2011).
[38]. Ammal F. Al-anazi, G., Ian D, Support-Vector Regression for Permeability Prediction in
a Heterogeneous Reservoir: A Comparative Study. SPE Reservoir Evaluation &
Engineering, 13(03) (2010).
[39]. Mohammad-Ali Ahmadi, M.R.A., Seyed Moein Hosseini, Mohammad Ebadi,
PT
Connectionist model predicts the porosity and permeability of petroleum reservoirs by
means of petro-physical logs: Application of artificial intelligence. Journal of Petroleum
Science and Engineering, 123 (2014). p. 183-200.
[40]. Mohammad-Ali Ahmadi, A.B., A LSSVM approach for determining well placement and
RI
conning phenomena in horizontal wells. Fuel, 153 (2015). p. 276-283.
[41]. Mohammad Ali Ahmadi, M.E., Seyed Moein Hosseini, Prediction breakthrough time of
water coning in the fractured reservoirs by implementing low parameter support vector
SC
machine approach. Fuel, 117 (2014). p. 579-589.
[42]. Ahmadi, M.A., Connectionist approach estimates gas–oil relative simulation in
petroleum reservoirs: Application to reservoir simulation. Fuel, 140 (2015). p. 429-439.
U
[43]. Eslamimanesh, A., F. Gharagheizi, M. Illbeigi, A.H. Mohammadi, A. Fazlali, and D.
Richon, Phase equilibrium modeling of clathrate hydrates of methane, carbon dioxide,
AN
nitrogen, and hydrogen + water soluble organic promoters using Support Vector
Machine algorithm. Fluid Phase Equilibria, 316 (2012). p. 34-45.
[44]. Reza Gholgheysari Gorjaei, R.S., Mohammad Torkaman, Mohsen Safari, Ghassem
Zargar, A novel PSO-LSSVM model for predicting liquid rate of two phase flow through
M
wellhead chokes. Journal of Natural Gas Science and Engineering, 24 (2015). p. 228-237.
[45]. Ahmadi, M.-A., M.Z. Hasanvand, and A. Bahadori, A least-squares support vector
machine approach to predict temperature drop accompanying a given pressure drop for
D
the natural gas production and processing systems. International Journal of Ambient
Energy, 38(2) (2015). p. 122-129.
TE
[47]. Ahmadi, M.-A., M. Masumi, R. Kharrat, and A.H. Mohammadi, Gas Analysis by In Situ
Combustion in Heavy-Oil Recovery Process: Experimental and Modeling Studies.
Chemical Engineering & Technology, 37(3) (2014). p. 409-418.
[48]. Ahmadi, M.A., M. Masoumi, and R. Askarinezhad, Evolving Smart Model to Predict the
C
Combustion Front Velocity for In Situ Combustion. Energy Technology, 3(2) (2015). p.
128-135.
AC
[49]. Ahmadi, M.A., M. Zahedzadeh, S.R. Shadizadeh, and R. Abbassi, Connectionist model
for predicting minimum gas miscibility pressure: Application to gas injection process.
Fuel, 148 (2015). p. 202-211.
[50]. Ali Ahmadi, M. and A. Ahmadi, Applying a sophisticated approach to predict
CO2solubility in brines: application to CO2sequestration. International Journal of Low-
Carbon Technologies, 11(3) (2016). p. 325-332.
[51]. Ahmadi, M.-A., A. Bahadori, and S.R. Shadizadeh, A rigorous model to predict the
amount of Dissolved Calcium Carbonate Concentration throughout oil field brines: Side
effect of pressure and temperature. Fuel, 139 (2015). p. 154-159.
ACCEPTED MANUSCRIPT
[52]. Panja, P., T. Conner, and M. Deo, Factors Controlling Production in Hydraulically
Fractured Low Permeability Oil Reservoirs. International Journal of Oil, Gas and Coal
Technology, 3(1) (2015). p. 18.
[53]. Box, G.E.P. and D.W. Behnken, Some New Three Level Designs for the Study of
Quantitative Variables. Technometrics, 2(4) (1960). p. 455-475.
[54]. Panja, P., T. Conner, and M. Deo, Grid sensitivity studies in hydraulically fractured low
permeability reservoirs. Journal of Petroleum Science and Engineering, 112(0) (2013). p.
PT
78-87.
[55]. Panja, P., M. Pathak, R. Velasco, and M. Deo. Least Square Support Vector Machine: An
Emerging Tool for Data Analysis. in SPE Low Perm Symposium, 5-6 May. Colorado,
Denver: Society of Petroleum Engineers (2016).
RI
[56]. Eberhart, R. and J. Kennedy. A new optimizer using particle swarm theory. in Micro
Machine and Human Science, 1995. MHS '95., Proceedings of the Sixth International
Symposium on. (1995).
SC
[57]. Banerjee, C. and R. Sawal. PSO with dynamic acceleration Coefficient based on Mutiple
Constraint Satisfaction. in International Conference on Advances in Electronics
Computers and Communications. Bangalore, India (2014).
U
[58]. Ahmadi, M.A., R. Soleimani, M. Lee, T. Kashiwao, and A. Bahadori, Determination of
oil well production performance using artificial neural network (ANN) linked to the
[59].
AN
particle swarm optimization (PSO) tool. Petroleum, 1(2) (2015). p. 118-132.
Panja, P. and M. Deo, Unusual behavior of produced gas oil ratio in low permeability
fractured reservoirs. Journal of Petroleum Science and Engineering, 144 (2016). p. 76-
83.
M
[60]. Pathak, M., H. Kweon, P. Panja, R. Velasco, and M.D. Deo. Suppression in the Bubble
Points of Oils in Shales Combined Effect of Presence of Organic Matter and
Confinement. in SPE Unconventional Resources Conference, 15-16 February, . Calgary,
D
Table A.1: Coefficient of determination (R2) of RSM, LSSVM and ANN for all models
Training Data Test Data
Output Model
RSM LSSVM ANN RSM LSSVM ANN
90 days 0.99 0.99 0.96 0.69 0.52 0.51
1 year 0.98 0.99 0.98 0.78 0.69 0.53
Oil 5 years 0.99 0.99 0.99 0.63 0.81 0.60
Recovery 10 years 0.99 0.99 0.98 0.91 0.9 0.72
15 years 0.99 0.99 0.99 0.97 0.93 0.84
Rate Based 0.98 0.98 0.99 0.57 0.54 0.48
ACCEPTED MANUSCRIPT
PT
Table A.2: Normalized Root Mean Square Error (NRMSE) of RSM, LSSVM and ANN for all
models
Training Data Test Data
RI
Output Model
RSM LSSVM ANN RSM LSSVM ANN
90 days 1.9 1.9 3.5 16.5 20.3 20.7
1 year 2.4 2.3 2.5 12.4 14.7 18.1
SC
Oil 5 years 2.0 1.9 2.1 16.1 11.5 16.7
Recovery 10 years 1.9 1.7 2.6 7.9 8.5 14
15 years 2.7 2.4 2.1 4.9 7.3 10.8
U
Rate Based 3.5 3.3 2.4 20.7 21.2 22.6
90 days 2.6 2.0 4.6 8.7 11.8 13.3
Gas Oil
1 year
5 years
3.3
3.0
AN 3.3
3.1
4.2
3.7
7.9
24.0
9.3
16.1
9.7
26.2
Ratio 10 years 5.7 4.6 4.3 16.1 17.2 24.3
M
15 years 5.8 6.8 5.6 14.4 15.5 25.7
Rate Based 5.2 4.5 3.8 14.1 18.4 18.8
D
TE
C EP
AC
(a) (d)
ACCEPTED MANUSCRIPT
PT
RI
SC
(b) (e)
U
AN
M
D
TE
EP
(c) (f)
Figure A.1.: Data training comparison of RSM, ANN and LSSVM models for (a) Oil recovery
after 1 year (b) Oil recovery after 5 years (c) Oil recovery after 15 years (d) Gas Oil Ratio after 1
C
year (e) Gas Oil Ratio after 5 years (f) Gas Oil Ratio after 15 years
AC
Table A.3: List of simulations using Box-Behnken DOE used to train surrogate models
Sr. Km ng Cf dRs/dp Rsi Pi Pwf Xf
No. (nD) - (1/psi) ((SCF/STB)/psi) (SCF/STB) (psi) (psi) (ft.)
1 10 1 4.00E-05 0.65 1900 5250 1000 180
2 10 3 4.00E-05 0.65 1900 5250 1000 180
3 5000 1 4.00E-05 0.65 1900 5250 1000 180
4 5000 3 4.00E-05 0.65 1900 5250 1000 180
5 10 2 4.00E-06 0.65 1900 5250 1000 180
ACCEPTED MANUSCRIPT
PT
12 5000 2 0.8 1900 5250 1000 180
13 10 2 4.00E-05 0.65 800 5250 1000 180
14 10 2 4.00E-05 0.65 3000 5250 1000 180
RI
15 5000 2 4.00E-05 0.65 800 5250 1000 180
16 5000 2 4.00E-05 0.65 3000 5250 1000 180
17 10 2 4.00E-05 0.65 1900 4000 1000 180
SC
18 10 2 4.00E-05 0.65 1900 6500 1000 180
19 5000 2 4.00E-05 0.65 1900 4000 1000 180
20 5000 2 4.00E-05 0.65 1900 6500 1000 180
U
21 10 2 4.00E-05 0.65 1900 5250 500 180
22 10 2 4.00E-05 0.65 1900 5250 1500 180
23
24
5000
5000
2
2
4.00E-05
4.00E-05
AN
0.65
0.65
1900
1900
5250
5250
500
1500
180
180
25 10 2 4.00E-05 0.65 1900 5250 1000 60
M
26 10 2 4.00E-05 0.65 1900 5250 1000 300
27 5000 2 4.00E-05 0.65 1900 5250 1000 60
28 5000 2 4.00E-05 0.65 1900 5250 1000 300
D
PT
53 225 2 0.5 1900 5250 1000 180
54 225 2 4.00E-06 0.8 1900 5250 1000 180
55 225 2 4.00E-04 0.5 1900 5250 1000 180
RI
56 225 2 4.00E-04 0.8 1900 5250 1000 180
57 225 2 4.00E-06 0.65 800 5250 1000 180
58 225 2 4.00E-06 0.65 3000 5250 1000 180
SC
59 225 2 4.00E-04 0.65 800 5250 1000 180
60 225 2 4.00E-04 0.65 3000 5250 1000 180
61 225 2 4.00E-06 0.65 1900 4000 1000 180
U
62 225 2 4.00E-06 0.65 1900 6500 1000 180
63 225 2 4.00E-04 0.65 1900 4000 1000 180
64
65
225
225
2
2
4.00E-04
4.00E-06
AN
0.65
0.65
1900
1900
6500
5250
1000
500
180
180
66 225 2 4.00E-06 0.65 1900 5250 1500 180
M
67 225 2 4.00E-04 0.65 1900 5250 500 180
68 225 2 4.00E-04 0.65 1900 5250 1500 180
69 225 2 4.00E-06 0.65 1900 5250 1000 60
D
PT
94 225 2 0.65 800 5250 1500 180
95 225 2 4.00E-05 0.65 3000 5250 500 180
96 225 2 4.00E-05 0.65 3000 5250 1500 180
RI
97 225 2 4.00E-05 0.65 800 5250 1000 60
98 225 2 4.00E-05 0.65 800 5250 1000 300
99 225 2 4.00E-05 0.65 3000 5250 1000 60
SC
100 225 2 4.00E-05 0.65 3000 5250 1000 300
101 225 2 4.00E-05 0.65 1900 4000 500 180
102 225 2 4.00E-05 0.65 1900 4000 1500 180
U
103 225 2 4.00E-05 0.65 1900 6500 500 180
104 225 2 4.00E-05 0.65 1900 6500 1500 180
105
106
225
225
2
2
4.00E-05
4.00E-05
AN
0.65
0.65
1900
1900
4000
4000
1000
1000
60
300
107 225 2 4.00E-05 0.65 1900 6500 1000 60
M
108 225 2 4.00E-05 0.65 1900 6500 1000 300
109 225 2 4.00E-05 0.65 1900 5250 500 60
110 225 2 4.00E-05 0.65 1900 5250 500 300
D
Table A.4: List of simulations performed using random input parameter values used to test the
surrogate models
C
PT
15 13 1.03 3.0E-05 0.75 1273 5156 1136 60
16 16 2.97 1.9E-05 0.58 1413 5061 1445 60
17 256 1.33 6.2E-06 0.68 1323 5152 709 180
RI
18 18 1.21 9.4E-06 0.51 2281 5925 1209 60
19 1618 1.74 1.2E-05 0.63 1552 4806 736 300
20 1612 1.40 3.8E-05 0.59 2527 5962 619 300
SC
21 892 1.98 5.7E-06 0.55 1566 5178 1107 300
22 25 1.68 2.9E-05 0.55 1900 4089 950 60
23 604 2.90 1.8E-05 0.63 1614 4440 959 300
U
24 251 2.84 9.5E-06 0.53 2944 5804 1162 180
25 4237 1.11 6.2E-06 0.68 1710 5184 1270 300
26
27
565
1448
2.48
1.54
1.1E-05
1.2E-05
AN
0.64
0.71
1966
1393
4382
4853
850
1162
300
300
28 168 1.85 5.3E-06 0.71 2676 5518 916 180
M
29 147 2.10 1.6E-05 0.69 1431 4479 1342 180
30 1692 2.89 6.7E-06 0.51 2672 5846 1333 300
D
TE
C EP
AC