Accepted Manuscript: 10.1016/j.petlm.2017.11.003

Accepted Manuscript
Application of artificial intelligence to forecast hydrocarbon production from shales
Palash Panja, Raul Velasco, Manas Pathak, Milind Deo
PII: S2405-6561(17)30114-1
DOI: 10.1016/j.petlm.2017.11.003
Reference: PETLM 175
To appear in: Petroleum
Received Date: 15 June 2017

Revised Date: 22 September 2017
Accepted Date: 22 November 2017
Please cite this article as: P. Panja, R. Velasco, M. Pathak, M. Deo, Application of artificial intelligence to
forecast hydrocarbon production from shales, Petroleum (2017), doi: 10.1016/j.petlm.2017.11.003.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and all
legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Application of Artificial Intelligence to Forecast Hydrocarbon

Production from Shales
Palash Panja1*, Raul Velasco1, Manas Pathak2, Milind Deo2
1
Energy & Geoscience Institute
432 Wakara Way, Suite 300, Salt Lake City, UT 84108
PT
2
Department of Chemical Engineering, University of Utah
50 Central Campus Dr., Salt Lake City, UT 84112
RI
*ppanja@egi.utah.edu
*Corresponding author
SC
Abstract
Artificial intelligence (AI) methods and applications have recently gained a great deal of
U
attention in many areas, including fields of mathematics, neuroscience, economics, engineering,
AN
linguistics, gaming, and many others. This is due to the surge of innovative and sophisticated AI
techniques applications to highly complex problems as well as the powerful new developments
M
in high speed computing. Various applications of AI in everyday life include machine learning,
D
pattern recognition, robotics, data processing and analysis, etc. The oil and gas industry is not
TE
behind either, in fact, AI techniques have recently been applied to estimate PVT properties,
optimize production, predict recoverable hydrocarbons, optimize well placement using pattern
EP
recognition, optimize hydraulic fracture design, and to aid in reservoir characterization efforts. In
this study, three different AI models are trained and used to forecast hydrocarbon production
C
from hydraulically fractured wells. Two vastly used artificial intelligence methods, namely the
AC
Least Square Support Vector Machine (LSSVM) and the Artificial Neural Networks (ANN), are
compared to a traditional curve fitting method known as Response Surface Model (RSM) using
second order polynomial equations to determine production from shales. The objective of this
work is to further explore the potential of AI in the oil and gas industry. Eight parameters are
considered as input factors to build the model: reservoir permeability, initial dissolved gas-oil
ACCEPTED MANUSCRIPT
ratio, rock compressibility, gas relative permeability, slope of gas oil ratio, initial reservoir
pressure, flowing bottom hole pressure, and hydraulic fracture spacing. The range of values used
for these parameters resemble real field scenarios from prolific shale plays such as the Eagle
Ford, Bakken, and the Niobrara in the United States. Production data consists of oil recovery
PT
factor and produced gas-oil ratio (GOR) generated from a generic hydraulically fractured
reservoir model using a commercial simulator. The Box-Behnken experiment design was used to
RI
minimize the number of simulations for this study. Five time-based models (for production
SC
periods of 90 days, 1 year, 5 years, 10 years, and 15 years) and one rate-based model (when oil
rate drops to 5 bbl/day/fracture) were considered. Particle Swarm Optimization (PSO) routine is
U
used in all three surrogate models to obtain the associated model parameters. Models were trained
AN
using 80% of all data generated through simulation while 20% was used for testing of the models.
All models were evaluated by measuring the goodness of fit through the coefficient of
M
determination (R2) and the Normalized Root Mean Square Error (NRMSE). Results show that
D
RSM and LSSVM have very accurate oil recovery forecasting capabilities while LSSVM shows
TE
the best performance for complex GOR behavior. Furthermore, all surrogate models are shown
to serve as reliable proxy reservoir models useful for fast fluid recovery forecasts and sensitivity
EP
analyses.
Keywords: Surrogate models; LSSVM; ANN; Oil recovery; Artificial intelligence;

C
Unconventional reservoirs
AC
1. Introduction
Surrogate models are particularly useful for quick predictions given a range of input parameters.
These models are used to forecast oil production and perform sensitivity and uncertainty
analyses. Polynomial equations and other non-linear equations known as response surface
models (RSM) have been popularized for their simple mathematical structure and for easier
ACCEPTED MANUSCRIPT
implementation. Recently, artificial intelligence applications have gained the interest of
engineers and scientists due to their unconventional ways of connecting input data to output.
RSM coupled with a proper design of experiments [1] was proven to be an efficient and fast
proxy model for forecasting production performance and analyzing uncertainties [2]. Oil rate and
PT
water cut results were also predicted using RSM [3]. Response surface models are widely
applied to various aspects of reservoir engineering including estimating initial hydrocarbon
RI
uncertainty, [4] production uncertainty [5-10], finding an optimal scheme for well placement [7,
SC
11-14], history matching [13, 15, 16], and determining the dew point of water in natural gas
processing unit [17]. Field cases have been studied using pattern recognition techniques [18] to
U
determine pressure and production variation according to well locations.
AN
Even though researchers have developed numerical, analytical, and semi-analytical techniques to
understand the physics underlying the production from hydraulically fractured tight formations
M
[19-22], many of these systems grow in complexity rendering most of these methods
D
inapplicable. The AI approach on the other hand is very useful when dealing with highly
TE
complex systems. At the cost of understanding the physical mechanisms taking place in tight
formations, AI helps us analyze and forecast hydrocarbon production and assess performance.
EP
In this study, two of the most common AI techniques namely, ANN and LSSVM as well a
second order polynomial RSM are used to predict oil and gas-oil ratio production from
C
hydraulically fractured low permeability reservoir. The comparison of these three methods in
AC
terms of performance and accuracy is also discussed. The application of ANN started before
LSSVM in the early 90’s, data from well tests were already being interpreted using ANN [23,
24]. Rock characteristics such as lithology were determined from well logs using fuzzy neural
networks [25]. Reservoir heterogeneity with respect to porosity, permeability, and oil saturations
ACCEPTED MANUSCRIPT
were characterized from geophysical well logs such as gamma ray, bulk density, deep induction,
etc. using ANN [26]. Thermodynamic properties from reservoir fluids such as bubble point
pressures and formation volume factors at the bubble point have been predicted from four inputs:
solution GOR, reservoir temperature, oil gravity, and gas density using ANN, SVM and non-
PT
linear regression [27]. Similarly, crude oil viscosity and solution GOR as functions of pressure
have been determined from 12 variables including compositions of oil, bubble point pressure,
RI
bubble point viscosity [28], etc. using ANN. Calculations of gas condensate dew point pressures
SC
were also made using gas composition, temperature, and heavy fraction properties [29-31] and
condensate to gas ratio [32]. Results predicted by ANN for asphaltene precipitation [33] showed
U
promising results compared to experimental studies [34]. Oil rates have also been measured in
AN
the pipe line using ANN for varying pressures and temperatures [35]. Various applications of
LSSVM include porosity and permeability determination [36-39], water conning in horizontal
M
wells [40, 41], well placement [40], gas-oil relative permeability curves [42], phase equilibrium
D
calculations of hydrates [43], oil flow rate predictions [44], and temperature-pressure
TE
relationship in natural gas production and processing [45]. Wide applications of artificial
intelligence in improved oil recovery were recently described by researchers [46-49]. Other
EP
applications include the description of CO2 solubility [50] and calcium carbonate [51] in brine
sequestration processes.
C
Eight important parameters are considered as input data that include geological parameters
AC
(initial reservoir pressure, rock compressibility, and permeability), operational parameter (bottom
hole pressure), completion parameters (fracture spacing), rock-fluid properties (Corey gas
relative permeability exponent), and fluid properties (initial solution gas-oil ratio and the linear
slope of solution gas-oil ratio versus pressure) which are selected from a previous study [52]
ACCEPTED MANUSCRIPT
where a mechanistic study revealed these parameters to be highly significant. Six models (5
time-based models for production after 90 days, 1 year, 5 years, 10 years, and 15 years and one
rate-based model when oil rate drops to 5 bbl/day/fracture) of oil recovery and produced GOR
are developed for each surrogate model (RSM, ANN and LSSVM). Data is generated from a
PT
generic reservoir model with one vertical hydraulic fracture placed in the middle of the reservoir
using a commercial reservoir simulator. The mathematical formulations and workflow to create
RI
these surrogate models are discussed in this article. The results obtained for all models are
SC
compared using error analyses in terms of coefficient of determination (R2) and normalized root
mean square error (NRMSE).
U
2. Reservoir Model
AN
Unconventional reservoirs such as shales and other tight formations are very complex in terms of
possible natural fracture presence and heterogeneity. However, it is possible to build a

M
homogeneous reservoir model using average properties if the variation is not very large.
D
Typically, wells are drilled vertically and then directed horizontally for 1 to 2 miles, where as
TE
many as 100 vertical hydraulic fractures are induced to generate high conductive flow paths to
the wellbore. Simulating an entire reservoir model that consists of 100 hydraulic fractures is very
EP
computationally expensive. Hence, only a small representative portion of the reservoir is
simulated where production from a single vertical fracture is considered. The reservoir properties
C
are assumed to be homogeneous as listed in the Table 1.

AC
Table 1: Simulation parameters

Reservoir top depth (ft) 12000
Reservoir thickness (ft) 200
Reservoir width (ft) 750
Fracture width (ft) 0.05
Fracture height (ft) 200
ACCEPTED MANUSCRIPT
Fracture half-length (ft) 375

Fracture orientation Parallel to YZ plane
Reservoir porosity (%) 5
Initial water saturation (%) 16
PT
The number of unique input parameter combinations could lead to an enormous number of
RI
experiments or simulations. The Box-Behnken method [53] is chosen in this study to keep the
required number of simulations to a minimum. This simulation design is also suitable for second
SC
order response surface models. Using the Box-Behnken design, 114 simulations are modeled for
eight input parameters in three levels (minimum, medium and maximum) as shown in Table 2.
U
The values of all input parameters are converted to a -1, 0, and 1 scale using a linear relationship,
AN
except for matrix permeability and rock compressibility (where logarithmic values are used
M
instead).
Table 2: Input parameters and their values

D
Minimum Medium Maximum

Variable Symbol
(-1) (0) (+1)
TE
1 Matrix Permeability (nD) X1 10 225 5000

2 Gas Rel. Permeability Exponent, ng X2 1 2 3
3 Rock Compressibility (1/psi) X3 4x10-6 4x10-5 4x10-4
4 dRs/dp (SCF/STB/psi) X4 0.50 0.65 0.80
EP
5 Initial Gas Oil Ratio, Rsi (scf/stb) X5 800 1900 3000

6 Initial Pressure, Pi (psi) X6 4000 5250 6500
7 BHP (psi) X7 500 1000 1500
C
8 Fracture Spacing (ft) X8 60 180 300

AC
Apart from the 114 simulation results that were used to train the models, 30 additional
simulations were ran to test the models. Therefore, the training data is comprised of
approximately 80% of the total data set (114 out of 144) while the testing portion is comprised of
approximately 20% (30 out of 144). The list of simulations used to train and test the models can
be found in the Appendix (Tables A.3 and A.4). IMEXTM from the Computer Modeling Group,
ACCEPTED MANUSCRIPT
Calgary, Canada was used to conduct all black-oil simulations. The minimum number of
simulation grid blocks necessary to obtain accurate results and avoid convergence issues was
used as prescribed by Panja et al. [54].
3. Surrogate Model
PT
As mentioned earlier, three types of surrogate models – a Response Surface Model (RSM), a
Least Square Support Vector Machine (LSSVM) model, and an Artificial Neural Networks
RI
(ANN) model - were developed and compared in this study. Simulation results in terms of oil
SC
recovery and gas oil ratio (GOR) were obtained in two ways: by recording oil recovery and GOR
after certain production times and when the oil rate dropped to 5 bbl/day/fracture. In other words,
U
five time-based models and one rate-based model were developed as summarized below:

AN
Time based model: Models for oil recovery and GOR at 90 days, 1 year, 5 years, 10
years and 15 years.

M
 Rate based model: Models for oil recovery and GOR when oil rate drops to 5
D
bbl/day/fracture.
TE
All unknown parameters in the surrogate models (RSM, LSSVM and ANN) are obtained using
an optimization routine known as Particle Swarm Optimization (PSO) using Matlab

EP
(MathWorks® Inc.). The same optimization routine was used for all surrogate models to
eliminate any performance bias. Sometimes, unacceptable physical values such as negative oil
C
recovery factors or gas oil ratios are obtained using surrogate models. To avoid this pitfall,
AC
logarithms of the outcomes (recovery factors and gas-oil ratio) are used to build the models. A
simplified schematic of methodology used to develop the surrogate model is shown in Figure 1.
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
Figure 1: Surrogate model development schematic
All unknown parameters are listed in Table 3. These parameters are discussed in more detail in
M
the upcoming sections.
D
Table 3: Number of parameters determined in each surrogate model

TE
Method Number of parameters Optimized parameter

1 intercept
RSM all
44 coefficients
EP
1 Bias term (b)

Regularization parameter ()
1 Regularization parameter ()
LSSVM Kernel parameter ()
1 Kernel parameter ()
C
92 Support values (i)

15 Biases ( 14 hidden layer +1 output layer)
AC
ANN 126 Weights (8X14 for hidden layer+14 for output All
layer)
The first two models (i.e. RSM and LSSVM) are discussed in detail in a previous article [55].
Therefore, these two models are intentionally discussed in brief here and the reader is referred to
the referenced article for more details.

ACCEPTED MANUSCRIPT
3.1. Response Surface Model (RSM)
The response surface model is the most common method used in many branches of engineering.
Basically, an algebraic equation is fitted to develop a relationship between input and output data.
During equation fitting with training data, the parameters (coefficients, intercepts etc.) are
PT
determined through an optimization routine to minimize error. A second order polynomial
equation is chosen in this study. The equation is defined as:
RI
( ) ∑ ∑∑ ( )
SC
For 8 input variables, there are 8 interaction coefficients, ak, 36 second order interaction
U
coefficients, aij, and one intercept, a0, as shown in equation 1. A workflow to develop surrogate
AN
models (RSM and LSSVM) is shown in Figure 2.
M
D
TE
C EP
AC
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
D
TE
EP
Figure 2: Workflow used to develop RSM and LSSVM. Modified from Panja et al. [55]
As part of the development of a model, validation is performed using test data to assess
C
AC
robustness. An accepted error margin is set for the surrogate model. In this fashion, surrogate
models are continuously improved unless the error reaches its acceptance limit.
3.2. Least Square Support Vector Machine (LSSVM)
The Support Vector Machine (SVM) is usually used for classification and regression analysis. A
modified form of SVM, namely the least square support vector machine (LSSVM) is used in this
ACCEPTED MANUSCRIPT
study. LSSVM is close to SVM formulation but solves a linear system instead of a quadratic
programming (QP) problem. It has been widely applied in various fields because it is easier to
implement, speedy solution convergence, etc. On the other hand, LSSVM has the inherent nature
of overfitting to minimize error. Various combinations of data training and testing sets such as
PT
90-10 (%), 85-15(%), 80-20(%), and 70-30(%) were tried. Eventually, a data set with 80% used
for training and 20% used for testing yielded the best prediction capabilities in this study. The
RI
same combination was used for the other two surrogate models (RSM and ANN).
SC
The input and output relationship in LSSVM is given by Equation 2:
( ) ( )
U
The final form of LSSVM is given by Equation 3:
AN
( ) ( )
M
[ ] [ ] ( )
( ) ( ) ( ) ( )
[ ](
D
) ( )
TE
K(x,xi) is known as the kernel function which is chosen a priori. The Radial Basis Function
(RBF) kernel is used in this study as shown in Equation 4

EP
‖ ‖
( ) ( ) ( )
C
Where,
xi: Input vector of ith data
AC
b: Bias term
: Regularization parameter
: Kernel parameter
i: Support values
It is evident from equations 2, 3, and 4 that if the regularization parameter, , and the kernel
parameter, 2, are provided, the bias term and all support values can be determined from a linear
relationship. This is accomplished by using an optimization technique where initial  and 2

ACCEPTED MANUSCRIPT
values are guessed and iteratively improved as described in figure 2. For the optimization part,
the training data is further divided into LSSVM training data (80%) and optimization data (20%)
as shown in Figure 3
PT
RI
SC
Figure 3: Division of total data set into LSSVM training, optimization, and test data
LSSVM training is over once all parameters in the Table 3 are found. At this point, the model
U
can be applied for any unknown input vector using the RBF kernel as shown in Equation 5
( ) ∑
AN (
‖ ‖
) ( )
M
3.3. Artificial Neural Networks (ANNs)

D
The Artificial Neural Networks (ANNs) algorithm was developed based on human learning
TE
processes through brain and nerve networks. This is a connectionist technique where input and
output are linked through neurons. The most common feed forward architecture consists of one
EP
input layer, one or more hidden layers, and one output layer as shown in Figure 4.
C
AC
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
Figure 4: Basic structure of ANN with input, hidden, and output layers.
M
Links between input and output are established through the internal computations in the hidden
D
layers. The complexity and non-linearity of the model are increased by increasing the number of
TE
hidden layers where the individual components of a layer are known as nodes. In this study, there
are nine input nodes (eight input parameters as listed in Table 2 and one bias) contained in one
EP
hidden layer. A weight was given to each connection for every node and a bias term was added
to each hidden and output node. Bias and weight values used in this study are summarized for
C
one output in Table 3. In the process of training the ANN model, all weights and biases are
AC
determined by minimizing the error between the predicted output and the training output via
activation function at each the node. All output data is normalized as shown in Equation 6:
( )
( )
( ) ( )
Computations in hidden nodes and output nodes are shown in Figure 5.

ACCEPTED MANUSCRIPT
PT
RI
SC
(a) (b)
Figure 5: Input-to-output structure and calculations inside (a) hidden and (b) output nodes
U
As shown in figure 5 computation consists of two calculations: summation and transformation
AN
through activation functions where activation functions may be linear or non-linear. In this study,
sigmoid transfer function was used as shown in Equation 7

M
( )
( )
D
The sensitivity of the model to the number of hidden nodes (neurons) was also investigated. As
TE
described earlier, the non-linearity relationship between input and output data increases with the
number of hidden nodes. However, increasing non-linearity doesn’t always guarantee higher
EP
prediction accuracy. To find out the optimum number of hidden nodes (neurons), a sensitivity
C
study was conducted on the training and testing data for oil recovery and gas oil ratio after 5
AC
years of production as shown in Figure 6.

ACCEPTED MANUSCRIPT
PT
RI
SC
(a) (b)
U
Figure 6: Coefficients of determination using different number of hidden nodes for (a) Oil
AN
recovery and (b) gas oil ratio at 5 years.
It is evident from figure 6 that the R2 is close to unity for training data. On the other hand, the R2
M
value for the test data increases initially with the number of neurons for oil recovery and gas oil
ratio. A maximum R2 value can be clearly identified at 14 neurons for the case of oil recovery.
D
Therefore, 14 hidden neurons are used in this study. The ANN parameters used in study are
TE
summarized below:
EP
 Number of neurons in the first layer (hidden layer) = 14
 Number of neurons in the second layer (output layer) = 1

C
 Number of weights = Number of neurons in the first layer * Number of input + Number
AC
of neurons in the first layer * Number of neurons in the second layer=126
 Number of biases = Number of neurons in the first layer + Number of neurons in the
second layer = 15
ACCEPTED MANUSCRIPT
The unknown parameters in the ANN structure are summarized in Table 3. During training of the
ANN, these 126 weights and 15 biases are determined using an optimization routine, namely the
Particle Swarm Optimization (PSO) which is discussed in the upcoming sections.
3.4. Goodness of Fit
PT
There are various error measuring tools used in every branch of science and engineering. Their
uses are mostly dependent on the model and purpose of the system. During the fitting portion of
RI
the model (training the model) the Mean Square Error (MSE) is set as the objective function to
SC
determine the optimized model parameters using PSO. As the minimum value of MSE is the
indication of a good match between experimental or simulated values and modeled values, MSE
U
is minimized during optimization in PSO. The MSE is calculated between experimental or
AN
simulated values and modeled values as shown in Equation 8.
∑ ( )
M
( )
D
Yobs and Ymodel are the simulated and modeled values respectively, n is the number of data sets.
TE
In this study, the Normalized Root Mean Square Error (NRMSE) and the coefficient of
determination (R2) are adopted to measure the discrepancy between simulated data and model
EP
data. The NRMSE is used over MSE to compare various models (time based and rate based
models) in the same scale. The coefficient of determination, R2, is defined as shown in Equation
C
9
AC
( )
Where,
∑ ( ) , the residual sum of squares
∑ (̅ ) , the total sum of squares

ACCEPTED MANUSCRIPT
̅ ∑ , the mean of observed values
The NRMSE is defined in Equation 10,
√
( )
PT
Where Yobs,max is the maximum value and Yobs,min is the minimum value of the observed data.
The value of R2 varies from 0 to 1. R2 values close to unity and small NRMSE values are
RI
indication of a good fit.
SC
4. Optimization Routine: Particle Swarm Optimization
Inspired by the motion of bird swarms, the Particle Swarm Optimization (PSO) routine was
U
developed by Eberhart and Kennedy [56]. In this method, each potential solution is treated as
AN
particle. Each particle is characterized by its position and velocity. The position of a particle is
defined in a hyperspace whose dimension is equal to the number of unknown parameters being
M
optimized as shown in Table 3. For example, in the case of ANN, particles fly in a 141-
D
dimensional hyperspace. Several particles are initially defined in hyperspace where they
TE
iteratively change their position to determine the optimum position. Fitness of a particle is
determined by a fitness function such as the MSE. This algorithm is similar to the method
EP
followed by a bird groups searching for food in a vast area.
Two solutions, pbest and gbest, at any iteration during execution of the algorithm are tracked. The
C
local best or pbest is defined as the best position of a particle in the hyperspace as determined by
AC
the fitness value. The global best or gbest is the overall best value by any particle so far in the
population. At each iteration step, the velocity is updated first and then position. Accelerating the
particle towards its pbest and gbest by updating velocity is done by two separate random numbers
(random 1 and random 2) as shown in Equation 11

ACCEPTED MANUSCRIPT
( )
The cognitive components guide the local search from its local best (pbest) and the social
PT
component is responsible for global search depending on the population best (gbest) [57]. In
equation 11, is the velocity in the next iteration step which is partially preserved from the
RI
current velocity by an inertia weight, wi (range 0.4 to 0.9). The acceleration coefficients (C1 and
SC
C2) for cognitive and social components are chosen by trial and error. The position of a particle
for the next iteration step is updated by the following equation:
U
( )
AN
The values of wi, C1, C2, and other parameters used in this study are given in Table 4.
M
Table 4: Particle swarm optimization parameters in various surrogate models
Surrogate Model Particle Swarm Optimization
Number of Particles Maximum
D
Name Parameters Optimized for a single C1 C2 w Iteration

parameter
TE
1 intercept
RSM 100 2 2 0.6 1000
44 coefficients
1 Regularization parameter ()
LSSVM 100 2 2 0.6 1000
EP
1 Kernel parameter ()

15 Biases
ANN 100 2 2 0.6 1000
126 Weights
C
Initial position and velocity of each particle is randomly distributed. After the initialization of
AC
positions and velocities of all particles, fitness is calculated. In subsequent steps, positions and
velocities are updated iteratively by the local best and the global best parameters as summarized
in the flowchart shown in in Figure 7 [58].

ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
D
TE
Figure 7: Particle Swarm Optimization flow chart. Modified from Ahmadi et al. [58]
The entire flowchart can be divided into four parts, namely, initialization, fitness evaluation,
EP
condition check, and updates of velocity and position. Acceptance of any particle as potential
C
solution is determined by its fitness value which is calculated in each iteration step. As described
AC
earlier, one local best, pbest, and one global best, gbest, are recorded during each iteration. The
number of iterations is only limited by time and computational constraints; hence the maximum
iteration number is defined by the user.

ACCEPTED MANUSCRIPT
5. Results and Discussion
Five time-based models (90 days, 1 year, 5 years, 10 years and 15 years) and one rate-based
model (5 bbl/day/fracture) were trained using RSM, LSSVM and ANN. The objective of this
study is to compare performance of three surrogate models. Production performance in terms of
PT
oil recovery and gas-oil ratio are compared with simulation data. Since all time-based models
behave similarly, only two time-based models (one early production model after 90 days and a
RI
long production model after 10 years) along with one rate-based model are discussed here. The
SC
fitness of a model with training data and test data are both discussed in this section. Once the
model is trained, it is tested against an unknown data set (i.e., test data) to check for robustness
U
of forecasting capabilities. As discussed earlier, the fitness is determined by two measures, the
AN
coefficient of determination (R2) and normalized root mean square error (NRMSE) which are
used here to compare different surrogate models.

M
5.1. Training
D
In this section, model fitness as compared with training data is evaluated and discussed. It is
TE
important to assess individual models to check for overfitting. Three surrogate models (RSM,
LSSVM, and ANN) for oil recovery and gas oil ratio after 90 days, 10 years of production and at
EP
a terminal rate (5 bbl/day/fracture) are shown in Figures 8.

C
AC
ACCEPTED MANUSCRIPT
PT
RI
SC
(a) (d)
U
AN
M
D
TE
(b) (e)
C EP
AC
(c) (f)
Figure 8: Comparison of RSM, ANN and LSSVM models using training data for (a) Oil
ACCEPTED MANUSCRIPT
recovery after 90 days (b), Oil recovery after 10 years (c), Oil recovery after oil rate drops to 5
bbl/day/fracture (d), Gas Oil Ratio after 90 days (e), Gas Oil Ratio after 10 years (f), and Gas Oil
Ratio after oil rate drops to 5 bbl/day/fracture
Typical high model fitness as compared with training data can be observed for all cases in Figure
8. However, the capture of the production behavior using surrogate models without apprehending
PT
the underlying physics is a great challenge. It is difficult to model the GOR from low
permeability reservoirs [59] due to its complex behavior. Mainly at higher value of GOR
RI
(obtained from 10 years model), flow becomes boundary dominated. On the other hand, lower
SC
GOR (obtained from 90 days and 1 year models) occurs when flow is at the transient linear
regime. Overall, both oil recovery and GOR from surrogate models are in good agreement with
U
simulation results. The errors are calculated in terms of R2 and NRMSE as listed in Tables A.1
AN
and A.2. For visual comparison, R2 and NRMSE for RSM, ANN and LSSVM models of oil
recovery are shown in Figure 9a and b respectively.

M
D
TE
C EP
AC
(a) (b)
Figure 9: Fitness of RSM, ANN and LSSVM models for of oil recovery for training data (a) Co-
efficient of determination, R2 and (b) NRMSE
R2 and NRMSE for oil recovery using RSM, LSSVM and ANN for all time- and rate-based
models are greater than 0.95 and less than 6% respectively. These values are evidence of well-
trained models.
ACCEPTED MANUSCRIPT
5.2. Testing
Some models may have the tendency to overfit with training data and consequently fail to predict
unseen test data with high accuracy. In this study, 20 percent of all data was used to check the
forecasting capabilities of all developed models. Results for oil recovery and GOR after 90 days,
PT
10 years production, and at terminal oil rate (5 bbl/day/fracture) are shown in Figure 10.
RI
U SC
AN
M
(a) (d)
D
TE
C EP
AC
(b) (e)
ACCEPTED MANUSCRIPT
PT
RI
SC
(c) (f)
Figure 10: Comparison of RSM, ANN and LSSVM models using test data for (a) Oil recovery
after 90 days, (b) Oil recovery after 10 years, (c) Oil recovery after oil rate drops to 5
U
bbl/day/fracture, (d), Gas Oil Ratio after 90 days (e), Gas Oil Ratio after 10 years (f), and Gas
Oil Ratio after the oil rate drops to 5 bbl/day/fracture
AN
Although all models had high fitness compared with training data, they showed relatively lower
fitness when compared with test data. Considering the fact that test data was not accounted for
M
during training, the models show promising forecasting capabilities without significant
D
aberrations. R2 and NRMSE are calculated as listed in Tables A.1 and A.2. The R2 and NRMSE
TE
values for RSM, ANN, and LSSVM oil recovery models are shown in Figures 11 a and b
respectively.
C EP
AC
(a) (b)
ACCEPTED MANUSCRIPT
Figure 11: Fitness of RSM, ANN and LSSVM oil recovery models for testing data: (a) Co-
efficient of determination, R2 and (b) NRMSE
Except for a few cases, the forecast accuracy for all models are within decent ranges. As shown
in the figures above, RSM shows higher accuracy predicting oil recovery followed by LSSVM.
As shown in Figures 10 and 11, AI tools have the potential to predict oil recoveries and fluid
PT
ratios given a small training data set. Large amounts of completion, geological, and production
RI
data can indeed be used to train more robust models and complement current conventional tools
to evaluate the potential of tight oil reservoirs. The cost, however, is that AI skips the physical
SC
description and understanding of the multiphase production mechanisms in tight formations. This
cost may not be too high to pay since the current conventional understanding of these systems
U
may not be sufficiently developed yet. In fact, researchers have recently reported on the
AN
discrepancies between conventional thinking and fluids under nanoconfinement in tight
M
formations [60, 61].
D
6. Conclusion
TE
Artificial intelligence tools aimed to predict oil recovery and gas-oil ratio from hydraulically
fractured tight formations can be successfully developed using simulation information as a

EP
training framework. In this study, three models were developed based on RSM, ANN, and
C
LSSVM to predict recovery from wells producing under time-based (90 days, 1 year, 5 years,
AC
10 years, and 15 years) and rate-based constraints (5 bbl/day/fracture). Eight key factors,
namely, matrix permeability, gas relative permeability exponent, rock compressibility, initial
gas-oil ratio, slope of solution gas-oil ratio versus pressure, initial pressure, flowing bottom-
hole pressure, and fracture spacing were considered as input parameters for all cases. After all
models were trained with the same database, they were used to predict production for different
ACCEPTED MANUSCRIPT
scenarios. Using simulation as a comparison basis, all models were evaluated in terms of their
oil recovery and producing gas-oil ratio predictive capabilities. It was found that RSM and
LSSVM have better predictive capabilities for oil recovery than ANN. In addition, LSSVM
exhibits the highest accuracy with respect to gas-oil ratio prediction.
PT
Field-scale modeling and simulation of hydraulically fractured ultra-low permeability reservoirs
lead to very expensive computational overhead in commercial simulators. Surrogate reservoir
RI
models, on the other hand, are useful for quick oil production forecast and assessment.
SC
Additionally, these models can be used for risk and uncertainty analysis. Overall, artificial
intelligence applications such as LSSVM have promising applications in various aspects of
U
production and reservoir engineering.
AN
Nomenclature
M
Symbol Description Units

 Regularization Parameter -
D
 Kernel Parameter -
i Support Values
TE
Unit of Output
̅ Mean Of Observed Values Unit of Output
a0 The Intercept Of The Surrogate Model Unit of Output
AI Artificial Intelligence -
EP
aij Coefficient Of 2nd Order Interaction Of Inputs -

ak Coefficient Of Independent Input -
ANN Artificial Neural Networks -
C
b Bias Term Unit of Output

BHP Bottom Hole Pressure psi
AC
C1 Acceleration Coefficient For Cognitive Components

C2 Acceleration Coefficient For Social Components
DOE Design Of Experiments
dRs/dp Slope Of Gas/Oil Ratio In PVT (SCF/STB)/psi
gbest Population's Best Particle's Position
GOR Gas/Oil Ratio SCF/STB
LSSVM Least Square Support Vector Machine -
MSE Mean Square Error Unit of Output
ng Exponent Of Relative Permeability Curve For Gas -
NRMSE Normalized Root Mean Square Error -
ACCEPTED MANUSCRIPT
pbest Particle's Best Position

Pi Initial Reservoir Pressure psi
PSO Particle Swarm Optimization -
PVT Pressure-Volume-Temperature -
R2 Coefficient Of Determination -
Rsi Initial Gas/Oil Ratio SCF/STB
RSM Response Surface Model -
PT
SSres Residual Sum Of Squares Unit of Output
SStot Total Sum Of Squares Unit of Output
SVM Support Vector Machine -
RI
vk Velocity Of Particle -
wi Inertia Weight -
Ymodel,i Modeled Value Unit of Output
SC
Yobs,i Observed Data Unit of Output
Yobs,max The Maximum Value Of Observed Data Unit of Output
Yobs,min The Minimum Value Of Observed Data Unit of Output
U
References
[1].
AN
Yeten, B., A. Castellini, B. Guyaguler, and W.H. Chen. A Comparison Study on
Experimental Design and Response Surface Methodologies. in SPE Reservoir Simulation
M
Symposium. The Woodlands, Texas: Society of Petroleum Engineers Inc. (2005).
[2]. Amorim, T.C.A.D. and D.J. Schiozer. Risk Analysis Speed-Up With Surrogate Models. in
SPE Latin America and Caribbean Petroleum Engineering Conference. Mexico City,
D
Mexico: Society of Petroleum Engineers (2012).

[3]. Li, B. and F. Firedmann. A Novel Response Surface Methodology Based on "Amplitude
TE
Factor" Analysis for Modeling Nonlinear Responses Caused by Both Reservoir and
Controllable Factors. in SPE Annual Technical Conference and Exhibition. Dallas,
Texas: Society of Petroleum Engineers (2005).
[4]. Peng, C.Y. and R. Gupta. Experimental Design in Deterministic Modelling: Assessing
EP
Significant Uncertainties. in SPE Asia Pacific Oil and Gas Conference and Exhibition.
Jakarta, Indonesia: Society of Petroleum Engineers (2003).
[5]. Dejean, J.P. and G. Blanc. Managing Uncertainties on Production Predictions Using
C
Integrated Statistical Methods. in SPE Annual Technical Conference and Exhibition.

Houston, Texas: Society of Petroleum Engineers (1999).
AC
[6]. Corre, B., P. Thore, V.d. Feraudy, and G. Vincent. Integrated Uncertainty Assessment
For Project Evaluation and Risk Analysis. in SPE European Petroleum Conference.
Paris, France: Society of Petroleum Engineers Inc. (2000).
[7]. Manceau, E., M. Mezghani, I. Zabalza-Mezghani, and F. Roggero. Combination of
Experimental Design and Joint Modeling Methods for Quantifying the Risk Associated
With Deterministic and Stochastic Uncertainties - An Integrated Test Study. in SPE
Annual Technical Conference and Exhibition. New Orleans, Louisiana: Society of
Petroleum Engineers Inc. (2001).
[8]. Venkataraman, R. Application of the Method of Experimental Design to Quantify
Uncertainty in Production Profiles. in SPE Asia Pacific Conference on Integrated
ACCEPTED MANUSCRIPT
Modelling for Asset Management. Yokohama, Japan: Copyright 2000, Society of

Petroleum Engineers Inc. (2000).
[9]. Chewaroungroaj, J., O.J. Varela, and L.W. Lake. An Evaluation of Procedures to
Estimate Uncertainty in Hydrocarbon Recovery Predictions. in SPE Asia Pacific
Conference on Integrated Modelling for Asset Management. Yokohama, Japan:
Copyright 2000, Society of Petroleum Engineers Inc. (2000).
[10]. Mohaghegh, S.D. Quantifying Uncertainties Associated With Reservoir Simulation
PT
Studies Using a Surrogate Reservoir Model. in SPE Annual Technical Conference and
Exhibition. San Antonio, Texas, USA: Society of Petroleum Engineers (2006).
[11]. Guyaguler, B. and R.N. Horne. Uncertainty Assessment of Well Placement Optimization.
in SPE Annual Technical Conference and Exhibition. New Orleans, Louisiana: Copyright
RI
2001, Society of Petroleum Engineers Inc. (2001).
[12]. Manceau, E., F. Roggero, and I. Zabalza-Mezghani. Use Of Experimental Design
Methodology To Make Decisions In An Uncertain Reservoir Environment From
SC
Reservoir Uncertainties To Economic Risk Analysis. World Petroleum Congress (2002).
[13]. Landa, J.L. and B. Güyagüler. A Methodology for History Matching and the Assessment
of Uncertainties Associated with Flow Prediction. in SPE Annual Technical Conference
U
and Exhibition. Denver, Colorado: Society of Petroleum Engineers (2003).
[14]. Carreras, P.E., S.E. Turner, and G.T. Wilkinson. Tahiti: Development Strategy
AN
Assessment Using Design of Experiments and Response Surface Methods. in SPE
Western Regional/AAPG Pacific Section/GSA Cordilleran Section Joint Meeting.
Anchorage, Alaska, USA: Society of Petroleum Engineers (2006).
[15]. Yang, C., L.X. Nghiem, C. Card, and M. Bremeier. Reservoir Model Uncertainty
M
Quantification Through Computer-Assisted History Matching. in SPE Annual Technical
Conference and Exhibition. Anaheim, California, U.S.A.: Society of Petroleum Engineers
(2007).
D
[16]. Slotte, P.A. and E. Smorgrav. Response Surface Methodology Approach for History
Matching and Uncertainty Assessment of Reservoir Simulation Models. in
TE
Europec/EAGE Conference and Exhibition. Rome, Italy: Society of Petroleum Engineers

(2008).
[17]. Ahmadi, M.A., R. Soleimani, and A. Bahadori, A computational intelligence scheme for
EP
prediction equilibrium water dew point of natural gas in TEG dehydration systems. Fuel,
137 (2014). p. 145-154.
[18]. Mohaghegh, S.D., J.S. Liu, R. Gaskari, M. Maysami, and O.A. Olukoko. Application of
Well-Base Surrogate Reservoir Models (SRMs) to Two Offshore Fields in Saudi Arabia,
C
Case Study. in SPE Western Regional Meeting. Bakersfield, California, USA: Society of
Petroleum Engineers (2012).
AC
[19]. Velasco, R., P. Panja, and M. Deo. New Production Performance and Prediction Tool for
Unconventional Reservoirs, URTEC-2461718-MS. in Unconventional Resources
Technology Conference, 1-3 August. San Antonio, Texas, USA: Unconventional
Resources Technology Conference (2016).
[20]. Patzek, T.W., F. Male, and M. Marder, Gas production in the Barnett Shale obeys a
simple scaling theory. Proceedings of the National Academy of Sciences, 110(49) (2013).
p. 19731-19736.
[21]. Wattenbarger, R.A., A.H. El-Banbi, M.E. Villegas, and J.B. Maggard. Production
Analysis of Linear Flow Into Fractured Tight Gas Wells, SPE-39931-MS. in SPE Rocky
ACCEPTED MANUSCRIPT
Mountain Regional/Low-Permeability Reservoirs Symposium, 5-8 April. Denver,

Colorado, USA: Society of Petroleum Engineers (1998).
[22]. Nobakht, M., L. Mattar, S. Moghadam, and D.M. Anderson, Simplified Forecasting of
Tight/Shale-Gas Production in Linear Flow. Journal of Canadian Petroleum Technology,
51(06) (2012). p. 11.
[23]. Al-Kaabi, A.U. and W.J. Lee, Using Artificial Neural Networks To Identify the Well Test
Interpretation Model (includes associated papers 28151 and 28165 ). 8(03) (1993.
PT
[24]. Juniardi, I.R. and I. Ershaghi. Complexities of Using Neural Network in Well Test
Analysis of Faulted Reservoirs. Society of Petroleum Engineers (1993).
[25]. Zhou, C.D., X.-L. Wu, and J.-A. Cheng. Determining Reservoir Properties in Reservoir
Studies Using a Fuzzy Neural Network. in SPE Annual Technical Conference and
RI
Exhibition. Houston, Texas: Society of Petroleum Engineers (1993).
[26]. Mohaghegh, S., R. Arefi, S. Ameri, and M.H. Hefner. A Methodological Approach for
Reservoir Heterogeneity Characterization Using Artificial Neural Networks. in SPE
SC
Annual Technical Conference and Exhibition. New Orleans, Louisiana: Society of
Petroleum Engineers (1994).
[27]. E. El-Sebakhy, T.S., S. Al-Bokhitan, Y. Shaaban, I. Raharja, Y. Khaeruzzaman. Support
U
Vector Machines Framework for Predicting the PVT Properties of Crude-Oil Systems.
Kingdom of Baharin: 15th SPE Middle East Oil & Gas Show and Conference (2007).
[28]. AN
Oloso, M., A. Khoukhi, A. Abdulraheem, and M. Elshafei. Prediction of Crude Oil
Viscosity and Gas/Oil Ratio Curves Using Recent Advances to Neural Networks. in
SPE/EAGE Reservoir Characterization and Simulation Conference. Abu Dhabi, UAE:
Society of Petroleum Engineers (2009).
M
[29]. Rabiei, A., H. Sayyad, M. Riazi, and A. Hashemi, Determination of dew point pressure in
gas condensate reservoirs based on a hybrid neural genetic algorithm. Fluid Phase
Equilibria, 387 (2015). p. 38-49.
D
[30]. Ahmadi, M.A. and M. Ebadi, Evolving smart approach for determination dew point
pressure through condensate gas reservoirs. Fuel, 117 (2014). p. 1074-1084.
TE
[31]. Ahmadi, M.A., M. Ebadi, and A. Yazdanpanah, Robust intelligent tool for estimating dew
point pressure in retrograded condensate gas reservoirs: Application of particle swarm
optimization. Journal of Petroleum Science and Engineering, 123 (2014). p. 7-19.
EP
[32]. Ahmadi, M.A., M. Ebadi, P.S. Marghmaleki, and M.M. Fouladi, Evolving predictive
model to determine condensate-to-gas ratio in retrograded condensate gas reservoirs.
Fuel, 124 (2014). p. 241-257.
[33]. Ahmadi, M.A., Neural network based unified particle swarm optimization for prediction
C
of asphaltene precipitation. Fluid Phase Equilibria, 314 (2012). p. 46-51.

[34]. Ahmadi, M.A. and S.R. Shadizadeh, New approach for prediction of asphaltene
AC
precipitation due to natural depletion by using evolutionary algorithm concept. Fuel, 102
(2012). p. 716-723.
[35]. Ahmadi, M.A., M. Ebadi, A. Shokrollahi, and S.M.J. Majidi, Evolving artificial neural
network and imperialist competitive algorithm for prediction oil flow rate of the
reservoir. Applied Soft Computing, 13(2) (2013). p. 1085-1098.
[36]. Fatai Adesina Anifowose, A.A. Prediction of Porosity and Permeability of Oil and Gas
Reservoirs using Hybrid Computational Intelligence Models. Cairo, Egypt: North Africa
Technical Conference and Exhibition, SPE (2010).
ACCEPTED MANUSCRIPT
[37]. Fatai Adesina Anifowose, A.O.E., Safiriyu Ijiyemi. Prediction of Oil and Gas Reservoir
Properties using Support Vector Machines. Bangkok, Thailand: International Petroleum
Technology Conference, (2011).
[38]. Ammal F. Al-anazi, G., Ian D, Support-Vector Regression for Permeability Prediction in
a Heterogeneous Reservoir: A Comparative Study. SPE Reservoir Evaluation &
Engineering, 13(03) (2010).
[39]. Mohammad-Ali Ahmadi, M.R.A., Seyed Moein Hosseini, Mohammad Ebadi,
PT
Connectionist model predicts the porosity and permeability of petroleum reservoirs by
means of petro-physical logs: Application of artificial intelligence. Journal of Petroleum
Science and Engineering, 123 (2014). p. 183-200.
[40]. Mohammad-Ali Ahmadi, A.B., A LSSVM approach for determining well placement and
RI
conning phenomena in horizontal wells. Fuel, 153 (2015). p. 276-283.
[41]. Mohammad Ali Ahmadi, M.E., Seyed Moein Hosseini, Prediction breakthrough time of
water coning in the fractured reservoirs by implementing low parameter support vector
SC
machine approach. Fuel, 117 (2014). p. 579-589.
[42]. Ahmadi, M.A., Connectionist approach estimates gas–oil relative simulation in
petroleum reservoirs: Application to reservoir simulation. Fuel, 140 (2015). p. 429-439.
U
[43]. Eslamimanesh, A., F. Gharagheizi, M. Illbeigi, A.H. Mohammadi, A. Fazlali, and D.
Richon, Phase equilibrium modeling of clathrate hydrates of methane, carbon dioxide,
AN
nitrogen, and hydrogen + water soluble organic promoters using Support Vector
Machine algorithm. Fluid Phase Equilibria, 316 (2012). p. 34-45.
[44]. Reza Gholgheysari Gorjaei, R.S., Mohammad Torkaman, Mohsen Safari, Ghassem
Zargar, A novel PSO-LSSVM model for predicting liquid rate of two phase flow through
M
wellhead chokes. Journal of Natural Gas Science and Engineering, 24 (2015). p. 228-237.
[45]. Ahmadi, M.-A., M.Z. Hasanvand, and A. Bahadori, A least-squares support vector
machine approach to predict temperature drop accompanying a given pressure drop for
D
the natural gas production and processing systems. International Journal of Ambient
Energy, 38(2) (2015). p. 122-129.
TE
[46]. Ahmadi, M.A., M. Masoumi, and R. Askarinezhad, Evolving Connectionist Model to

Monitor the Efficiency of an In Situ Combustion Process: Application to Heavy Oil
Recovery. Energy Technology, 2(9-10) (2014). p. 811-818.
EP
[47]. Ahmadi, M.-A., M. Masumi, R. Kharrat, and A.H. Mohammadi, Gas Analysis by In Situ
Combustion in Heavy-Oil Recovery Process: Experimental and Modeling Studies.
Chemical Engineering & Technology, 37(3) (2014). p. 409-418.
[48]. Ahmadi, M.A., M. Masoumi, and R. Askarinezhad, Evolving Smart Model to Predict the
C
Combustion Front Velocity for In Situ Combustion. Energy Technology, 3(2) (2015). p.
128-135.
AC
[49]. Ahmadi, M.A., M. Zahedzadeh, S.R. Shadizadeh, and R. Abbassi, Connectionist model
for predicting minimum gas miscibility pressure: Application to gas injection process.
Fuel, 148 (2015). p. 202-211.
[50]. Ali Ahmadi, M. and A. Ahmadi, Applying a sophisticated approach to predict
CO2solubility in brines: application to CO2sequestration. International Journal of Low-
Carbon Technologies, 11(3) (2016). p. 325-332.
[51]. Ahmadi, M.-A., A. Bahadori, and S.R. Shadizadeh, A rigorous model to predict the
amount of Dissolved Calcium Carbonate Concentration throughout oil field brines: Side
effect of pressure and temperature. Fuel, 139 (2015). p. 154-159.
ACCEPTED MANUSCRIPT
[52]. Panja, P., T. Conner, and M. Deo, Factors Controlling Production in Hydraulically
Fractured Low Permeability Oil Reservoirs. International Journal of Oil, Gas and Coal
Technology, 3(1) (2015). p. 18.
[53]. Box, G.E.P. and D.W. Behnken, Some New Three Level Designs for the Study of
Quantitative Variables. Technometrics, 2(4) (1960). p. 455-475.
[54]. Panja, P., T. Conner, and M. Deo, Grid sensitivity studies in hydraulically fractured low
permeability reservoirs. Journal of Petroleum Science and Engineering, 112(0) (2013). p.
PT
78-87.
[55]. Panja, P., M. Pathak, R. Velasco, and M. Deo. Least Square Support Vector Machine: An
Emerging Tool for Data Analysis. in SPE Low Perm Symposium, 5-6 May. Colorado,
Denver: Society of Petroleum Engineers (2016).
RI
[56]. Eberhart, R. and J. Kennedy. A new optimizer using particle swarm theory. in Micro
Machine and Human Science, 1995. MHS '95., Proceedings of the Sixth International
Symposium on. (1995).
SC
[57]. Banerjee, C. and R. Sawal. PSO with dynamic acceleration Coefficient based on Mutiple
Constraint Satisfaction. in International Conference on Advances in Electronics
Computers and Communications. Bangalore, India (2014).
U
[58]. Ahmadi, M.A., R. Soleimani, M. Lee, T. Kashiwao, and A. Bahadori, Determination of
oil well production performance using artificial neural network (ANN) linked to the
[59].
AN
particle swarm optimization (PSO) tool. Petroleum, 1(2) (2015). p. 118-132.
Panja, P. and M. Deo, Unusual behavior of produced gas oil ratio in low permeability
fractured reservoirs. Journal of Petroleum Science and Engineering, 144 (2016). p. 76-
83.
M
[60]. Pathak, M., H. Kweon, P. Panja, R. Velasco, and M.D. Deo. Suppression in the Bubble
Points of Oils in Shales Combined Effect of Presence of Organic Matter and
Confinement. in SPE Unconventional Resources Conference, 15-16 February, . Calgary,
D
Alberta, Canada: Society of Petroleum Engineers (2017).

[61]. Velasco, R., M. Pathak, P. Panja, and M. Deo. What Happens to Permeability at the
TE
Nanoscale? A Molecular Dynamics Simulation Study. in SPE/AAPG/SEG

Unconventional Resources Technology Conference, 24-26 July. Austin, Texas, USA:
Unconventional Resources Technology Conference (2017).
EP
Appendix A: Supplementary Information

C
AC
Table A.1: Coefficient of determination (R2) of RSM, LSSVM and ANN for all models
Training Data Test Data
Output Model
RSM LSSVM ANN RSM LSSVM ANN
90 days 0.99 0.99 0.96 0.69 0.52 0.51
1 year 0.98 0.99 0.98 0.78 0.69 0.53
Oil 5 years 0.99 0.99 0.99 0.63 0.81 0.60
Recovery 10 years 0.99 0.99 0.98 0.91 0.9 0.72
15 years 0.99 0.99 0.99 0.97 0.93 0.84
Rate Based 0.98 0.98 0.99 0.57 0.54 0.48
ACCEPTED MANUSCRIPT
90 days 0.98 0.99 0.95 0.92 0.84 0.80

1 year 0.98 0.98 0.96 0.93 0.91 0.90
Gas Oil 5 years 0.98 0.98 0.97 0.41 0.73 0.30
Ratio 10 years 0.88 0.92 0.93 0.76 0.73 0.46
15 years 0.83 0.77 0.84 0.79 0.75 0.32
Rate Based 0.84 0.88 0.92 0.68 0.45 0.43
PT
Table A.2: Normalized Root Mean Square Error (NRMSE) of RSM, LSSVM and ANN for all
models
Training Data Test Data
RI
Output Model
RSM LSSVM ANN RSM LSSVM ANN
90 days 1.9 1.9 3.5 16.5 20.3 20.7
1 year 2.4 2.3 2.5 12.4 14.7 18.1
SC
Oil 5 years 2.0 1.9 2.1 16.1 11.5 16.7
Recovery 10 years 1.9 1.7 2.6 7.9 8.5 14
15 years 2.7 2.4 2.1 4.9 7.3 10.8
U
Rate Based 3.5 3.3 2.4 20.7 21.2 22.6
90 days 2.6 2.0 4.6 8.7 11.8 13.3
Gas Oil
1 year
5 years
3.3
3.0
AN 3.3
3.1
4.2
3.7
7.9
24.0
9.3
16.1
9.7
26.2
Ratio 10 years 5.7 4.6 4.3 16.1 17.2 24.3
M
15 years 5.8 6.8 5.6 14.4 15.5 25.7
Rate Based 5.2 4.5 3.8 14.1 18.4 18.8
D
TE
C EP
AC
(a) (d)
ACCEPTED MANUSCRIPT
PT
RI
SC
(b) (e)
U
AN
M
D
TE
EP
(c) (f)
Figure A.1.: Data training comparison of RSM, ANN and LSSVM models for (a) Oil recovery
after 1 year (b) Oil recovery after 5 years (c) Oil recovery after 15 years (d) Gas Oil Ratio after 1
C
year (e) Gas Oil Ratio after 5 years (f) Gas Oil Ratio after 15 years
AC
Table A.3: List of simulations using Box-Behnken DOE used to train surrogate models
Sr. Km ng Cf dRs/dp Rsi Pi Pwf Xf
No. (nD) - (1/psi) ((SCF/STB)/psi) (SCF/STB) (psi) (psi) (ft.)
1 10 1 4.00E-05 0.65 1900 5250 1000 180
2 10 3 4.00E-05 0.65 1900 5250 1000 180
3 5000 1 4.00E-05 0.65 1900 5250 1000 180
4 5000 3 4.00E-05 0.65 1900 5250 1000 180
5 10 2 4.00E-06 0.65 1900 5250 1000 180
ACCEPTED MANUSCRIPT
6 10 2 4.00E-04 0.65 1900 5250 1000 180

7 5000 2 4.00E-06 0.65 1900 5250 1000 180
8 5000 2 4.00E-04 0.65 1900 5250 1000 180
9 10 2 4.00E-05 0.5 1900 5250 1000 180
10 10 2 4.00E-05 0.8 1900 5250 1000 180
11 5000 2 4.00E-05 0.5 1900 5250 1000 180
4.00E-05
PT
12 5000 2 0.8 1900 5250 1000 180
13 10 2 4.00E-05 0.65 800 5250 1000 180
14 10 2 4.00E-05 0.65 3000 5250 1000 180
RI
15 5000 2 4.00E-05 0.65 800 5250 1000 180
16 5000 2 4.00E-05 0.65 3000 5250 1000 180
17 10 2 4.00E-05 0.65 1900 4000 1000 180
SC
18 10 2 4.00E-05 0.65 1900 6500 1000 180
19 5000 2 4.00E-05 0.65 1900 4000 1000 180
20 5000 2 4.00E-05 0.65 1900 6500 1000 180
U
21 10 2 4.00E-05 0.65 1900 5250 500 180
22 10 2 4.00E-05 0.65 1900 5250 1500 180
23
24
5000
5000
2
2
4.00E-05
4.00E-05
AN
0.65
0.65
1900
1900
5250
5250
500
1500
180
180
25 10 2 4.00E-05 0.65 1900 5250 1000 60
M
26 10 2 4.00E-05 0.65 1900 5250 1000 300
27 5000 2 4.00E-05 0.65 1900 5250 1000 60
28 5000 2 4.00E-05 0.65 1900 5250 1000 300
D
29 225 1 4.00E-06 0.65 1900 5250 1000 180

30 225 1 4.00E-04 0.65 1900 5250 1000 180
TE
31 225 3 4.00E-06 0.65 1900 5250 1000 180

32 225 3 4.00E-04 0.65 1900 5250 1000 180
33 225 1 4.00E-05 0.5 1900 5250 1000 180
EP
34 225 1 4.00E-05 0.8 1900 5250 1000 180

35 225 3 4.00E-05 0.5 1900 5250 1000 180
36 225 3 4.00E-05 0.8 1900 5250 1000 180
C
37 225 1 4.00E-05 0.65 800 5250 1000 180

38 225 1 4.00E-05 0.65 3000 5250 1000 180
AC
39 225 3 4.00E-05 0.65 800 5250 1000 180

40 225 3 4.00E-05 0.65 3000 5250 1000 180
41 225 1 4.00E-05 0.65 1900 4000 1000 180
42 225 1 4.00E-05 0.65 1900 6500 1000 180
43 225 3 4.00E-05 0.65 1900 4000 1000 180
44 225 3 4.00E-05 0.65 1900 6500 1000 180
45 225 1 4.00E-05 0.65 1900 5250 500 180
46 225 1 4.00E-05 0.65 1900 5250 1500 180
ACCEPTED MANUSCRIPT
47 225 3 4.00E-05 0.65 1900 5250 500 180

48 225 3 4.00E-05 0.65 1900 5250 1500 180
49 225 1 4.00E-05 0.65 1900 5250 1000 60
50 225 1 4.00E-05 0.65 1900 5250 1000 300
51 225 3 4.00E-05 0.65 1900 5250 1000 60
52 225 3 4.00E-05 0.65 1900 5250 1000 300
4.00E-06
PT
53 225 2 0.5 1900 5250 1000 180
54 225 2 4.00E-06 0.8 1900 5250 1000 180
55 225 2 4.00E-04 0.5 1900 5250 1000 180
RI
56 225 2 4.00E-04 0.8 1900 5250 1000 180
57 225 2 4.00E-06 0.65 800 5250 1000 180
58 225 2 4.00E-06 0.65 3000 5250 1000 180
SC
59 225 2 4.00E-04 0.65 800 5250 1000 180
60 225 2 4.00E-04 0.65 3000 5250 1000 180
61 225 2 4.00E-06 0.65 1900 4000 1000 180
U
62 225 2 4.00E-06 0.65 1900 6500 1000 180
63 225 2 4.00E-04 0.65 1900 4000 1000 180
64
65
225
225
2
2
4.00E-04
4.00E-06
AN
0.65
0.65
1900
1900
6500
5250
1000
500
180
180
66 225 2 4.00E-06 0.65 1900 5250 1500 180
M
67 225 2 4.00E-04 0.65 1900 5250 500 180
68 225 2 4.00E-04 0.65 1900 5250 1500 180
69 225 2 4.00E-06 0.65 1900 5250 1000 60
D
70 225 2 4.00E-06 0.65 1900 5250 1000 300

71 225 2 4.00E-04 0.65 1900 5250 1000 60
TE
72 225 2 4.00E-04 0.65 1900 5250 1000 300

73 225 2 4.00E-05 0.5 800 5250 1000 180
74 225 2 4.00E-05 0.5 3000 5250 1000 180
EP
75 225 2 4.00E-05 0.8 800 5250 1000 180

76 225 2 4.00E-05 0.8 3000 5250 1000 180
77 225 2 4.00E-05 0.5 1900 4000 1000 180
C
78 225 2 4.00E-05 0.5 1900 6500 1000 180

79 225 2 4.00E-05 0.8 1900 4000 1000 180
AC
80 225 2 4.00E-05 0.8 1900 6500 1000 180

81 225 2 4.00E-05 0.5 1900 5250 500 180
82 225 2 4.00E-05 0.5 1900 5250 1500 180
83 225 2 4.00E-05 0.8 1900 5250 500 180
84 225 2 4.00E-05 0.8 1900 5250 1500 180
85 225 2 4.00E-05 0.5 1900 5250 1000 60
86 225 2 4.00E-05 0.5 1900 5250 1000 300
87 225 2 4.00E-05 0.8 1900 5250 1000 60
ACCEPTED MANUSCRIPT
88 225 2 4.00E-05 0.8 1900 5250 1000 300

89 225 2 4.00E-05 0.65 800 4000 1000 180
90 225 2 4.00E-05 0.65 800 6500 1000 180
91 225 2 4.00E-05 0.65 3000 4000 1000 180
92 225 2 4.00E-05 0.65 3000 6500 1000 180
93 225 2 4.00E-05 0.65 800 5250 500 180
4.00E-05
PT
94 225 2 0.65 800 5250 1500 180
95 225 2 4.00E-05 0.65 3000 5250 500 180
96 225 2 4.00E-05 0.65 3000 5250 1500 180
RI
97 225 2 4.00E-05 0.65 800 5250 1000 60
98 225 2 4.00E-05 0.65 800 5250 1000 300
99 225 2 4.00E-05 0.65 3000 5250 1000 60
SC
100 225 2 4.00E-05 0.65 3000 5250 1000 300
101 225 2 4.00E-05 0.65 1900 4000 500 180
102 225 2 4.00E-05 0.65 1900 4000 1500 180
U
103 225 2 4.00E-05 0.65 1900 6500 500 180
104 225 2 4.00E-05 0.65 1900 6500 1500 180
105
106
225
225
2
2
4.00E-05
4.00E-05
AN
0.65
0.65
1900
1900
4000
4000
1000
1000
60
300
107 225 2 4.00E-05 0.65 1900 6500 1000 60
M
108 225 2 4.00E-05 0.65 1900 6500 1000 300
109 225 2 4.00E-05 0.65 1900 5250 500 60
110 225 2 4.00E-05 0.65 1900 5250 500 300
D
111 225 2 4.00E-05 0.65 1900 5250 1500 60

112 225 2 4.00E-05 0.65 1900 5250 1500 300
TE
113 225 2 4.00E-05 0.65 1900 5250 1000 180

114 225 2 4.00E-05 0.65 1900 5250 1000 180
EP
Table A.4: List of simulations performed using random input parameter values used to test the
surrogate models
C
Sr. Km ng Cf dRs/dp Rsi Pi Pwf Xf

No. (nD) - (1/psi) ((SCF/STB)/psi) (SCF/STB) (psi) (psi) (ft.)
AC
1 1496 1.17 1.0E-05 0.62 2069 5738 743 300

2 361 1.27 3.8E-05 0.67 1456 4170 1417 180
3 31 1.35 8.0E-06 0.58 1625 4637 769 60
4 44 1.78 2.0E-05 0.59 1724 4560 1266 60
5 2475 2.66 1.9E-05 0.69 2408 5670 689 300
6 12 2.61 1.4E-05 0.58 2820 6111 787 60
7 210 1.12 2.0E-05 0.75 1775 4861 591 180
8 28 1.80 1.9E-05 0.79 2969 5951 1076 60
ACCEPTED MANUSCRIPT
9 4390 2.05 6.0E-06 0.72 2531 5688 1183 300

10 840 1.83 5.4E-06 0.60 1460 4017 1047 300
11 225 2.31 4.0E-05 0.68 2106 5505 926 180
12 187 2.26 5.9E-06 0.53 1689 4967 1144 180
13 14 1.58 4.3E-06 0.77 2779 6290 1148 60
14 694 1.86 1.5E-05 0.76 1539 4003 1179 300
PT
15 13 1.03 3.0E-05 0.75 1273 5156 1136 60
16 16 2.97 1.9E-05 0.58 1413 5061 1445 60
17 256 1.33 6.2E-06 0.68 1323 5152 709 180
RI
18 18 1.21 9.4E-06 0.51 2281 5925 1209 60
19 1618 1.74 1.2E-05 0.63 1552 4806 736 300
20 1612 1.40 3.8E-05 0.59 2527 5962 619 300
SC
21 892 1.98 5.7E-06 0.55 1566 5178 1107 300
22 25 1.68 2.9E-05 0.55 1900 4089 950 60
23 604 2.90 1.8E-05 0.63 1614 4440 959 300
U
24 251 2.84 9.5E-06 0.53 2944 5804 1162 180
25 4237 1.11 6.2E-06 0.68 1710 5184 1270 300
26
27
565
1448
2.48
1.54
1.1E-05
1.2E-05
AN
0.64
0.71
1966
1393
4382
4853
850
1162
300
300
28 168 1.85 5.3E-06 0.71 2676 5518 916 180
M
29 147 2.10 1.6E-05 0.69 1431 4479 1342 180
30 1692 2.89 6.7E-06 0.51 2672 5846 1333 300
D
TE
C EP
AC

Accepted Manuscript: 10.1016/j.petlm.2017.11.003

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Accepted Manuscript: 10.1016/j.petlm.2017.11.003

Uploaded by

Copyright:

Available Formats

Accepted Manuscript

Application of artificial intelligence to forecast hydrocarbon production from shales

Palash Panja, Raul Velasco, Manas Pathak, Milind Deo

To appear in: Petroleum

Received Date: 15 June 2017

Application of Artificial Intelligence to Forecast Hydrocarbon

Keywords: Surrogate models; LSSVM; ANN; Oil recovery; Artificial intelligence;

implementation. Recently, artificial intelligence applications have gained the interest of

applied to various aspects of reservoir engineering including estimating initial hydrocarbon

mean square error (NRMSE).

possible natural fracture presence and heterogeneity. However, it is possible to build a

computationally expensive. Hence, only a small representative portion of the reservoir is

are assumed to be homogeneous as listed in the Table 1.

Table 1: Simulation parameters

Fracture half-length (ft) 375

Table 2: Input parameters and their values

Minimum Medium Maximum

1 Matrix Permeability (nD) X1 10 225 5000

5 Initial Gas Oil Ratio, Rsi (scf/stb) X5 800 1900 3000

8 Fracture Spacing (ft) X8 60 180 300

used as prescribed by Panja et al. [54].

years and 15 years.

an optimization routine known as Particle Swarm Optimization (PSO) using Matlab

Table 3: Number of parameters determined in each surrogate model

Method Number of parameters Optimized parameter

1 Bias term (b)

92 Support values (i)

the referenced article for more details.

3.1. Response Surface Model (RSM)

equation is chosen in this study. The equation is defined as:

3.2. Least Square Support Vector Machine (LSSVM)

(RBF) kernel is used in this study as shown in Equation 4

relationship. This is accomplished by using an optimization technique where initial  and 2

3.3. Artificial Neural Networks (ANNs)

Computations in hidden nodes and output nodes are shown in Figure 5.

sigmoid transfer function was used as shown in Equation 7

years of production as shown in Figure 6.

 Number of neurons in the first layer (hidden layer) = 14

 Number of neurons in the second layer (output layer) = 1

of neurons in the first layer * Number of neurons in the second layer=126

Particle Swarm Optimization (PSO) which is discussed in the upcoming sections.

3.4. Goodness of Fit

∑ ( ) , the residual sum of squares

∑ (̅ ) , the total sum of squares

̅ ∑ , the mean of observed values

The NRMSE is defined in Equation 10,

followed by a bird groups searching for food in a vast area.

(random 1 and random 2) as shown in Equation 11

for the next iteration step is updated by the following equation:

Name Parameters Optimized for a single C1 C2 w Iteration

1 Kernel parameter ()

in the flowchart shown in in Figure 7 [58].

iteration number is defined by the user.

5. Results and Discussion

study is to compare performance of three surrogate models. Production performance in terms of

used here to compare different surrogate models.

a terminal rate (5 bbl/day/fracture) are shown in Figures 8.

recovery are shown in Figure 9a and b respectively.

fractured tight formations can be successfully developed using simulation information as a

exhibits the highest accuracy with respect to gas-oil ratio prediction.

lead to very expensive computational overhead in commercial simulators. Surrogate reservoir

intelligence applications such as LSSVM have promising applications in various aspects of

Symbol Description Units