You are on page 1of 38

Accepted Manuscript

Application of artificial intelligence to forecast hydrocarbon production from shales

Palash Panja, Raul Velasco, Manas Pathak, Milind Deo

PII: S2405-6561(17)30114-1
DOI: 10.1016/j.petlm.2017.11.003
Reference: PETLM 175

To appear in: Petroleum

Received Date: 15 June 2017


Revised Date: 22 September 2017
Accepted Date: 22 November 2017

Please cite this article as: P. Panja, R. Velasco, M. Pathak, M. Deo, Application of artificial intelligence to
forecast hydrocarbon production from shales, Petroleum (2017), doi: 10.1016/j.petlm.2017.11.003.

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and all
legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT

Application of Artificial Intelligence to Forecast Hydrocarbon


Production from Shales
Palash Panja1*, Raul Velasco1, Manas Pathak2, Milind Deo2
1
Energy & Geoscience Institute
432 Wakara Way, Suite 300, Salt Lake City, UT 84108

PT
2
Department of Chemical Engineering, University of Utah
50 Central Campus Dr., Salt Lake City, UT 84112

RI
*ppanja@egi.utah.edu
*Corresponding author

SC
Abstract

Artificial intelligence (AI) methods and applications have recently gained a great deal of

U
attention in many areas, including fields of mathematics, neuroscience, economics, engineering,
AN
linguistics, gaming, and many others. This is due to the surge of innovative and sophisticated AI

techniques applications to highly complex problems as well as the powerful new developments
M
in high speed computing. Various applications of AI in everyday life include machine learning,
D

pattern recognition, robotics, data processing and analysis, etc. The oil and gas industry is not
TE

behind either, in fact, AI techniques have recently been applied to estimate PVT properties,

optimize production, predict recoverable hydrocarbons, optimize well placement using pattern
EP

recognition, optimize hydraulic fracture design, and to aid in reservoir characterization efforts. In

this study, three different AI models are trained and used to forecast hydrocarbon production
C

from hydraulically fractured wells. Two vastly used artificial intelligence methods, namely the
AC

Least Square Support Vector Machine (LSSVM) and the Artificial Neural Networks (ANN), are

compared to a traditional curve fitting method known as Response Surface Model (RSM) using

second order polynomial equations to determine production from shales. The objective of this

work is to further explore the potential of AI in the oil and gas industry. Eight parameters are

considered as input factors to build the model: reservoir permeability, initial dissolved gas-oil
ACCEPTED MANUSCRIPT

ratio, rock compressibility, gas relative permeability, slope of gas oil ratio, initial reservoir

pressure, flowing bottom hole pressure, and hydraulic fracture spacing. The range of values used

for these parameters resemble real field scenarios from prolific shale plays such as the Eagle

Ford, Bakken, and the Niobrara in the United States. Production data consists of oil recovery

PT
factor and produced gas-oil ratio (GOR) generated from a generic hydraulically fractured

reservoir model using a commercial simulator. The Box-Behnken experiment design was used to

RI
minimize the number of simulations for this study. Five time-based models (for production

SC
periods of 90 days, 1 year, 5 years, 10 years, and 15 years) and one rate-based model (when oil

rate drops to 5 bbl/day/fracture) were considered. Particle Swarm Optimization (PSO) routine is

U
used in all three surrogate models to obtain the associated model parameters. Models were trained
AN
using 80% of all data generated through simulation while 20% was used for testing of the models.

All models were evaluated by measuring the goodness of fit through the coefficient of
M

determination (R2) and the Normalized Root Mean Square Error (NRMSE). Results show that
D

RSM and LSSVM have very accurate oil recovery forecasting capabilities while LSSVM shows
TE

the best performance for complex GOR behavior. Furthermore, all surrogate models are shown

to serve as reliable proxy reservoir models useful for fast fluid recovery forecasts and sensitivity
EP

analyses.

Keywords: Surrogate models; LSSVM; ANN; Oil recovery; Artificial intelligence;


C

Unconventional reservoirs
AC

1. Introduction

Surrogate models are particularly useful for quick predictions given a range of input parameters.

These models are used to forecast oil production and perform sensitivity and uncertainty

analyses. Polynomial equations and other non-linear equations known as response surface

models (RSM) have been popularized for their simple mathematical structure and for easier
ACCEPTED MANUSCRIPT

implementation. Recently, artificial intelligence applications have gained the interest of

engineers and scientists due to their unconventional ways of connecting input data to output.

RSM coupled with a proper design of experiments [1] was proven to be an efficient and fast

proxy model for forecasting production performance and analyzing uncertainties [2]. Oil rate and

PT
water cut results were also predicted using RSM [3]. Response surface models are widely

applied to various aspects of reservoir engineering including estimating initial hydrocarbon

RI
uncertainty, [4] production uncertainty [5-10], finding an optimal scheme for well placement [7,

SC
11-14], history matching [13, 15, 16], and determining the dew point of water in natural gas

processing unit [17]. Field cases have been studied using pattern recognition techniques [18] to

U
determine pressure and production variation according to well locations.
AN
Even though researchers have developed numerical, analytical, and semi-analytical techniques to

understand the physics underlying the production from hydraulically fractured tight formations
M

[19-22], many of these systems grow in complexity rendering most of these methods
D

inapplicable. The AI approach on the other hand is very useful when dealing with highly
TE

complex systems. At the cost of understanding the physical mechanisms taking place in tight

formations, AI helps us analyze and forecast hydrocarbon production and assess performance.
EP

In this study, two of the most common AI techniques namely, ANN and LSSVM as well a

second order polynomial RSM are used to predict oil and gas-oil ratio production from
C

hydraulically fractured low permeability reservoir. The comparison of these three methods in
AC

terms of performance and accuracy is also discussed. The application of ANN started before

LSSVM in the early 90’s, data from well tests were already being interpreted using ANN [23,

24]. Rock characteristics such as lithology were determined from well logs using fuzzy neural

networks [25]. Reservoir heterogeneity with respect to porosity, permeability, and oil saturations
ACCEPTED MANUSCRIPT

were characterized from geophysical well logs such as gamma ray, bulk density, deep induction,

etc. using ANN [26]. Thermodynamic properties from reservoir fluids such as bubble point

pressures and formation volume factors at the bubble point have been predicted from four inputs:

solution GOR, reservoir temperature, oil gravity, and gas density using ANN, SVM and non-

PT
linear regression [27]. Similarly, crude oil viscosity and solution GOR as functions of pressure

have been determined from 12 variables including compositions of oil, bubble point pressure,

RI
bubble point viscosity [28], etc. using ANN. Calculations of gas condensate dew point pressures

SC
were also made using gas composition, temperature, and heavy fraction properties [29-31] and

condensate to gas ratio [32]. Results predicted by ANN for asphaltene precipitation [33] showed

U
promising results compared to experimental studies [34]. Oil rates have also been measured in
AN
the pipe line using ANN for varying pressures and temperatures [35]. Various applications of

LSSVM include porosity and permeability determination [36-39], water conning in horizontal
M

wells [40, 41], well placement [40], gas-oil relative permeability curves [42], phase equilibrium
D

calculations of hydrates [43], oil flow rate predictions [44], and temperature-pressure
TE

relationship in natural gas production and processing [45]. Wide applications of artificial

intelligence in improved oil recovery were recently described by researchers [46-49]. Other
EP

applications include the description of CO2 solubility [50] and calcium carbonate [51] in brine

sequestration processes.
C

Eight important parameters are considered as input data that include geological parameters
AC

(initial reservoir pressure, rock compressibility, and permeability), operational parameter (bottom

hole pressure), completion parameters (fracture spacing), rock-fluid properties (Corey gas

relative permeability exponent), and fluid properties (initial solution gas-oil ratio and the linear

slope of solution gas-oil ratio versus pressure) which are selected from a previous study [52]
ACCEPTED MANUSCRIPT

where a mechanistic study revealed these parameters to be highly significant. Six models (5

time-based models for production after 90 days, 1 year, 5 years, 10 years, and 15 years and one

rate-based model when oil rate drops to 5 bbl/day/fracture) of oil recovery and produced GOR

are developed for each surrogate model (RSM, ANN and LSSVM). Data is generated from a

PT
generic reservoir model with one vertical hydraulic fracture placed in the middle of the reservoir

using a commercial reservoir simulator. The mathematical formulations and workflow to create

RI
these surrogate models are discussed in this article. The results obtained for all models are

SC
compared using error analyses in terms of coefficient of determination (R2) and normalized root

mean square error (NRMSE).

U
2. Reservoir Model
AN
Unconventional reservoirs such as shales and other tight formations are very complex in terms of

possible natural fracture presence and heterogeneity. However, it is possible to build a


M
homogeneous reservoir model using average properties if the variation is not very large.
D

Typically, wells are drilled vertically and then directed horizontally for 1 to 2 miles, where as
TE

many as 100 vertical hydraulic fractures are induced to generate high conductive flow paths to

the wellbore. Simulating an entire reservoir model that consists of 100 hydraulic fractures is very
EP

computationally expensive. Hence, only a small representative portion of the reservoir is

simulated where production from a single vertical fracture is considered. The reservoir properties
C

are assumed to be homogeneous as listed in the Table 1.


AC

Table 1: Simulation parameters


Reservoir top depth (ft) 12000
Reservoir thickness (ft) 200
Reservoir width (ft) 750
Fracture width (ft) 0.05
Fracture height (ft) 200
ACCEPTED MANUSCRIPT

Fracture half-length (ft) 375


Fracture orientation Parallel to YZ plane
Reservoir porosity (%) 5
Initial water saturation (%) 16

PT
The number of unique input parameter combinations could lead to an enormous number of

RI
experiments or simulations. The Box-Behnken method [53] is chosen in this study to keep the

required number of simulations to a minimum. This simulation design is also suitable for second

SC
order response surface models. Using the Box-Behnken design, 114 simulations are modeled for

eight input parameters in three levels (minimum, medium and maximum) as shown in Table 2.

U
The values of all input parameters are converted to a -1, 0, and 1 scale using a linear relationship,
AN
except for matrix permeability and rock compressibility (where logarithmic values are used
M
instead).

Table 2: Input parameters and their values


D

Minimum Medium Maximum


Variable Symbol
(-1) (0) (+1)
TE

1 Matrix Permeability (nD) X1 10 225 5000


2 Gas Rel. Permeability Exponent, ng X2 1 2 3
3 Rock Compressibility (1/psi) X3 4x10-6 4x10-5 4x10-4
4 dRs/dp (SCF/STB/psi) X4 0.50 0.65 0.80
EP

5 Initial Gas Oil Ratio, Rsi (scf/stb) X5 800 1900 3000


6 Initial Pressure, Pi (psi) X6 4000 5250 6500
7 BHP (psi) X7 500 1000 1500
C

8 Fracture Spacing (ft) X8 60 180 300


AC

Apart from the 114 simulation results that were used to train the models, 30 additional

simulations were ran to test the models. Therefore, the training data is comprised of

approximately 80% of the total data set (114 out of 144) while the testing portion is comprised of

approximately 20% (30 out of 144). The list of simulations used to train and test the models can

be found in the Appendix (Tables A.3 and A.4). IMEXTM from the Computer Modeling Group,
ACCEPTED MANUSCRIPT

Calgary, Canada was used to conduct all black-oil simulations. The minimum number of

simulation grid blocks necessary to obtain accurate results and avoid convergence issues was

used as prescribed by Panja et al. [54].

3. Surrogate Model

PT
As mentioned earlier, three types of surrogate models – a Response Surface Model (RSM), a

Least Square Support Vector Machine (LSSVM) model, and an Artificial Neural Networks

RI
(ANN) model - were developed and compared in this study. Simulation results in terms of oil

SC
recovery and gas oil ratio (GOR) were obtained in two ways: by recording oil recovery and GOR

after certain production times and when the oil rate dropped to 5 bbl/day/fracture. In other words,

U
five time-based models and one rate-based model were developed as summarized below:


AN
Time based model: Models for oil recovery and GOR at 90 days, 1 year, 5 years, 10

years and 15 years.


M

 Rate based model: Models for oil recovery and GOR when oil rate drops to 5
D

bbl/day/fracture.
TE

All unknown parameters in the surrogate models (RSM, LSSVM and ANN) are obtained using

an optimization routine known as Particle Swarm Optimization (PSO) using Matlab


EP

(MathWorks® Inc.). The same optimization routine was used for all surrogate models to

eliminate any performance bias. Sometimes, unacceptable physical values such as negative oil
C

recovery factors or gas oil ratios are obtained using surrogate models. To avoid this pitfall,
AC

logarithms of the outcomes (recovery factors and gas-oil ratio) are used to build the models. A

simplified schematic of methodology used to develop the surrogate model is shown in Figure 1.
ACCEPTED MANUSCRIPT

PT
RI
U SC
AN
Figure 1: Surrogate model development schematic

All unknown parameters are listed in Table 3. These parameters are discussed in more detail in
M
the upcoming sections.
D

Table 3: Number of parameters determined in each surrogate model


TE

Method Number of parameters Optimized parameter


1 intercept
RSM all
44 coefficients
EP

1 Bias term (b)


Regularization parameter ()
1 Regularization parameter ()
LSSVM Kernel parameter ()
1 Kernel parameter ()
C

92 Support values (i)


15 Biases ( 14 hidden layer +1 output layer)
AC

ANN 126 Weights (8X14 for hidden layer+14 for output All
layer)
The first two models (i.e. RSM and LSSVM) are discussed in detail in a previous article [55].

Therefore, these two models are intentionally discussed in brief here and the reader is referred to

the referenced article for more details.


ACCEPTED MANUSCRIPT

3.1. Response Surface Model (RSM)

The response surface model is the most common method used in many branches of engineering.

Basically, an algebraic equation is fitted to develop a relationship between input and output data.

During equation fitting with training data, the parameters (coefficients, intercepts etc.) are

PT
determined through an optimization routine to minimize error. A second order polynomial

equation is chosen in this study. The equation is defined as:

RI
( ) ∑ ∑∑ ( )

SC
For 8 input variables, there are 8 interaction coefficients, ak, 36 second order interaction

U
coefficients, aij, and one intercept, a0, as shown in equation 1. A workflow to develop surrogate
AN
models (RSM and LSSVM) is shown in Figure 2.
M
D
TE
C EP
AC
ACCEPTED MANUSCRIPT

PT
RI
U SC
AN
M
D
TE
EP

Figure 2: Workflow used to develop RSM and LSSVM. Modified from Panja et al. [55]

As part of the development of a model, validation is performed using test data to assess
C
AC

robustness. An accepted error margin is set for the surrogate model. In this fashion, surrogate

models are continuously improved unless the error reaches its acceptance limit.

3.2. Least Square Support Vector Machine (LSSVM)

The Support Vector Machine (SVM) is usually used for classification and regression analysis. A

modified form of SVM, namely the least square support vector machine (LSSVM) is used in this
ACCEPTED MANUSCRIPT

study. LSSVM is close to SVM formulation but solves a linear system instead of a quadratic

programming (QP) problem. It has been widely applied in various fields because it is easier to

implement, speedy solution convergence, etc. On the other hand, LSSVM has the inherent nature

of overfitting to minimize error. Various combinations of data training and testing sets such as

PT
90-10 (%), 85-15(%), 80-20(%), and 70-30(%) were tried. Eventually, a data set with 80% used

for training and 20% used for testing yielded the best prediction capabilities in this study. The

RI
same combination was used for the other two surrogate models (RSM and ANN).

SC
The input and output relationship in LSSVM is given by Equation 2:

( ) ( )

U
The final form of LSSVM is given by Equation 3:
AN
( ) ( )
M
[ ] [ ] ( )

( ) ( ) ( ) ( )
[ ](
D

) ( )
TE

K(x,xi) is known as the kernel function which is chosen a priori. The Radial Basis Function

(RBF) kernel is used in this study as shown in Equation 4


EP

‖ ‖
( ) ( ) ( )
C

Where,
xi: Input vector of ith data
AC

b: Bias term
: Regularization parameter
: Kernel parameter
i: Support values

It is evident from equations 2, 3, and 4 that if the regularization parameter, , and the kernel

parameter, 2, are provided, the bias term and all support values can be determined from a linear

relationship. This is accomplished by using an optimization technique where initial  and 2


ACCEPTED MANUSCRIPT

values are guessed and iteratively improved as described in figure 2. For the optimization part,

the training data is further divided into LSSVM training data (80%) and optimization data (20%)

as shown in Figure 3

PT
RI
SC
Figure 3: Division of total data set into LSSVM training, optimization, and test data

LSSVM training is over once all parameters in the Table 3 are found. At this point, the model

U
can be applied for any unknown input vector using the RBF kernel as shown in Equation 5

( ) ∑
AN (
‖ ‖
) ( )
M

3.3. Artificial Neural Networks (ANNs)


D

The Artificial Neural Networks (ANNs) algorithm was developed based on human learning
TE

processes through brain and nerve networks. This is a connectionist technique where input and

output are linked through neurons. The most common feed forward architecture consists of one
EP

input layer, one or more hidden layers, and one output layer as shown in Figure 4.
C
AC
ACCEPTED MANUSCRIPT

PT
RI
U SC
AN
Figure 4: Basic structure of ANN with input, hidden, and output layers.
M
Links between input and output are established through the internal computations in the hidden
D

layers. The complexity and non-linearity of the model are increased by increasing the number of
TE

hidden layers where the individual components of a layer are known as nodes. In this study, there

are nine input nodes (eight input parameters as listed in Table 2 and one bias) contained in one
EP

hidden layer. A weight was given to each connection for every node and a bias term was added

to each hidden and output node. Bias and weight values used in this study are summarized for
C

one output in Table 3. In the process of training the ANN model, all weights and biases are
AC

determined by minimizing the error between the predicted output and the training output via

activation function at each the node. All output data is normalized as shown in Equation 6:

( )
( )
( ) ( )

Computations in hidden nodes and output nodes are shown in Figure 5.


ACCEPTED MANUSCRIPT

PT
RI
SC
(a) (b)

Figure 5: Input-to-output structure and calculations inside (a) hidden and (b) output nodes

U
As shown in figure 5 computation consists of two calculations: summation and transformation
AN
through activation functions where activation functions may be linear or non-linear. In this study,

sigmoid transfer function was used as shown in Equation 7


M

( )
( )
D

The sensitivity of the model to the number of hidden nodes (neurons) was also investigated. As
TE

described earlier, the non-linearity relationship between input and output data increases with the

number of hidden nodes. However, increasing non-linearity doesn’t always guarantee higher
EP

prediction accuracy. To find out the optimum number of hidden nodes (neurons), a sensitivity
C

study was conducted on the training and testing data for oil recovery and gas oil ratio after 5
AC

years of production as shown in Figure 6.


ACCEPTED MANUSCRIPT

PT
RI
SC
(a) (b)

U
Figure 6: Coefficients of determination using different number of hidden nodes for (a) Oil
AN
recovery and (b) gas oil ratio at 5 years.

It is evident from figure 6 that the R2 is close to unity for training data. On the other hand, the R2
M
value for the test data increases initially with the number of neurons for oil recovery and gas oil

ratio. A maximum R2 value can be clearly identified at 14 neurons for the case of oil recovery.
D

Therefore, 14 hidden neurons are used in this study. The ANN parameters used in study are
TE

summarized below:
EP

 Number of neurons in the first layer (hidden layer) = 14

 Number of neurons in the second layer (output layer) = 1


C

 Number of weights = Number of neurons in the first layer * Number of input + Number
AC

of neurons in the first layer * Number of neurons in the second layer=126

 Number of biases = Number of neurons in the first layer + Number of neurons in the

second layer = 15
ACCEPTED MANUSCRIPT

The unknown parameters in the ANN structure are summarized in Table 3. During training of the

ANN, these 126 weights and 15 biases are determined using an optimization routine, namely the

Particle Swarm Optimization (PSO) which is discussed in the upcoming sections.

3.4. Goodness of Fit

PT
There are various error measuring tools used in every branch of science and engineering. Their

uses are mostly dependent on the model and purpose of the system. During the fitting portion of

RI
the model (training the model) the Mean Square Error (MSE) is set as the objective function to

SC
determine the optimized model parameters using PSO. As the minimum value of MSE is the

indication of a good match between experimental or simulated values and modeled values, MSE

U
is minimized during optimization in PSO. The MSE is calculated between experimental or
AN
simulated values and modeled values as shown in Equation 8.

∑ ( )
M
( )
D

Yobs and Ymodel are the simulated and modeled values respectively, n is the number of data sets.
TE

In this study, the Normalized Root Mean Square Error (NRMSE) and the coefficient of

determination (R2) are adopted to measure the discrepancy between simulated data and model
EP

data. The NRMSE is used over MSE to compare various models (time based and rate based

models) in the same scale. The coefficient of determination, R2, is defined as shown in Equation
C

9
AC

( )

Where,

∑ ( ) , the residual sum of squares

∑ (̅ ) , the total sum of squares


ACCEPTED MANUSCRIPT

̅ ∑ , the mean of observed values

The NRMSE is defined in Equation 10,


( )

PT
Where Yobs,max is the maximum value and Yobs,min is the minimum value of the observed data.

The value of R2 varies from 0 to 1. R2 values close to unity and small NRMSE values are

RI
indication of a good fit.

SC
4. Optimization Routine: Particle Swarm Optimization

Inspired by the motion of bird swarms, the Particle Swarm Optimization (PSO) routine was

U
developed by Eberhart and Kennedy [56]. In this method, each potential solution is treated as
AN
particle. Each particle is characterized by its position and velocity. The position of a particle is

defined in a hyperspace whose dimension is equal to the number of unknown parameters being
M
optimized as shown in Table 3. For example, in the case of ANN, particles fly in a 141-
D

dimensional hyperspace. Several particles are initially defined in hyperspace where they
TE

iteratively change their position to determine the optimum position. Fitness of a particle is

determined by a fitness function such as the MSE. This algorithm is similar to the method
EP

followed by a bird groups searching for food in a vast area.

Two solutions, pbest and gbest, at any iteration during execution of the algorithm are tracked. The
C

local best or pbest is defined as the best position of a particle in the hyperspace as determined by
AC

the fitness value. The global best or gbest is the overall best value by any particle so far in the

population. At each iteration step, the velocity is updated first and then position. Accelerating the

particle towards its pbest and gbest by updating velocity is done by two separate random numbers

(random 1 and random 2) as shown in Equation 11


ACCEPTED MANUSCRIPT

( )

The cognitive components guide the local search from its local best (pbest) and the social

PT
component is responsible for global search depending on the population best (gbest) [57]. In

equation 11, is the velocity in the next iteration step which is partially preserved from the

RI
current velocity by an inertia weight, wi (range 0.4 to 0.9). The acceleration coefficients (C1 and

SC
C2) for cognitive and social components are chosen by trial and error. The position of a particle

for the next iteration step is updated by the following equation:

U
( )
AN
The values of wi, C1, C2, and other parameters used in this study are given in Table 4.
M
Table 4: Particle swarm optimization parameters in various surrogate models
Surrogate Model Particle Swarm Optimization
Number of Particles Maximum
D

Name Parameters Optimized for a single C1 C2 w Iteration


parameter
TE

1 intercept
RSM 100 2 2 0.6 1000
44 coefficients
1 Regularization parameter ()
LSSVM 100 2 2 0.6 1000
EP

1 Kernel parameter ()


15 Biases
ANN 100 2 2 0.6 1000
126 Weights
C

Initial position and velocity of each particle is randomly distributed. After the initialization of
AC

positions and velocities of all particles, fitness is calculated. In subsequent steps, positions and

velocities are updated iteratively by the local best and the global best parameters as summarized

in the flowchart shown in in Figure 7 [58].


ACCEPTED MANUSCRIPT

PT
RI
U SC
AN
M
D
TE

Figure 7: Particle Swarm Optimization flow chart. Modified from Ahmadi et al. [58]

The entire flowchart can be divided into four parts, namely, initialization, fitness evaluation,
EP

condition check, and updates of velocity and position. Acceptance of any particle as potential
C

solution is determined by its fitness value which is calculated in each iteration step. As described
AC

earlier, one local best, pbest, and one global best, gbest, are recorded during each iteration. The

number of iterations is only limited by time and computational constraints; hence the maximum

iteration number is defined by the user.


ACCEPTED MANUSCRIPT

5. Results and Discussion

Five time-based models (90 days, 1 year, 5 years, 10 years and 15 years) and one rate-based

model (5 bbl/day/fracture) were trained using RSM, LSSVM and ANN. The objective of this

study is to compare performance of three surrogate models. Production performance in terms of

PT
oil recovery and gas-oil ratio are compared with simulation data. Since all time-based models

behave similarly, only two time-based models (one early production model after 90 days and a

RI
long production model after 10 years) along with one rate-based model are discussed here. The

SC
fitness of a model with training data and test data are both discussed in this section. Once the

model is trained, it is tested against an unknown data set (i.e., test data) to check for robustness

U
of forecasting capabilities. As discussed earlier, the fitness is determined by two measures, the
AN
coefficient of determination (R2) and normalized root mean square error (NRMSE) which are

used here to compare different surrogate models.


M

5.1. Training
D

In this section, model fitness as compared with training data is evaluated and discussed. It is
TE

important to assess individual models to check for overfitting. Three surrogate models (RSM,

LSSVM, and ANN) for oil recovery and gas oil ratio after 90 days, 10 years of production and at
EP

a terminal rate (5 bbl/day/fracture) are shown in Figures 8.


C
AC
ACCEPTED MANUSCRIPT

PT
RI
SC
(a) (d)

U
AN
M
D
TE

(b) (e)
C EP
AC

(c) (f)
Figure 8: Comparison of RSM, ANN and LSSVM models using training data for (a) Oil
ACCEPTED MANUSCRIPT

recovery after 90 days (b), Oil recovery after 10 years (c), Oil recovery after oil rate drops to 5
bbl/day/fracture (d), Gas Oil Ratio after 90 days (e), Gas Oil Ratio after 10 years (f), and Gas Oil
Ratio after oil rate drops to 5 bbl/day/fracture

Typical high model fitness as compared with training data can be observed for all cases in Figure

8. However, the capture of the production behavior using surrogate models without apprehending

PT
the underlying physics is a great challenge. It is difficult to model the GOR from low

permeability reservoirs [59] due to its complex behavior. Mainly at higher value of GOR

RI
(obtained from 10 years model), flow becomes boundary dominated. On the other hand, lower

SC
GOR (obtained from 90 days and 1 year models) occurs when flow is at the transient linear

regime. Overall, both oil recovery and GOR from surrogate models are in good agreement with

U
simulation results. The errors are calculated in terms of R2 and NRMSE as listed in Tables A.1
AN
and A.2. For visual comparison, R2 and NRMSE for RSM, ANN and LSSVM models of oil

recovery are shown in Figure 9a and b respectively.


M
D
TE
C EP
AC

(a) (b)
Figure 9: Fitness of RSM, ANN and LSSVM models for of oil recovery for training data (a) Co-
efficient of determination, R2 and (b) NRMSE
R2 and NRMSE for oil recovery using RSM, LSSVM and ANN for all time- and rate-based

models are greater than 0.95 and less than 6% respectively. These values are evidence of well-

trained models.
ACCEPTED MANUSCRIPT

5.2. Testing

Some models may have the tendency to overfit with training data and consequently fail to predict

unseen test data with high accuracy. In this study, 20 percent of all data was used to check the

forecasting capabilities of all developed models. Results for oil recovery and GOR after 90 days,

PT
10 years production, and at terminal oil rate (5 bbl/day/fracture) are shown in Figure 10.

RI
U SC
AN
M

(a) (d)
D
TE
C EP
AC

(b) (e)
ACCEPTED MANUSCRIPT

PT
RI
SC
(c) (f)
Figure 10: Comparison of RSM, ANN and LSSVM models using test data for (a) Oil recovery
after 90 days, (b) Oil recovery after 10 years, (c) Oil recovery after oil rate drops to 5

U
bbl/day/fracture, (d), Gas Oil Ratio after 90 days (e), Gas Oil Ratio after 10 years (f), and Gas
Oil Ratio after the oil rate drops to 5 bbl/day/fracture
AN
Although all models had high fitness compared with training data, they showed relatively lower

fitness when compared with test data. Considering the fact that test data was not accounted for
M
during training, the models show promising forecasting capabilities without significant
D

aberrations. R2 and NRMSE are calculated as listed in Tables A.1 and A.2. The R2 and NRMSE
TE

values for RSM, ANN, and LSSVM oil recovery models are shown in Figures 11 a and b

respectively.
C EP
AC

(a) (b)
ACCEPTED MANUSCRIPT

Figure 11: Fitness of RSM, ANN and LSSVM oil recovery models for testing data: (a) Co-
efficient of determination, R2 and (b) NRMSE
Except for a few cases, the forecast accuracy for all models are within decent ranges. As shown

in the figures above, RSM shows higher accuracy predicting oil recovery followed by LSSVM.

As shown in Figures 10 and 11, AI tools have the potential to predict oil recoveries and fluid

PT
ratios given a small training data set. Large amounts of completion, geological, and production

RI
data can indeed be used to train more robust models and complement current conventional tools

to evaluate the potential of tight oil reservoirs. The cost, however, is that AI skips the physical

SC
description and understanding of the multiphase production mechanisms in tight formations. This

cost may not be too high to pay since the current conventional understanding of these systems

U
may not be sufficiently developed yet. In fact, researchers have recently reported on the
AN
discrepancies between conventional thinking and fluids under nanoconfinement in tight
M
formations [60, 61].
D

6. Conclusion
TE

Artificial intelligence tools aimed to predict oil recovery and gas-oil ratio from hydraulically

fractured tight formations can be successfully developed using simulation information as a


EP

training framework. In this study, three models were developed based on RSM, ANN, and
C

LSSVM to predict recovery from wells producing under time-based (90 days, 1 year, 5 years,
AC

10 years, and 15 years) and rate-based constraints (5 bbl/day/fracture). Eight key factors,

namely, matrix permeability, gas relative permeability exponent, rock compressibility, initial

gas-oil ratio, slope of solution gas-oil ratio versus pressure, initial pressure, flowing bottom-

hole pressure, and fracture spacing were considered as input parameters for all cases. After all

models were trained with the same database, they were used to predict production for different
ACCEPTED MANUSCRIPT

scenarios. Using simulation as a comparison basis, all models were evaluated in terms of their

oil recovery and producing gas-oil ratio predictive capabilities. It was found that RSM and

LSSVM have better predictive capabilities for oil recovery than ANN. In addition, LSSVM

exhibits the highest accuracy with respect to gas-oil ratio prediction.

PT
Field-scale modeling and simulation of hydraulically fractured ultra-low permeability reservoirs

lead to very expensive computational overhead in commercial simulators. Surrogate reservoir

RI
models, on the other hand, are useful for quick oil production forecast and assessment.

SC
Additionally, these models can be used for risk and uncertainty analysis. Overall, artificial

intelligence applications such as LSSVM have promising applications in various aspects of

U
production and reservoir engineering.
AN
Nomenclature
M

Symbol Description Units


 Regularization Parameter -
D

 Kernel Parameter -
i Support Values
TE

Unit of Output
̅ Mean Of Observed Values Unit of Output
a0 The Intercept Of The Surrogate Model Unit of Output
AI Artificial Intelligence -
EP

aij Coefficient Of 2nd Order Interaction Of Inputs -


ak Coefficient Of Independent Input -
ANN Artificial Neural Networks -
C

b Bias Term Unit of Output


BHP Bottom Hole Pressure psi
AC

C1 Acceleration Coefficient For Cognitive Components


C2 Acceleration Coefficient For Social Components
DOE Design Of Experiments
dRs/dp Slope Of Gas/Oil Ratio In PVT (SCF/STB)/psi
gbest Population's Best Particle's Position
GOR Gas/Oil Ratio SCF/STB
LSSVM Least Square Support Vector Machine -
MSE Mean Square Error Unit of Output
ng Exponent Of Relative Permeability Curve For Gas -
NRMSE Normalized Root Mean Square Error -
ACCEPTED MANUSCRIPT

pbest Particle's Best Position


Pi Initial Reservoir Pressure psi
PSO Particle Swarm Optimization -
PVT Pressure-Volume-Temperature -
R2 Coefficient Of Determination -
Rsi Initial Gas/Oil Ratio SCF/STB
RSM Response Surface Model -

PT
SSres Residual Sum Of Squares Unit of Output
SStot Total Sum Of Squares Unit of Output
SVM Support Vector Machine -

RI
vk Velocity Of Particle -
wi Inertia Weight -
Ymodel,i Modeled Value Unit of Output

SC
Yobs,i Observed Data Unit of Output
Yobs,max The Maximum Value Of Observed Data Unit of Output
Yobs,min The Minimum Value Of Observed Data Unit of Output

U
References

[1].
AN
Yeten, B., A. Castellini, B. Guyaguler, and W.H. Chen. A Comparison Study on
Experimental Design and Response Surface Methodologies. in SPE Reservoir Simulation
M
Symposium. The Woodlands, Texas: Society of Petroleum Engineers Inc. (2005).
[2]. Amorim, T.C.A.D. and D.J. Schiozer. Risk Analysis Speed-Up With Surrogate Models. in
SPE Latin America and Caribbean Petroleum Engineering Conference. Mexico City,
D

Mexico: Society of Petroleum Engineers (2012).


[3]. Li, B. and F. Firedmann. A Novel Response Surface Methodology Based on "Amplitude
TE

Factor" Analysis for Modeling Nonlinear Responses Caused by Both Reservoir and
Controllable Factors. in SPE Annual Technical Conference and Exhibition. Dallas,
Texas: Society of Petroleum Engineers (2005).
[4]. Peng, C.Y. and R. Gupta. Experimental Design in Deterministic Modelling: Assessing
EP

Significant Uncertainties. in SPE Asia Pacific Oil and Gas Conference and Exhibition.
Jakarta, Indonesia: Society of Petroleum Engineers (2003).
[5]. Dejean, J.P. and G. Blanc. Managing Uncertainties on Production Predictions Using
C

Integrated Statistical Methods. in SPE Annual Technical Conference and Exhibition.


Houston, Texas: Society of Petroleum Engineers (1999).
AC

[6]. Corre, B., P. Thore, V.d. Feraudy, and G. Vincent. Integrated Uncertainty Assessment
For Project Evaluation and Risk Analysis. in SPE European Petroleum Conference.
Paris, France: Society of Petroleum Engineers Inc. (2000).
[7]. Manceau, E., M. Mezghani, I. Zabalza-Mezghani, and F. Roggero. Combination of
Experimental Design and Joint Modeling Methods for Quantifying the Risk Associated
With Deterministic and Stochastic Uncertainties - An Integrated Test Study. in SPE
Annual Technical Conference and Exhibition. New Orleans, Louisiana: Society of
Petroleum Engineers Inc. (2001).
[8]. Venkataraman, R. Application of the Method of Experimental Design to Quantify
Uncertainty in Production Profiles. in SPE Asia Pacific Conference on Integrated
ACCEPTED MANUSCRIPT

Modelling for Asset Management. Yokohama, Japan: Copyright 2000, Society of


Petroleum Engineers Inc. (2000).
[9]. Chewaroungroaj, J., O.J. Varela, and L.W. Lake. An Evaluation of Procedures to
Estimate Uncertainty in Hydrocarbon Recovery Predictions. in SPE Asia Pacific
Conference on Integrated Modelling for Asset Management. Yokohama, Japan:
Copyright 2000, Society of Petroleum Engineers Inc. (2000).
[10]. Mohaghegh, S.D. Quantifying Uncertainties Associated With Reservoir Simulation

PT
Studies Using a Surrogate Reservoir Model. in SPE Annual Technical Conference and
Exhibition. San Antonio, Texas, USA: Society of Petroleum Engineers (2006).
[11]. Guyaguler, B. and R.N. Horne. Uncertainty Assessment of Well Placement Optimization.
in SPE Annual Technical Conference and Exhibition. New Orleans, Louisiana: Copyright

RI
2001, Society of Petroleum Engineers Inc. (2001).
[12]. Manceau, E., F. Roggero, and I. Zabalza-Mezghani. Use Of Experimental Design
Methodology To Make Decisions In An Uncertain Reservoir Environment From

SC
Reservoir Uncertainties To Economic Risk Analysis. World Petroleum Congress (2002).
[13]. Landa, J.L. and B. Güyagüler. A Methodology for History Matching and the Assessment
of Uncertainties Associated with Flow Prediction. in SPE Annual Technical Conference

U
and Exhibition. Denver, Colorado: Society of Petroleum Engineers (2003).
[14]. Carreras, P.E., S.E. Turner, and G.T. Wilkinson. Tahiti: Development Strategy
AN
Assessment Using Design of Experiments and Response Surface Methods. in SPE
Western Regional/AAPG Pacific Section/GSA Cordilleran Section Joint Meeting.
Anchorage, Alaska, USA: Society of Petroleum Engineers (2006).
[15]. Yang, C., L.X. Nghiem, C. Card, and M. Bremeier. Reservoir Model Uncertainty
M
Quantification Through Computer-Assisted History Matching. in SPE Annual Technical
Conference and Exhibition. Anaheim, California, U.S.A.: Society of Petroleum Engineers
(2007).
D

[16]. Slotte, P.A. and E. Smorgrav. Response Surface Methodology Approach for History
Matching and Uncertainty Assessment of Reservoir Simulation Models. in
TE

Europec/EAGE Conference and Exhibition. Rome, Italy: Society of Petroleum Engineers


(2008).
[17]. Ahmadi, M.A., R. Soleimani, and A. Bahadori, A computational intelligence scheme for
EP

prediction equilibrium water dew point of natural gas in TEG dehydration systems. Fuel,
137 (2014). p. 145-154.
[18]. Mohaghegh, S.D., J.S. Liu, R. Gaskari, M. Maysami, and O.A. Olukoko. Application of
Well-Base Surrogate Reservoir Models (SRMs) to Two Offshore Fields in Saudi Arabia,
C

Case Study. in SPE Western Regional Meeting. Bakersfield, California, USA: Society of
Petroleum Engineers (2012).
AC

[19]. Velasco, R., P. Panja, and M. Deo. New Production Performance and Prediction Tool for
Unconventional Reservoirs, URTEC-2461718-MS. in Unconventional Resources
Technology Conference, 1-3 August. San Antonio, Texas, USA: Unconventional
Resources Technology Conference (2016).
[20]. Patzek, T.W., F. Male, and M. Marder, Gas production in the Barnett Shale obeys a
simple scaling theory. Proceedings of the National Academy of Sciences, 110(49) (2013).
p. 19731-19736.
[21]. Wattenbarger, R.A., A.H. El-Banbi, M.E. Villegas, and J.B. Maggard. Production
Analysis of Linear Flow Into Fractured Tight Gas Wells, SPE-39931-MS. in SPE Rocky
ACCEPTED MANUSCRIPT

Mountain Regional/Low-Permeability Reservoirs Symposium, 5-8 April. Denver,


Colorado, USA: Society of Petroleum Engineers (1998).
[22]. Nobakht, M., L. Mattar, S. Moghadam, and D.M. Anderson, Simplified Forecasting of
Tight/Shale-Gas Production in Linear Flow. Journal of Canadian Petroleum Technology,
51(06) (2012). p. 11.
[23]. Al-Kaabi, A.U. and W.J. Lee, Using Artificial Neural Networks To Identify the Well Test
Interpretation Model (includes associated papers 28151 and 28165 ). 8(03) (1993.

PT
[24]. Juniardi, I.R. and I. Ershaghi. Complexities of Using Neural Network in Well Test
Analysis of Faulted Reservoirs. Society of Petroleum Engineers (1993).
[25]. Zhou, C.D., X.-L. Wu, and J.-A. Cheng. Determining Reservoir Properties in Reservoir
Studies Using a Fuzzy Neural Network. in SPE Annual Technical Conference and

RI
Exhibition. Houston, Texas: Society of Petroleum Engineers (1993).
[26]. Mohaghegh, S., R. Arefi, S. Ameri, and M.H. Hefner. A Methodological Approach for
Reservoir Heterogeneity Characterization Using Artificial Neural Networks. in SPE

SC
Annual Technical Conference and Exhibition. New Orleans, Louisiana: Society of
Petroleum Engineers (1994).
[27]. E. El-Sebakhy, T.S., S. Al-Bokhitan, Y. Shaaban, I. Raharja, Y. Khaeruzzaman. Support

U
Vector Machines Framework for Predicting the PVT Properties of Crude-Oil Systems.
Kingdom of Baharin: 15th SPE Middle East Oil & Gas Show and Conference (2007).
[28]. AN
Oloso, M., A. Khoukhi, A. Abdulraheem, and M. Elshafei. Prediction of Crude Oil
Viscosity and Gas/Oil Ratio Curves Using Recent Advances to Neural Networks. in
SPE/EAGE Reservoir Characterization and Simulation Conference. Abu Dhabi, UAE:
Society of Petroleum Engineers (2009).
M
[29]. Rabiei, A., H. Sayyad, M. Riazi, and A. Hashemi, Determination of dew point pressure in
gas condensate reservoirs based on a hybrid neural genetic algorithm. Fluid Phase
Equilibria, 387 (2015). p. 38-49.
D

[30]. Ahmadi, M.A. and M. Ebadi, Evolving smart approach for determination dew point
pressure through condensate gas reservoirs. Fuel, 117 (2014). p. 1074-1084.
TE

[31]. Ahmadi, M.A., M. Ebadi, and A. Yazdanpanah, Robust intelligent tool for estimating dew
point pressure in retrograded condensate gas reservoirs: Application of particle swarm
optimization. Journal of Petroleum Science and Engineering, 123 (2014). p. 7-19.
EP

[32]. Ahmadi, M.A., M. Ebadi, P.S. Marghmaleki, and M.M. Fouladi, Evolving predictive
model to determine condensate-to-gas ratio in retrograded condensate gas reservoirs.
Fuel, 124 (2014). p. 241-257.
[33]. Ahmadi, M.A., Neural network based unified particle swarm optimization for prediction
C

of asphaltene precipitation. Fluid Phase Equilibria, 314 (2012). p. 46-51.


[34]. Ahmadi, M.A. and S.R. Shadizadeh, New approach for prediction of asphaltene
AC

precipitation due to natural depletion by using evolutionary algorithm concept. Fuel, 102
(2012). p. 716-723.
[35]. Ahmadi, M.A., M. Ebadi, A. Shokrollahi, and S.M.J. Majidi, Evolving artificial neural
network and imperialist competitive algorithm for prediction oil flow rate of the
reservoir. Applied Soft Computing, 13(2) (2013). p. 1085-1098.
[36]. Fatai Adesina Anifowose, A.A. Prediction of Porosity and Permeability of Oil and Gas
Reservoirs using Hybrid Computational Intelligence Models. Cairo, Egypt: North Africa
Technical Conference and Exhibition, SPE (2010).
ACCEPTED MANUSCRIPT

[37]. Fatai Adesina Anifowose, A.O.E., Safiriyu Ijiyemi. Prediction of Oil and Gas Reservoir
Properties using Support Vector Machines. Bangkok, Thailand: International Petroleum
Technology Conference, (2011).
[38]. Ammal F. Al-anazi, G., Ian D, Support-Vector Regression for Permeability Prediction in
a Heterogeneous Reservoir: A Comparative Study. SPE Reservoir Evaluation &
Engineering, 13(03) (2010).
[39]. Mohammad-Ali Ahmadi, M.R.A., Seyed Moein Hosseini, Mohammad Ebadi,

PT
Connectionist model predicts the porosity and permeability of petroleum reservoirs by
means of petro-physical logs: Application of artificial intelligence. Journal of Petroleum
Science and Engineering, 123 (2014). p. 183-200.
[40]. Mohammad-Ali Ahmadi, A.B., A LSSVM approach for determining well placement and

RI
conning phenomena in horizontal wells. Fuel, 153 (2015). p. 276-283.
[41]. Mohammad Ali Ahmadi, M.E., Seyed Moein Hosseini, Prediction breakthrough time of
water coning in the fractured reservoirs by implementing low parameter support vector

SC
machine approach. Fuel, 117 (2014). p. 579-589.
[42]. Ahmadi, M.A., Connectionist approach estimates gas–oil relative simulation in
petroleum reservoirs: Application to reservoir simulation. Fuel, 140 (2015). p. 429-439.

U
[43]. Eslamimanesh, A., F. Gharagheizi, M. Illbeigi, A.H. Mohammadi, A. Fazlali, and D.
Richon, Phase equilibrium modeling of clathrate hydrates of methane, carbon dioxide,
AN
nitrogen, and hydrogen + water soluble organic promoters using Support Vector
Machine algorithm. Fluid Phase Equilibria, 316 (2012). p. 34-45.
[44]. Reza Gholgheysari Gorjaei, R.S., Mohammad Torkaman, Mohsen Safari, Ghassem
Zargar, A novel PSO-LSSVM model for predicting liquid rate of two phase flow through
M
wellhead chokes. Journal of Natural Gas Science and Engineering, 24 (2015). p. 228-237.
[45]. Ahmadi, M.-A., M.Z. Hasanvand, and A. Bahadori, A least-squares support vector
machine approach to predict temperature drop accompanying a given pressure drop for
D

the natural gas production and processing systems. International Journal of Ambient
Energy, 38(2) (2015). p. 122-129.
TE

[46]. Ahmadi, M.A., M. Masoumi, and R. Askarinezhad, Evolving Connectionist Model to


Monitor the Efficiency of an In Situ Combustion Process: Application to Heavy Oil
Recovery. Energy Technology, 2(9-10) (2014). p. 811-818.
EP

[47]. Ahmadi, M.-A., M. Masumi, R. Kharrat, and A.H. Mohammadi, Gas Analysis by In Situ
Combustion in Heavy-Oil Recovery Process: Experimental and Modeling Studies.
Chemical Engineering & Technology, 37(3) (2014). p. 409-418.
[48]. Ahmadi, M.A., M. Masoumi, and R. Askarinezhad, Evolving Smart Model to Predict the
C

Combustion Front Velocity for In Situ Combustion. Energy Technology, 3(2) (2015). p.
128-135.
AC

[49]. Ahmadi, M.A., M. Zahedzadeh, S.R. Shadizadeh, and R. Abbassi, Connectionist model
for predicting minimum gas miscibility pressure: Application to gas injection process.
Fuel, 148 (2015). p. 202-211.
[50]. Ali Ahmadi, M. and A. Ahmadi, Applying a sophisticated approach to predict
CO2solubility in brines: application to CO2sequestration. International Journal of Low-
Carbon Technologies, 11(3) (2016). p. 325-332.
[51]. Ahmadi, M.-A., A. Bahadori, and S.R. Shadizadeh, A rigorous model to predict the
amount of Dissolved Calcium Carbonate Concentration throughout oil field brines: Side
effect of pressure and temperature. Fuel, 139 (2015). p. 154-159.
ACCEPTED MANUSCRIPT

[52]. Panja, P., T. Conner, and M. Deo, Factors Controlling Production in Hydraulically
Fractured Low Permeability Oil Reservoirs. International Journal of Oil, Gas and Coal
Technology, 3(1) (2015). p. 18.
[53]. Box, G.E.P. and D.W. Behnken, Some New Three Level Designs for the Study of
Quantitative Variables. Technometrics, 2(4) (1960). p. 455-475.
[54]. Panja, P., T. Conner, and M. Deo, Grid sensitivity studies in hydraulically fractured low
permeability reservoirs. Journal of Petroleum Science and Engineering, 112(0) (2013). p.

PT
78-87.
[55]. Panja, P., M. Pathak, R. Velasco, and M. Deo. Least Square Support Vector Machine: An
Emerging Tool for Data Analysis. in SPE Low Perm Symposium, 5-6 May. Colorado,
Denver: Society of Petroleum Engineers (2016).

RI
[56]. Eberhart, R. and J. Kennedy. A new optimizer using particle swarm theory. in Micro
Machine and Human Science, 1995. MHS '95., Proceedings of the Sixth International
Symposium on. (1995).

SC
[57]. Banerjee, C. and R. Sawal. PSO with dynamic acceleration Coefficient based on Mutiple
Constraint Satisfaction. in International Conference on Advances in Electronics
Computers and Communications. Bangalore, India (2014).

U
[58]. Ahmadi, M.A., R. Soleimani, M. Lee, T. Kashiwao, and A. Bahadori, Determination of
oil well production performance using artificial neural network (ANN) linked to the

[59].
AN
particle swarm optimization (PSO) tool. Petroleum, 1(2) (2015). p. 118-132.
Panja, P. and M. Deo, Unusual behavior of produced gas oil ratio in low permeability
fractured reservoirs. Journal of Petroleum Science and Engineering, 144 (2016). p. 76-
83.
M
[60]. Pathak, M., H. Kweon, P. Panja, R. Velasco, and M.D. Deo. Suppression in the Bubble
Points of Oils in Shales Combined Effect of Presence of Organic Matter and
Confinement. in SPE Unconventional Resources Conference, 15-16 February, . Calgary,
D

Alberta, Canada: Society of Petroleum Engineers (2017).


[61]. Velasco, R., M. Pathak, P. Panja, and M. Deo. What Happens to Permeability at the
TE

Nanoscale? A Molecular Dynamics Simulation Study. in SPE/AAPG/SEG


Unconventional Resources Technology Conference, 24-26 July. Austin, Texas, USA:
Unconventional Resources Technology Conference (2017).
EP

Appendix A: Supplementary Information


C
AC

Table A.1: Coefficient of determination (R2) of RSM, LSSVM and ANN for all models
Training Data Test Data
Output Model
RSM LSSVM ANN RSM LSSVM ANN
90 days 0.99 0.99 0.96 0.69 0.52 0.51
1 year 0.98 0.99 0.98 0.78 0.69 0.53
Oil 5 years 0.99 0.99 0.99 0.63 0.81 0.60
Recovery 10 years 0.99 0.99 0.98 0.91 0.9 0.72
15 years 0.99 0.99 0.99 0.97 0.93 0.84
Rate Based 0.98 0.98 0.99 0.57 0.54 0.48
ACCEPTED MANUSCRIPT

90 days 0.98 0.99 0.95 0.92 0.84 0.80


1 year 0.98 0.98 0.96 0.93 0.91 0.90
Gas Oil 5 years 0.98 0.98 0.97 0.41 0.73 0.30
Ratio 10 years 0.88 0.92 0.93 0.76 0.73 0.46
15 years 0.83 0.77 0.84 0.79 0.75 0.32
Rate Based 0.84 0.88 0.92 0.68 0.45 0.43

PT
Table A.2: Normalized Root Mean Square Error (NRMSE) of RSM, LSSVM and ANN for all
models
Training Data Test Data

RI
Output Model
RSM LSSVM ANN RSM LSSVM ANN
90 days 1.9 1.9 3.5 16.5 20.3 20.7
1 year 2.4 2.3 2.5 12.4 14.7 18.1

SC
Oil 5 years 2.0 1.9 2.1 16.1 11.5 16.7
Recovery 10 years 1.9 1.7 2.6 7.9 8.5 14
15 years 2.7 2.4 2.1 4.9 7.3 10.8

U
Rate Based 3.5 3.3 2.4 20.7 21.2 22.6
90 days 2.6 2.0 4.6 8.7 11.8 13.3

Gas Oil
1 year
5 years
3.3
3.0
AN 3.3
3.1
4.2
3.7
7.9
24.0
9.3
16.1
9.7
26.2
Ratio 10 years 5.7 4.6 4.3 16.1 17.2 24.3
M
15 years 5.8 6.8 5.6 14.4 15.5 25.7
Rate Based 5.2 4.5 3.8 14.1 18.4 18.8
D
TE
C EP
AC

(a) (d)
ACCEPTED MANUSCRIPT

PT
RI
SC
(b) (e)

U
AN
M
D
TE
EP

(c) (f)
Figure A.1.: Data training comparison of RSM, ANN and LSSVM models for (a) Oil recovery
after 1 year (b) Oil recovery after 5 years (c) Oil recovery after 15 years (d) Gas Oil Ratio after 1
C

year (e) Gas Oil Ratio after 5 years (f) Gas Oil Ratio after 15 years
AC

Table A.3: List of simulations using Box-Behnken DOE used to train surrogate models
Sr. Km ng Cf dRs/dp Rsi Pi Pwf Xf
No. (nD) - (1/psi) ((SCF/STB)/psi) (SCF/STB) (psi) (psi) (ft.)
1 10 1 4.00E-05 0.65 1900 5250 1000 180
2 10 3 4.00E-05 0.65 1900 5250 1000 180
3 5000 1 4.00E-05 0.65 1900 5250 1000 180
4 5000 3 4.00E-05 0.65 1900 5250 1000 180
5 10 2 4.00E-06 0.65 1900 5250 1000 180
ACCEPTED MANUSCRIPT

6 10 2 4.00E-04 0.65 1900 5250 1000 180


7 5000 2 4.00E-06 0.65 1900 5250 1000 180
8 5000 2 4.00E-04 0.65 1900 5250 1000 180
9 10 2 4.00E-05 0.5 1900 5250 1000 180
10 10 2 4.00E-05 0.8 1900 5250 1000 180
11 5000 2 4.00E-05 0.5 1900 5250 1000 180
4.00E-05

PT
12 5000 2 0.8 1900 5250 1000 180
13 10 2 4.00E-05 0.65 800 5250 1000 180
14 10 2 4.00E-05 0.65 3000 5250 1000 180

RI
15 5000 2 4.00E-05 0.65 800 5250 1000 180
16 5000 2 4.00E-05 0.65 3000 5250 1000 180
17 10 2 4.00E-05 0.65 1900 4000 1000 180

SC
18 10 2 4.00E-05 0.65 1900 6500 1000 180
19 5000 2 4.00E-05 0.65 1900 4000 1000 180
20 5000 2 4.00E-05 0.65 1900 6500 1000 180

U
21 10 2 4.00E-05 0.65 1900 5250 500 180
22 10 2 4.00E-05 0.65 1900 5250 1500 180
23
24
5000
5000
2
2
4.00E-05
4.00E-05
AN
0.65
0.65
1900
1900
5250
5250
500
1500
180
180
25 10 2 4.00E-05 0.65 1900 5250 1000 60
M
26 10 2 4.00E-05 0.65 1900 5250 1000 300
27 5000 2 4.00E-05 0.65 1900 5250 1000 60
28 5000 2 4.00E-05 0.65 1900 5250 1000 300
D

29 225 1 4.00E-06 0.65 1900 5250 1000 180


30 225 1 4.00E-04 0.65 1900 5250 1000 180
TE

31 225 3 4.00E-06 0.65 1900 5250 1000 180


32 225 3 4.00E-04 0.65 1900 5250 1000 180
33 225 1 4.00E-05 0.5 1900 5250 1000 180
EP

34 225 1 4.00E-05 0.8 1900 5250 1000 180


35 225 3 4.00E-05 0.5 1900 5250 1000 180
36 225 3 4.00E-05 0.8 1900 5250 1000 180
C

37 225 1 4.00E-05 0.65 800 5250 1000 180


38 225 1 4.00E-05 0.65 3000 5250 1000 180
AC

39 225 3 4.00E-05 0.65 800 5250 1000 180


40 225 3 4.00E-05 0.65 3000 5250 1000 180
41 225 1 4.00E-05 0.65 1900 4000 1000 180
42 225 1 4.00E-05 0.65 1900 6500 1000 180
43 225 3 4.00E-05 0.65 1900 4000 1000 180
44 225 3 4.00E-05 0.65 1900 6500 1000 180
45 225 1 4.00E-05 0.65 1900 5250 500 180
46 225 1 4.00E-05 0.65 1900 5250 1500 180
ACCEPTED MANUSCRIPT

47 225 3 4.00E-05 0.65 1900 5250 500 180


48 225 3 4.00E-05 0.65 1900 5250 1500 180
49 225 1 4.00E-05 0.65 1900 5250 1000 60
50 225 1 4.00E-05 0.65 1900 5250 1000 300
51 225 3 4.00E-05 0.65 1900 5250 1000 60
52 225 3 4.00E-05 0.65 1900 5250 1000 300
4.00E-06

PT
53 225 2 0.5 1900 5250 1000 180
54 225 2 4.00E-06 0.8 1900 5250 1000 180
55 225 2 4.00E-04 0.5 1900 5250 1000 180

RI
56 225 2 4.00E-04 0.8 1900 5250 1000 180
57 225 2 4.00E-06 0.65 800 5250 1000 180
58 225 2 4.00E-06 0.65 3000 5250 1000 180

SC
59 225 2 4.00E-04 0.65 800 5250 1000 180
60 225 2 4.00E-04 0.65 3000 5250 1000 180
61 225 2 4.00E-06 0.65 1900 4000 1000 180

U
62 225 2 4.00E-06 0.65 1900 6500 1000 180
63 225 2 4.00E-04 0.65 1900 4000 1000 180
64
65
225
225
2
2
4.00E-04
4.00E-06
AN
0.65
0.65
1900
1900
6500
5250
1000
500
180
180
66 225 2 4.00E-06 0.65 1900 5250 1500 180
M
67 225 2 4.00E-04 0.65 1900 5250 500 180
68 225 2 4.00E-04 0.65 1900 5250 1500 180
69 225 2 4.00E-06 0.65 1900 5250 1000 60
D

70 225 2 4.00E-06 0.65 1900 5250 1000 300


71 225 2 4.00E-04 0.65 1900 5250 1000 60
TE

72 225 2 4.00E-04 0.65 1900 5250 1000 300


73 225 2 4.00E-05 0.5 800 5250 1000 180
74 225 2 4.00E-05 0.5 3000 5250 1000 180
EP

75 225 2 4.00E-05 0.8 800 5250 1000 180


76 225 2 4.00E-05 0.8 3000 5250 1000 180
77 225 2 4.00E-05 0.5 1900 4000 1000 180
C

78 225 2 4.00E-05 0.5 1900 6500 1000 180


79 225 2 4.00E-05 0.8 1900 4000 1000 180
AC

80 225 2 4.00E-05 0.8 1900 6500 1000 180


81 225 2 4.00E-05 0.5 1900 5250 500 180
82 225 2 4.00E-05 0.5 1900 5250 1500 180
83 225 2 4.00E-05 0.8 1900 5250 500 180
84 225 2 4.00E-05 0.8 1900 5250 1500 180
85 225 2 4.00E-05 0.5 1900 5250 1000 60
86 225 2 4.00E-05 0.5 1900 5250 1000 300
87 225 2 4.00E-05 0.8 1900 5250 1000 60
ACCEPTED MANUSCRIPT

88 225 2 4.00E-05 0.8 1900 5250 1000 300


89 225 2 4.00E-05 0.65 800 4000 1000 180
90 225 2 4.00E-05 0.65 800 6500 1000 180
91 225 2 4.00E-05 0.65 3000 4000 1000 180
92 225 2 4.00E-05 0.65 3000 6500 1000 180
93 225 2 4.00E-05 0.65 800 5250 500 180
4.00E-05

PT
94 225 2 0.65 800 5250 1500 180
95 225 2 4.00E-05 0.65 3000 5250 500 180
96 225 2 4.00E-05 0.65 3000 5250 1500 180

RI
97 225 2 4.00E-05 0.65 800 5250 1000 60
98 225 2 4.00E-05 0.65 800 5250 1000 300
99 225 2 4.00E-05 0.65 3000 5250 1000 60

SC
100 225 2 4.00E-05 0.65 3000 5250 1000 300
101 225 2 4.00E-05 0.65 1900 4000 500 180
102 225 2 4.00E-05 0.65 1900 4000 1500 180

U
103 225 2 4.00E-05 0.65 1900 6500 500 180
104 225 2 4.00E-05 0.65 1900 6500 1500 180
105
106
225
225
2
2
4.00E-05
4.00E-05
AN
0.65
0.65
1900
1900
4000
4000
1000
1000
60
300
107 225 2 4.00E-05 0.65 1900 6500 1000 60
M
108 225 2 4.00E-05 0.65 1900 6500 1000 300
109 225 2 4.00E-05 0.65 1900 5250 500 60
110 225 2 4.00E-05 0.65 1900 5250 500 300
D

111 225 2 4.00E-05 0.65 1900 5250 1500 60


112 225 2 4.00E-05 0.65 1900 5250 1500 300
TE

113 225 2 4.00E-05 0.65 1900 5250 1000 180


114 225 2 4.00E-05 0.65 1900 5250 1000 180
EP

Table A.4: List of simulations performed using random input parameter values used to test the
surrogate models
C

Sr. Km ng Cf dRs/dp Rsi Pi Pwf Xf


No. (nD) - (1/psi) ((SCF/STB)/psi) (SCF/STB) (psi) (psi) (ft.)
AC

1 1496 1.17 1.0E-05 0.62 2069 5738 743 300


2 361 1.27 3.8E-05 0.67 1456 4170 1417 180
3 31 1.35 8.0E-06 0.58 1625 4637 769 60
4 44 1.78 2.0E-05 0.59 1724 4560 1266 60
5 2475 2.66 1.9E-05 0.69 2408 5670 689 300
6 12 2.61 1.4E-05 0.58 2820 6111 787 60
7 210 1.12 2.0E-05 0.75 1775 4861 591 180
8 28 1.80 1.9E-05 0.79 2969 5951 1076 60
ACCEPTED MANUSCRIPT

9 4390 2.05 6.0E-06 0.72 2531 5688 1183 300


10 840 1.83 5.4E-06 0.60 1460 4017 1047 300
11 225 2.31 4.0E-05 0.68 2106 5505 926 180
12 187 2.26 5.9E-06 0.53 1689 4967 1144 180
13 14 1.58 4.3E-06 0.77 2779 6290 1148 60
14 694 1.86 1.5E-05 0.76 1539 4003 1179 300

PT
15 13 1.03 3.0E-05 0.75 1273 5156 1136 60
16 16 2.97 1.9E-05 0.58 1413 5061 1445 60
17 256 1.33 6.2E-06 0.68 1323 5152 709 180

RI
18 18 1.21 9.4E-06 0.51 2281 5925 1209 60
19 1618 1.74 1.2E-05 0.63 1552 4806 736 300
20 1612 1.40 3.8E-05 0.59 2527 5962 619 300

SC
21 892 1.98 5.7E-06 0.55 1566 5178 1107 300
22 25 1.68 2.9E-05 0.55 1900 4089 950 60
23 604 2.90 1.8E-05 0.63 1614 4440 959 300

U
24 251 2.84 9.5E-06 0.53 2944 5804 1162 180
25 4237 1.11 6.2E-06 0.68 1710 5184 1270 300
26
27
565
1448
2.48
1.54
1.1E-05
1.2E-05
AN
0.64
0.71
1966
1393
4382
4853
850
1162
300
300
28 168 1.85 5.3E-06 0.71 2676 5518 916 180
M
29 147 2.10 1.6E-05 0.69 1431 4479 1342 180
30 1692 2.89 6.7E-06 0.51 2672 5846 1333 300
D
TE
C EP
AC

You might also like