You are on page 1of 15

Computational Geosciences (2020) 24:2043–2057

https://doi.org/10.1007/s10596-019-09895-8

ORIGINAL PAPER

Managing geological uncertainty in expensive reservoir simulation


optimization
Kashif Rashid1

Received: 21 September 2018 / Accepted: 30 August 2019 / Published online: 5 February 2020
© Springer Nature Switzerland AG 2020

Abstract
A method to manage geological uncertainty as part of an expensive simulation-based optimization process is presented.
When the number of realizations representing the uncertainty is high, the computational cost to optimize the system can be
considerable, and often prohibitively, as each forward evaluation is expensive to evaluate. To overcome this limitation, an
iterative procedure is developed that selects a subset of realizations, based on a binary nonlinear optimization subproblem,
to match the statistical properties of the target function at known sample points. This results in a reduced-order model that
is optimized in place of the full system at a much lower computational cost. The result is validated over the ensemble of all
realizations giving rise to one new sample point per iteration. The process repeats until the stipulated stopping conditions
are met. Demonstration of the proposed method on a publicly available realistic reservoir model with 50 realizations shows
that comparable results to full optimization can be obtained but far more efficiently.

Keywords Geological uncertainty · Realizations · Expensive · reservoir simulation and optimization

1 Introduction a single simulation-based objective evaluation, comprising


all the underlying realizations, and, particularly, during
This paper presents a method to manage geological uncer- optimization, when many such evaluations are required in
tainty in a reservoir simulation process. The uncertainty order to reach an optimal solution. This paper presents an
is represented in the form of multiple possible realiza- iterative method that serves to identify the least number
tions that are usually created as a result of the geostatistics of realizations necessary while ensuring the statistical
gathered from core and well log data. Geoscientists can gen- properties of the objective value are retained to a certain
erate a multitude of realizations that are constrained to the degree. The method, adaptive reduced-order-modeling
known data, yet enable the uncertainty in the subsurface to (AROM), is demonstrated on an open-literature reservoir
be considered [1]. In reservoir simulation, each realization simulation optimization problem comprising 50 underlying
entails a forward evaluation that is often time consuming realizations, along with a simple analytical case with 8
to perform. This cost is compounded when tens of real- realizations.
izations are considered [2]. Thus, the intent is usually to
provide a sufficient number of realizations that can capture
the uncertainty in the model without unduly affecting the 2 Background review
computational cost expected in evaluating the model under
uncertainty. An oilfield production system may comprise one or more
The possibility of using hundreds of realizations is reservoir models connected to a surface gathering network
precluded due to the computational cost associated with by many wells. Producer wells serve to extract the reservoir
fluids to the surface, while injector wells are used to inject
either water or gas into the formation for pressure manage-
 Kashif Rashid ment purposes. This stimulates the recovery of hydrocar-
krashid@slb.com bons by ensuring that the oil component flows preferentially
from the reservoir contact zones into the wells and ideally,
1 Schlumberger-Doll Research, Cambridge, MA, up to the surface, and further downstream to a collection
02139, USA sink [3].
2044 Comput Geosci (2020) 24:2043–2057

Most often, various types of sub-surface or surface number of evaluations which makes them computationally
valves are used to aid the production of reservoir fluids prohibitive for use without some approximation scheme,
by controlling well rates and boundary pressure conditions. especially when considering uncertainty [5].
That is, flow from zones or wells with undesirable reser- The use proxy or surrogate methods (e.g., kriging,
voir fluids is mitigated in favor of those producing valuable neural-networks, and radial-basis functions) have proven
hydrocarbons. The long-term goal is often to maximize the particularly effective in alleviating the computational cost
net present value (NPV) of the produced hydrocarbons and associated with optimization and especially when applied
in that regard, various design configurations can be tested adaptively, where the proxy model is continually refined [2,
by simulation, including consideration of the number and 5, 11]. However, these methods can be impractical for
location of new wells, or other design choices. However, as problems with high dimensions as the number of samples
reservoir and other petrophysical properties (derived from required to generate the approximation can be a significant
logs, cores, sensors, and field data) are riddled with uncer- bottleneck. Here, stochastic approximation and perturbation
tainty, viable decisions from a simulation process cannot methods can be considered better alternatives to manage
be made in the absence of uncertainty. Geological uncer- problems of scale [12, 13]. The approximate gradient direc-
tainty for example is presented in the form of multiple, tion established from an ensemble of evaluations enables a
possibly equi-probable, realizations of the underlying reser- gradient-based solver to progress effectively, albeit to a local
voir geology that dictates how the reservoir may behave. minimum. The use of covariance matrix adaption evolu-
Uncertainty then is considered in the simulation step by tion strategy with neighborhood approximation can also be
evaluation of all realizations, each requiring an expensive considered in this category [14].
simulation evaluation, in conjunction with a utility-driven Another method in this class is the ensemble-based
objective [4]. stochastic gradient scheme known as EnOpt [15, 16]. It
Mathematically, the forward simulation yields a utility- returns an approximate gradient from an approximation
based objective value (F ) for a given design imposed by the constructed over the sensitivity of the ensemble of controls
set of control variables (X) over the set of all realizations using the mean objective values. Each control sample is
(ρ) for a given risk-aversion (or confidence) factor (λ). drawn from a multi-variate Gaussian distribution in which
This results in an expensive simulation-based nonlinear the prior mean is updated by iteration. The method maxi-
optimization problem, that is further complicated if some mizes the expected objective over an ensemble of realiza-
variables are integer [5]. However, for convenience, and tions and controls. However, it overcomes the computational
without loss, only continuous variables are considered in burden of necessitating m2 evaluations (with m realiza-
this discussion. tions per sample) by approximating a robust gradient with
It can be noted that the solution process entails three one realization per sample. The stochastic simplex gradi-
components: (i) the problem definition that is dictated by the ent (StoSAG) method similarly constructs an approximate
choice of controls, decisions to be made, and the stipulated gradient [17, 18]. Like EnOpt, it samples a distribution of
objective with constraints; (ii) the solver choice, based on the controls and each sample is connected to one underly-
the availability of gradient information or otherwise; and ing realization. Smoothing conditions can be imposed on
(iii) some approximation scheme that may be employed the control samples using multipliers, along with cross-
for computational expediency. In the following, an outline covariance conditions. The authors note that StoSAG over-
of key approaches is given as they concern the last two comes the limitations imposed by the assumptions in EnOpt
components (solver and approximation choice), as the that results in StoSAG and its smoothed variants outper-
problem definition is assumed known. forming EnOpt in the tests presented [17]. Nonetheless, both
The general utility-based optimization problem can be methods can suffer from drastic control perturbations that
tackled with a gradient-based approach using derivative may lower reliability of the approximate gradient derived.
information directly from the simulation process, from In addition, m realizations are still maintained that can
an adjoint scheme, or by numerical perturbation at much be costly if a large number of realizations are considered.
greater cost [6, 7]. However, as this information may be Lastly, if sufficiently accurate, the approximate gradient
anticipated with a significant computational cost, along with will only lead to a local solution from the given starting
the fact that gradient-based methods are prone to become condition.
stuck in local minima, an alternative is to use derivative-free Retrospective optimization (RO) is one method that
methods from the outset [8]. This includes the downhill- selects a subset of realizations for analysis and is potentially
simplex method or population-based heuristics, such as better suited to problems with many realizations [19]. A
genetic algorithms, simulated annealing, and particle swarm sequence of problems are solved in which an increasing
optimization, that can potentially seek the global solution number of realizations are selected randomly or using
[9, 10]. However, these methods typically entail a great a clustering scheme on pre-selected static or dynamic
Comput Geosci (2020) 24:2043–2057 2045

reservoir properties. The latter requires the clustering step compactly defined for any subset of selected realizations as
to be repeated more frequently as the control setting will follows:
impact the dynamic properties. Here, each subset gives
rise to an approximation of the real problem that can be max M(X|U, ρ) = M(X|ρ̄)
solved with any type of solver. The result from one problem X ∈ Rn

is marked as the starting point for the next, ultimately, m̄ = m j =1 uj
leading to the final and actual problem comprising all
n = nv (np + nq )
realizations. The authors show that the method is efficient,
particularly when using the clustering scheme during the xiL ≤ xi ≤ xiU i ∈ [1 n] (2)
realization selection step and that a convergent solution where M is the reduced-order model functional of
is established in only a few iterations. This suggests the control variable set X of dimensionality n and ρ̄
that all iterations may not be necessary to reach a represents the compact set of m̄ selected realizations
good solution, potentially making the procedure more {ρ̄1 , ρ̄2 , . . . , ρ̄m̄ }. Thus, the intent is to establish a reduced-
effective when the number of realizations in contention is order representation M with a given U as a proxy to the
large. real problem F . However, the solution to this subproblem
is contingent on some number of known samples that typify
the system behavior and are used to identify U . Hence, the
3 Proposed method latter problem, stated as Eq. 14 below, is solved each time a
new sample becomes available.
3.1 Problem statement
3.3 Sample evaluation
The general optimization under uncertainty problem can be
stated as follows: Evidently, it is incorrect to take a single distribution of
results obtained at a particular design configuration as being
max F (X|ρ) representative of the entire domain. For this reason, the first
X ∈ Rn step of the procedure is to randomly generate S samples
xiL ≤ xi ≤ xiU i ∈ [1 n] (1) in the n dimensional search space. The number of samples
is dictated by computational expediency, but a reasonable
number should be taken to be meaningful. The full model is
where F is the simulation-based functional of the control evaluated for each sample k in S:
variable set X of dimensionality n and ρ represents the set
of m realizations {ρ1 , ρ2 , . . . , ρm }. The ith variable from n Rkj = F (Xk |ρ j )
is bound between xiL and xiU , respectively. j ∈ [1 m]
The conventional approach is to launch an optimization
k ∈ [1 S] (3)
process in which each objective will require m simulation
evaluations (one for each realization) and take a consider- where Rkj is the metric value of the kth sample evaluation
able amount of time to run. For a high-dimensional problem, with the j th realization ρj . Let D represent the set of S
this entails hundreds of simulation evaluations in order to known samples in tabular form. That is, D = [X R]S
reach a solution [2]. Hence, the computational cost involved with D ∈ R[S×(n+m)] . The desired moments can then be
can be considerable. established for each sample as follows:
Although one may resort to the use of high performance
1 
m
computing to run the required simulation evaluations in
parallel for speed-up, another approach is to consider using μk (Rk ) = Rkj (4)
m
j =1
a subset of the realizations in order to approximate the actual
problem for computational advantage. This is the premise 
 
of the proposed method: adaptive reduced-order modeling 1 m
(AROM). σk (Rk ) =  (μk − Rkj )2 (5)
m
j =1

3.2 Reduced-order model where μk and σk are the mean and standard deviation of the
kth sample, respectively, and the utility-based objective is
Let us define U ∈ Bm as the array of selected realizations, given by:
such that uj ρj indicates an active realization ρj if uj = 1,
with uj ∈ {0, 1} and j ∈ [1, m]. Let the problem be fk = μk (Rk ) − λσk (Rk ) = Fλ (Xk |ρ) (6)
2046 Comput Geosci (2020) 24:2043–2057

where λ represents the desired confidence factor in the Here, the inequality constraints enforce the range of
solution [4]. Note that the objective concerns the mean the permissible number of active realizations during
response when λ = 0. the search. When m̄L = m̄U , a strict equality results,
forcing only m̄ realizations to be selected. The solution
3.4 Realization selection subproblem from Eq. 14 is the set of selected realizations Û . The
realization selection subproblem (14) is a pure-binary
Now consider a measure of the desired moments, specified nonlinear constrained optimization problem that neces-
as follows: sitates a suitable solver. This can include the use of a
meta-heuristic-based solver or a dedicated integer non-
E(U |D) = μ + σ (7) linear programming solver [20–24]. The former type is
μ (R, U ) = μ(R) − μ̄(R, U ) (8) typically more robust under the conditions imposed and
σ (R, U ) = σ (R) − σ̄ (R, U ) (9) especially with increasing dimensionality. Note that an
optimal solution cannot be guaranteed when m̄ is very
where μ and σ are the error measures for the first two large (in the order of thousands of binary variables) as
moments in the known data R with respect to the full and Eq. 14 is NP hard.3 However, the set of realizations is
reduced model representation, respectively.1 Specified as a expected to be in the order of hundreds of variables,
root mean square error, this gives:2 which is manageable by most dedicated solvers [20–

 S 23]. In addition, optimality is not a strict requirement
 
μ (R, U ) =  S1
for the subproblem because a near-optimal solution can
wk (μk − μ̄k )2 (10)
be sufficient to drive the reduced-order model optimiza-
k=1
 tion problem to better states. The reduced problem is
 S
  then defined:
σ (R, U ) =  S1 wk (σk − σ̄k )2 (11)
k=1
arg max M(X|Û , ρ) = M(X|ρ̄) (15)
where wk ∈ [0 1] is a weight term assigned to the kth sample X
in R. That is, samples yielding better results can be valued
more highly than those returning less desirable results (e.g., as per (2) but with the optimal U established in Eq. 14.
assigning lower weights to the random samples generated The optimal solution from the reduced model is:
in the early stages). The approximated quantities μ̄k and σ̄k
are defined as follows: fM = M(X̂ | Û , ρ) (16)

1 
m
μ̄k (Rk , U ) = uj Rkj (12) while the validated result from the full model is:

j =1

  m fS = F (X̂ | ρ) (17)
1
σ̄k (Rk , U ) =  uj (μ̄k − Rkj )2 (13)

j =1 An error measure of the model mismatch:
The realization selection subproblem can now be stated as
follows: m = fM − fS  or Xk − Xk−1  (18)

arg min E(U | D, m̄L , m̄U )


U can be used for convergence purposes. When the error is

m greater than a stipulated threshold level, the data repository
s.t. m̄L ≤ uj ≤ m̄U and sample count is updated:
j =1
uj ∈ {0 1} j ∈ [1 . . . m] (14) D = D ∪ [X̂ R] and S = S + 1. (19)

1 The mismatch between the actual and estimated CDF should be used 3 Refers to computational complexity associated with problems that are
if higher moments are required. deemed non-deterministic polynomial time hard, i.e., they cannot be
2 Alternative error measures include SSE, MSE, and MAE. solved in polynomial time.
Comput Geosci (2020) 24:2043–2057 2047

The complete method can now be stated as follows:


S0: set : nv , S, m
T , T , F
s best = 0,
generate : X ∈ R[S×n]
for each sample : Xk k ∈ [1 . . . S] :
eval : F(Xk , ρ) → Rk Fk
store : D = D ∪ [X R]s
S1: solve : arg min E(U | D, m̄L , m̄U ) → Û
U
solve : arg max M(X|Û , ρ) = M(X|ρ̄) → X̂
X
eval : M(X̂|ρ̄) → fM
eval : F (X̂|ρ) → R fS
eval : m = fM − fS  or Xk − Xk−1 
eval : s = fS − Fbest 
update : D = D ∪ [X̂ R] S = S + 1 Fig. 1 Mean response (bold) over all realizations, with optimum
(asterisk) at x̂ = 1.1078 and ŷ = 0.7029
est : D → [Xbest , Rbest , Fbest ]
if : m > m
T go to S1 where F is the mean response of the 8 realizations (fi ) in
the uncertainty set ρ and U is the binary array of selections,
if : s > sT update : m̄L m̄U go to S1
with ui ∈ {0, 1}. The one-dimensional multi-modal function
return : Xbest , Rbest , Fbest is shown in Fig. 1.
if required, update : nv go to S0 (20)
4.2 Analytical results

4 Test study The RO method is applied with three sample paths of 2, 5,


and 8 realizations, where the subset realizations are selected
In this section, a simple analytical case with 8 realizations is randomly. Evidently, from a random starting point and
used to contrast the RO and AROM methods for illustrative using a gradient-based sequential quadratic programming
purposes, followed by the application of AROM on an (SQP) solver, the procedure will converge to the nearest
open-literature reservoir simulation problem comprising 50 local minimum (see Figs. 2, 3, and 4). A new region is
realizations [25, 26]. never explored as the solution from one problem imposes
the starting condition for the next. Hence, while setting
4.1 Analytical benchmark

The analytical test case is given as follows:

1
8
max F (x|U, ρ) = ui fi
8
i=1
x∈R
0.0 ≤ x ≤ 1.2
f1 = −(1.4 − 3.0x)sin(18x)
f2 = −(1.3 − 1.4x)sin(16x)
f3 = −(1.5 − 3.2x)sin(13x)
f4 = −(1.9 − 3.5x)sin(19x)
f5 = −(1.2 − 1.6x)sin(14x)
f6 = −(1.3 − 3.2x)sin(15x)
f7 = −(1.8 − 2.1x)sin(22x)
f8 = −(1.4 − 3.9x)sin(18x) (21) Fig. 2 RO-SQP iteration 1. Solution (asterisk) vs. actual value (circle)
2048 Comput Geosci (2020) 24:2043–2057

Fig. 3 RO-SQP iteration 2. Solution (asterisk) vs. actual value (circle) Fig. 6 RO-GA iteration 2. Solution (asterisk) vs. actual value (circle)

Fig. 4 RO-SQP iteration 3. Solution (asterisk) vs. actual value (circle)

Fig. 7 RO-GA iteration 3. Solution (asterisk) vs. actual value (circle)

Table 1 RO-GA test results

Itn m x̂ M(x̂) F (x̂)

1 2 0.2306 0.8701 0.4920


2 5 1.0799 0.8232 0.6257
3 8 1.1075 0.7029 0.7029

Selected realizations m, approximate model M(x) and actual model


F (x). Selection array by itn: U 1 = [010000010] and U 2 =
Fig. 5 RO-GA iteration 1. Solution (asterisk) vs. actual value (circle) [10111100]
Comput Geosci (2020) 24:2043–2057 2049

Table 2 AROM-GA test results

It Ns m x̂ M(x̂) F (x̂) xbest Fbest

1 3 3 1.1075 1.0266 0.7029 1.1075 0.7029


2 4 3 1.0943 0.8398 0.6842 1.1075 0.7029
3 5 4 1.1399 0.8689 0.5904 1.1075 0.7029
4 6 5 1.1291 0.6732 0.6534 1.1075 0.7029

Samples Ns, realizations m, approximate model M(x) and


and actual model F (x), with best known solution (xbest , Fbest )

random starting points is likely to be beneficial, it defeats


the purpose of the sequence of path problems. The optimum
will only be found if the starting point is perchance well
located. However, this limitation is readily mitigated using a
global (GA) solver, as shown in Figs. 5, 6, and 7. The results
are presented in Table 1. Fig. 9 AROM-GA iteration 2. Solution (asterisk) vs. actual value
Results using the AROM scheme are shown in Table 2. (circle)
Here, the procedure commences with 3 random samples of
the full model to optimally select three realizations. Subse- RO stipulates a sequence of path problems a priori, where
quently, one new sample is added per iteration and 5 realiza- the last is the actual problem comprising all realizations. In
tions are selected prior to reaching the stopping conditions. that sense, it creates improving initial conditions with which
The best known solution (post validation) is shown in the last to solve the actual problem. The number of path problems
two columns of the table, notably identified at the outset. The and the selection size in RO are user specified. The selection
AROM progress plots are shown in Figs. 8, 9, 10, and 11. of realizations is random, as clustering was not applicable.
AROM on the other hand demands a number of samples
4.3 Method discussion S and the starting (minimum) subset size m. m realizations
are then sampled from N, and the resulting ROM is
Notably, both methods find the expected optimum. Here, a optimized. The solution is validated to yield one new
simple genetic algorithm is used that readily identifies the sample. This inner loop process (for given m) repeats until
global optimum manifest in each iteration. In effect, the convergence and serves to identify the best solution possible
choice and cost of solver application is removed from the for k realizations for the given convergence conditions (on
discussion. This leaves the premise behind each method. X or M etc.). Here, a norm on X was used since for small

Fig. 8 AROM-GA iteration 1. Solution (asterisk) vs. actual value Fig. 10 AROM-GA iteration 3. Solution (asterisk) vs. actual value
(circle) (circle)
2050 Comput Geosci (2020) 24:2043–2057

Fig. 11 AROM-GA iteration 4. Solution (asterisk) vs. actual value


(circle)

k, it is less likely that the approximate model M will be


Fig. 13 Control variable set by block in the reservoir simulation
close to the actual model F . The subset size m is then schedule deck. Oil rate is set for producers (ORAT) and water rate for
increased in the outer-loop and the procedure repeats for the the injectors (RATE)
new size m. This AROM process repeats until the outer-
loop stopping conditions are met. That is, when there is little is evaluated fully. To overcome this limitation, a quadrature
change in the best known solution (Fbest ) and the expected sampling method is suggested in which sample properties
marginal gain is minimal. In this regard, AROM can halt of the black-box model response can be constructed with
before reaching the full problem, which is useful when N an increasing number of realizations. The sampling will
is large. However, the full model evaluation required, for halt when a reasonable estimate is obtained [27, 28]. This
initial sample generation and for each validation step, can procedure could be used whenever a full model evaluation
be costly for large N since it is assumed that the black-box is required and N is large.

Fig. 12 Reservoir model of the Olympus case comprising 11 producer and 7 injector wells. Model dimensions: 118 × 181 × 16 = 341,728 cells.
The grid property shown is permeability
Comput Geosci (2020) 24:2043–2057 2051

Fig. 14 Example layer slices of


permeability are shown for four
(from 50) realizations

In AROM, the set of samples are used to optimally be optimized with minimal cost as the evaluation cost is
pick the subset of realizations for the desired objective of cheap. The solution is validated by the ROM M(X) and the
interest (e.g., NPV), while in RO, the realizations are picked proxy representation is updated with the new sample. The
randomly or on the basis of static (or dynamic) property procedure repeats until convergence to an optimal solution.
clustering. For dynamic properties, each realization must be The adaptive proxy scheme is by far the most effective
evaluated at cost prior to clustering and the result may be way to treat expensive simulation-based problems, often
limited as clustering is performed only once per iteration at requiring only a handful of evaluations in comparison with
a given setting X. the direct application of alternative solvers (e.g., downhill
Although any solver can be used, AROM employs an simplex, GA, ES, and PSO [5]).
adaptive proxy optimization scheme to solve each ROM
problem [5]. Here, n + 1 samples are generated, where 4.4 Simulation model
n is the number of control variables. This initial data is
used to construct an RBF proxy of the objective surface The test problem concerns an open-literature model deve-
M(X) given by the ROM. The proxy training is fast as loped by TNO/Delft as part of an industry/academia chal-
only one linear inversion is required for any choice of the lenge exercise [25, 26]. Uncertainty is presented in the form
basis spread parameter [5]. In addition, the proxy model can of 50 geologically plausible and equiprobable realizations.

Fig. 15 Field oil production total (FOPT) with time for all 50 realizations (at the base configuration)
2052 Comput Geosci (2020) 24:2043–2057

Fig. 16 Field water production total (FWPT) with time for all 50 realizations (at the base configuration)

The challenge comprises four levels of study to partitions nv is set as 5 giving a problem dimensionality
demonstrate manual design, optimization by well control, n = 90, and the initial number of samples is set as 12.
field development planning, and lastly, joint field design The first row of the table shows the result for the first
with optimal well control. This work is concerned with the iteration of the procedure. Four realizations are established
well control problem only, where the 11 producers (np ) from the subproblem and the resulting reduced-order model
and 7 injectors (nq ) are at fixed positions with a given drill is optimized using an adaptive radial basis function (RBF)
queue schedule (see Fig. 12). The objective is to maximize proxy method [2, 5]. Here, an initial set of n + 1 samples
the mean net present value (NPV) of the asset over a are used to construct the proxy model of the objective
simulation period of 20 years (details can be found in the obtained from the ROM. The proxy model is optimized
Appendix). The control variables concern the oil rate in each and the solution is validated on the actual problem. The
producer and the water rate in each injector, respectively, proxy model is updated and procedure will repeat until
that can be varied at most once per quarter (see Fig. 13). convergence.
As this granularity results in 1440 variables over a 20- Profiles for the first (and other iterations) are shown
year period, the number of time partitions (nv ) is set to 5 in Fig. 17. The reduced-order model (with 4 realizations)
for computational ease, giving 90 control variables: n =
nv (np + nq ). A single simulation evaluation takes 7 mins Table 3 Model parameters
and hence, a single objective (evaluated sequentially) takes
350 min in this study.4 Four realizations of the model are Label Unit Value
shown in Fig. 14. The variation in the field oil production
Well oil rate sm3/d [10.0 90.0]
total (FOPT) and the field water production total (FWPT)
Injector water rate sm3/d [10.0 1600.0]
over the ensemble of realizations at the base configuration
Max platform rate sm3/d 14000.0
are shown in Figs. 15 and 16, respectively. The model
Produced oil value USD/bbl 45.0
parameters are listed in Table 3 for reference.
Produced water cost USD/bbl 6.0
Injected water cost USD/bbl 2.0
4.5 Simulation results
Discount rate % 8.0
Time period (years) T 20
The AROM procedure described above is applied to the
benchmark test case with 50 underlying realizations [25, No. of producers np 11
26]. The results are presented in Table 4. First, the time No. of injectors nq 7
No. of temporal partitions nv 5
No. of realizations m 50
4 Using an Intel Xeon E5-2667, 2.9GHz, 32GB desktop.
Comput Geosci (2020) 24:2043–2057 2053

Table 4 Iterative reduced-order model method Table 5 Evaluation cost

Itn Samp Realizations Model Actual Err Itn Reals Obj Evals Re-used Net
S ρ̄ M(X|ρ̄) F (X|ρ) m
0 50 12 600 600
M$ M$ %
1 4 254 1016 0 1016
1 12 11 12 40 47 1511.83 1479.67 2.17 50 1 50 0 50
2 13 14 32 42 50 1530.35 1518.07 0.81 2 4 296 1184 364 820
3 14 14 28 32 38 42 1498.36 1488.81 0.64 50 1 50 0 50
4 15 6 16 36 39 47 49 1511.57 1495.78 1.06 3 5 338 1690 364 1326
5 16 31 37 41 42 49 50 1488.21 1488.00 0.01 50 1 50 0 50
4 6 366 2196 364 1832
50 1 50 0 50
returns a value of 1511.83M$, while the actual model (with 5 6 306 1836 364 1472
all 50 realizations) returns a value of 1479.67M$. The 50 1 50 0 50
model error (m ) of 2.17% is shown in the last column. 7316
The procedure is repeated in iteration 2 commencing with Full model 50 150 7500 7500
13 samples (see row 2). Four different realizations are
selected and the model is optimized giving a value of The full model was limited to 150 iterations due to computational cost,
1530.35M$. The actual model returns 1518.07M$ with a and would otherwise have taken much longer for convergence
model error of 0.81%. As this is below the 1% threshold
T ), the number of desired realizations (m̄) is increased in
(m
the third iteration. The procedure repeats, as shown in the evaluations). The reduced-order model of 4 realizations in
table, and is eventually halted as the best known solution the first iteration requires 254 objective calls (1016 sim-
(1518.07M$ established in iteration 2) does not improve ulation evaluations). In the following iteration, the initial
with continued computational effort. Clearly, this stopping data set of 4(N + 1) = 364 evaluations from the first
condition is a trade-off between the expected improvement iteration is re-used to warm-start the proxy model. Thus,
in the result and the computational effort required to the 296 objective calls only require 820 new evaluations.
achieve it. The process repeats in this manner for the other iterations
The number of simulation evaluations per iteration are and the total number of simulation evaluations required
shown in Table 5. Initially, 12 evaluations of the full is 7316. This can be contrasted to the full optimization
model are made to sample the system (600 simulation case using the same adaptive RBF proxy method that ran

Fig. 17 Proxy optimization


profiles are shown for each of
the five iterations performed (as
shown in Table 4). An efficient
adaptive RBF proxy
optimization scheme is used to
solve each subproblem in which
one new sample is added per
iteration. Each case has 90
control variables and uses a
different number of active
realizations. The bold lines
indicate the progress of the best
known solution, while the other
lines show the search
progression
2054 Comput Geosci (2020) 24:2043–2057

Table 6 Result summary 5 Discussion


Case Notes Solution
The motivation and justification of the proposed scheme
M$
is presented in this section. Geoscientists can generate
Single evaluation many geological realizations of the reservoir in order to
Base ORAT=50, RATE=805 1365.43 describe and understand the uncertainty in the subsurface
Reactive ORAT=90, RATE=1600, WECON=0.88 1456.15 structure (e.g., layering, faults, oil or water saturation,
Optimization and other petrophysical properties). However, this leads
Proposed start=Base, 7316 evals (best ∼2500) 1518.07 to a significant computational cost when a single forward
Full start=Base, 7500 evals (best ∼6000) 1508.74 model is to be evaluated comprising simulation evaluations
for each of the underlying realizations. Moreover, a
great number of forward evaluations are required when
for 150 iterations (each with 50 realizations) requiring optimizing over uncertainty, where the objective value
7500 evaluations. Note that this procedure was halted prior is defined by a utility-based measure that accounts for
to convergence due to the stipulated maximum number the degree of certainty in the result [4, 5]. Hence, it
of iterations and would have required many more objec- is plausible to seek some subset of the realizations that
tive calls to reach full convergence. Nonetheless, it shows embody the statistical properties associated with the desired
that a conventional full optimization approach (using the outcomes (the NPV objective) while reducing the expected
same adaptive proxy optimization method) is computa- computational cost.
tionally expensive when many geological realizations are First, we need a collection of data that represents the
involved. full system. In particular, one must provide a measure
The summary results in Table 6 show that the iterative of the distribution of the objective outcomes at various
procedure obtains a solution comparable with that obtained samples in the search space, since a single sample is patently
by optimizing the full model directly. Moreover, it is able insufficient for realization selection. An error measure of
to find a good solution in ∼ 2500 evaluations after two the mismatch in statistical properties is used for selection
iterations (from 7316) compared with full optimization that purposes. In the limit, we may use all the realizations or
finds the best solution after ∼ 6000 evaluations (from only a single realization. However, as the former is often
7500). Note also that the full model necessitates 4550 (91 × impractical and the latter likely to be a very poor estimator,
50) evaluations just to kick-start the procedure. Thus, the we instead seek some other small number that can represent
iterative procedure has the benefit of potentially yielding the ensemble of all realizations.
a good solution far more readily and efficiently. This is Generally, for m realizations, there are 2m possible selec-
especially true if the number of realizations in consideration tions (in the definitive binary system) ranging from picking
is large. zero to the entire set. As the number of combinations for
For reference purposes, two single evaluation cases are selecting k items from m, given by:
also shown in Table 6. The base case is the default starting m!
case during optimization in which the control variables take C(m, k) = (22)
k!(m − k)!
the mid-value of their respective ranges (see Table 3). This
yields a solution of 1365.43M$. In the reactive case, the can become quite considerable for large m and some given
oil and water rates are 900 and 1600 sm3 d, respectively. k, it can quickly become impractical even to undertake an
Here, a default control strategy based on well economic exhaustive search starting from small values of k. However,
limits (wecon) is imposed. If the water content of any this selection can be made in a more effective manner using
well exceeds the 88% limit, the well is automatically shut a binary optimization scheme that remains viable for m
down. This simple scheme yields a solution of 1456.15M$. of moderate size.6 Thus, a binary nonlinear optimization
As expected, the optimized solutions beat the simple problem can be solved to select some k realizations to
reactive case. Note that while all tests were performed minimize the error measure accounting for the resulting
on a desktop,5 high performance computing (multi-cores, mismatch in the statistical properties compared with the full
cloud computing, etc.) can be used to improve the time ensemble. The solution to this subproblem effectively yields
cost involved to reach a solution. That is, the sequential a reduced-order model that can be optimized in turn as a
realization evaluation step can be completed in one period proxy to the full model. A new data sample is gathered
with a parallel capability.
6 Advanced binary nonlinear solvers are apt at handling hundreds of
binary variables in reasonable time, e.g. Tabu Search, OptQuest and
5 Study performed on an Intel Xeon E5-2667, 2.9GHz, 32GB desktop. LocalSolver.
Comput Geosci (2020) 24:2043–2057 2055

when the solution is validated against the full model, and 6 Conclusion
the realization selection subproblem can be repeated with
the new data. An iterative procedure to manage geological uncertainty
Evidently, the best result for k selections will be obtained in reservoir simulation was presented in this paper. A
after many such iterations, and no further improvement will realization selection subproblem is solved at each iteration
be forthcoming unless k is increased. Clearly, in the limit using a collection of representative model data. The subset
when k = m, the original problem will appear. However, of realizations selected defines a reduced-order model that
the assertion of the proposed scheme is that a good solution is optimized and compared to the full system. The additional
can be obtained with k  m and at a significantly lower sample generated during validation is added to the known
computational cost. For example, if m comprises many data and the process repeats, first, with convergence on
duplicate realizations, including those yielding very similar the stipulated subset size, and subsequently, with continued
results for the stated objective measure, the value of k increments in desired subset size, to a convergent solution.
required will indeed be much lower. Thus, the proposed The procedure stops when the expected marginal gain in
method automatically filters less desirable realizations as the solution becomes too minimal. Test results on an open-
the procedure commences from a low value of k and will literature reservoir simulation-optimization model with 50
halt when the marginal gain in the best known objective realizations showed that the iterative method can find a
value no longer warrants further computational effort. good solution in a computationally efficient manner as
Note that the realization selection subproblem depends compared with full optimization. The method should be
on the number and location of samples. At the outset, the used in conjunction with a high-performance computing to
sample set will be small and randomly generated, but will reduce the overall run-time.
improve with respect to the intended objective as more Finally, further work is necessary to test the proposed
samples are added by iteration. However, a different set of method on cases with a greater number of realizations and
initial samples is likely to return a slightly different result. to qualify the performance in comparison to alternative
This observation is true also for full model optimization. schemes.
Another point to note is that the starting value and range
for the subset size (k) will dictate the computational cost Acknowledgment Many thanks to William Bailey (Principal Scientist,
required to reach a good solution. Too small a value implies Schlumberger-Doll Research) for his insights about the Olympus
Challenge problem. I am also grateful to the anonymous reviewers for
many iterations, whereas too large a value means greater their suggestions on this manuscript.
evaluation cost. These elements must be considered by the
user as parameters of the method (20).
The prospect of dealing with problems with large m Appendix
raises two issues. The first concerns the computational cost
anticipated for each objective evaluation, while the second The continuous time net present value (NPV) objective
concerns the efficacy of solving the large-scale binary function used in this study is stated below for completeness:
optimization subproblem at each iteration. To alleviate the  t
cost of the former, a quadrature estimation procedure can be F (X) = e−rt [ Ω(X, t) − Φ(X, T ) ]dt (23)
used to evaluate the model sequentially, in batches of one 0
or more randomly selected realizations, in order to estimate where Ω(X, t) and Φ(X, t) represent the revenue and cost
the statistical properties at the given configuration [27, 28], streams as a function of time t respectively, given as:
e.g., from a practical perspective, it is sufficient to establish
a good estimate of the distribution of the intended objective. Ω(X, t) = Po Qo (X, t) + Pg Qg (X, t)
For the second issue, continuous improvement in solver Φ(X, t) = Co Qo (X, t) + Cg Qg (X, t) + Cw Qw (X, t) . . .
capability has allowed consideration of several hundred
+Bg Kg (X, t) + Bw Kw (X, t) + Dt
realizations and future improvements may well reach a
magnitude higher [29]. where X is the set of control variables ∈ Rn , T is the
Lastly, if the size of m is considered prohibitive, the simulation time period, and r is the discount rate. The oil,
procedure could be applied with zero samples and the use of gas, and water rates are given by Qo , Qg , and Qw . The
randomly selected realizations. In particular, the validation unit market price for oil and gas is denoted Po and Pg ,
step would be circumvented in favor of selecting a greater respectively, while the costs associated with unit production
number of realizations per iteration, ultimately, giving rise of oil, gas and water are given by Co , Cg , and Cw . Kg and
to the actual problem. This description is indeed that of the Kw indicate the gas and water injection rates, with unit costs
RO scheme discussed earlier, and in that regard, AROM can Bg and Bw , respectively. Lastly, a fixed cost per time-step
be considered a more general framework. is given by Dt . Note that only those items pertinent to the
2056 Comput Geosci (2020) 24:2043–2057

model are used, as listed in Table 3 and that discrete-time 19. Wang, H., Ciaurri, D.E., Durlofsky, L.J., Cominelli, A.: Optimal
NPV is anticipated in the Olympus Challenge. well placement under uncertainty using a retrospective optimiza-
tion framework. SPE J. 17(1), 112–121 (2012)
Thus, following the protocols of the well control
20. Glover, F., Laguna, M.: Tabu search. Kluwer Academic Publishers
problem stipulated by TNO (all wells come on stream (1997)
at t = 0), the optimized solution (Table 6) equiva- 21. Excel Solver, Reference Guide, https://www.microsoft.com
lent E(NPV) = $587,875,800 with optimal rig location 22. OptTek Systems, Reference Guide, https://www.optek.com
23. Local Solver, Reference Guide, https://www.localsolver.com
[524,134.8 618,0130.9] and drilling cost of $464,101,461.
24. Solver, B.: Computational infrastructure for operational research,
Reference Guide, https://www.coin-or.org
25. Fonseca, R.M., Della Rossa, E., Emerick, A.A., Hanea, R.G.,
References Jansen, J.D.: Overview of the OLYMPUS field development
optimization challenge. ECMOR XVI EAGE (2018)
26. ISAPP: Olympus Field Development Optimization Challenge,
1. Chilès, J.P., Delfiner, P. Geostatistics: modeling spatial uncer- https://www.isapp2.com/optimization-challenge.html
tainty, 2nd edn. Wiley, New York (2012) 27. Gelfand, A.E., Smith, A.F.: Sampling-based approaches to
2. Rashid, K., Bailey, W., Couët, B., Wilkinson, D.: An efficient calculating marginal densities. J. Amer. Statist. Assoc. 85, 398–
procedure for expensive reservoir simulation optimization under 409 (1990)
uncertainty. SPE Econ. Mgmt. 5, 4 (Oct. 2014) 28. Gilks, W.R., Richardson, S., Spiegelhalter, D.J.: Markov-chain
3. Beggs, H.D.: Production optimization using nodal analysis. OGCI Monte-Carlo in practice. Chapman Hall (1996)
Publications (2006) 29. MemComputing, solver development, https://www.memcpu.com
4. Bailey, W., Couët, B., Wilkinson, D.: Framework for field
optimization to maximize asset value. SPE Res. Eval Eng. 8(1), Publisher’s note Springer Nature remains neutral with regard to
SPE 87026 (2005) jurisdictional claims in published maps and institutional affiliations.
5. Rashid, K., Ambani, S.E.: An adaptive multiquadric radial-basis
function method for expensive black-box mixed-integer nonlinear Acronyms
constrained optimization. Eng. Opt. 45(2), 185–206 (2013) AROM - adaptive reduced-order modeling
6. Jansen, J.D.: Adjoint-based optimization of multi-phase flow
through porous media - a review. Comput. Fluids 46(1), 40–51 CDF - cumulative distribution function
(2011) ES - evolutionary strategy
7. Sarma, P., Chen, W.H., Durlofsky, L.J., Aziz, K.: Production FOPT - field oil production total
optimization with adjoint models under nonlinear control-state FWPT - field water production total
path inequality constraints. SPE Reserv. Eval. Eng. 11(2), 326–
339 (2006) GA - genetic algorithm
8. Conn, A.R., Scheinberg, K., Reynolds, A.C.: Introduction to MAE - maximum absolute error
derivative-free optimization, vol. 8. SIAM publishing (2009) MSE - mean squared error
9. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: NPV - net present value
Numerical recipes in C++, art of scientific computing. Cam-
bridge University Press, Cambridge (2002) PSO - particle swarm optimization
10. Pham, D.T., Karaboga, D.: Intelligent optimization techniques. RO - retrospective optimization
Springer, London (2000) RBF - radial-basis function
11. Fasshauer, G.E.: Meshfree approximation methods with Matlab. RMSE - root mean squared error
World Scientific Publishing (2008)
12. Bangerth, W., Klie, H., Wheeler, M.F., Stoffa, P.L., Sen, M.K.: SSE - sum of squared error
On optimization algorithms for the reservoir oil well placement SM3D - standard meter cubed per day unit
problem. Comput. Geosci. 10(3), 303–319 (2006) TS - tabu search
13. Mohamed, L., Christie, M., Demyanov, V.: Comparison of Symbols
stochastic sampling algorithms for uncertainty quantification, SPE
119139, Reservoir Simulation Symposium, The Woodlands, Texas Bg - gas injection cost by unit
(2009) Bw - water injection cost by unit
14. Bouzarkouna, Z.: Well placement optimization under uncertainty Cg - gas production cost by unit
with CMA-ES using the Neighborhood, ECMOR XIII, 13th Co - oil production cost by unit
European Conference on the Mathematics of Oil Recovery,
Biarritz (2012)
Cw - water production cost by unit
15. Chen, Y., Oliver, D.S., Zhang, D.: Efficient ensemble-based D - set of samples with metric values R[S×(n+m)]
closed-loop production optimization. SPE J. 14(4), 634–645 m - model mismatch error measure
(2009) mT - model mismatch error measure threshold
16. Chen, Y., Oliver, D.S.: Ensemble-based closed-loop optimization
applied to Brugge field. SPE Reserv. Eval. Eng. 13(1), 56–71
s - best known solution error measure
(2010) sT - best known solution error measure threshold
17. Fonseca, R.R., Chen, B., Jansen, J.D., Reynolds, A.: A stochastic μ - error measure of the first moment (mean)
simplex approximate gradient (stoSAG) for optimization under σ - error measure of the second moment (std-dev)
uncertainty, vol. 109 (2016)
18. Lu, R., Forouzanfar, F., Reynolds, A.C.: Bi-objective optimization
E - function of error measures
of well placement and controls using stoSAG, SPE 182705, fk - utility-based objective function
Reservoir Simulation Conference, Montgomery, Texas, (2017) fM - reduced-model objective value
Comput Geosci (2020) 24:2043–2057 2057

fS - full-model objective value Pg - gas production value by unit


F - simulation-based NPV function Qo - oil quantity
Fbest - best known solution objective value Qg - gas quantity
i - variable count Qw - water quantity
j - realization count ρj - j th realization
k - sample count ρ - set of discrete uncertainties (realizations)
Kg - gas injection quantity ρ̄ - compact set selected of realizations
Kw - water injection quantity r - discount rate
λ - confidence factor, ∈ [0 1] Rbest - best known solution realization values, Rm
m - number of realizations R - set of sample metric values, R[S×m]
m̄ - number of selected realizations σk - kth sample standard deviation
m̄L
i - lower bound on the number of selected realizations σ̄k - kth sample standard deviation estimate
m̄U
i - upper bound on the number of selected realizations S - number of samples
M - reduced-order simulation-based NPV function t - incremental simulation time period
μk - k-th sample mean T - time horizon (years)
μ̄k - k-th sample mean estimate U - array of selected realizations, Bm
n - number of control variables Û - solution array of selected realizations, Bm
np - number of producer wells xi - ith variable
nq - number of injector wells xiL - ith variable lower bound
nv - number of partitions xiU - ith variable upper bound
N - set of all realizations by index X - set of control variables, Rn
Ω - profit component of NPV function Xbest - best known solution, Rn
Ψ - cost component of NPV function X - set of samples, R[S×n]
Po - oil production value by unit wk - kth sample weight, ∈ [0 1]

You might also like