Professional Documents
Culture Documents
A DISSERTATION
SUBMITTED TO THE DEPARTMENT OF ENERGY RESOURCES
ENGINEERING
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
Kwangwon Park
January 2011
© Copyright by Kwangwon Park 2011
All Rights Reserved
ii
I certify that I have read this dissertation and that, in my opinion, it
is fully adequate in scope and quality as a dissertation for the degree
of Doctor of Philosophy.
iii
preface
iv
rejection sampler. The construction of 3 to 10 prior models and forward evalua-
tion were required to generate one posterior model, while the rejection sampler re-
quired 300 to 500 prior models to generate one posterior model. We also propose a
metric ensemble Kalman filter (Metric EnKF), which applies the ensemble Kalman
filter (EnKF) to the parameterizations by the kernel KL expansion in metric space.
Metric EnKF overcomes some critical limitations of EnKF: it preserves prior geo-
logic information; it creates a stable and consistent filtering. However, the results
of Metric EnKF applied to various cases including the Brugge field-scale synthetic
reservoir show the same problem as with the EnKF in general, that is, it does not
provide a realistic uncertainty model.
v
Contents
preface iv
1 Motivation 1
1.1 Modeling Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 A Bayesian approach to modeling uncertainty . . . . . . . . . . . . . 2
1.3 Uncertainty in reservoir modeling . . . . . . . . . . . . . . . . . . . . 5
1.4 Reservoir modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 State-of-the-art review . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.1 Inverse modeling approaches . . . . . . . . . . . . . . . . . . . 9
1.5.2 Introducing distances and kernels . . . . . . . . . . . . . . . . 11
1.5.3 Modeling uncertainty in metric space . . . . . . . . . . . . . . 12
1.6 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
vi
2.8.2 Distance calculation . . . . . . . . . . . . . . . . . . . . . . . . 28
2.8.3 Models in projection space by MDS . . . . . . . . . . . . . . . 30
2.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
vii
5.5.1 Model selection to reduce ensemble size . . . . . . . . . . . . . 145
5.6 Comparison of metric EnKF with the post-image solution . . . . . . . 163
5.6.1 Comparison of metric EnKF with post-image problem . . . . 163
5.6.2 A case where the ”true Earth” is near the boundary of the prior165
5.6.3 A case where few prior models are near the ”true Earth” . . . 170
5.7 Application to Brugge field-scale synthetic data . . . . . . . . . . . . 176
5.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
viii
List of Tables
A.1 Generation of 104 models with different techniques. For facies, YES
means the generation of porosity and permeability is based on fa-
cies model and NO means facies ignored; for fluvial (porosity gen-
eration method), MPS means multiple-point Geostatistical simula-
tion and SIS means sequential indicator simulation; for permeabil-
ity (permeability generation method), KS means the permeability
model is generated by the single-poroperm regression, KP means
the poroperm regression per facies, and KM means the permeabil-
ity model by co-Kriging on porosity. The number is the parenthesis
represents the number of models generated. . . . . . . . . . . . . . . 209
ix
List of Figures
1.1 A general way to approach Bayes’ rule: to not explicitly state the
posterior, but to produce posterior samples that follow Bayes’ rule. . 3
1.2 How the rejection sampler works. . . . . . . . . . . . . . . . . . . . . . 5
1.3 Various sources of uncertainty in the oil industry. . . . . . . . . . . . 5
2.1 The size of reservoir model is 310 ft × 310 ft × 10 ft. The domain is
discretized into 961 gridblocks (31 × 31). An injector and a producer
are completed at (45 ft, 45 ft) and (275 ft, 275 ft) respectively. . . . . . 26
2.2 6 out of 1,000 initial log-permeability models generated by SGSIM.
(-3 to 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3 Training image used for SNESIM models. . . . . . . . . . . . . . . . . 28
2.4 6 out of 1,000 initial facies distribution models generated by SNESIM. 29
2.5 Projection of metric space of 1,000 SGSIM and 1,000 SNESIM models
using Euclidean distance. . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.6 Projection of metric space of 1,000 SGSIM and 1,000 SNESIM models
using the connectivity distance. . . . . . . . . . . . . . . . . . . . . . . 32
2.7 Projection of metric space using the connectivity distance (continu-
ous variables). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.8 Projection of metric space using the connectivity distance (binary
variables). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.9 The correlations between the distances defined and the distances
in the low-dimensional projection space. From left: 1D projection
space, 2D, 3D, and 10D. X-axis means the connectivity distance and
y-axes the distance in the projection space by MDS. . . . . . . . . . . 35
x
2.10 The correlations between the distance defined and the dynamic re-
sponse. The red lines in (b) and (c) represent the mean of difference
in dynamic data of a range of connectivity distance. . . . . . . . . . . 36
2.11 Representation of the difference in dynamic responses between all
models and a specific model. . . . . . . . . . . . . . . . . . . . . . . . 37
3.1 Various solutions of the pre-image problem: all the methods con-
verge to the minimum. . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2 Various solutions of the pre-image problem: Schölkopf and Smola
(2002) fixed-point iteration algorithm does not converge to the min-
imum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Various solutions of the pre-image problem: Kwok and Tsang (2004)
algorithm does not converge to the minimum. . . . . . . . . . . . . . 48
3.4 Various solutions of the pre-image problem: Conjugate gradient method
does not converge to the minimum. . . . . . . . . . . . . . . . . . . . 51
3.5 Various solutions of the pre-image problem: all the methods other
than the case starting from the proposed initial point do not con-
verge to the minimum. . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.6 4 of 300 Gaussian models generated by SGSIM. . . . . . . . . . . . . . 54
3.7 4 of 300 new models generated by applying unconstrained opti-
mization to the pre-image problem using the Euclidian distance. . . . 55
3.8 4 of 300 new models generated by applying unconstrained opti-
mization to the pre-image problem using connectivity distance. . . . 56
3.9 300 initial Gaussian random function models and 300 new models. . 56
3.10 QQ-plot between an initial prior model and a new model from un-
constrained optimization of the pre-image problem. The green line
is the 45◦ -line. The Gaussian shape is preserved but not the variance. 57
3.11 Histograms of an initial prior model and a new model. . . . . . . . . 57
3.12 Variograms (standardized) of an initial prior model and a new model. 57
3.13 4 of 300 uniform random function models generated by the DSSIM. . 58
xi
3.14 4 of 300 new models generated by applying unconstrained opti-
mization to the pre-image problem using the Euclidian distance. . . . 59
3.15 4 of 300 new models generated by applying unconstrained opti-
mization to the pre-image problem using connectivity distance. . . . 60
3.16 300 initial uniform random function models and 300 new models. . . 60
3.17 QQ-plot between an initial prior model and a new model from un-
constrained optimization of the pre-image problem. The green line
is the 45◦ -line. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.18 Histograms of an initial prior model and a new model. . . . . . . . . 61
3.19 Variograms (standardized) of an initial prior model and a new model. 61
3.20 Illustration of the proximity distance transform. . . . . . . . . . . . . 63
3.21 Boolean channel, lobe, and fracture models. . . . . . . . . . . . . . . . 66
3.22 The processes of feature constrained optimization for Boolean chan-
nel models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.23 The processes of feature constrained optimization for Boolean lobe
models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.24 The processes of feature constrained optimization for Boolean frac-
ture models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.25 300 initial models (red) and 300 new models (blue) in the projection
of metric space by MDS. . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.26 The probability perturbation method to solve the pre-image prob-
lem (geologically-constrained optimization). . . . . . . . . . . . . . . 85
3.27 The PPM iterations from a randomly chosen model as an initial model. 88
3.28 The PPM iterations from the best fit model amongst the initial prior
models as an initial model. . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.29 Initial and final models obtained by PPM starting from current best-
fit model. From the same initial model in (a) (best-fit model), PPM
provides the same final model with 10 trials of different random seeds. 90
3.30 The example solutions of the unconstrained optimization, which is
used as initial conditional probability of the PPM. . . . . . . . . . . . 90
xii
3.31 The PPM iterations from the solution of the fixed-point iteration al-
gorithm as an initial probability for the PPM. . . . . . . . . . . . . . . 91
3.32 A diverse set of the pre-image solutions obtained by the PPM start-
ing from the solution of unconstrained optimization. . . . . . . . . . 92
xiii
4.16 Watercut curves of 100 initial prior models displayed with data. . . . 112
4.17 The locations of the true Earth and initial prior models in the projec-
tion of metric space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.18 The objective function of post-image problem in the projection of
metric space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.19 6 of 30 solutions from the fixed-point iteration algorithm used as an
initial probability in the probability perturbation method for solving
post-image problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.20 6 of 30 posterior models constrained to watercut history obtained by
solving post-image and pre-image problems. . . . . . . . . . . . . . . 115
4.21 Watercut curves for 30 posterior models obtained by post-image and
pre-image problems matching the watercut history. . . . . . . . . . . 116
4.22 6 of 30 posterior models obtained by the rejection sampler. . . . . . . 116
4.23 Watercut curves for 30 posterior models obtained by the rejection
sampler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.24 Comparison of the mean and conditional variance of 30 posterior
models from the post-image problem with those from the rejection
sampler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
4.25 Watercut curves of 100 initial prior models displayed with data. . . . 119
4.26 The locations of the true Earth and initial prior models in the projec-
tion of metric space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.27 3 solutions from the unconstrained optimization used as an initial
probability in the probability perturbation method for solving post-
image problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.28 3 posterior models constrained to watercut history obtained by solv-
ing post-image and pre-image problems. . . . . . . . . . . . . . . . . 120
4.29 Watercut curves for 3 posterior models obtained by post-image and
pre-image problems matching the watercut history. . . . . . . . . . . 121
4.30 Comparison of the post-image and pre-image solution methods with
other sampling techniques. . . . . . . . . . . . . . . . . . . . . . . . . 122
xiv
4.31 6 of 30 posterior models constrained to watercut history obtained by
solving post-image and pre-image problems with iteration. . . . . . . 123
4.32 Watercut curves for 30 posterior models obtained by the post-image
and pre-image problems with iteration. Watercut curves of initial
prior models and newly added prior models are displayed separately.124
4.33 6 of 30 posterior models obtained by the rejection sampler. . . . . . . 124
4.34 Watercut curves for 30 posterior models obtained by the rejection
sampler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.35 The objective function of post-image problem in the projection of
metric space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.36 Comparison of the mean and conditional variance of 30 posterior
models from the post-image problem (238 forward simulations) with
those from the rejection sampler (11,454 forward simulations). . . . . 126
4.37 Watercut curves of 100 initial prior models displayed with data. . . . 127
4.38 The locations of the true Earth and initial prior models in the projec-
tion of metric space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.39 The objective function of post-image problem in the projection of
metric space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.40 6 of 15 solutions from the fixed-point iteration algorithm used as an
initial probability in the probability perturbation method for solving
post-image problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.41 6 of 15 posterior models constrained to watercut history obtained by
solving post-image and pre-image problems. . . . . . . . . . . . . . . 130
4.42 Watercut curves for 15 posterior models obtained by post-image and
pre-image problems matching the watercut history. . . . . . . . . . . 130
4.43 6 of 15 posterior models obtained by the rejection sampler. . . . . . . 131
4.44 Watercut curves for 15 posterior models obtained by the rejection
sampler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.45 Comparison of the mean and conditional variance of 15 posterior
models from the post-image problem (238 forward simulations) with
those from the rejection sampler (12,424 forward simulations). . . . . 132
xv
5.1 Watercut data measured every two months. Only the red circles ◦
(noisy data) are available to the algorithm. . . . . . . . . . . . . . . . 146
5.2 2D projection of metric space of 1,000 initial models based on their
own distances. Color represents the difference in responses between
initial models and the model located in × (◦: low; ◦: high). Since the
connectivity distance is highly correlated with the difference in re-
sponses, although the models are mapped based on the connectivity
distance, the models are well sorted with the difference in responses. 147
5.3 Log-permeability of 6 out of 1,000 models which are generated by
SGSIM. All the models are conditioned to hard data: 150 md at (45
ft, 45 ft) and (275 ft, 275 ft). . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.4 The mean (left) and conditional variance (right) of log-permeability
of 1,000 initial models. It is verified that all the models are condi-
tioned to hard data. In the map of the mean (left), the well locations,
or the hard data locations, are easily identified: (45 ft, 45 ft) and (275
ft, 275 ft). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.5 Watercut curves simulated with all 1,000 initial models and the mea-
sured watercut data. Red circles ◦ mean the measured data. Green
line − means the mean of the watercut curves. Grey lines show −
1,000 watercut curves. . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.6 Watercut curves calculated by the reservoir simulations and the mea-
sured watercut at 260 days. . . . . . . . . . . . . . . . . . . . . . . . . 149
5.7 Update at 260 days in 2D MDS space. ◦’s represent the a priori mod-
els (before correction) and ◦’s the a posteriori models (after correction).149
5.8 Watercut curves calculated by the reservoir simulations and the mea-
sured watercut at 520 days. . . . . . . . . . . . . . . . . . . . . . . . . 150
5.9 Update at 520 days in 2D MDS space. ◦’s represent the a priori mod-
els (before correction) and ◦’s the a posteriori models (after correc-
tion). Grey lines (−) show the path of update. . . . . . . . . . . . . . . 150
xvi
5.10 LEFT: Watercut curves calculated by the reservoir simulations and
the measured watercut from 580 days to 1,095 days; RIGHT: Update
from 580 days to 1,095 days in 2D MDS space. ◦’s represent the a
priori models (before correction) and ◦’s the a posteriori models (after
correction). Grey lines (−) show the path of update. . . . . . . . . . . 153
5.11 The updates of log-permeability ln k, a priori and a posteriori water
saturation Sw of one model amongst 300 models. . . . . . . . . . . . . 156
5.12 The mean (left) and conditional variance (right) of log-permeability
of 300 final models after EnKF. . . . . . . . . . . . . . . . . . . . . . . 156
5.13 Log permeability of reference model. . . . . . . . . . . . . . . . . . . . 157
5.14 Watercut curves predicted by reservoir simulations of 300 final mod-
els from 0 days to 1095 days. . . . . . . . . . . . . . . . . . . . . . . . . 157
5.15 The initial 300 models clustered into 30 clusters by means of the
kernel k-mean clustering. . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.16 The 30 models selected (the medoids). . . . . . . . . . . . . . . . . . . 158
5.17 Watercut curves for the initial 300 models and their p50 , p10 , and p90
(red solid line and dotted lines). . . . . . . . . . . . . . . . . . . . . . . 159
5.18 Watercut curves for the selected initial 30 models and their p50 , p10 ,
and p90 (red solid line and dotted lines). . . . . . . . . . . . . . . . . . 159
5.19 p50 , p10 , and p90 of the initial 300 models (red) and the selected initial
30 models (blue). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.20 EnKF update of the selected 30 models at 520 days in 2D projection
of metric space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.21 Watercut curves predicted by the final 30 models of EnKF. . . . . . . 161
5.22 Watercut curves for the final 300 models (original ensemble) and
their p50 , p10 , and p90 (red solid line and dotted lines). . . . . . . . . . 161
5.23 Watercut curves for the final 30 models (reduced ensemble) and
their p50 , p10 , and p90 (red solid line and dotted lines). . . . . . . . . . 162
5.24 p50 , p10 , and p90 of the final 300 models (red) and the final 30 models
(blue). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
5.25 Watercut curves of 100 initial prior models displayed with data. . . . 165
xvii
5.26 Initial ensemble of prior models in the projection of metric space. . . 166
5.27 Final ensemble of prior models in the projection of metric space. . . . 167
5.28 Update of initial ensemble of prior models in the projection of metric
space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
5.29 6 of 100 final models constrained to watercut history obtained by
metric EnKF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
5.30 Watercut curves for 100 posterior models obtained by metric EnKF
matching the watercut history. . . . . . . . . . . . . . . . . . . . . . . . 169
5.31 6 of 100 posterior models obtained by the rejection sampler. . . . . . 169
5.32 Watercut curves for 100 posterior models obtained by the rejection
sampler. (15,305 forward simulations) . . . . . . . . . . . . . . . . . . 170
5.33 The mean and conditional variance of 100 posterior models from
metric EnKF (100 forward simulations) and from the rejection sam-
pler (15,305 forward simulations). . . . . . . . . . . . . . . . . . . . . . 171
5.34 Watercut curves of 100 initial prior models displayed with data. . . . 172
5.35 Initial ensemble of prior models in the projection of metric space. . . 172
5.36 Final ensemble of prior models in the projection of metric space. . . . 173
5.37 Update of initial ensemble of prior models in the projection of metric
space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.38 6 of 60 final models constrained to watercut history obtained by met-
ric EnKF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
5.39 Watercut curves for 60 posterior models obtained by metric EnKF
matching the watercut history. . . . . . . . . . . . . . . . . . . . . . . . 174
5.40 6 of 60 posterior models obtained by the rejection sampler. . . . . . . 175
5.41 Watercut curves for 60 posterior models obtained by the rejection
sampler. (38,201 forward simulations) . . . . . . . . . . . . . . . . . . 175
5.42 The mean and conditional variance of 60 posterior models from met-
ric EnKF (100 forward simulations) and from the rejection sampler
(38,201 forward simulations). . . . . . . . . . . . . . . . . . . . . . . . 177
5.43 The Brugge field and wells (Oil saturation). . . . . . . . . . . . . . . . 178
5.44 Production history. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
xviii
5.45 Permeability of 4 of 65 prior models. . . . . . . . . . . . . . . . . . . . 180
5.46 The prediction of watercut from 65 initial prior models and the data. 182
5.47 65 initial prior models in the projection of the metric space. . . . . . . 183
5.48 Update of the metric EnKF of 65 models of Brugge data set. . . . . . 183
5.49 The prediction of watercut of 65 final models and the data. . . . . . . 184
5.50 The prediction of oil production rates of 65 final models and the data. 185
5.51 The prediction of bottom-hole pressure of 65 final models and the
data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
5.52 The permeability of 4 of 65 final model obtained by the metric EnKF. 187
5.53 The mean and conditional variance of initial 65 models and final 65
models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
xix
A.15 Field oil and water production curves of 104 models by exhaustive
simulations, which cannot be applied in the field. . . . . . . . . . . . 213
A.16 Field oil and water production curves of 6 representative models
chosen by KKM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
A.17 p10 , p50 , and p90 of field oil production curves of 104 models (green
dashed lines) and 6 representative models chosen by KKM (blue
solid lines). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
A.18 p10 , p50 , and p90 of field water production curves of 104 models
(green dashed lines) and 6 representative models chosen by KKM
(blue solid lines). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
A.19 Well (p17) oil and water production curves of 104 models by exhaus-
tive simulations, which cannot be applied in the field. . . . . . . . . . 215
A.20 Well (p17) oil and water production curves of 6 representative mod-
els chosen by KKM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
A.21 p10 , p50 , and p90 of well (p17) oil production curves of 104 models
(green dashed lines) and 6 representative models chosen by KKM
(blue solid lines). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
A.22 p10 , p50 , and p90 of well (p17) water production curves of 104 models
(green dashed lines) and 6 representative models chosen by KKM
(blue solid lines). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
A.23 Checking the type of porosity and permeability model generation
method in the spreadsheet of Pointset. x, y, Depth represent the lo-
cation of each model in the space projected by MDS. The case name,
cluster index, and generation methods of permeability and porosity
are listed in the table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
A.24 Checking the type of porosity and permeability model generation
method for 6 representative models only in the spreadsheet of Pointset.219
A.25 Projection of metric space with displaying the usage of facies infor-
mation for the generation of porosity model (YES: facies considered;
NO: facies ignored). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
xx
A.26 Projection of metric space with displaying the type of simulation
method to generate porosity model (MPS: multiple-point geostatis-
tical method; SIS: sequential indicator simulation). . . . . . . . . . . . 220
A.27 Projection of metric space with displaying the type of method to
generate permeability model (KS: single poroperm regression; KP:
poroperm regression per facies; KM: coKriging on porosity). . . . . . 220
xxi
Chapter 1
Motivation
1
CHAPTER 1. MOTIVATION 2
or not. When the event happened, it does not tell us what the uncertainty was
in hindsight. For instance, although the weather forecast says the probability of
raining tomorrow is 60 %, there are only two possible events: tomorrow it rains or
not. For both the events, we cannot say the probability of raining, 60 %, is correct
or not. While there is no correct uncertainty, we still have to model the uncertainty
for the future prediction.
where, X is random vector representing the Earth model and D the random vector
representing data. x represents the outcome of random vector X or the model and
d the outcome of random vector D or the data. P(X = x) is the prior probability of
X; P(D = d|X = x) is the likelihood; P(X = x|D = d) is the posterior probability.
P(D = d) is the prior probability of D. The posterior probability represents the
uncertainty of the model, given data. Hence, in order to model the uncertainty,
the likelihood and the prior probability need to be determined in the Bayesian
framework.
The prior probability is the probability of model before the data are taken into
account. However, not taking any data into account does not mean that no infor-
mation can be considered for the prior probability. The prior probability represents
all ’possible’ models. There are many ’impossible’ models. For example, assume
that a reservoir model is being constructed. Then, strictly speaking, the prior prob-
ability can represent all the reservoirs on all the Earth-like planets in the universe
if there are infinite number of the Earth-like planets in the universe. As for the
CHAPTER 1. MOTIVATION 3
rules of geology, geologically unrealistic reservoir models should not be taken into
account for the prior.
It is also convenient to use the terms, prior and posterior probability, in a rela-
tive sense. Assume we have two types of data, D1 and D2 , and suppose D1 and
D2 are considered sequentially. When we first consider D1 , the prior probability
becomes P(X) and the posterior probability is P(X|D1 ). Then, if D2 is considered
only after looking at D1 , the prior probability becomes P(X|D1 ) and the poste-
rior probability is P(X|D1 , D2 ). In this relative sense, the prior probability is the
probability before certain data are considered.
The posterior probability represents all models that honor all the data amongst
all possible prior models. Note that this is more than saying, ”all models that honor
the data”. There may be models matching the data that are not in the prior. There
are two ways to approach Bayes’ rule. The first is to state explicitly the posterior
distribution by multiplying a prior and likelihood and then sample from it. The
second is to not explicitly state the posterior, but produce posterior samples that
follow Bayes’ rule. We will mostly follow the second approach since it is more
general (Figure 1.1). There are very few multivariate probability distributions (ba-
sically only the multivariate Gaussian), that makes the first approach feasible.
Figure 1.1: A general way to approach Bayes’ rule: to not explicitly state the pos-
terior, but to produce posterior samples that follow Bayes’ rule.
There are many sampling techniques that can be used for these two approaches
to Bayes’ rule but the rejection sampling is the only sampling method that creates
samples that represent the posterior probability perfectly. The rejection sampling
CHAPTER 1. MOTIVATION 4
provides a set of posterior models that represents the posterior probability by re-
jecting models that do not honor the data amongst all the prior models. ”Honor-
ing” is expressed in the likelihood which contains the data-model relationship as
well as any error modeled for that relationship. The data are used in this sense
to ”falsify” the models (Popper, 2002; Tarantola, 2006). The rejection sampler is in
that sense a prefect sampler, it follows Bayes’ rule correctly.
The rejection sampler basically follows the procedure (Figure 1.2):
4. Accept the model x with a certain probability (Equation 1.2) depending upon
the mismatch between the response and the data.
P(D = d|X = x)
p= = f (k g(x) − dk). (1.2)
pmax
where, pmax may be chosen as the maximum value of P(D = d|X = x). The
acceptance ratio depends on the constant pmax .
Briefly speaking, the rejection sampler rejects or accepts a prior model depending
upon the shape of the likelihood, i.e. the nature of data-model relationship.
However, if the data-model relationship is complex and/or requires consider-
able CPU-time to be evaluated, then the rejection sampler can be extremely slow.
Although many other sampling techniques are developed to overcome the limita-
tion of computational inefficiency of the rejection sampling, that gain in compu-
tational efficiency often means a loss in adhering to Bayes’ rule which may lead
to unrealistic models of uncertainty, often, models of uncertainty that have less
uncertainty than modeled through Bayes’ rule.
CHAPTER 1. MOTIVATION 5
Reservoirs are large and complex geological structures in the subsurface of the
CHAPTER 1. MOTIVATION 6
Often reservoir data used for modeling has already been subject to a great deal of
interpretation from raw measurements hence are uncertain themselves. Moreover,
raw measurements are subject to error.
and the GDM generate the posterior models contained within the prior models.
Both methods update a current model by combining with it a new possible model
through a one-parameter optimization. The difference is that the PPM deals with
the probability and the GDM the random numbers which are used in geostatistical
simulations (Caers, 2007). While the PPM and the GDM can provide a posterior
model, they need relatively large number of forward simulations, which take usu-
ally several hours to days. Moreover, they yield only one posterior model, hence
a large number of optimizations with different initial models are required in order
to obtain multiple models.
In order to provide multiple models, the Ensemble Kalman Filtering (EnKF)
has been applied and researched very actively recently. The Kalman Filter (KF)
is a technique for obtaining the prediction of a linear dynamic system (Kalman,
1960). Though the KF deals with a linear stochastic difference equation, the Ex-
tended Kalman Filter (EKF) is able to be applied when the relationship between
the process and measurements is nonlinear (Welch and Bishop, 2004). However,
the EKF is not applicable to highly nonlinear systems. Evensen (1994) developed a
modified Kalman filter, EnKF, for highly nonlinear problems. Due to its high per-
formance and applicability, the EnKF has rapidly spread and has been effectively
applied to a variety of fields, such as ocean dynamics, meteorology, hydrogeology,
and so forth (Houtekamer and Mitchell, 1998; Reichle et al., 2002; Margulis et al.,
2002; Evensen, 2003, 2004, 2009).
Nævdal and Vefring (2002) brought the EnKF into history matching problem.
At first, the EnKF was applied to characterize a near-well-bore reservoir. Since
then, the EnKF has been utilized to identify detailed permeability distribution for
the entire reservoir (Nævdal et al., 2005). Gu and Oliver (2005) updated the per-
meability and porosity of a full 3D reservoir simultaneously in PUNQ-S3 reservoir,
which is a realistic synthetic reservoir to verify the performance of history match-
ing. Gao et al. (2005) compared the randomized maximum likelihood method with
the EnKF. Liu and Oliver (2005) carried out the EnKF for geologic facies. Park et al.
(2005) demonstrated the applicability of the EnKF in aquifer parameter identifi-
cation through the comparison with the SA and the GDM. Park and Choe (2006)
CHAPTER 1. MOTIVATION 11
also presented a modified EnKF for a waterflooded reservoir with the methods of
regeneration and selective uses of the observations. As the EnKF showed good
performance and diverse advantages in history matching problems, it has been re-
searched actively in a range of petroleum academies and industries (Zhang et al.,
2007; Zafari and Reynolds, 2005; Skjervheim et al., 2005; Lorentzen et al., 2005; Ja-
farpour and McLaughlin, 2008; Sarma and Chen, 2009).
However, the EnKF can neither preserve the geologic information (prior) nor
condition to secondary data such as seismic survey, nor provide a realistic uncer-
tainty assessment under the conditions of Bayes’ rule: the multiple models ob-
tained by EnKF do not represent the posterior probability. It turns out that the
models from the EnKF represent the dramatically reduced uncertainty. The EnKF
updates the ensemble of models to match dynamic data only. In addition, the state
vectors which represent the model and measurement should be distributed nor-
mally in the EnKF (Evensen, 1994). Therefore, the EnKF cannot handle discrete
properties, e.g. facies models for channel bed reservoirs.
a model from feature space to model space. As a result, a new model is obtained
by nonlinear combination of an ensemble of models. He optimized this relatively
short parameters and found a solution which is conditioned to dynamic data by
gradient-based optimization algorithms. While this technique accounts for some
prior information, the solution of the pre-image problem used in Sarma (2006) does
not sample exactly from the prior, particularly for discrete problems (facies) issues
related to the reproduction of the crisp facies architecture remain. Additionally,
the approach (embedded in gradient-based optimization) does not follow Bayes’
rule (except possibly for multi-Gaussian models).
A kernel can be understood as a similarity measure, because a kernel is a dot
product of feature vectors and similar feature vectors provide a large dot product.
If this similarity from a kernel function can represent the similarity of dynamic
data, optimization would become easier and more efficient. However, polynomial
kernels which are used in Sarma (2006) are not well correlated with the difference
of dynamic data. Hence, we have to devise a more efficient and reliable way to
model the Earth.
While a kernel is a similarity measure, a distance is also a dissimilarity mea-
sure. As a dissimilarity measure, Suzuki and Caers (2008) introduced a Hausdorff
distance. They showed how a (static) distance between any two models that cor-
relates with their difference in flow responses can be used to search for history
matched models by means of efficient search algorithms, such as neighborhood
search algorithm and tree-search algorithm. This method was successfully applied
to structurally complex reservoirs (Suzuki et al., 2008). In order for this method
to work, the dissimilarity distance between any two models should be reasonably
correlated with the dynamic data.
1.6 Objectives
This dissertation aims to model the uncertainty in the Earth models using distance-
based techniques. For this objective, the following will be kept in mind.
1. Such uncertainty should be consistent with the result of the rejection sam-
pling. This means that any set of models should span the same uncertainty
as models generated by the rejection sampler.
3. The models which describe the uncertainty honor all available data including
any data other than production data such as well-log and seismic data.
2.1 Introduction
A new paradigm for modeling uncertainty in Earth modeling has been proposed
by Caers (2008b), which is called Modeling Uncertainty in Metric Space (MUMS).
The techniques for MUMS make it possible for all the processes accompanied by
modeling uncertainty to be performed in metric space, where the models are rep-
resented exclusively by their mutual differences in responses as defined by a ”dis-
tance”.
MUMS has been triggered by an idea that often our main interest in modeling
uncertainty is not uncertainty in the model itself but uncertainty in future predic-
tion. Earth models are often represented by various properties assigned to several
millions of gridblocks in a complex structural grid. Yet, the future prediction of in-
terest is a relatively few responses at wells over time, which are one-dimensional
time-series vectors. Hence, MUMS focuses on the simple low-dimensional re-
sponses, not the high-dimensional model itself, such that all the processes in mod-
eling uncertainty are performed within a much lower dimensional space.
In MUMS, a distance representing the difference in responses is defined be-
tween any two models in order to focus on the responses, not the model itself. The
distance defined between any two models constructs a metric space where all the
models are mapped and located. Therefore, all the further operations performed
15
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 16
a model is N and the number of models L, then a set (or ensemble) of models is rep-
resented by the matrix X (Equation 2.1). The ensemble of models is generated by
geostatistical (variogram-based or Boolean or multiple-point) simulation methods
aiming to cover the prior uncertainty space constrained to any prior data, such as
well core and log as well as seismic data. For example, if we generate 100 models
of porosity and permeability and the number of gridblocks in the grid of geolog-
ical model is 100,000, xi contains the value of porosity and permeability of each
gridblock, therefore the size of xi is 200, 000 × 1 and the size of X is 100 × 200, 000.
|
X= x1 x2 · · · xi · · · xL ∈ R L× N (2.1)
gi = g ( xi ) i = 1, ..., L (2.2)
2.3 Distance
MUMS starts by defining a distance. A distance is defined such that the distance
between any two models is reasonably correlated with the difference in the re-
sponses of the two models (Equation 2.3) (for the study about the correlation, refer
to Caers and Scheidt (2010)). We can define any type of function for the distance
calculation as long as Equation 2.3 holds. Note the function d(xi , x j ) should be
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 18
computationally easy and fast while the evaluation of function g(xi ) often costs
large time and efforts.
q
dij = d(xi , x j ) correlates with ( g i − g j )| ( g i − g j ) (2.3)
In most cases, the size of gi is much smaller than that of xi as mentioned before
(N >> Nt ). In other words, a model xi usually exists in very high-dimensional
(105 D to 108 D) space regardless of its response. On the other hand, gi usually ex-
ists in low dimensional (1D to 103 D) space. Hence, if we construct a metric space
representing models exclusively by the distance, the models in metric space are ar-
ranged according to the differences in their responses, which is much more effec-
tive in most of applications than that in the high-dimensional model space (Caers
et al., 2010). This simplification is the reason why we can represent the constructed
metric space as a projection to low-dimensional (2D to 5D) space through MDS.
where, Xm is defined as
|
Xm = x1,m x2,m · · · xi,m · · · x L,m ∈ R L×m (2.5)
where, the subscript m indicates the dimension of projection space or the num-
ber of eigenvalues retained. MDS is simply done by eigenvalue decomposition
and then retaining a ’reasonable’ number of largest positive eigenvalues as in
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 19
Equations 2.6 to 2.8. ’Reasonable’ means large enough to capture the variation of
models in metric space. The correlation coefficient between the distances in metric
space and the distances in the projection space determines this ’reasonable’ num-
ber. Equation 2.6 represents the process of centering the distance matrix.
B = HAH (2.6)
1 |
H = I− 11 ∈ R L× L
L
1
aij = − d2ij
2
.
Next, the eigenvalue decomposition of B is
B = VB ΛB V|B (2.7)
Xm = VB,m Λ1/2
B,m (2.8)
MDS is required for several reasons. First, MDS transforms the distance defined
into an equivalent Euclidean distance, since the distance in the projection of metric
space by the MDS is the Euclidean distance and almost the same as the distance
defined. In most cases, a three-dimensional projection space is enough (m = 3)
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 20
L
Copt = argmin ∑ min kc j − xi,m k with j = 1, 2, · · · , Nc (2.9)
C
i =1 j
Therefore, the equations for KKM are obtained by slightly modifying Equa-
tions 2.9 and 2.10 (Equations 2.14 and 2.15).
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 22
L
Copt = argmax ∑ max k(c j , xi,m ) with j = 1, 2, · · · , Nc (2.14)
C i =1 j
L
1 1 |
∑ xj xj
|
CX = = X X (2.16)
L j =1
L
A. ynew represents the parameterization for the model xnew . The parameterization
yi or ynew is a standard Gaussian random vector and the size is determined by
how many eigenvalues are retained (yi ∈ Rm×1 ). We do not have to use all the
nonzero L eigenvalues; typically a few large eigenvalues are retained (m ≤ L). By
Equation 2.19, we can generate many models representing the same covariance
and the same uncertainty space (Caers et al., 2010).
In order to consider higher-order moments or spatial correlation beyond the
point-by-point covariance, the feature expansions of the models can be introduced.
Let φ be the feature map from model space R to feature space F (Equations 2.20
and 2.21).
φ : R→F (2.20)
xm 7→ φ : = φ(xm ) (2.21)
where, φ is the feature expansion of model. With the feature expansion of the en-
semble φ(Xm ) (defined by Equation 2.22), a model is parameterized and a new
feature expansion is generated in the same manner as above (Equations 2.25 and
2.25). The covariance of feature expansions φ(x j ) of the ensemble and its eigen-
value decomposition are calculated by Equation 2.23.
L
1 1 1
CΦ =
L ∑ φ(x j,m )φ(x j,m )| = L
φ(Xm )φ(Xm )| = ΦΦ| = VCΦ ΛCΦ V|CΦ (2.23)
L
j =1
KVK = VK ΛK (2.28)
Then, the eigenvectors and the corresponding eigenvalues of the covariance are
calculated directly from the eigenvectors and eigenvalues of the kernel matrix,
which takes much less time (Equation 2.29).
1
Λ CΦ = ΛK
L
Φ | VC Φ = VK Λ1/2
K (2.29)
Φ = VCΦ Λ1/2
CΦ Y (2.30)
Φ| Φ = Φ| VCΦ Λ1/2
CΦ Y (2.31)
| {z } | {z }
VK Λ1/2
K √1 Λ1/2
K
L
1
K = √ VK Λ K Y (2.32)
L
Reservoir geometry
300
250
200
y, ft
150
2.8 Example
This section provides the example that describes the usefulness of the techniques
for MUMS by means of synthetic reservoir modeling problem.
Consider a reservoir of 310 ft × 310 ft × 10 ft. Uncertainty on facies and petro-
physical properties will be considered. There is strong spatial correlation along
the NE50 direction. An injector and a producer are completed at bottom-left cor-
ner (45 ft, 45 ft) and top-right corner (275 ft, 275 ft) respectively (Figure 2.1). The
permeability values at the wells are given as 150 md.
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 27
Figure 2.2: 6 out of 1,000 initial log-permeability models generated by SGSIM. (-3
to 3)
Secondly, 1,000 models are generated by the Single Normal Equation Simula-
tion (SNESIM) with conditioning to hard data (facies = sand at two well locations).
The facies distribution is modeled with a training image with meandering channels
of mostly NE50 direction (Figure 2.3). Constant permeability values are assigned
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 28
to each facies: 150 md for sand facies; 1 md for mud background. Figure 2.4 shows
the facies distribution of 6 out of 1,000 models.
y, ft
y, ft
150 150 150
50 50 50
y, ft
y, ft
150 150 150
50 50 50
Figure 2.4: 6 out of 1,000 initial facies distribution models generated by SNESIM.
in the reservoir. The connectivity distance is devised for a distance correlated with
the difference in dynamic responses of reservoir models.
In order for the connectivity distance to exhibit high correlation with the dif-
ference in the dynamic responses of reservoir models, the Time of Flight (TOF,
Datta-Gupta and King (1995)) is employed, which shows satisfactory correlation.
TOF from an injector to a producer is calculated by streamline tracing (Thiele
et al., 1996) in steady-state condition. Typically, in order to determine pressure
field and trace streamlines in steady-state condition only one hundredth to one
thousandth simulation time equivalent to the usual reservoir simulation time is
required. Equation 2.40 shows the TOF-based injector to producer distance calcu-
lation. We choose a percentile among TOFs of streamlines that arrive at a producer.
Z wP
ji i dζ k
τk = (2.40)
wIj v(ζ k )
ji
where, τk represents the TOF of the k-th streamline from the j-th injector w jI to the
i-th producer wiP . ζ k is the coordinate along the k-th streamline and v(ζ ) is the
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 30
where, Nwp and Nsl represent the number of producers and the number of stream-
lines, respectively. M◦ means the end-point mobility ratio and t the time.
Analytical fractional flow at producer wiP is obtained by
where, qw and qo mean the water and oil flow rates, respectively. krw and kro repre-
sent the water and oil relative permeability, respectively. µw and µo are the water
and oil viscosity, respectively.
Finally a connectivity distance between models xi and x j is calculated as the
difference in the fractional flow curves:
Nwp Z t 2
1 xj
∑
x
d ( xi , x j ) = f wi (t; wkP ) − f w (t; wkP ) dt (2.43)
k =1 0
in Figure 2.5 (a) are distributed normally in 2D space since the 1,000 models are
multivariate Gaussian. The circles in Figure 2.5 (b) represent the 1,000 SNESIM
models. Both 2D maps show 1,000 models.
Figure 2.5: Projection of metric space of 1,000 SGSIM and 1,000 SNESIM models
using Euclidean distance.
Figure 2.6 depicts the projection of the constructed metric space by means of
MDS using the connectivity distance. Likewise, the circles in Figure 2.6 (a) rep-
resent 1,000 SGSIM models. The circles in Figure 2.6 (b) represent 1,000 SNESIM
models. Note that there are two groups of models in the projection: in Figure 2.6
(a), a narrow line on the left and a wide plume on the right; in Figure 2.6 (b), a nar-
row line on the bottom and a wide plume on the top but not as clear as Figure 2.6
(a).
Figures 2.7 and 2.8 demonstrate the nature of the two groups of models: in
the second group (narrow line), the two wells are disconnected in the models; in
the first group (wide plume), the injector is connected to the producer by a high-
permeability region (hot spot) in the models. Furthermore, the models which are
located far left region in Figure 2.7 (or bottom region in Figure 2.8) in the narrow-
line region are more disconnected than those located near the center (near the in-
tersection point between the two regions). This is also true for the wide plume
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 32
Figure 2.6: Projection of metric space of 1,000 SGSIM and 1,000 SNESIM models
using the connectivity distance.
region. Since connectivity distance is correlated with dynamic response, the mod-
els in metric space are ranked by their dynamic response, which makes it easy and
efficient to model the uncertainty in dynamic response.
Figure 2.9 shows the correlations between the connectivity distance and the dis-
tance in the projection space by MDS. It turns out that the distance in 2D projection
space is almost the same as the connectivity distance. Since the connectivity dis-
tance is well correlated with the dynamic response which is a simple time-series,
the dimension of two (m = 2) for the projection of the metric space is enough to
capture the variation of the models in metric space.
Figure 2.10 shows the correlations between the distance and the difference in
dynamic response. Figure 2.10 (a) displays the correlations in 1,000 SGSIM mod-
els between the Euclidean distance and the difference in dynamic response (in this
case, watercut from Eclipse run). Figure 2.10 (b) displays the correlations in 1,000
SGSIM models between the connectivity distance and the difference in dynamic re-
sponse. Figure 2.10 (c) displays the correlations in 1,000 SNESIM models between
the connectivity distance and the difference in dynamic response. As expected, the
connectivity distance is well correlated with the difference in dynamic response.
Figure 2.11 exhibits the difference in dynamic response in 2D projection space.
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 33
Figure 2.7: Projection of metric space using the connectivity distance (continuous
variables).
We plot the difference in dynamic data between each model and one specific model
(indicated by × in Figure 2.11). Figure 2.11 (a) through (d) are the projection space
using the Euclidean distance for 1,000 SGSIM models, the Euclidean distance for
1,000 SNESIM models, the connectivity distance for 1,000 SGSIM models, and the
connectivity distance for 1,000 SNESIM models. It shows that the difference in
dynamic response in 2D projection space using connectivity distance is smoothly
varying and does not have any local minimum, which is a favorable case if some
optimization is performed in this space (Chapter 4). On the other hand, the case
using the Euclidean distance shows the distribution is not meaningful for handling
dynamic responses.
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 34
Figure 2.8: Projection of metric space using the connectivity distance (binary vari-
ables).
2.9 Summary
The mathematical theories for MUMS are explained in detail: distance, the MDS,
the KKM, the kernel KL expansion. By using the projection of metric space by the
MDS, many models are displayed in 2D space effectively. Connectivity distance
which is well correlated with the difference in dynamic response renders the dis-
tribution of models in metric space ranked favorably for many applications such
as optimization. These techniques can be applied to any type of reservoir model
defined by either continuous or categorical variables.
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 35
(a) 1D (b) 2D
Figure 2.9: The correlations between the distances defined and the distances in the
low-dimensional projection space. From left: 1D projection space, 2D, 3D, and 10D.
X-axis means the connectivity distance and y-axes the distance in the projection
space by MDS.
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 36
(a) 1,000 SGSIM, Euclidean distance vs. (b) 1,000 SGSIM, connectivity distance
dynamic response vs. dynamic response
Figure 2.10: The correlations between the distance defined and the dynamic re-
sponse. The red lines in (b) and (c) represent the mean of difference in dynamic
data of a range of connectivity distance.
CHAPTER 2. MODELING UNCERTAINTY IN METRIC SPACE 37
(a) Euclidean distance for 1,000 SGSIM (b) Euclidean distance for 1,000
models SNESIM models
(c) The connectivity distance for 1,000 (d) The connectivity distance for 1,000
SGSIM models SNESIM models
3.1 Introduction
Techniques for MUMS have been introduced in Chapter 2. The initial ensemble
of prior models are generated and mapped into metric space. By means of MDS
and kernel KL expansion, the prior models are parameterized into short Gaussian
random vectors. In this framework, generating a new model from the prior dis-
tribution requires obtaining the inverse of such Gaussian-based parameterization.
In other words, a new model should be obtained from a new Gaussian random
vector.
As mentioned before, a parameterization by means of kernel KL expansion, or
a Gaussian random vector, represents a feature expansion in kernel space. Then,
the problem of generating a model corresponding to an arbitrary Gaussian random
vector is equivalent to the problem of back-transforming from feature space to met-
ric space. However such back transformation will only determine the location of a
new model relative to the existing models, not the model itself, hence an additional
problem of model identification poses itself. This back-transformation is called the
pre-image problem, which is a widely known problem in computer science in the
context of pattern recognition. This chapter focuses on this pre-image problem,
especially designed for models of the Earth. In a Bayesian context the pre-image
problem is equivalent to generating an additional model from the empirical prior
38
CHAPTER 3. THE PRE-IMAGE PROBLEM 39
where, b j represents the j-th element of the vector b. The kernel function is defined
in Equations 2.12 and 2.26.
Schölkopf and Smola (2002) proposed the fixed-point iteration algorithm for
solving a pre-image problem with the Radial-Basis Function (RBF) kernel. Kwok
and Tsang (2004) proposed an algorithm which uses the distance constraint in the
optimization and removes the iteration. Since the Gaussian kernel is applied in
our framework, the pre-image problem can be solved by both methods.
CHAPTER 3. THE PRE-IMAGE PROBLEM 41
by iterations: xm,new obtained by Equation 3.5 is iteratively put into Equation 3.5.
L
∑ b j,new k(xm,new , x j,m )x j,m
j =1
xm,new = = Xm a (3.5)
L
∑ b j,new k (xm,new , x j,m )
j =1
Since we know the kernel function k in Equation 3.5, these iterations can be
done efficiently. It turns out that xm,new is obtained by a nonlinear combination of
the ensemble members. Note that the nonlinear weights sum to unity ∑ ai = 1.
i
However, both methods may not converge since convergence is highly depen-
dent upon the initial point of the minimization problem and the possible existence
of the local minima in the problem. A suitable choice for the initial point resolves
those limitations in the fixed-point iteration algorithm. In our context, a suitable
initial point is the location of one of the prior models which minimizes the objec-
tive function of the pre-image problem (Equation 3.7).
(0)
xm = xk,m
k = arg min kφ(xi,m ) − φ(Xm )bnew k i = 1, ..., L (3.7)
i
CHAPTER 3. THE PRE-IMAGE PROBLEM 42
(to be continued...)
CHAPTER 3. THE PRE-IMAGE PROBLEM 44
Figure 3.1: Various solutions of the pre-image problem: all the methods converge
to the minimum.
CHAPTER 3. THE PRE-IMAGE PROBLEM 45
(to be continued...)
CHAPTER 3. THE PRE-IMAGE PROBLEM 46
Figure 3.2: Various solutions of the pre-image problem: Schölkopf and Smola
(2002) fixed-point iteration algorithm does not converge to the minimum.
CHAPTER 3. THE PRE-IMAGE PROBLEM 47
(to be continued...)
CHAPTER 3. THE PRE-IMAGE PROBLEM 48
Figure 3.3: Various solutions of the pre-image problem: Kwok and Tsang (2004)
algorithm does not converge to the minimum.
CHAPTER 3. THE PRE-IMAGE PROBLEM 49
(to be continued...)
CHAPTER 3. THE PRE-IMAGE PROBLEM 50
xnew = Xa (3.8)
d ( xi , x j ) ≈ k xi − x j k (3.9)
d(xi , x j ) ≈ kxm,i − xm,j k (3.10)
In the process of linearly combining the initial models, any other constraints
for prior geologic information cannot be imposed. However, certain types of prior
probability distributions are preserved in linear combination, such as the Gaus-
sianity of models. Indeed, any linear combination of Gaussian models is again a
Gaussian model. Although the linear combination weights that sum up to unity
CHAPTER 3. THE PRE-IMAGE PROBLEM 51
Figure 3.5: Various solutions of the pre-image problem: all the methods other than
the case starting from the proposed initial point do not converge to the minimum.
CHAPTER 3. THE PRE-IMAGE PROBLEM 53
preserve the mean, the constraints for preserving the covariance are not imposed
in Equation 3.8. The covariance is preserved when the squared weights sum up
to unity. In order to preserve the covariance or further apply the scheme of un-
constrained optimization to any random function of continuous variables, a rank
transform identifying the unit variance after linear combination can be employed,
if at all needed. The following two examples for Gaussian and uniform random
function models show the applicability of the proposed unconstrained optimiza-
tion of the fixed-point iteration algorithm to the pre-image problem.
First, consider 300 Gaussian random function models (101 × 101) generated
by sequential Gaussian simulation (SGSIM) (Figure 3.6). With these models, 300
additional prior models are generated by using the Euclidean distance and the
connectivity distance respectively.
If the Euclidean distance between the model variables is utilized, the scheme
of unconstrained optimization is applied, since Equation 3.9 holds. Yet, it should
be noted that although Equation 3.9 holds, the Euclidian distance requires a rel-
atively large number of eigenvalues for the condition of Equation 3.10 to hold.
With a large number of eigenvalues a relatively high dimensional space is created
and, consequently, the unconstrained optimization converges slowly in this high
dimensional space.
If the connectivity distance is employed, 2 or 3 eigenvalues are enough for
Equation 3.10 to hold. However, the connectivity distance is nonlinearly related
to the model variables, so Equation 3.9 does not hold. Therefore, the application of
this unconstrained optimization to connectivity distance provides an approximate
solution of the pre-image problem.
Figures 3.7 and 3.8 show 4 of the new 300 models generated by solving the
pre-image problem using unconstrained optimization. The models in Figure 3.7
are generated by using the Euclidian distance and the models in Figure 3.8 by
using the connectivity distance. Since they are generated by linearly combining
the initial Gaussian models, the new models have Gaussian histograms as seen in
the QQ-plot of Figure 3.10. Figure 3.9 depicts the initial 300 models and the new
300 models in the projection of metric space by MDS (left: the Euclidian distance;
CHAPTER 3. THE PRE-IMAGE PROBLEM 54
right: the connectivity distance). It turns out that the new models capture the
variation of the initial models, which means the new models represent the prior
uncertainty space. However, Figure 3.11 shows the comparison of the histograms
of the initial and new model. The variance is decreased while new models follow
the Gaussian distribution. Figure 3.12 displays the variograms of initial and new
models. The range and the shape of the variograms are similar.
A model of continuous variables A model of continuous variables
100 3 100 3
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
The same unconstrained optimization for solving the pre-image problem is ap-
plied to 300 models generated by the Direct Sequential Simulation (DSSIM, Deutsch
and Journel (1998)) (Figure 3.13). In this case all conditional distributions used are
uniform. In this case, a rank transform is required to preserve a desired prior
CHAPTER 3. THE PRE-IMAGE PROBLEM 55
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
Figure 3.9: 300 initial Gaussian random function models and 300 new models.
CHAPTER 3. THE PRE-IMAGE PROBLEM 57
3 3
2 2
1 1
Y Quantiles
Y Quantiles
0 0
−1 −1
−2 −2
−3 −3
−4 −4
−4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4
X Quantiles X Quantiles
Figure 3.10: QQ-plot between an initial prior model and a new model from uncon-
strained optimization of the pre-image problem. The green line is the 45◦ -line. The
Gaussian shape is preserved but not the variance.
Gaussian prior Gaussian prior
3500 3500
Initial Initial
Pre−image Pre−image
3000 3000
2500 2500
2000 2000
1500 1500
1000 1000
500 500
0 0
−4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4
1.2 1.2
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 20 40 60 80 100 120 0 20 40 60 80 100 120
Figure 3.12: Variograms (standardized) of an initial prior model and a new model.
CHAPTER 3. THE PRE-IMAGE PROBLEM 58
distribution, since the QQplot of Figure 3.17 shows clear difference in the distribu-
tions of the prior models and the new models. Figures 3.14 and 3.15 show the new
models from the pre-image problem when using the Euclidian distance and con-
nectivity distance respectively. Figure 3.16 displays the new 300 models and the
initial 300 models in the projection of metric space by MDS. Similar to the results
of the 300 Gaussian models, the pre-image problem generates the same distribu-
tion of new models in metric space as the initial prior models. Additionally, the
variance is decreased in the new models (the histograms of Figure 3.18) while the
(standardized) variogram is reasonably reproduced (Figure 3.19).
A model of continuous variables A model of continuous variables
100 3 100 3
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
Figure 3.13: 4 of 300 uniform random function models generated by the DSSIM.
CHAPTER 3. THE PRE-IMAGE PROBLEM 59
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
90 90
2 2
80 80
70 70
1 1
60 60
x, ft
x, ft
50 0 50 0
40 40
−1 −1
30 30
20 20
−2 −2
10 10
0 −3 0 −3
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
y, ft y, ft
Figure 3.16: 300 initial uniform random function models and 300 new models.
CHAPTER 3. THE PRE-IMAGE PROBLEM 61
3 3
2 2
1 1
Y Quantiles
Y Quantiles
0 0
−1 −1
−2 −2
−3 −3
−3 −2 −1 0 1 2 3 4 5 −4 −3 −2 −1 0 1 2 3 4
X Quantiles X Quantiles
Figure 3.17: QQ-plot between an initial prior model and a new model from uncon-
strained optimization of the pre-image problem. The green line is the 45◦ -line.
Uniform prior Uniform prior
3000 3500
Initial Initial
Pre−image Pre−image
3000
2500
2500
2000
2000
1500
1500
1000
1000
500
500
0 0
−3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4
0.8
0.7 1
0.6
0.8
0.5
0.6
0.4
0.3 0.4
0.2
0.2
0.1
0 0
0 10 20 30 40 50 60 0 10 20 30 40 50 60
Figure 3.19: Variograms (standardized) of an initial prior model and a new model.
CHAPTER 3. THE PRE-IMAGE PROBLEM 62
The proximity distance transform assigns to each location the distance to the
border of the nearest object from the point. Figure 3.20 illustrates the proximity
distance transform for a simple binary image. As shown in the figure, the prox-
imity distance transform creates a continuous variable from a categorical variable
by assigning to each gridblock the distance to the nearest object. The value of the
proximity distance transform at the location of an object is zero, since the distance
to the nearest object from the object itself is zero. Using this property, the back
transform of the proximity distance transform is possible after obtaining a new
image of the proximity distance transform (by using a threshold). That is, if the
value at a gridblock is larger than zero, assign zero (indicating background) to the
gridblock; otherwise assign one (indicating object).
For demonstration of the proposed feature constrained optimization, Boolean
models are generated by placing various shapes of pre-assigned depositional ob-
jects. Figure 3.21 shows various types of Boolean models generated: Boolean chan-
nel models, Boolean lobe models, and Boolean fracture models.
CHAPTER 3. THE PRE-IMAGE PROBLEM 63
Figures 3.22 to 3.24 displays the process of the proposed feature constrained
optimization: (a) Generate the initial Boolean models. (b) Perform the proximity
distance transform for all the initial models. (c) Solve the pre-image problem by the
fixed-point iteration algorithm and apply the combination coefficients to linearly
combine the images of the proximity distance transform. (d) Perform the inverse
of the proximity distance transform for all the solutions from the fixed-point iter-
ation algorithm. The new models represent well the spatial characteristics of the
initial models, although not perfectly. The new models from the Boolean channel
models show channelized objects, similar results are obtained in the case of lobes
and fractures but the representation is worse than the case of channel.
Figure 3.25 depicts the locations of 300 initial models and 300 new models in the
projection of metric space by MDS. The new models represent the prior uncertainty
space regardless of which distance is employed.
Both the unconstrained optimization and the feature constrained optimization are
basically approximations which can become problematic under certain conditions.
CHAPTER 3. THE PRE-IMAGE PROBLEM 64
(d) 4 of 300 new models from the back transform of proximity distance transform.
Figure 3.22: The processes of feature constrained optimization for Boolean channel
models.
CHAPTER 3. THE PRE-IMAGE PROBLEM 71
(d) 4 of 300 new models from the back transform of proximity distance transform.
Figure 3.23: The processes of feature constrained optimization for Boolean lobe
models.
CHAPTER 3. THE PRE-IMAGE PROBLEM 75
(d) 4 of 300 new models from the back transform of proximity distance transform.
Figure 3.24: The processes of feature constrained optimization for Boolean fracture
models.
CHAPTER 3. THE PRE-IMAGE PROBLEM 79
Euclidian Connectivity
(a) Boolean channel models.
Euclidian Connectivity
(b) Boolean lobe models.
Euclidian Connectivity
(c) Boolean fracture models.
Figure 3.25: 300 initial models (red) and 300 new models (blue) in the projection of
metric space by MDS.
CHAPTER 3. THE PRE-IMAGE PROBLEM 80
First, in most applications Equation 3.9 does not hold. Equation 3.9 states ba-
sically that the distance defined is approximately the same as the Euclidean dis-
tance. In MUMS, the application-tailored distance is defined such that the distance
is correlated with the response of the model, not with the model itself. Hence,
Equation 3.9 holds only if the response of the model is linearly correlated with the
model itself. In such case any uncertainty assessment would be trivial.
Secondly, the two optimization methods proposed above do not always pre-
serve well the prior information (particularly for complex geology), which is chal-
lenging, see the fracture case Figure 3.24. The unconstrained optimization applied
to Gaussian models does not preserve the covariance. Furthermore the uncon-
strained optimization applied to non-Gaussian models with continuous variables
requires a rank transform to preserve the marginal distribution, but the rank trans-
form may destroy the minimization of the objective function as defined in the
pre-image problem. In other words, the rank transformed models obtained by
unconstrained optimization are not the exact pre-image solutions. In the feature
constrained optimization, the proximity distance transform and its back transform
no longer minimize the objective function of the pre-image problem. Moreover, the
feature constrained optimization can be applied only to the binary image, and are
not applicable to a model containing more than two categories. What is missing in
the previous methods is that the generating algorithm (SGSIM, DSSIM, SNESIM)
for creating the prior set of models is not used at all.
In order to overcome these limitations a geologically constrained optimization
employing this generating algorithm for the pre-image problem is proposed. In
this geologically constrained optimization we apply the Probability Perturbation
Method (PPM, Caers and Hoffman (2006)) to the minimization problem of the pre-
image problem. Such geologically constrained optimization provides the solution
in model space directly. Caers and Hoffman (2006) showed that the PPM algorithm
generates a non-stationary Markov chain that has sampling properties similar to
the rejection sampler (geologically consistent, adhering to Bayes’ rule) but at much
greater efficiency.
Recall that all sequential geostatistical simulation algorithms (variogram-based
CHAPTER 3. THE PRE-IMAGE PROBLEM 81
To briefly recap what PPM does, we introduce sequential simulation first. In se-
quential simulation, we draw a prior model x containing outcomes xi (grid-cell
values):
|
x= x1 x2 · · · xi · · · xN ∈ R N ×1 (3.12)
where, Xi represent a random variable at the i-th node, such as grid cell proper-
ties. Note that the subscripts ’1, 2, · · · , N’ represent nodes that are randomized in
sequential simulation.
Consider the minimization problem with an arbitrary objective function O( ):
PPM regards the condition of minimization of the objective function as a new type
CHAPTER 3. THE PRE-IMAGE PROBLEM 82
of data event C. In other words, xi,C is drawn from the conditional probability:
Note that this conditioning event consists of two parts: { Xi ≤ xi | Xi−1 ≤ xi−1 , · · · , X1 ≤
x1 } and {C}.
PPM first parameterizes the conditional probability of the latter event Prob{ Xi ≤
xi |C} as
( n −1)
Prob{ Xi ≤ xi |C} = (1 − rC )1{ xi ≤ xi } + rC Prob{ Xi ≤ xi } (3.16)
( n −1)
where, xi denotes a model variable of the previous step n − 1.
The conditional probability of the first event
1 − Prob{ Xi ≤ xi }
a = (3.19)
Prob{ Xi ≤ xi }
1 − Prob{ Xi ≤ xi | Xi−1 ≤ xi−1 , · · · , X1 ≤ x1 }
b = (3.20)
Prob{ Xi ≤ xi | Xi−1 ≤ xi−1 , · · · , X1 ≤ x1 }
1 − Prob{ Xi ≤ xi |C}
c = (3.21)
Prob{ Xi ≤ xi |C}
(n)
rC = arg min O(x(rC )) (3.22)
rC
The evaluation of the objective function to solve the pre-image problem (Equa-
tion 3.11) requires the calculation of the distance between any initial model and
the current model as in Equations 3.26 to 3.29.
4: for i = 1, · · · , L do
(0)
5: xi is drawn sequentially from conditional probability:
6: end for
7: while O(x(n) ) is not minimized do
8: n = n+1
9: Change the random seed.
(n)
10: Generate x(rC ) by
11: for i = 1, · · · , L do
(n)
12: drawing xi (rC ) from conditional probability:
(n)
rC = arg min O(x(rC )) (3.25)
rC
(n)
15: Put x(n) = x(rC )
16: end while
17: Put xC = x(n)
CHAPTER 3. THE PRE-IMAGE PROBLEM 85
Figure 3.26: The probability perturbation method to solve the pre-image problem
(geologically-constrained optimization).
(Equation 3.7). In Equation 3.30, we choose the best fit model amongst the initial
set of prior models.
x (0) = x k
k = arg min kφ(xi ) − φ(X)bnew k i = 1, ..., L (3.30)
i
However, starting the PPM from the same best model amongst the set of prior
models may require sampling the Markov chain of the PPM algorithm for a long
time in order to find a suitable model that has lower objective function, especially
when a prior model is less likely to be located in the region where the objective
function is low.
CHAPTER 3. THE PRE-IMAGE PROBLEM 86
Hence we propose to start PPM from the solution of the unconstrained opti-
mization (Section 3.3.1, Equation 3.8):
|
xa = Xa = x1,a x2,a · · · xi,a · · · x N,a (3.31)
Example
Consider 300 models amongst 1,000 channelized prior models generated by SNESIM
in Chapter 2. The prior information imposed by the training image is explicitly
honored in the optimization process of PPM.
First, the initial point is chosen randomly amongst the initial prior models. Fig-
ure 3.27 shows the updates the PPM algorithm makes. The green circles ◦ denote
the locations of initial prior models. The large circles with colors from red ◦ to blue
◦ represent the model updates that decrease the objective function. The large gray
circles ◦ represent the model updates which do not decrease the objective function
CHAPTER 3. THE PRE-IMAGE PROBLEM 87
hence are discarded. The lines between large red-to-blue-colored circles denote the
direction of the PPM updates. The contour map in the background displays the ob-
jective function over the entire domain (Equation 3.26). From Figure 3.27 it is clear
that PPM makes updates to either increase or decrease the connectivity between
the injector and the producer when needed. As seen in the figure, starting the PPM
algorithm from a random point requires a relatively large number of iterations.
Since we have already evaluated the objective function values at the locations
of the initial prior models, selecting the best fit model amongst the initial prior
models is tested to increase efficiency of PPM updates. Figure 3.28 shows the up-
dates of PPM. Although less number of iterations are required to converge, it is
more difficult to obtain a diverse set of solutions by starting from a single best
initial model. From a single best initial model in Figure 3.29 (a), 10 trials with dif-
ferent random seeds provided the same model of Figure 3.29 (b). In other words,
PPM requires many iterations to generate a diverse set of solutions when starting
from the same initial model.
In order to obtain a diverse set of solutions efficiently, PPM is applied by uti-
lizing the solution of the unconstrained optimization as prior probabilities. Fig-
ure 3.30 shows examples of such solutions. Figure 3.31 shows the updates of PPM.
It turns out that less number of iterations are required to converge. Figure 3.32
depicts a diverse set of solutions obtained by applying the PPM by using as initial
model the solution of the unconstrained optimization. All these diverse models
are located at the same point in the projection of metric space, which means these
models exhibit similar connectivity as seen in Figure 3.32.
3.4 Summary
The pre-image problem generates a new set of prior models from an initial set of
prior models. The fixed-point iteration algorithm to solve the pre-image problem
has been improved for efficiency by choosing the initial point carefully. In order
to obtain geologic models from the pre-image problem, three types of solutions
are proposed: unconstrained optimization, feature constrained optimization, and
CHAPTER 3. THE PRE-IMAGE PROBLEM 88
Figure 3.27: The PPM iterations from a randomly chosen model as an initial model.
CHAPTER 3. THE PRE-IMAGE PROBLEM 89
Figure 3.28: The PPM iterations from the best fit model amongst the initial prior
models as an initial model.
CHAPTER 3. THE PRE-IMAGE PROBLEM 90
Figure 3.29: Initial and final models obtained by PPM starting from current best-fit
model. From the same initial model in (a) (best-fit model), PPM provides the same
final model with 10 trials of different random seeds.
Figure 3.31: The PPM iterations from the solution of the fixed-point iteration algo-
rithm as an initial probability for the PPM.
CHAPTER 3. THE PRE-IMAGE PROBLEM 92
Figure 3.32: A diverse set of the pre-image solutions obtained by the PPM starting
from the solution of unconstrained optimization.
4.1 Introduction
Chapter 3 demonstrated how the pre-image problem creates a new model from an
arbitrary feature expansion using the fixed-point iteration algorithm and the prob-
ability perturbation method. By solving the pre-image problem from an arbitrary
feature expansion (φnew ), represented by a short Gaussian vector (ynew ), a new
model is generated from the probability distribution empirically represented by
the initial ensemble of models. For example, if the initial models are constrained
to geologic information and hard data (eg. well core or log) as well as soft data
(eg. seismic probability cube), the pre-image problem generates new models con-
strained to those data only.
However, when we need posterior models constrained to additional data, the
initial models honoring the prior information as well as the additional data have to
be re-created from scratch. To make matters worse, if these additional data exhibit
a strong nonlinear and time-varying relationship with model variables, there is no
direct method available to generate them.
Inverse modeling approaches have been applied to the problem of conditioning
to non-linear time-dependent data. Yet, no techniques for inverse modeling can ef-
ficiently generate multiple models representing the posterior probability distribu-
tion given geologic information, hard and soft data, and nonlinear time-dependent
93
CHAPTER 4. THE POST-IMAGE PROBLEM 94
data. Only the rejection sampler generates multiple models representing the pos-
terior probability distribution but it is inefficient. See Section 1.5.
This chapter addresses how to sample models from the posterior probability
distribution given non-linear and often time-dependent data in the framework of
MUMS. First, we determine the feature expansions representing posterior models
by reformulating the inverse problem in metric space. The problem of determining
the feature expansion representing posterior models is termed the ”Post-Image
Problem”.
The term, post-image problem, contains two meanings. First, the post-image
problem determines the feature expansion from metric space while the pre-image
problem determines the model in metric space from feature space. Hence, the post-
image problem represents the opposite problem of the pre-image problem. Sec-
ondly, the post-image problem generates posterior models, while the pre-image
problem generates additional prior models. Therefore, the term, ”post”-image
problem, also means ”for posterior models”, while ”pre”-image refers to prior
model generation.
The post-image problem generates feature expansions representing posterior
models given nonlinear time-dependent data. The pre-image problem generates
models from arbitrary feature expansions. Therefore, applying the pre-image prob-
lem on the feature expansions obtained by the post-image problem provides mul-
tiple posterior models constrained to all the prior information as well as the non-
linear time-dependent data.
In the following section, we define the post-image problem by reformulating
the inverse problem in metric space and propose two solutions of post-image prob-
lem with the comparison to the rejection sampler in the applications to Earth mod-
els.
is no error in data.
d = g(xtrue ) (4.1)
Additionally, since we have the data d, or we know g(xtrue ), the distance be-
tween each model and the ”true Earth” (Equation 4.4) can be defined.
expansion of the ”true Earth” in kernel space φ(xm,true ) can be obtained by solving
Equations 4.5 and 4.6. In comparison with the pre-image problem, we call this
problem the ”post-image” problem, because while the pre-image problem finds
the model from its feature expansion, the post-image problem provides the feature
expansion from its location in metric space.
into the objective function in Equation 4.6 makes the objective function to be zero,
or the minimum.)
∂
{φ(xm,true )| φ(xm,true ) − 2φ(xm,true )| φ(Xm )b + b| Kb} = −2φ(Xm )| φ(xm,true ) + 2Kb
∂b
= 0
φ(Xm )| φ(xm,true ) = Kbtrue (4.7)
The feature expansion of the ”true Earth” is calculated by using btrue as linear
weight on the ensemble of feature expansion (Equation 4.8). The kernel KL expan-
sion of the ”true Earth” is calculated by Equation 4.9. Hence, the feature expansion
of the ”true Earth” is determined and the pre-image problem that is explained in
Chapter 3 provides the model for the ”true Earth”.
Figure 4.1: The gradual deformation method to solve the post-image problem.
1
φ(xm,true ) = φ(Xm )btrue = φ(Xm ) √ VK ytrue (4.11)
L
4.4 Examples
In this section, the post-image problem is applied to solve three different problems
of modeling uncertainty in reservoir models with production history as nonlinear
time-dependent data. First, a simple and easy example is provided to demonstrate
the post-image and pre-image problems. Secondly, two less favorable cases are
presented, including a case where the ”true Earth” is near the boundary of the
prior and a case where there are few prior models near the ”true Earth”. More
importantly, uncertainty in future prediction is investigated in detail by solving the
post-image and pre-image problems. All the cases are compared with the rejection
sampler.
Consider a reservoir of 310 ft × 310 ft × 10 ft (Figure 4.2). Uncertainty on facies
distribution will be considered. We assume that a few meandering channels are
CHAPTER 4. THE POST-IMAGE PROBLEM 100
distributed mainly along the NE50 direction (geological knowledge). The training
image in Figure 4.3 shows the given geologic scenario for the reservoir of interest.
An injector and a producer are completed at bottom-left corner (45 ft, 45ft) and
top-right corner (275 ft, 275 ft) respectively as displayed in Figure 4.2. Core data at
the two wells show sand facies are distributed near the wells (hard data).
Reservoir geometry
300
250
200
y, ft
150
100
50
Injector
Producer
0
0 50 100 150 200 250 300
x, ft
Figure 4.2: The reservoir geometry and two wells: an injector and a producer.
Based on the prior information given (geologic scenario and hard data), 100
prior models are generated by means of the multiple-point geostatistical algo-
rithm, SNESIM. Figure 4.4 displays 6 of 100 prior models containing various shapes
of meandering channels of NE50 direction. Note that the injector is connected to
the producer by a sand facies channel structure in some of the prior models while
they are disconnected in others. The connectivity between the two wells is a dom-
inant factor affecting the flow behaviors in the reservoir and the nature of such
connectivity is highly dependent on the prior geological scenario.
CHAPTER 4. THE POST-IMAGE PROBLEM 101
y, ft
y, ft
50 50 50
y, ft
y, ft
50 50 50
Figure 4.4: 6 out of 100 initial prior models generated by SNESIM constrained to
geologic information and hard data.
CHAPTER 4. THE POST-IMAGE PROBLEM 102
0.9
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.7 shows the locations of 100 initial prior models as well as the ”true
Earth” in the projection from metric space by means of MDS. Since all the re-
sponses of initial prior models as well as the response of the ”true Earth” are avail-
able, the location of the ”true Earth” is determined, although the ”true Earth” itself
is unknown.
As explained in Chapter 2, the models in the left wide plume of points exhibit
good connectivity between the injector and producer; the models in the right nar-
row line of points have poor connectivity between the injector and producer. The
CHAPTER 4. THE POST-IMAGE PROBLEM 103
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.6: Watercut curves of 100 initial prior models displayed with data.
0.2
0.1
−0.1
−0.2
−0.3
−1 −0.5 0 0.5 1 1.5 2 2.5
Figure 4.7: The locations of the true Earth and initial prior models in the projection
of metric space.
CHAPTER 4. THE POST-IMAGE PROBLEM 104
0.2
0.1
−0.1
−0.2
−0.3
Figure 4.8: The objective function of post-image problem in the projection of metric
space by the MDS.
Since 30 feature expansions of the ”true Earth” are obtained, the pre-image
CHAPTER 4. THE POST-IMAGE PROBLEM 105
problem for the feature expansions is solved in order to determine the models cor-
responding to the feature expansions. Figure 4.9 shows 6 of 30 pre-image solutions
obtained by unconstrained optimization (see Section 3.3.1), which are then used as
an initial probability to the PPM for solving the pre-image problem. Figure 4.10
displays 6 of 30 posterior models obtained by solving the pre-image problem by
means of the geologically constrained algorithm proposed in Section 3.3.3. It turns
out that various channelized models where the injector is connected to the pro-
ducer by a sand facies body are obtained as we expected. Note that the obtained
posterior models are diverse in channel shape and spatial distribution.
The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm
1 1 1
300 300 300
y, ft
y, ft
0.5 0.5 0.5
150 150 150
0 0 0 0 0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300 0 50 100 150 200 250 300
x, ft x, ft x, ft
The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm
1 1 1
300 300 300
y, ft
y, ft
0 0 0 0 0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300 0 50 100 150 200 250 300
x, ft x, ft x, ft
Figure 4.11 depicts the watercut responses of the 30 posterior models. All the 30
posterior models match the data well. To obtain 30 posterior models, 78 additional
forward simulations were needed to solve the pre-image problems (not includ-
ing the 100 forward simulations for calculating distances between prior models).
However, it needs to be verified that the 30 posterior models represent the poste-
rior probability as outlined by Bayes’ rule.
CHAPTER 4. THE POST-IMAGE PROBLEM 106
y, ft
y, ft
150 150 150
50 50 50
y, ft
y, ft
150 150 150
50 50 50
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.11: Watercut curves for 30 posterior models obtained by post-image and
pre-image problems matching the watercut history.
CHAPTER 4. THE POST-IMAGE PROBLEM 107
y, ft
y, ft
50 50 50
y, ft
y, ft
50 50 50
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.13: Watercut curves for 30 posterior models obtained by the rejection sam-
pler.
The conditional mean and variance of the ensemble provide a detailed com-
parison between the post-image problem and the rejection sampler. First of all,
Figure 4.14 shows the conditional mean and variance of the initial ensemble of
100 prior models and the 9,898 prior models generated for the rejection sampler.
The two figures are very similar, which means that the 100 prior models are suf-
ficient to cover the prior uncertainty space. Note that the conditional mean and
variance should be symmetric with regard to the line connecting the injector to the
producer.
More importantly, Figure 4.15 depicts the conditional mean and variance of the
30 posterior models obtained by the post-image and pre-image problems and 30
posterior models obtained by the rejection sampler. This comparison provides a
verification of the post-image and pre-image problems in terms of representing
posterior uncertainty.
First, compare Figure 4.14 (c) with Figure 4.15 (c), the conditional mean of 9,898
prior models and 30 posterior models obtained by the rejection sampler. The mean
CHAPTER 4. THE POST-IMAGE PROBLEM 109
Conditional variance
0.5
300
Conditional mean
1 0.45
300
y, ft
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (b) Conditional variance of 100
(a) Mean of 100 initial prior models initial prior models
Conditional variance
0.5
300
Conditional mean
1 0.45
300
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (d) Conditional variance of 9,898
(c) Mean of 9,898 prior models for prior models for the rejection
the rejection sampler sampler
Figure 4.14: Comparison of the mean and conditional variance of 100 initial prior
models with 9,898 prior models for the rejection sampler.
CHAPTER 4. THE POST-IMAGE PROBLEM 110
Conditional variance
0.5
300
Conditional mean
1 0.45
300
y, ft
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (b) Conditional variance of 30
(a) Mean of 30 posterior models posterior models from post-image
from post-image problem problem
Conditional variance
0.5
300
Conditional mean
1 0.45
300
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (d) Conditional variance of 30
(c) Mean of 30 posterior models posterior models from the rejection
from the rejection sampler sampler
Figure 4.15: Comparison of the mean and conditional variance of 30 posterior mod-
els from the post-image problem with those from the rejection sampler.
CHAPTER 4. THE POST-IMAGE PROBLEM 111
of models where the indicator of sand facies is one and the indicator of mud back-
ground is zero represents the probability of having sand facies. In these figures,
the posterior probability of having sand facies increases along the line connecting
the injector to the producer. This makes sense, since the watercut is relatively high
so the posterior models exhibit high probability of being connected between the
injector and the producer by sand facies.
Next, compare Figure 4.14 (d) with Figure 4.15 (d), the conditional variance of
9,898 prior models and 30 posterior models obtained by the rejection sampler. The
variance around the line connecting the injector to the producer increased, because
the injector can be connected to the producer by sand facies with various shapes
and spatial distribution of the channel sand facies. On the other hand, the variance
in the other region (far from the line connecting the injector to the producer) is
almost zero, which means that the injector and the producer are connected by sand
facies with a relatively short path so the probability of being sand facies in the
region is very low.
Then, compare the results of the post-image and pre-image problems with
those of the rejection sampler. When we compare Figure 4.14 (a) and (b) with
Figure 4.15 (a) and (b), the change of the conditional mean and variance from 100
prior models to 30 posterior models obtained by the post-image and pre-image
problems is very similar to that obtained from the rejection sampler. Addition-
ally, the comparison of Figure 4.15 (a) and (b) with Figure 4.15 (c) and (d) shows
the conditional mean and variance of 30 posterior models obtained by the post-
image and pre-image problem are similar to those from the rejection sampler. In
conclusion, the reformulation of the post-image and pre-image problem in metric
space provides realistic uncertainty models for both the model variables and their
responses.
CHAPTER 4. THE POST-IMAGE PROBLEM 112
4.4.2 A case where the ”true Earth” is near the boundary of the
prior
The previous case concerned a situation where the ”true Earth” was located almost
in the middle of the plume of prior models in metric space. In order to verify the
applicability of the post-image and pre-image problem to uncertainty modeling,
less favorable cases should be considered.
Consider a new watercut history as displayed in Figure 4.16. The data are
much more difficult due to the following two reasons. First of all, the watercut
values decrease from 240 days to 400 days. In order to match the data we need
specific model characteristics other than simply connectivity between the injector
and the producer as will be shown later. Furthermore, the watercut history shows
almost maximum values amongst the watercut responses of 100 initial prior mod-
els. In the period between 100 days and 240 days, the watercut history is located
outside of the spread of the prior watercut responses. As a result, the location of
the ”true Earth” is almost near the boundary of the plume of points of 100 initial
prior models (Figure 4.17).
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.16: Watercut curves of 100 initial prior models displayed with data.
CHAPTER 4. THE POST-IMAGE PROBLEM 113
−0.1
−0.2
−0.3
−0.4
−0.5
−2.5 −2 −1.5 −1 −0.5 0 0.5 1
Figure 4.17: The locations of the true Earth and initial prior models in the projection
of metric space.
However, the objective function in Figure 4.18 has a nice shape. Hence obtain-
ing the post-image solutions is stable and efficient. Note again how the history
matching problem is reduced to a 2D optimization problem. Figures 4.19 and 4.20
show the solutions of unconstrained optimization (Section 3.3.1) and geologically
constrained optimization (Section 3.3.3), respectively. It turns out that the post-
image and pre-image problems provide diverse posterior models constrained to
the watercut data. The reason why the watercut decreases from 240 days can now
be given as follows. Most of the posterior models contain two channels which start
from the injector and proceed to the producer separately and only one of the two
channels is connected to the producer. Hence, after the oil has been swept out in
the connected channel and before the water in the disconnected channel arrives at
the producer, the watercut decreases.
Figure 4.21 shows the watercut responses of the 30 posterior models obtained
from the post-image and pre-image problems. They match well. Especially, if one
compares it with the result from the rejection sampler (Figures 4.22 and 4.23), the
match is very similar to these results which required 9,563 forward simulations. In
CHAPTER 4. THE POST-IMAGE PROBLEM 114
Obj. func.
0.3 True Earth
Prior (low obj)
...
0.2 ...
...
Prior (high obj)
0.1
−0.1
−0.2
−0.3
−0.4
−0.5
−2 −1.5 −1 −0.5 0 0.5
Figure 4.18: The objective function of post-image problem in the projection of met-
ric space.
The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm
1 1 1
300 300 300
y, ft
y, ft
0 0 0 0 0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300 0 50 100 150 200 250 300
x, ft x, ft x, ft
The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm
1 1 1
300 300 300
y, ft
y, ft
0 0 0 0 0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300 0 50 100 150 200 250 300
x, ft x, ft x, ft
y, ft
y, ft
150 150 150
50 50 50
y, ft
y, ft
150 150 150
50 50 50
order to obtain 30 posterior models from the post-image and pre-image problems
97 forward simulations (+ 100 forward simulation initially done) were required in
this case.
Figure 4.24 displays the comparison of the conditional mean and variance be-
tween the posterior models obtained by the post-image and pre-image problems
and by the rejection sampler. It shows that the posterior models are likely to con-
tain a channel along the line of injector and producer. Unlike Figure 4.15, high level
of variance is observed in the region far from this line, which means another chan-
nel parallel to the connected channel but not connected to the producer is likely
to exist. The conditional mean and variance of 30 posterior models from the post-
image and pre-image problems represent most of the important characteristics in
the results of the rejection sampler.
CHAPTER 4. THE POST-IMAGE PROBLEM 116
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.21: Watercut curves for 30 posterior models obtained by post-image and
pre-image problems matching the watercut history.
y, ft
y, ft
50 50 50
y, ft
y, ft
50 50 50
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.23: Watercut curves for 30 posterior models obtained by the rejection sam-
pler.
4.4.3 A case where few prior models are near the ”true Earth”
Consider a new watercut history (Figure 4.25). There are very few initial prior
models that show similar watercut responses to the watercut history. As in Fig-
ure 4.26, there are few points near the location of the ”true Earth”.
In this case, the post-image problem can be solved easily, however, the pre-
image problem seldom provides a converged solution. After 127 additional for-
ward simulations, the pre-image problem generated only 3 posterior models (Fig-
ures 4.27 and 4.28). Although the 3 posterior models are matching the history
(Figure 4.29) and show the major characteristics of being disconnected between
the injector and the producer, we need many more forward simulations to obtain
enough posterior models, especially for comparison with the rejection sampler.
One possible solution is to add the 3 posterior models to the set of initial prior
models and resolve the post-image and pre-image problems with 103 initial mod-
els. This would generate more points near the location of the ”true Earth”. If
required, this procedure can be iterated until enough number of posterior models
is obtained. The iteration creates a larger set of initial prior models. The idea of
CHAPTER 4. THE POST-IMAGE PROBLEM 118
Conditional variance
0.5
300
Conditional mean
1 0.45
300
y, ft
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (b) Conditional variance of 30
(a) Mean of 30 posterior models posterior models from post-image
from post-image problem problem
Conditional variance
0.5
300
Conditional mean
1 0.45
300
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (d) Conditional variance of 30
(c) Mean of 30 posterior models posterior models from the rejection
from the rejection sampler sampler
Figure 4.24: Comparison of the mean and conditional variance of 30 posterior mod-
els from the post-image problem with those from the rejection sampler.
CHAPTER 4. THE POST-IMAGE PROBLEM 119
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.25: Watercut curves of 100 initial prior models displayed with data.
0.1
−0.1
−0.2
−0.3
−0.4
−1 −0.5 0 0.5 1 1.5 2 2.5
Figure 4.26: The locations of the true Earth and initial prior models in the projection
of metric space.
CHAPTER 4. THE POST-IMAGE PROBLEM 120
The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm
1 1 1
300 300 300
y, ft
y, ft
0.5 0.5 0.5
150 150 150
0 0 0 0 0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300 0 50 100 150 200 250 300
x, ft x, ft x, ft
y, ft
y, ft
50 50 50
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.29: Watercut curves for 3 posterior models obtained by post-image and
pre-image problems matching the watercut history.
Figure 4.30: Comparison of the post-image and pre-image solution methods with
other sampling techniques.
Consider the same watercut history (Figure 4.25). The same methodology but
now with iteration is applied. 30 posterior models are obtained after 138 new prior
models are added to the initial set of prior models (Figure 4.31). That is, amongst
238 prior models, 30 posterior models are obtained with 238 forward simulations
or about 8 flow simulations per history matched model. Considering that the prior
models have low likelihood of matching the data, this clearly illustrates the effi-
ciency of the post-image and pre-image solution method. Figure 4.32 displays the
watercut responses of 30 posterior models with the data. They match well. Espe-
cially, if one compares it with the result from the rejection sampler (Figures 4.33
and 4.34), the uncertainty in future prediction of watercut is very similar to the
CHAPTER 4. THE POST-IMAGE PROBLEM 123
y, ft
y, ft
150 150 150
50 50 50
y, ft
y, ft
50 50 50
Figure 4.36 displays the comparison of the conditional mean and variance be-
tween the posterior models obtained by the methodology with iteration and by
the rejection sampler. The mean of 30 posterior models shows the injector is not
likely to be connected to the producer by a channel sand facies. The conditional
variance of 30 posterior models shows there are channels of various shapes and lo-
cations, since the channels do not have to connect the injector to the producer. The
CHAPTER 4. THE POST-IMAGE PROBLEM 124
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.32: Watercut curves for 30 posterior models obtained by the post-image
and pre-image problems with iteration. Watercut curves of initial prior models
and newly added prior models are displayed separately.
y, ft
y, ft
50 50 50
y, ft
y, ft
50 50 50
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.34: Watercut curves for 30 posterior models obtained by the rejection sam-
pler.
Obj. func.
True Earth
0.4 Prior (low obj)
...
...
...
0.3
Prior (high obj)
0.2
0.1
−0.1
−0.2
−0.3
Figure 4.35: The objective function of post-image problem in the projection of met-
ric space.
CHAPTER 4. THE POST-IMAGE PROBLEM 126
conditional mean and variance of 30 posterior models from the post-image and
pre-image problems represent most of the important characteristics in the results
of the rejection sampler.
Conditional variance
0.5
300
Conditional mean
1 0.45
300
y, ft
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (b) Conditional variance of 30
(a) Mean of 30 posterior models posterior models from post-image
from post-image problem problem
Conditional variance
0.5
300
Conditional mean
1 0.45
300
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (d) Conditional variance of 30
(c) Mean of 30 posterior models posterior models from the rejection
from the rejection sampler sampler
Figure 4.36: Comparison of the mean and conditional variance of 30 posterior mod-
els from the post-image problem (238 forward simulations) with those from the
rejection sampler (11,454 forward simulations).
CHAPTER 4. THE POST-IMAGE PROBLEM 127
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.37: Watercut curves of 100 initial prior models displayed with data.
Assuming the reservoir geometry, geologic information, and the hard data are
the same as in Section 4.4, we start from the same 100 initial prior models generated
in Section 4.4.
CHAPTER 4. THE POST-IMAGE PROBLEM 128
Figure 4.38 shows the locations of the ”true Earth” and 100 initial prior models.
The post-image and pre-image problems are solved to obtain 15 posterior models
(Figures 4.40 and 4.41). Iteration in the post-image and pre-image problems is
required, since there are few prior models near the ”true Earth”. Figure 4.39 shows
how many models are newly added near the location of the ”true Earth”.
0.15
0.1
0.05
−0.05
−0.1
−0.15
−0.2
−0.4 −0.2 0 0.2 0.4 0.6 0.8 1
Figure 4.38: The locations of the true Earth and initial prior models in the projection
of metric space.
Figure 4.42 depicts the match with watercut history and the future prediction
after 480 days. 15 posterior models match the history and show the degree of
uncertainty in the future prediction of watercut. Figures 4.43 and 4.44 display the
results from the rejection sampler. The uncertainty in future prediction adequately
represents the uncertainty obtained by the rejection sampler.
Figure 4.45 shows the comparison of conditional mean and variance between
the posterior models from the post-image and the pre-image problems and the
rejection sampler. Again, the results are similar. This again shows that in the fu-
ture prediction the techniques for modeling uncertainty in metric space provides
realistic uncertainty models.
CHAPTER 4. THE POST-IMAGE PROBLEM 129
Obj. func.
True Earth
0.25 Prior (low obj)
...
...
0.2 ...
Prior (high obj)
0.15
0.1
0.05
−0.05
−0.1
−0.15
Figure 4.39: The objective function of post-image problem in the projection of met-
ric space.
The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm
1 1 1
300 300 300
y, ft
y, ft
0 0 0 0 0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300 0 50 100 150 200 250 300
x, ft x, ft x, ft
The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm The solution of the fixed−point iteration algorithm
1 1
300 300 300
y, ft
y, ft
0 0 0 0 0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300 0 50 100 150 200 250 300
x, ft x, ft x, ft
y, ft
y, ft
150 150 150
50 50 50
y, ft
y, ft
150 150 150
50 50 50
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.42: Watercut curves for 15 posterior models obtained by post-image and
pre-image problems matching the watercut history.
CHAPTER 4. THE POST-IMAGE PROBLEM 131
y, ft
y, ft
150 150 150
50 50 50
y, ft
y, ft
150 150 150
50 50 50
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 4.44: Watercut curves for 15 posterior models obtained by the rejection sam-
pler.
CHAPTER 4. THE POST-IMAGE PROBLEM 132
Conditional variance
0.5
300
Conditional mean
1 0.45
300
y, ft
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (b) Conditional variance of 15
(a) Mean of 15 posterior models posterior models from post-image
from post-image problem problem
Conditional variance
0.5
300
Conditional mean
1 0.45
300
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (d) Conditional variance of 15
(c) Mean of 15 posterior models posterior models from the rejection
from the rejection sampler sampler
Figure 4.45: Comparison of the mean and conditional variance of 15 posterior mod-
els from the post-image problem (238 forward simulations) with those from the
rejection sampler (12,424 forward simulations).
CHAPTER 4. THE POST-IMAGE PROBLEM 133
4.6 Summary
The post-image problem is developed and formulated for efficiently generating
multiple posterior models from the initial ensemble of prior models in metric space.
The uncertainty in the posterior models obtained by solving the post-image and
pre-image problems is similar to the results obtained from the rejection sampler.
Although the ”true Earth” is located near the boundary of the distribution of the
prior, the post-image problem can find the multiple posterior models unless there
are few initial models near the ”true Earth” in metric space. In case of having
few prior models near the ”true Earth”, an iteration scheme is applied. The post-
image and pre-image problems with iteration generates multiple posterior mod-
els efficiently in the situation where the prior models are not likely to match the
data. Unlike other sampling techniques, the post-image problem generates poste-
rior models based on the information of all previously sampled prior models. The
posterior models obtained by the post-image and pre-image problems represent
uncertainty in future prediction realistically.
Chapter 5
5.1 Introduction
As explained in Chapter 1, the Ensemble Kalman Filter (EnKF) has been intro-
duced and employed to generate multiple reservoir models constrained to non-
linear time-dependent data, since it shows good performance and efficiency. The
EnKF is not an iterative algorithm so it requires exactly one forward simulation
for one history-matched model. Additionally, the EnKF makes a real-time update
whenever new data are obtained without any forward simulation from the be-
ginning. Due to these merits mainly in efficiency, the EnKF has recently gained
popularity to solve history matching problems.
However, the application of EnKF to modeling uncertainty is fundamentally
flawed and unfortunately this has not been emphasized in the EnKF literature at
all. First of all, the EnKF cannot preserve the prior information such as geologic
information, since the EnKF requires the model variables to be Gaussian. It can-
not be applied to generate reservoir models of facies distributions or any discrete
properties without destroying any intended prior geological information. More
importantly, multiple models obtained by the EnKF do not represent the posterior
as formulated under the conditions of Bayes’ rule. The models obtained by EnKF
represent a dramatically reduced uncertainty. This reduction of uncertainty is due
to the very nature of ENKF: EnKF is an estimating/filtering technique; it is not a
134
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 135
where, z represents the state vector, i.e. the state of the system. Unlike the model
x used in the previous chapters, the state vector z includes static model variables
as well as dynamic model variables. u is the control input (for example, boundary
condition of the system) with the control input operator B, and w is the error in the
model of physical process. The subscript t represents time. d denotes the data (for
instance, bottom hole pressure at a producer) and v the noise in data (for instance,
the noise when measuring the bottom hole pressure). Although we measure the
same properties several times, the measurements are not exactly identical because
of noise. H is called the data-to-state operator (for example, if we measure perme-
ability at some gridblocks, H becomes a matrix that consists of 0 and 1 such that
we can obtain the permeability values at the gridblocks from the state vector by
d = Hz).
The estimation error is defined by Equations 5.3 and 5.4.
e− = z − ẑ− (5.3)
e = z − ẑ (5.4)
where, e is the estimation error. The superscript ’−’ indicates the vector for an a
priori state and no superscript denotes the a posteriori state. A priori here is a state
before assimilating and a posteriori after assimilating the data. The hat denotes the
estimation and no hat denotes the true state.
Four error covariances are defined by Equations 5.5 to 5.8.
Q = ww| (5.5)
R = vv| (5.6)
P− = e− e−| (5.7)
P = ee| (5.8)
where, Q is the error covariance of state vectors, R the error covariance of data, P
the error covariance of estimation. The bar represents the mean.
In deriving the equations for KF, the goal is to find an equation that computes
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 137
an a posteriori state estimate as a sum of the a priori estimate and a weighted differ-
ence between the data and the prediction as shown in Equation 5.9 (the assimila-
tion equation).
ẑt = ẑ− −
t + Gt dt − Hẑt (5.9)
where, G is named the Kalman gain, which is determined by minimizing the esti-
mation error.
When we substitute Equations 5.4 and 5.9 into Equation 5.8 and assume that
the measurement errors are independent of the estimation error, we obtain Equa-
tion 5.10.
|
Pt = (I − Gt H )P− |
t (I − Gt H) + Gt RGt (5.10)
Differentiating Equation 5.10 and finding the Kalman gain to minimize the es-
timation error, we finally obtain the Kalman gain equation, Equation 5.11. Addi-
tionally, the a posteriori estimation error covariance is calculated from the a priori
estimation error covariance (Equation 5.12) and vice versa (Equation 5.13).
−1
Gt = P−
t H
|
HP− |
t H +R (5.11)
Pt = (I − Gt H )P−
t (5.12)
P−
t = APt−1 A| + Q (5.13)
Equations 5.1 and 5.13 represent the forecast (predict) step, or time update, and
Equation 5.9, 5.11, and 5.12 represent the assimilation (correct) step, or measure-
ment update. The time update is performed until new data are obtained. When
new data are obtained, a measurement update is performed. Then the time up-
date is performed again starting from the state assimilated by the measurement
update until the time new data are obtained. This recursive nature is one of the
very appealing features of the KF (Welch and Bishop, 2004).
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 138
z t = f ( z t −1 , u t −1 ) (5.14)
where, the operator f ( ) represents the nonlinear difference equation, which ob-
tains the current state of model from the state of the previous time step and the
control input. The state vector consists of all static and dynamic model variables,
such as the permeability, porosity, pressure, and water saturation at each gridblock
as well as data (Equation 5.15).
x
d
z = g (x) (5.15)
g(x)
where, x represents the static model variables (e.g. permeability and porosity at
each gridblock), gd (x) the dynamic model variables (e.g. pressure and saturation
at each gridblock) obtained from the forward simulator g( ). gd ( ) and g( ) are
different in that gd ( ) provides the dynamic variables at each gridblock and g( )
provides the actual responses at wells (e.g. well bottom hole pressure or oil and
water production rate).
Equation 5.16 shows the response at time t which contains measurement noise.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 139
The measurement noise is assumed to be white noise. Since the state vector con-
tains the response, the measurement matrix operator H is composed of 0 and 1.
dt = Hzt + vt (5.16)
vt = v̄t + δt (5.17)
where, the bar means the mean and δ represents the white noise.
The measurement error covariance is calculated by Equation 5.18. If we assume
that the measurement error of a property is independent between all properties
at the same location, then the measurement error covariance is a block diagonal
matrix (for example, the porosity and the permeability at the same location). If we
assume that the measurement error of a variable at one location is independent of
that of other locations, for the same property, the measurement error covariance is
a diagonal matrix. In other words, the measurement error covariance becomes the
measurement error variance.
R = v̄v̄| (5.18)
The aim of the assimilation step is to minimize the estimation error. Unlike the
KF, the estimation error and the estimation error covariance are obtained by means
of an ensemble (Equation 5.19), as defined by Equations 5.20 to 5.23, respectively.
|
Z= z1 z2 · · · zi · · · zL (5.19)
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 140
!
1 L −
e−
j = ∑
L i =1
zi − z − j j = 1, · · · , L (5.20)
!
1 L
L i∑
ej = zi − z j j = 1, · · · , L (5.21)
=1
L
1
∑ e−j e j
−|
P− = (5.22)
L j =1
L
1
∑ ej ej
|
P = (5.23)
L j =1
where, L indicates the size of the ensemble or the number of initial models. In
other words, the assimilation step updates an a priori state to an a posteriori state in
which the estimation error is minimized. In EnKF, the true state is assumed to be
the mean of ensemble members. Note that the size of the ensemble should be large
enough to cover the true state.
The state vector that minimizes the estimation error is obtained from Equa-
tions 5.24 and 5.25.
ẑt = z− −
k + Gt dt − Hzt (5.24)
−1
Gt = P−
t H
|
HP− |
t H +R (5.25)
Y = ( y1 y2 ··· yL ) (5.26)
yi = yi− + Gm (d − g(x)) (5.27)
Gm = Cy,g(x) C− 1
g(x),g(x)
(5.28)
1 L
L∑
Cy,g(x) = y i g ( x i )| (5.29)
i =i
1 L
L∑
Cg(x),g(x) = g ( x i ) g ( x i )| (5.30)
i =i
Since the length of yi is at most the number of models in the ensemble (if we
retain all the positive eigenvalues in the kernel KL expansion), the metric EnKF
updates the set of short Gaussian random vectors stably and efficiently.
From yi , we obtain the corresponding static and dynamic model variables by
solving the pre-image problem. For both static and dynamic variables, the same
distance previously defined is employed. For the static model variables, we can
apply the geologically constrained optimization to the pre-image problem in or-
der to create geologically realistic reservoir models. For the dynamic variables,
the unconstrained optimization or the fixed-point iteration algorithm is applied to
solve the pre-image problem. See Chapter 3 for various solutions of the pre-image
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 143
problem.
The entire procedure of metric EnKF is as follows.
3. Parameterize the prior models into Gaussian random vectors by the kernel
KL expansion
Metric EnKF has been implemented with 300 (randomly chosen) out of 1,000
prior models. Figure 5.3 shows 6 of 300 prior models and Figure 5.4 displays the
mean and variance of 300 prior models. The mean and variance show the effect
of the hard permeability data. Figure 5.5 displays the watercut curves of prior
models and the data. The data are far from the mean of watercut curves of prior
models.
At 260 days, a first correction is executed based on the data. Reservoir simu-
lations (prediction) from 0 days to 260 days for 300 initial models are performed.
Figure 5.6 shows the watercut curves calculated by the reservoir simulations and
the watercut at 260 days. None of the models exhibit breakthrough, so all water-
cut values at 260 days are zero. Also the watercut data is zero at 260 days, which
means there is no need to correct the state vectors. Figure 5.7 displays the update
at 260 days in metric space. As expected, no change in all the models is observed.
At 520 days, the second correction is performed based on data. Reservoir sim-
ulations (the prediction step) from 260 days to 520 days for 300 prior models are
performed. For the prediction, the reservoir simulations are conducted starting
from the pressure and water saturation updated at 260 days. Figure 5.8 shows
the watercut curves calculated by the reservoir simulations and the watercut data
at 520 days. The watercut values from the simulations vary from 0.0 to 0.2 and
the mean is around 0.02, while the watercut data is 0.05, which means most of the
models should be updated to increase the watercut.
Figure 5.9 displays the update at 520 days in metric space. Recall that the mod-
els located in left narrow line region are the disconnected ones and the models
located in right wide plume region are the connected ones. As a result, almost all
of the disconnected models are updated to the connected ones.
Figure 5.10 displays the updates after 520 days. Figure 5.11 lists the updates of
log-permeability, a priori and a posteriori water saturation of one model. The update
of permeability is consistent with the water saturation. The pre-image problem,
solved using unconstrained optimization (see Section 3.3.1), was applied to both
the static and dynamic model variables. The pressure and the saturation are also
continuous and a linear combination of pressure or saturation fields provides a
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 145
of 300 models, all the 30 models move to the reference point. Figure 5.21 depicts
the flow predictions of the final 30 models. All the models are constrained to the
measurements within the measurement error level (10%).
Figures 5.22 and 5.23 display watercut curves for the final 30 models and the
final 300 models and their p10 , p50 , and p90 , respectively. Figure 5.24 compares the
p10 , p50 , and p90 of the 30 final models with those of the previously obtained 300
final models. EnKF with only 30 selected models reproduces the statistics of flow
responses of final 300 models relatively well. In summary, the ensemble size can
be reduced by kernel k-means clustering and the reduced ensemble reproduces the
similar results as those of full ensemble case.
Figure 5.1: Watercut data measured every two months. Only the red circles ◦
(noisy data) are available to the algorithm.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 147
Figure 5.2: 2D projection of metric space of 1,000 initial models based on their own
distances. Color represents the difference in responses between initial models and
the model located in × (◦: low; ◦: high). Since the connectivity distance is highly
correlated with the difference in responses, although the models are mapped based
on the connectivity distance, the models are well sorted with the difference in re-
sponses.
Figure 5.4: The mean (left) and conditional variance (right) of log-permeability of
1,000 initial models. It is verified that all the models are conditioned to hard data.
In the map of the mean (left), the well locations, or the hard data locations, are
easily identified: (45 ft, 45 ft) and (275 ft, 275 ft).
Figure 5.5: Watercut curves simulated with all 1,000 initial models and the mea-
sured watercut data. Red circles ◦ mean the measured data. Green line − means
the mean of the watercut curves. Grey lines show − 1,000 watercut curves.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 149
Figure 5.6: Watercut curves calculated by the reservoir simulations and the mea-
sured watercut at 260 days.
Figure 5.7: Update at 260 days in 2D MDS space. ◦’s represent the a priori models
(before correction) and ◦’s the a posteriori models (after correction).
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 150
Figure 5.8: Watercut curves calculated by the reservoir simulations and the mea-
sured watercut at 520 days.
Figure 5.9: Update at 520 days in 2D MDS space. ◦’s represent the a priori models
(before correction) and ◦’s the a posteriori models (after correction). Grey lines (−)
show the path of update.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 151
Figure 5.10: LEFT: Watercut curves calculated by the reservoir simulations and the
measured watercut from 580 days to 1,095 days; RIGHT: Update from 580 days to
1,095 days in 2D MDS space. ◦’s represent the a priori models (before correction)
and ◦’s the a posteriori models (after correction). Grey lines (−) show the path of
update.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 154
Figure 5.12: The mean (left) and conditional variance (right) of log-permeability of
300 final models after EnKF.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 157
Figure 5.14: Watercut curves predicted by reservoir simulations of 300 final models
from 0 days to 1095 days.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 158
Figure 5.15: The initial 300 models clustered into 30 clusters by means of the kernel
k-mean clustering.
Figure 5.17: Watercut curves for the initial 300 models and their p50 , p10 , and p90
(red solid line and dotted lines).
Figure 5.18: Watercut curves for the selected initial 30 models and their p50 , p10 ,
and p90 (red solid line and dotted lines).
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 160
Figure 5.19: p50 , p10 , and p90 of the initial 300 models (red) and the selected initial
30 models (blue).
Figure 5.20: EnKF update of the selected 30 models at 520 days in 2D projection of
metric space.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 161
Figure 5.22: Watercut curves for the final 300 models (original ensemble) and their
p50 , p10 , and p90 (red solid line and dotted lines).
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 162
Figure 5.23: Watercut curves for the final 30 models (reduced ensemble) and their
p50 , p10 , and p90 (red solid line and dotted lines).
Figure 5.24: p50 , p10 , and p90 of the final 300 models (red) and the final 30 models
(blue).
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 163
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 5.25: Watercut curves of 100 initial prior models displayed with data.
5.6.2 A case where the ”true Earth” is near the boundary of the
prior
Consider the same reservoir presented in Chapter 4 (310 ft × 310 ft × 10 ft). Un-
certainty on facies distribution will be considered under the same channelized ge-
ologic scenario along the NE50 direction. Based on the prior information given
(geologic scenario and hard data), 100 prior models are generated by means of
the multiple-point geostatistical algorithm, SNESIM. Consider the watercut his-
tory over 3 years in Figure 5.25, which is the same data as in Section 4.4.2.
Since watercut is of interest, we employed the connectivity distance, which is
well correlated with the difference in watercut prediction. Figure 5.26 displays
the locations of initial prior models in the projection of metric space. The distri-
bution of models in metric space using the connectivity distance is similar to the
result using the actual difference in watercut as in Section 4.4.2, since both are cor-
related with the dynamic response of the model. The models located in the left re-
gion show poor connectivity between the injector and the producer and vice versa.
Figure 5.27 depicts the locations of final models in the projection of metric space
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 166
−1
−2
−3
−4
Figure 5.26: Initial ensemble of prior models in the projection of metric space.
after the update of the metric EnKF. The models are gathered in the right-hand
region where the good-connectivity models are placed. As seen in the data, good-
connectivity models are good candidates for matching the data. Figure 5.28 shows
the update of EnKF visually in the projection of metric space. All the initial mod-
els moved to the right region after the update of the metric EnKF. In other words,
all the initial models are updated to exhibit good connectivity between the injec-
tor and the producer. By using the projection of metric space, any optimization or
update process can be analyzed in 2D space effectively.
Figure 5.29 demonstrates 6 of 100 final models obtained by the metric EnKF.
All final models honor the geologic prior information and hard data. All final
models show good connectivity between the injector and the producer. Figure 5.30
depicts the watercut prediction of 100 final models and the data. All the models
are matching the watercut data well. However, when compared with the results
of the rejection sampler (Figures 5.31 and 5.32), the watercut predictions between
100 days and 300 days do not match the data. This resulted from the fact that
the watercut data between 100 days and 300 days are outside of the spread of the
watercut curves of the initial 100 prior models. In EnKF, the truth always has to
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 167
−1
−2
−3
−4
Figure 5.27: Final ensemble of prior models in the projection of metric space.
−1
−2
−3
−4
−5
−25 −20 −15 −10 −5 0 5 10
Figure 5.28: Update of initial ensemble of prior models in the projection of metric
space.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 168
y, ft
y, ft
150 150 150
50 50 50
y, ft
y, ft
150 150 150
50 50 50
Figure 5.29: 6 of 100 final models constrained to watercut history obtained by met-
ric EnKF.
be within the spread of the initially generated ensemble of prior models, since
EnKF assumes that the mean of the ensemble statistics is the truth at each update
as mentioned in Section 5.6.1. Yet, the overall match is relatively good, compared
with the results from the post-image problem in Section 4.4.2.
However, the results of the metric EnKF look even more problematic if the
mean and conditional variance of the final models from the metric EnKF are com-
pared with those from the rejection sampler (Figure 5.33). The mean of the models
from the rejection sampler shows clear connection between the injector and the
producer. This makes sense since the data show high watercut values. The mean
of the models of the metric EnKF does not display this connection and there is a
region of high probability of sand presence in the top-left region, which means that
many models from the metric EnKF have a sand channel which connects the in-
jector to the producer through the top-left region. In other words, EnKF provided
biased samples. The conditional variances are also very different. Although the
models of the metric EnKF match the data relatively well, the variation of the ob-
tained models does not cover the posterior probability or the posterior uncertainty
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 169
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 5.30: Watercut curves for 100 posterior models obtained by metric EnKF
matching the watercut history.
y, ft
y, ft
50 50 50
y, ft
y, ft
50 50 50
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 5.32: Watercut curves for 100 posterior models obtained by the rejection
sampler. (15,305 forward simulations)
space well. The following example shows this deficiency more clearly.
5.6.3 A case where few prior models are near the ”true Earth”
In the same setting, consider a new watercut data, which are used in Section 4.4.3
(Figure 5.34). Figures 5.35 to 5.37 display the update of the metric EnKF in the
projection of metric space. Since the watercut data show low watercut values, all
the initial prior models are updated to the poor-connectivity region, or left narrow-
line region in metric space. Figure 5.38 displays 6 of 60 final models obtained by
the metric EnKF. All the models show poor connectivity between the injector and
the producer by sand facies bodies. Figure 5.39 shows the watercut prediction of
final 60 models and the data. They match the data relatively well, but there one can
observe three groups of matched responses (thick lines in Figure 5.39). These thick
lines mean that multiple matched responses plot on the same location. Hence, the
metric EnKF provided a number of models that look almost similar to each other.
On the other hand, the watercut prediction from the models of rejection sampler is
more uniformly distributed (Figures 5.40 and 5.41).
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 171
Conditional variance
0.5
300
Conditional mean
1 0.45
300
y, ft
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (b) Conditional variance of 100
(a) Mean of 100 posterior models posterior models from metric
from metric EnKF EnKF
Conditional variance
0.5
300
Conditional mean
1 0.45
300
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (d) Conditional variance of 100
(c) Mean of 100 posterior models posterior models from the rejection
from the rejection sampler sampler
Figure 5.33: The mean and conditional variance of 100 posterior models from met-
ric EnKF (100 forward simulations) and from the rejection sampler (15,305 forward
simulations).
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 172
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 5.34: Watercut curves of 100 initial prior models displayed with data.
−1
−2
−3
−4
Figure 5.35: Initial ensemble of prior models in the projection of metric space.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 173
−1
−2
−3
−4
Figure 5.36: Final ensemble of prior models in the projection of metric space.
−1
−2
−3
−4
−5
−25 −20 −15 −10 −5 0 5 10
Figure 5.37: Update of initial ensemble of prior models in the projection of metric
space.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 174
y, ft
y, ft
150 150 150
50 50 50
y, ft
y, ft
150 150 150
50 50 50
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 5.39: Watercut curves for 60 posterior models obtained by metric EnKF
matching the watercut history.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 175
y, ft
y, ft
150 150 150
50 50 50
y, ft
y, ft
150 150 150
50 50 50
0.8
0.7
0.6
WCT, fraction
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
TIME, days
Figure 5.41: Watercut curves for 60 posterior models obtained by the rejection sam-
pler. (38,201 forward simulations)
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 176
The comparison of the mean and conditional variance reveals the problem of
EnKF clearly (Figure 5.42). The mean and conditional variance of the final 60 mod-
els from metric EnKF clearly show the metric EnKF generates a number of very
similar models. Since there are few prior models near the ”true Earth” in metric
space, there is not enough information to calculate the Kalman gain. As men-
tioned before, Kalman gain can be understood as a sensitivity coefficient. If few
prior models are near the ”true Earth”, the covariances obtained from the ensemble
statistics cannot exactly provide the relationship between the state vector and the
response. A Kalman gain with insufficient information leads to a situation where
most of the state vectors are updated to the same location. Therefore, although the
update of EnKF provides multiple models that match the data, those models do not
represent the posterior probability. In order to model the uncertainty realistically,
we need to sample the posterior and try to increase the efficiency of the sampling
technique. In fact, the result obtained from the EnKF are rather disconcerning: the
EnKF provides multiple models that all match the data and honor prior statisti-
cal information (channels). However, the results provide a false sense of security
about the uncertainty of the resulting models. The uncertainty is unrealistically
low and there is no ”test” that can verify this objectively. This simple observation
basically annihilates the single appeal of EnKF for reservoir modeling: i.e. provide
a model of reservoir uncertainty through multiple history matched models.
The following field-scale application of metric EnKF shows this problem as
well. It should be noted that this problem of understating uncertainty is a fun-
damental problem with the EnKF, not of metric space modeling.
Conditional variance
0.5
300
Conditional mean
1 0.45
300
y, ft
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (b) Conditional variance of 60
(a) Mean of 60 posterior models posterior models from metric
from metric EnKF EnKF
Conditional variance
0.5
300
Conditional mean
1 0.45
300
0.6 0.25
150
y, ft
0.5 0.2
150
100
0.4 0.15
100
0.3 0.1
50
0.2 0.05
50
0.1 0 0
0 50 100 150 200 250 300
x, ft
0 0
0 50 100 150 200 250 300
x, ft (d) Conditional variance of 60
(c) Mean of 60 posterior models posterior models from the rejection
from the rejection sampler sampler
Figure 5.42: The mean and conditional variance of 60 posterior models from metric
EnKF (100 forward simulations) and from the rejection sampler (38,201 forward
simulations).
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 178
(a) BHP
(b) OPR
(c) WPR
Figure 5.46: The prediction of watercut from 65 initial prior models and the data.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 183
Figure 5.47: 65 initial prior models in the projection of the metric space.
Figure 5.48: Update of the metric EnKF of 65 models of Brugge data set.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 184
Figure 5.49: The prediction of watercut of 65 final models and the data.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 185
Figure 5.50: The prediction of oil production rates of 65 final models and the data.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 186
Figure 5.51: The prediction of bottom-hole pressure of 65 final models and the data.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 187
Figure 5.52: The permeability of 4 of 65 final model obtained by the metric EnKF.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 188
Figure 5.53: The mean and conditional variance of initial 65 models and final 65
models.
CHAPTER 5. METRIC ENSEMBLE KALMAN FILTER 189
5.8 Summary
The EnKF has four major limitations: it cannot preserve prior geologic information;
its large-scale filtering often makes the update unstable; it does not guarantee the
consistent update of different properties; it decreases the uncertainty dramatically.
In order to overcome those limitations, we propose metric EnKF, which replace the
state vector in the EnKF with the standard Gaussian random vector (parameteriza-
tion) obtained by the kernel KL expansion in metric space. Metric EnKF preserves
prior geologic information, makes stable filtering and consistent update of differ-
ent properties. Additionally, model selection by the kernel k-means clustering in
metric space decreases the number of initial prior models. However, similar to
the EnKF, metric EnKF also provides biased final models that do not cover the
posterior uncertainty space. Updating/filtering schemes are not appropriate for
modeling uncertainty; sampling approaches are desirable in such case.
Chapter 6
6.1 Conclusions
The Bayesian framework provides a consistent and repeatable mathematical frame-
work for modeling uncertainty on future predictions. The only exact sampling
techniques following Bayes’ rule is the rejection sampler, which is extremely inef-
ficient. With the work presented in the thesis it is possible to efficiently obtain a
realistic uncertainty model of facies distribution and petrophysical properties of
reservoir in the framework of Modeling Uncertainty in Metric Space (MUMS). Ad-
ditionally, MUMS provides a realistic uncertainty model for future oil production
with multiple posterior models constrained to prior geologic information and hard
data as well as nonlinear time-dependent production history.
The main conclusions observed in this study are as follows:
190
CHAPTER 6. CONCLUSIONS AND FUTURE WORK 191
He stated that we should falsify not only a model based on data (as mentioned in
Chapter 1) but also a solution to a problem based on many efforts from various
points of view. In this sense, although this study provided results to understand
why this proposed solution works, further efforts to ”overthrow” the solution are
required. For this, studies about the theory and the range of applicability are sug-
gested.
This study showed that the posterior models obtained by the post-image and pre-
image solution method exhibit the same uncertainty provided by the rejection sam-
pler that follows Bayes’ rule in various cases. However since it is inductive way of
showing the method works, proving that the proposed method generates posterior
samples that follow Bayes’ rule would be the best way to verify the method.
This method samples a model neither purely randomly as the rejection sampler
does nor by employing the Markov chain as many other samplers do. This method
generates an additional prior model based on the information of all the previously
sampled prior models. At the same time, this method just attempts to increase
the probability for the generated prior model to match the given nonlinear time-
dependent data, that is, to increase the efficiency. Theoretically investigating this
sampling technique would provide verification, therefore highly suggested.
Additionally, the investigation and application of the proposed technique in
various cases of different data-model relationships, i.e. the likelihood, would help
the theoretical verification. This study compared the posterior samples obtained
by the proposed method with those obtained from the rejection sampler but the
CHAPTER 6. CONCLUSIONS AND FUTURE WORK 194
empirical likelihood estimated from the results of the proposed method has been
applied to the rejection sampler as a likelihood. Hence, investigating how the pro-
posed method represents the previously assigned likelihood for the rejection sam-
pler would be valuable.
Various sources of uncertainty are involved in the process of modeling the Earth:
structural uncertainty from uncertain structure of an Earth model, geological un-
certainty from uncertain geological scenario in the Earth model, spatial uncertainty
from uncertain spatial distribution of facies bodies and petrophysical properties,
physical uncertainty from uncertain physical forward model for prediction, inter-
pretation uncertainty from uncertain data interpretation. All the sources of un-
certainty are of great importance, but this study handled only spatial uncertainty.
The proposed framework of modeling uncertainty in metric space has a poten-
tial to address the structural uncertainty and the geological uncertainty, which are
often major sources of uncertainty in Earth modeling.
Structural uncertainty can be addressed by generating prior models with differ-
ent structures possible for the Earth model. Likewise, geological uncertainty can
be tackled by generating prior models with multiple geological scenarios (specifi-
cally multiple training images or multiple variograms in geostatistics).
Then we calculate distances between any two models, the difference between
the responses. Although we have multiple structures and multiple geological sce-
narios, the distance table is constructed as proposed in this study. Therefore the
location of the ”true Earth” is determined by solving the post-image problem.
Challenges arise after determining the location of the ”true Earth” in metric
space. In order to generate a model corresponding to the location of the ”true
Earth”, the pre-image problem has to be solved using the geologically constrained
optimization. The geologically constrained optimization employs the probability
perturbation method and the probability perturbation method requires using a
single structure and a single geological scenario (a single training image or a single
variogram).
In order to choose which structure or which geological scenario to be used, the
probability of a certain structure or a certain geological scenario in the posterior
should be determined.
Two possible approaches to determine these probabilities can be suggested.
CHAPTER 6. CONCLUSIONS AND FUTURE WORK 196
First, the probability can be identified from the density map of models from each
structure or each geological scenario in metric space. Secondly, the post-image
problem provides the weights which determine the feature expansion of the ”true
Earth” by linear combination of the feature expansions of the initial set of prior
models. These weights also can provide the idea about that posterior probability.
The weights assigned to models of each structure or each geological scenario are
in some way expected to be correlated with that probability. Once the probability
near the location of the ”true Earth” is determined by means of any method, the
determined probability gives an idea about how often a certain structure or a ge-
ological scenario should be utilized in the probability perturbation method of the
geologically constrained optimization for the pre-image problem.
Currently, there are few techniques to efficiently and realistically deal with
structural uncertainty and geological uncertainty given nonlinear time-dependent
data. The framework of modeling uncertainty in metric space has potentially a
wide applicability to handle these challenging uncertainty modeling. Hence, the
attempts to apply MUMS to more wide range of uncertainty are of central interest.
The various examples included in this study are mainly about history matching,
specifically the uncertainty in Earth models with well log or core data as hard
data and production history as nonlinear time-dependent data. However, there
are many other similar challenging problems which involve highly nonlinear or
time-dependent data and costly forward simulations.
First, we suggest implementing the framework of MUMS to seismic imaging or
seismic inversion. The uncertainty in seismic imaging/inversion is huge. Besides
the structural/geological/spatial uncertainty, the interpretation of data as well as
the physical modeling involve substantial amount of uncertainty. Moreover, the
forward simulation for seismic imaging is extremely costly. The proposed frame-
work is applicable to these problems favorably.
Secondly, another problem similar to history matching is the aquifer parame-
terization for groundwater flow. Aquifers are also geological structures and flow
CHAPTER 6. CONCLUSIONS AND FUTURE WORK 197
In order to locate the ”true Earth” in metric space, the post-image problem requires
defining as a distance the actual difference between any two models, since we only
have a response from the ”true Earth”, that is, the data. However, if we define as a
distance the actual difference in responses, the pre-image problem requires the cal-
culation of responses of all newly generated models and the forward simulations
to obtain the responses are very costly.
In order to improve the efficiency in the post-image and pre-image solution
method, a proxy distance which is correlated with the actual difference in re-
sponses but evaluated efficiently can be employed, as in metric EnKF (Chapter
5). The problem is that there is no distance measure for the ”true Earth” if a proxy
distance is utilized. Yet, the metric EnKF showed the models are updated to a cer-
tain point in metric space even though the location of the ”true Earth” is unknown.
This is because the proxy distance is defined to be well correlated with the actual
difference in responses and all the models are distributed in a ranked way with
regard to their responses.
In other words, there should be a location of the ”true Earth” or at least a region
where the ”true Earth” exists in metric space as defined by a proxy distance. Also,
it is expected that, as we increase the number of prior models near the ”true Earth”,
the location of the ”true Earth” in metric space using a proxy distance becomes
CHAPTER 6. CONCLUSIONS AND FUTURE WORK 198
more accurate. Research about the use of a proxy distance for the post-image and
pre-image problems could dramatically increase the efficiency of posterior sam-
pling.
Appendix A
A.1 Introduction
Developing a plan to maximize oil production requires constructing reservoir mod-
els constrained to all available data. Reservoir modeling is, however, still a vexed
question because of various sources and types of data that need to be integrated as
well as the possibly existing uncertainty due to lack of data to fully constrain the
reservoir model.
From today’s oil fields, many types of data are being obtained. One of the
most important data is provided by geologists. Geologists produce a geological
interpretation of the reservoir from outcrop or other inspections, resulting in e.g.
guesses of channel dimensions, their stacking patterns or where the turbulent flow
in the ocean dominated deposition. Additionally, direct observation from a few
wells is available as a form of well log, core, or well test data. On the other hand,
indirect observation from geophysical survey (esp. seismic survey), often termed
”soft” data, provides a lower-resolution constraint. Additionally, production his-
tory (bottom hole pressure, oil or water rate) is recorded during the production.
Matching the reservoir to the production history is very difficult due to the severe
nonlinearity between the reservoir model and the history. Modeling a reservoir
requires integration of all available data from varying scales and sources.
In particular at the appraisal stage, where reservoir production data are few
199
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 200
and where critical decision need to be made, uncertainty about reservoir volume
and prediction performance is still considerable and critical to the decision mak-
ing process. Such uncertainty is captured and represented by generating several
alternative reservoir models by varying key geological, geophysical and reservoir
engineering parameters. Hence, a powerful tool for managing multiple reservoir
models is required. In order to assess the uncertainty using multiple reservoir
models, Monte Carlo simulation or experimental design is widely used. However,
Monte Carlo simulation demands a number of flow simulations, which is not fea-
sible practically. Additionally, the experimental design is not applicable to spatial
(geological) variables which are often categorical and critical to flow (Caers and
Scheidt, 2010).
Metric space modeling means that processes accompanied by modeling a reser-
voir are reformulated and performed in metric space, where the location of any
model is determined exclusively by the mutual differences in responses as defined
by a ”distance”. First step of all metric space modeling techniques is to define a
distance to construct a metric space for the initial set of multiple models; Secondly
the metric space is represented by its projection to the low-dimensional space
by means of multi-dimensional scaling (MDS). MDS generates a map of points
with maintaining the distance between any two points. MDS makes it possible to
analyze the ensemble of multiple models by simple visual inspection as well as
through many statistical analysis techniques. From the constructed metric space,
a series of operations for reservoir modeling is available: generating additional
models (Caers, 2008a; Scheidt et al., 2008), selecting a few representative models
by screening and clustering models (Scheidt and Caers, 2009a, 2010), sensitivity
analysis and uncertainty assessment for models (Scheidt and Caers, 2008, 2009b),
updating models for constraining to nonlinear time-series data (Caers and Park,
2008; Park et al., 2008a), and so forth (for a detailed summary, refer to Caers et al.
(2010)). While a reservoir model is often represented by millions of parameters
(properties at each gridblock), in metric space, a reservoir model is represented by
the distance between other models that is correlated with the output of applica-
tion, which is simple and of critical interest. Also, as long as a distance between
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 201
any two models is defined, metric space modeling technologies can be applied to
any combination of models, such as models of several different structural geom-
etry or models of different geological scenarios. In this appendix, we show that
Metric Space modeling is more than an interesting academic idea and that it can,
with the right software engineering, be readily put in practice.
Ocean is an application development framework that allows to develop ap-
plications tightly integrated with the Petrel product family (Schlumberger, 2008).
Under the Windows.NET environment, Ocean allows one to develop user-friendly
plugins which can be executed in Petrel using all Petrel functionality and database.
Petrel is a reservoir modeling software which makes it possible to generate multi-
ple reservoir models (multiple structures, multiple properties, etc.) given almost
all types of geological, geophysical, petrophysical data, and so on. In addition,
Petrel has many strong analysis functions for 3D visualization, reservoir flow sim-
ulation, uncertainty assessment, and so forth.
In this study, we have developed a Petrel plug-in (Metrel) where core technolo-
gies for metric space modeling are implemented based on the Ocean framework.
First, Metrel allows defining any type of distance from the Petrel database. Second,
Metrel constructs a metric space and map all the initial set of models into low-
dimensional space by means of MDS. The results are stored in the Petrel database
and each model can be viewed and analyzed in 3D display window as a point.
Third, Metrel performs kernel k-means clustering (KKM) to divide the set of mod-
els into several groups for further analyses. Finally, based on the results from MDS
and clustering, sensitivity analysis and uncertainty assessment are available. In
Section A.2 to A.4, we explain how the plug-in works and how and where to use
Metrel, followed by summarizing remarks in Section A.5.
With Metrel, users can choose a few representative models and determine the
uncertainty in future prediction (eg. p10 , p50 , and p90 ) with the reduced number of
models. Additionally, users can analyze the sensitivity of any type of parameters
whether continuous (channel width or length) or categorical (type of structural
model, multiple geological scenarios). Finally, users can analyze multiple models
very easily by means of simple visual inspection.
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 202
the reservoir simulation or the uncertainty analysis requires. For instance, we have
to assign to a Case which structural model is to use, which permeability model,
which porosity model, which relative permeability curve, which fluid behavior
curves, and so on (Figure A.3). We can define multiple Cases from the input and
model databases in Petrel. The defined Cases are used for the input of Metrel.
Using this set of cases defined and chosen for Metrel, Metrel can perform two
core operations for metric space modeling: MDS and KKM (see Chapter 2).
As discussed in Chapter 2, MDS maps Cases into a low-dimensional space by
preserving the distance between any two Cases (Figure A.4). In Metrel, the dis-
tance is defined by the difference between properties or simulation results of any
two Cases. A single property or multiple properties can be chosen to define the
distance. For example, the distance can be defined by the difference in oil pro-
duction and bottom hole pressure obtained through streamline simulation. Metrel
generates a new Pointset for displaying the results of MDS. The Pointset can be
viewed in the 3D view of Petrel as well as the name of Case and parameters used
for defining the Case.
The second functionality implemented in Metrel is KKM (Figure A.5). Metrel
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 204
Figure A.4: Distance and the projection of Cases from metric space by multi-
dimensional scaling.
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 205
makes it possible to apply the clustering to any metric space (Pointset) that is gen-
erated by means of MDS. KKM generates a new Pointset which identifies models
closest to the cluster centroid and their cluster indices. These models are then
selected as representative of the entire set. Additionally KKM generates another
new Pointset which contains all the models as well as their cluster indices. We
can also make use of the statistical analysis tools already implemented in Petrel to
analyze e.g. histogram of input parameters or results, sensitivity analysis on input
parameters, and so forth.
using streamline simulation (Frontsim in Petrel) for all the Cases defined.
Evaluation of the distance needs to be relatively efficient.
3. Run Metrel. The Metrel user-interface would display all the Cases and the
Properties which are shared by all the Cases.
4. Choose which Cases to be used for MDS and which Properties to be used
for distance calculation (Figure A.6). The Map multiple models into the
metric space button executes MDS and a new Pointset is generated in the
Input tab of the Petrel database. The new Pointset represents the loca-
tion of each Case in MDS space and its name (Figure A.7). Also any other
properties can be added into the Pointset by using the function Edit From
Spreadsheet in the Pointset.
5. Go to the next tab: Clustering. Choose which metric space to be used for
the clustering (Figure A.6). The other parameters (dimension of metric space
and kernel bandwidth) are determined automatically by clicking the buttons
located on the right-hand side of the input boxes (for details about how to
choose it automatically, see Scheidt and Caers (2009b)). The only parameter
that needs to be determined by the user is the number of clusters. Then the
clustered result in the new Pointset is added in the Input tab. The new
Pointset additionally contains the cluster indices and centroids information
(Figure A.8).
Table A.1: Generation of 104 models with different techniques. For facies, YES
means the generation of porosity and permeability is based on facies model and
NO means facies ignored; for fluvial (porosity generation method), MPS means
multiple-point Geostatistical simulation and SIS means sequential indicator sim-
ulation; for permeability (permeability generation method), KS means the perme-
ability model is generated by the single-poroperm regression, KP means the porop-
erm regression per facies, and KM means the permeability model by co-Kriging on
porosity. The number is the parenthesis represents the number of models gener-
ated.
Parameter 104 property models
Facies YES (78) NO (26)
Fluvial MPS (39) SIS (65)
Perm KS (13) KM (13) KP (13) KS (13) KM (13) KP (13) KS (13) KM (13)
2. Run Frontsim for the 104 Cases. Figure A.12 shows the streamlines that are
used in one of the Frontsim simulations.
4. Define a distance as the difference in oil and water production over 10 years
resulting from frontsim simulations and choose all 104 Cases.
5. Perform MDS by clicking the button: Map multiple models into metric
space. Figure A.13 depicts the result of MDS. Each point represents each
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 210
Case. The Cases are arranged in the 3D space (projection of the metric space)
such that similar Cases in terms of their oil and water flow characteristics are
located close to each other.
7. Run full reservoir simulations for the chosen Cases (Eclipse) and analyze the
Pointsets generated in the Input tab for the uncertainty assessment and sen-
sitivity analysis as presented in the following subsections.
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 211
Figure A.13: Projection of metric space by MDS. Each dot represents a reservoir
model (Case). The color means z-dir location of each Case
Figure A.14: Clustering results and a few representative Cases chosen by KKM.
Chosen Cases are represented by large circles which labeled BRUGGE 33, BRUGGE 48,
BRUGGE 68, BRUGGE 78, BRUGGE 88, and BRUGGE 93.
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 212
Figure A.15: Field oil and water production curves of 104 models by exhaustive
simulations, which cannot be applied in the field.
Figure A.16: Field oil and water production curves of 6 representative models cho-
sen by KKM.
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 214
Figure A.17: p10 , p50 , and p90 of field oil production curves of 104 models (green
dashed lines) and 6 representative models chosen by KKM (blue solid lines).
Figure A.18: p10 , p50 , and p90 of field water production curves of 104 models (green
dashed lines) and 6 representative models chosen by KKM (blue solid lines).
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 215
Figure A.19: Well (p17) oil and water production curves of 104 models by exhaus-
tive simulations, which cannot be applied in the field.
Figure A.20: Well (p17) oil and water production curves of 6 representative models
chosen by KKM.
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 216
Figure A.21: p10 , p50 , and p90 of well (p17) oil production curves of 104 models
(green dashed lines) and 6 representative models chosen by KKM (blue solid lines).
Figure A.22: p10 , p50 , and p90 of well (p17) water production curves of 104 models
(green dashed lines) and 6 representative models chosen by KKM (blue solid lines).
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 217
Figure A.23: Checking the type of porosity and permeability model generation
method in the spreadsheet of Pointset. x, y, Depth represent the location of each
model in the space projected by MDS. The case name, cluster index, and generation
methods of permeability and porosity are listed in the table.
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 219
Figure A.24: Checking the type of porosity and permeability model generation
method for 6 representative models only in the spreadsheet of Pointset.
Figure A.25: Projection of metric space with displaying the usage of facies infor-
mation for the generation of porosity model (YES: facies considered; NO: facies
ignored).
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 220
Figure A.26: Projection of metric space with displaying the type of simulation
method to generate porosity model (MPS: multiple-point geostatistical method;
SIS: sequential indicator simulation).
Figure A.27: Projection of metric space with displaying the type of method to gen-
erate permeability model (KS: single poroperm regression; KP: poroperm regres-
sion per facies; KM: coKriging on porosity).
APPENDIX A. METREL: PETREL PLUG-IN FOR MUMS 221
A.5 Summary
We have developed a petrel plug-in (Metrel) using Ocean development framework.
Metrel enables core technologies of metric space modeling (MDS and KKM) in Pe-
trel. Metrel allows us to analyze multiple models in 2D or 3D view of Petrel or other
Petrel analysis functions. Metrel can choose a few representative models amongst
a set of multiple models, which would help the efficient further analyses, such as
calculating P10, P50, and P90 of prediction from reservoir flow simulations. Metrel
helps to analyze the sensitivity of the input parameters or methods to the results in-
terested. An example run of Metrel with the Brugge field-scale data set exhibits that
6 representative models chosen by Metrel and 6 full flow simulations are enough
to assess the uncertainty and analyze the sensitivity.
Bibliography
Allard, D., Froidevaux, R., Biver, P., 2005. Accounting for non-stationarity and in-
teractions in object simulation for reservoir heterogeneity characterization. In
Geostatistics Banff 2004, O. Leuangthong and C.V. Deutsch (eds). Springer, New
York, pp. 155–164.
Borg, I., Groenen, P., 2005. Modern multidimensional scaling: theory and applica-
tions. Springer series in statistics. Springer.
URL http://books.google.com/books?id=duTODldZzRcC
Boucher, A., Gupta, R., Caers, J., Satija, A., 2010. Tetris: a training image generator
for SGeMS. 23rd SCRF Annual Meeting Report.
Caers, J., 2003. History matching under a training image-based geological model
constraints. SPE Journal 8 (3), 218–226.
Caers, J., 2007. Comparison of the gradual deformation with the probability per-
turbation method for solving inverse problems. Mathematical Geology 39 (1),
27–52.
Caers, J., 2008a. Distance-based random field models and their applications. In:
Proceedings of 8th International Geostatistical Congress, J.M. Ortiz and X.
Emery (eds). Gecamin, Santiago, Chile, pp. 109–118.
Caers, J., 2008b. Distance-based stochastic modeling: theory and applications. 21st
SCRF Annual Meeting Report.
222
BIBLIOGRAPHY 223
Caers, J., Hoffman, T., 2006. The probability perturbation method: a new look at
Bayesian inverse modeling. Mathematical Geology 38 (1), 81–100.
Caers, J., Park, K., Sep. 2008. A distance-based representation of reservoir uncer-
tainty: the metric enkf. In: Proceedings of 11th European Conference on the
Mathematics of Oil Recovery. Bergen, Norway.
Caers, J., Park, K., Scheidt, C., 2010. Modeling uncertainty of complex Earth sys-
tems in metric space. In Handbook of geomathematics, W. Freeden et al. (eds).
Springer, pp. 877–901.
URL http://books.google.com/books?id=nPqzpCs7k5EC
Caers, J., Scheidt, C., 2010. Joint integration of engineering and geological uncer-
tainty for reservoir performance prediction using a distance-based approach. In
AAPG Memoir on Modeling Geological Uncertainty in press.
Cheng, H., Kharghoria, A., He, Z., Datta-Gupta, A., 2005. Fast history matching
of finite-difference models using streamline-derived sensitivities. SPE Reservoir
Evaluation and Engineering 8 (5), 426–436.
Datta-Gupta, A., King, M., 1995. A semianalytic approach to tracer flow modeling
in heterogeneous permeable media. Advances in Water Resources 18 (1), 9–24.
Deutsch, C., Journel, A., 1998. GSLIB: Geostatistical software library and user’s
guide. Oxford Press.
Devegowda, D., Arroy-Negrete, E., Datta-Gupta, A., 2010. Flow relevent covari-
ance localization during dynamic data assimilation using EnKF. Advances in
Water Resources 33 (2), 129–145.
Evensen, G., 2003. The ensemble Kalman filter: theoretical formulation and practi-
cal implementation. Ocean Dyanmics 53 (4), 343–367.
Evensen, G., 2004. Sampling strategies and square root analysis schemes for the
EnKF. Ocean Dyanmics 54 (6), 539–560.
Evensen, G., 2009. Data assimilation: The ensemble Kalman filter. Springer.
URL http://books.google.com/books?id=2_zaTb_O1AkC
Gao, G., Zafari, M., Reynolds, A., Jan. 2005. Quantifying uncertainty for the
PUNQ-S3 problem in a Bayesian setting with RML and EnKF. In: SPE Reser-
voir Simulation Symposium. Woodlands, Texas, U.S.A.
Gill, P., Murray, W., Wright, M., 1981. Practical optimization. Academic Press.
URL http://books.google.com/books?id=xUzvAAAAMAAJ
Gu, Y., Oliver, D., 2005. History matching of the PUNQ-S3 reservoir model using
the ensemble Kalman filter. SPE Journal 10 (2), 217–224.
Hastings, W., 1970. Monte Carlo sampling methods using Markov chains and their
applications. Biometrika 57 (1), 97–109.
Houtekamer, P., Mitchell, H., 1998. Data assimilation using an ensemble Kalman
filter technique. Monthly Weather Review 126, 796–811.
Hu, L., 2008. Extended probability perturbation method for calibrating stochastic
reservoir models. Mathematical Geosciences 40 (8), 875–885.
Hu, L., Blanc, G., Noetinger, B., 2001. Gradual deformation and iterative calibra-
tion of sequential stochastic simulations. Mathematical Geology 33 (4), 475–489.
BIBLIOGRAPHY 225
Jafarpour, B., McLaughlin, D. B., 2008. History matching with an ensemble Kalman
filter and discrete cosine parameterization . Computational Geosciences 12 (2),
227–244.
Kalman, R., 1960. A new approach to linear filtering and prediction problems. J. of
Basic Engineering 82, 35–45.
Kwok, J.-Y., Tsang, I.-H., 2004. The Pre-image problem in kernel methods. IEEE
Transactions on Neural Network 15 (6), 1517–1525.
Liu, N., Oliver, D., Jan. 2005. Critical evaluation of the ensemble Kalman filter on
history matching of geologic facies. In: SPE Reservoir Simulation Symposium.
Woodlands, Texas, U.S.A.
Lorentzen, R., Nævdal, G., Vallés, B., Berg, A., Grimstad, A.-A., Oct. 2005. Analy-
sis of the ensemble Kalman filter for estimation of permeability and porosity in
reservoir models. In: SPE Annual Technical Conference and Exhibition. Dallas,
Texas, U.S.A., p. SPE96375.
Margulis, S., McLaughlin, D., Entekhabi, D., Dunne, S., 2002. Land data assimi-
lation and estimation of soil moisture using measurements from the southern
great plains 1997 field experiment. Water Resources Research 38 (12), 1–18.
Metropolis, N., Rosenbluth, A., Rosenbluth, M., 1953. Equations of state calcula-
tions by fast computing machines. Journal of Chemical Physics 21 (6), 1087–1092.
Michael, H., Li, H., Boucher, A., Sun, T., Caers, J., Gorelick, S., 2010. Combining
geologic-process models and geostatistics for conditional simulation of 3-D sub-
surface heterogeneity. Water Resources Research 46, 1–20.
BIBLIOGRAPHY 226
Nævdal, G., Johnsen, L., Aanonsen, S., Vefring, E., 2005. Reservoir monitoring and
continuous model updating using ensemble Kalman filter. SPE Journal 10 (1),
66–74.
Nævdal, G., Vefring, E., Apr. 2002. Near-well reservoir monitoring through en-
semble Kalman filter. In: SPE/DOE improved oil recovery symposium. Tulsa,
Oklahoma, U.S.A.
Park, K., Caers, J., Sep. 2010a. Mathematical reformulation of highly nonlinear
large-scale inverse problems in metric space. In: 12th European Conference on
the Mathematics of Oil Recovery. Oxford, U.K.
Park, K., Caers, J., Aug. 2010b. Sampling multiple non-Gaussian model realizations
constrained to static and highly nonlinear dynamic data using distance-based
techniques. In: Annual meeting of the International Association for Mathemati-
cal Geosciences. Budapest, Hungary.
Park, K., Choe, J., Jun. 2006. Use of ensemble Kalman filter with 3-dimensional
reservoir characterization during waterflooding. In: SPE Europec/EAGE An-
nual Conference and Exhibition. Vienna, Austria.
Park, K., Choe, J., Ki, S., Aug. 2005. Real-time aquifer characterization using
ensemble Kalman filter. In: 2005 Annual Conference of the IAMG. Toronto,
Canada.
Park, K., Scheidt, C., Caers, J., Jun. 2008a. Ensemble Kalman filtering in distance-
based kernel Space. In: Proceedings of EnKF Workshop 2008. Voss, Norway.
Park, K., Scheidt, C., Caers, J., 2008b. Simultaneous conditioning of multiple non-
Gaussian geostatistical Models to highly nonlinear data using distances in kernel
Space. In: Proceedings of 8th International Geostatistical Congress, J.M. Ortiz
and X. Emery (eds). Gecamin, Santiago, Chile, pp. 247–256.
Peters, E., Arts, R., Brouwer, G., Geel, C., Feb. 2009. Results of the Brugge bench-
mark study for flooding optimisation and history matching. In: Proceedings of
SPE Reservoir Simulation Symposium. The Woodlands, Texas.
BIBLIOGRAPHY 227
Peters, E., Arts, R., Brouwer, G., Geel, C., Cullick, S., Lorentzen, R., Chen, Y., Dun-
lop, K., Vossepoel, F., Xu, R., Sarma, P., Alhutali, A., Reynolds, A., 2010. Results
of the Brugge benchmark study for flooding optimisation and history matching.
SPE Reservoir Evaluation and Engineering 13 (3), 391–405.
Popper, K., 2002. The logic of scientific discovery. Routledge classics. Routledge.
URL http://books.google.com/books?id=T76Zd20IYlgC
Pyrcz, M., Strebelle, S., 2005. Conditioning event-based fluvial models. In Geo-
statistics Banff 2004, O. Leuangthong and C.V. Deutsch (eds). Springer, New
York, pp. 135–144.
RamaRao, B., LaVenue, A., de Marsily, G., Marietta, M., 1995. Pilot point method-
ology for automated calibration of an ensemble of conditionally simulated trans-
missivity fields. Water Resources Research 31, 475–493.
Reichle, R., McLaughlin, D., Entekhabi, D., 2002. Hydrologic data assimilation
with the ensemble Kalman filter. Monthly Weather Review 130, 103–114.
Sarma, P., 2006. Efficient closed-loop optimal control of petroleum reservoirs under
uncertainty. Ph.D. thesis, Stanford University.
Sarma, P., Chen, W., Feb. 2009. Generalization of the ensemble Kalman filter using
kernels for nongaussian random fields. In: Proceedings of SPE Reservoir Simu-
lation Symposium. The Woodlands, Texas.
Scheidt, C., Caers, J., Sep. 2008. Joint quantification of uncertainty on spatial
and non-spatial reservoir parameters: comparison between the joint modeling
method and distance kernel method. In: Proceedings of 11th European Confer-
ence on the Mathematics of Oil Recovery. Bergen, Norway.
Scheidt, C., Caers, J., 2009a. A new method for uncertainty quantification using
distances and kernel methods. Application to a deepwater turbidite reservoir.
SPE Journal 14 (4), 680–692.
BIBLIOGRAPHY 228
Scheidt, C., Caers, J., 2009b. Representing spatial uncertainty using distances and
kernels. Mathemathcal Geosciences 41 (4), 397–419.
Scheidt, C., Caers, J., 2010. Bootstrap confidence intervals for reservoir model se-
lection techniques. Computational Geosciences 14 (2), 369–382.
Scheidt, C., Park, K., Caers, J., 2008. Defining a random function from a given set of
model realizations. In: Proceedings of 8th International Geostatistical Congress,
J.M. Ortiz and X. Emery (eds). Gecamin, Santiago, Chile, pp. 469–478.
Schölkopf, B., Smola, A., 2002. Learning with kernels: support vector machines,
regularization, optimization, and beyond. Adaptive computation and machine
learning. MIT Press.
URL http://books.google.com/books?id=y8ORL3DWt4sC
Skjervheim, J.-A., Evensen, G., Aanonsen, S., Ruud, B., Johansen, T., Oct. 2005.
Incorporating 4D seismic data in reservoir simulation models using ensemble
Kalman filter. In: SPE Annual Technical Conference and Exhibition. Dallas,
Texas, U.S.A., p. SPE95789.
Suzuki, S., Caers, J., 2008. A Distance-based prior model parameterization for
constraining solutions of spatial inverse problems. Mathemathcal Geosciences
40 (4), 445–469.
Suzuki, S., Caumon, G., Caers, J., 2008. Dynamic data integration into structural
modeling: model screening approach using a distance-based model parameteri-
zation. Computational Geosciences 12 (1), 105–119.
Tarantola, A., 2005. Inverse problem theory and methods for model parameter es-
timation. Society for Industrial and Applied Mathematics.
URL http://books.google.com/books?id=kEboSYSU-nAC
BIBLIOGRAPHY 229
Tarantola, A., 2006. Popper, Bayes and the inverse problem. Nature Physics 2, 492–
494.
Thiele, M., Batycky, R., Blunt, M., 1996. Simulating flow in heterogeneous media
using streamtubes and streamlines. SPE Reservoir Engineering 11 (1), 5–12.
Vasco, D., Yoon, S., Datta-Gupta, A., 1999. Integrating dynamic data into high-
resolution reservoir models using streamline-based analytic sensitivity coeffi-
cients. SPE Journal 4, 389–399.
Welch, G., Bishop, G., 2004. An introduction to the Kalman filter. Department of
Computer Science, University of North Carolina at Chapel Hill, pp. 1–16.
Wen, X., Deutsch, C., Cullick, A., 2002. Construction of geostatistical aquifer mod-
els integrating dynamic flow and tracer data using inverse technique. Journal of
Hydrology 255, 151–168.
Zafari, M., Reynolds, A., Oct. 2005. Assessing the uncertainty in reservoir descrip-
tion and performance predictions with the ensemble Kalman filter. In: SPE An-
nual Technical Conference and Exhibition. Dallas, Texas, U.S.A., p. SPE95750.
Zhang, D., Lu, Z., Chen, Y., 2007. Dynamic reservoir data assimilation with an
efficient, dimension-reduced Kalman filter. SPE Journal 12 (1), 108–129.
Zhang, T., 2006. Filter-based training pattern classification for spatial pattern sim-
ulation. Ph.D. thesis, Stanford University.