You are on page 1of 15

Robust Well-Test Interpretation by Using

Nonlinear Regression With Parameter and


Data Transformations
Aysegul Dastan, SPE, and Roland N. Horne, SPE, Stanford University

Summary a long period of time, analysis of massive data sets is becoming


Nonlinear regression is a well-established technique in well-test a challenge. This paper is about improving the performance and
interpretation. However, this widely used technique is vulner- reliability of the nonlinear regression through Cartesian transfor-
able to issues commonly observed in real data sets—specifically, mations in the parameter space and wavelet transformation in the
sensitivity to noise, parameter uncertainty, and dependence on data space.
starting guess. In this paper, we show significant improvements Parameter-space transformations discussed previously in the
in nonlinear regression by using transformations on the parameter literature are mostly about reparameterization of the nonlinear-
space and the data space. Our techniques improve the accuracy of regression problem. Lu and Horne (2000) and Sahni and Horne
parameter estimation substantially. The techniques also provide (2005) used the wavelet transform in the parameter space, which
faster convergence, reduced sensitivity to starting guesses, auto- consists typically of the spatial distribution of the permeability
matic noise reduction, and data compression. and the porosity. These authors used only the significant wavelet
In the first part of the paper, we show, for the first time, that coefficients of permeability distribution to reduce the number of
Cartesian parameter transformations are necessary for correct parameters, hence stabilizing the parameter-estimation algorithm.
statistical representation of physical systems (e.g., the reservoir). In another reparameterization application, Carvalho et al. (1992)
Using true Cartesian parameters enables nonlinear regression investigated the effect of using the grouped parameter set {k,
to search for the optimal solution homogeneously on the entire k/C, Ce2s/}, typically used in traditional type curves, instead of
parameter space, which results in faster convergence and increases the primitive parameters {k, C, s}. Using the grouped parameter set
the probability of convergence for a random starting guess. Non- did not provide any advantage over the primitive parameters.
linear regression using Cartesian parameters also reveals inherent In conventional nonlinear-regression analysis of well tests,
ambiguities in a data set, which may be left concealed when it is customary to use the primitive parameters for solving the
using existing techniques, leading to incorrect conclusions. We nonlinear-regression problem. However, the logarithm of perme-
proposed suitable Cartesian transform pairs for common reservoir ability has been used in some applications (Mathisen et al. 2003;
parameters and used a Monte Carlo technique to verify that the Wong et al. 2002) and is known to improve the performance. The
transform pairs generate Cartesian parameters. reason behind the effectiveness of the logarithmic transformation
The second part of the paper discusses nonlinear regression is more generally instructive and will be the centerpiece of our
using the wavelet transformation of the data set. The wavelet discussion on parameter transformations in this paper. In the first
transformation is a process that can compress and denoise data part of this paper, we discuss how parameters can be categorized
automatically. We showed that only a few wavelet coefficients are as either Cartesian or Jeffreys (Tarantola 2005) and that it is
sufficient for an improved performance and direct control of non- more appropriate to assume that reservoir parameters are Jeffreys
linear regression. By using regression on a reduced wavelet basis parameters. Optimal performance and correct statistics in nonlin-
rather than the original pressure data points, we achieved improved ear regression can be achieved only through the use of Cartesian
performance in terms of likelihood of convergence and narrower parameters, which means a Cartesian transformation is necessary
confidence intervals. The wavelet components in the reduced basis for all reservoir parameters (not only permeability). By using Car-
isolate the key contributors to the response and, hence, use only the tesian-transformed parameters, we showed improved performance
relevant elements in the pressure-transient signal. We investigated in initial-guess statistics. A much wider range of initial guesses for
four different wavelet strategies, which differ in the method of estimated parameters converged to the correct answer in a smaller
choosing a reduced wavelet basis. number of iterations. We also showed that confidence-interval
Combinations of the techniques discussed in this paper were calculations are more accurate with Cartesian transformation. The
used to analyze 20 data sets to find the technique or combination Cartesian transformation table for reservoir parameters frequently
of techniques that works best with a particular data set. Using the used in well testing is provided in the first part of the paper.
appropriate combination of our techniques provides very robust The second part of this paper discusses data-space transfor-
and novel interpretation techniques, which will allow for reliable mations using the wavelet transform. Wavelet transforms were
estimation of reservoir parameters using nonlinear regression. formulized in the early 1920s, but it was not until the 1990s that
a wide range of applications was proposed. Many of these applica-
Introduction tions were related to image processing [see, for example, Mallat
Well-test interpretation using nonlinear regression is a well-estab- (1999)]. Recently, there have been quite a few interesting applica-
lished technique. Despite the ubiquitous application of nonlinear tions of wavelets in petroleum engineering also. These applications
regression, its ambiguous results, sensitivity to noise, and depen- include denoising data, identifying important transients for long
dence on good starting guesses constrain its use in practical well permanent-gauge data (Athichanagorn et al. 2002; Kikani and He
testing. In addition, with the implementation of permanent down- 1998), and simplifying the parameter space by reducing the num-
hole gauges, which are capable of acquiring data frequently over ber of parameters to be estimated (Lu and Horne 2000; Sahni and
Horne 2005; Awotunde and Horne 2008). In our work, we used the
wavelet transform for data-space (pressure/time) transformation
and carried out nonlinear regression in the transformed space. The
Copyright © 2011 Society of Petroleum Engineers
technique we present reduces the number of data points used in the
This paper (SPE 132467) was accepted for presentation at the SPE Western Regional nonlinear regression and automatically denoises the data. We used
Meeting, Anaheim, California, USA, 27–29 May 2010, and revised for publication. Original
manuscript received for review 29 March 2010. Revised manuscript received for review
the wavelet transform also to control the amount of information
6 August 2010. Paper peer approved 18 August 2010. taken into account at different stages during regression. We found

698 September 2011 SPE Journal


that our methods increase the accuracy of the parameter estimation. space a probability proportional to the volume of that region
We developed techniques for general reservoir types (e.g., reser- (Tarantola 2005). Note that a homogeneous probability distribution
voirs with skin, storage, and boundary effects, and dual-porosity is not the probability of occurrence of the parameters. Rather, it is
reservoirs) and for a variety of well tests. an aspect of the parameter space being used. Transformations on
In the third part of the paper, we describe the extensive testing the parameter space result in a different homogeneous probability
of the parameter-space and data-space transformation techniques distribution.
by application to 20 well-test data sets (three synthetic and 17 Let us first consider a 3D Euclidian coordinate system with
real-field cases). orthogonal coordinates x1, x2, and x3. In this system, the volume
element is
Parameters in Inverse Problems
dV = dx1dx 2 dx3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (1)
Inverse problem solving, in its broadest sense, can be thought of
as a probability problem. Any experiment we conduct (e.g., pres- Because the volume element is the same for all points in the
sure-transient analysis) is a change in our state of knowledge of the parameter space, the homogeneous probability distribution is con-
parameters we are estimating. Each of the parameters we would stant; i.e., the probability distribution function will be
like to estimate (i.e., the reservoir parameters) has a prior prob-
ability distribution. When an experiment is conducted, we obtain f ( x1 , x 2 , x3 ) = c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (2)
additional knowledge on the parameters and, hence, a posterior
probability distribution. Inverse problem solving is actually finding for some constant c. Let us now consider a more general coordi-
the posterior probability distribution for each parameter given the nate system with coordinates u1, u2, and u3, for which the volume
experimental data. Almost all of the nonlinear-regression tech- element is expressed as
niques discussed in the literature focus on techniques to find the
posterior distributions of the parameters. However, in doing that, dV =  ( u1 , u2 , u3 ) du1du2 du3. . . . . . . . . . . . . . . . . . . . . . . . . . (3)
the procedures often make tacit assumptions regarding the prior
distributions of the parameters. In most cases, if nothing is speci- In this case, the homogeneous probability density will be
fied, the prior probabilities of the parameters are assumed to be
uniformly distributed in the interval (∞,∞), which can contradict f ( u1 , u2 , u3 ) = c ( u1 , u2 , u3 ) . . . . . . . . . . . . . . . . . . . . . . . . . . (4)
the physical nature of the parameters (e.g., permeability).
The first step in forming a mathematical model of a system or As an example, in spherical coordinates, we have
an experiment is to parameterize the system. Physical relations
and fundamental laws provide the relation between the parameters. dV = r 2 sin ( ) drd d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (5)
Nonlinear regression uses the physical relation (i.e., the forward
model) and measurement results for the observable parameters to and
infer the values of other parameters. There is some degree of arbi-
trariness in choosing parameters. In many cases, you can come up f ( r , , ) = cr 2 sin ( ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (6)
with a function of a parameter and introduce it as a new parameter.
Our aim in this part of the research was to investigate the repre- To find these expressions, we simply used the definition of
sentation of the pressure transient using appropriate parameters homogeneous probability distribution. Consider a space region
consistent with the physical phenomena. S. The volume it defines is found by integrating dV over S. For a
In this part of the paper, we discuss the probabilistic nature of homogeneous probability density, the probability corresponding
inverse-problem solving in detail. We will discuss an important to that region should be proportional to the volume of the region,
classification related to prior distributions of parameters, and we hence the probability-density function given in Eq. 4.
will see that most physical parameters, in particular well-testing When the distance for a parameter  is defined as log ( 1 ) −
parameters, can (and should) be considered to be Jeffreys quanti- log ( 2 ) , we obtain the same distance for the reciprocal of . Let
ties. We can then find a suitable transformation to make those us define the distance L with respect to some reference 0:
parameters Cartesian. The idea is to start solving the inverse
problem with a uniform a priori distribution of parameters, which
is valid only for Cartesian parameters. In this paper, we propose a
( )
L (  ) = log  0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (7)
conversion table for the set of parameters frequently used in well Now, we can define a distance element dL:
testing. The importance of using true Cartesian parameters will
be demonstrated. d
dL = . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (8)

Jeffreys and Cartesian Parameters. Tarantola (2005) discussed
Note that the infinitesimal distance element in 1D space is
the importance of parameterization in a physical system by consid-
actually a volume element. Combining Eq. 4 with Eq. 8, one can
ering positive parameters encountered in science with reciprocals
find the homogeneous probability density for  as
that are also meaningful physical parameters corresponding to the
same physical phenomenon. Tarantola showed that the only unam- c
biguous way of defining distance for reciprocal pairs is defining f (  ) = . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (9)

the distance in terms of logarithms of parameters.
Definition of distance is central to nonlinear-regression algo- If we define a new parameter v  log(), the distance L can
rithms. In nonlinear regression, an objective function and its be defined as
derivatives are evaluated at successive points until a minimum is
reached—the minimum usually being defined as some function L ( v ) = v − v0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (10)
of the distance between the model and the data. Regression algo-
rithms determine the direction and the step size to find the next Consequently, the distance element becomes
iteration given a starting point in the parameter space. A different
definition of distance results in a different path taken by the algo- dL ( v ) = dv. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (11)
rithm and, hence, a difference in performance. The classification of
parameters as Jeffreys and Cartesian is instrumental in understand- For this volume element, the associated probability density is
ing the underlying concepts in the definition of distance. constant:
A homogeneous probability distribution is defined as the prob-
ability distribution that assigns to each region in the parameter f ( v ) = c. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (12)

September 2011 SPE Journal 699


This analysis shows that, when we take the logarithm of a Hence, a quantity with normal probability density is a Cartesian
parameter with homogeneous probability distribution in the form parameter and a quantity with log-normal probability density is
of f1(•)  c/•, we obtain a parameter with homogeneous prob- a Jeffreys parameter. This important property allows us to check
ability distribution of f2(•)  c. whether a parameter is Cartesian using Monte Carlo simulations.
I
With all this as background, we can now define Jeffreys and Let us assume that the reservoir parameters
are related through
Cartesian parameters. Parameters associated with homogeneous a forward-model function as
probability distribution f(•)  c/• are called Jeffreys parameters.
On the other hand, parameters associated with homogeneous prob- p ( t ) = f ( t ,
1 ,
2 ,$,
k ). . . . . . . . . . . . . . . . . . . . . . . . . . . (17)
ability distribution f(•)  c are called Cartesian parameters. We
have just proved that the logarithm of a Jeffreys parameter is a Car- In a Monte Carlo simulation, we consider a large number of
I
tesian parameter. This important property will be the basis of many
realizations (typically thousands) sampled uniformly in the
of the parameter transformations we investigated to improve the parameter space for a given range of the parameter set being used.
performance of nonlinear regression in well-test interpretation. We can run each point through a threshold algorithm (a random
process) to find out whether we should accept or reject that point.
Prior Probabilities. Given the definitions of homogeneous prob- For example, using Eq. 17, we can generate a hypothetical data set
ability distribution and Jeffreys and Cartesian parameters, we can of N data points (ti, pi) for a given set of true values of parameters.
now combine this information with the probability of occurrence The time range and number of points should be chosen appropri-
of parameters. The homogeneous probability distribution function ately to reflect the most interesting features of pressure-transient
to be associated with a particular parameter is related to the more- behavior. [Any history data would also be incorporated in the f(t)
complicated problem of prior probabilities (Jaynes 1968). In realis- formulation by use of superposition.] We can define the following
tic decision problems, we usually have prior information about the probability of acceptance as
parameters. For example, looking at the results of an estimation,
I ⎧ 1 N I 2⎫
we can generally determine (by engineering experience) if they Pr {
} = exp ⎨− 2 ∑ ⎡⎣ f ( ti ,
) − pi ⎤⎦ ⎬. . . . . . . . . . . . . . . (18)
are nonsensical. The fact that we are able to say certain values of ⎩ 2 i =1 ⎭
parameters are nonsensical implies that we must know the values Eq. 18 can be interpreted as follows. Assume there is normally
of the variables that are reasonable. This information must be taken distributed noise in the pressure measurement, with standard devia-
into account to avoid obvious inconsistencies (Gill et al. 1981). tion . Eq. 18 finds the likelihood of measuring the data set corre-
Jaynes (1968) contrasts prior probabilities to direct probabilities I
sponding to a randomly generated parameter set
. We will accept
in an interesting way. Direct probabilities can be interpreted with I

if the pressure function it corresponds to is likely to be measured
the frequency of occurrence (i.e., statistical hypothesis testing). given the noise. Notice that the probability of acceptance is unity
In other words, we think about experiments that may not exist at the true answer and becomes smaller as we go away from the
physically to find the probabilities. This is the way most commonly true answer. If this accept/reject test is performed for a sufficiently
used distributions are (objectively) defined, such as Bernoulli and large number of trial parameter sets, the accepted sets correspond
Poisson distributions. On the other hand, prior probabilities are to the probability distribution of the parameter space. Regions
rather private by nature. The subjective nature of prior probabilities with a high density of accepted points correspond to regions with
comes from the fact that a prior probability is a state of knowledge a higher probability of occurrence. Note that the distribution found
rather than being something measurable experimentally. here is the posterior distribution of the parameters (i.e., the forward
Jeffreys (1939, 1957) was the first to suggest that we assign model and the true answers are taken into account). However,
a prior probability distribution of d/ to a continuous parameter through Eqs. 15 and 16, we can infer the prior (or homogeneous)
 known to be positive. The basic reasoning behind his argument probability distribution. In particular, if the probability distribution
is that using the parameter  or any power m should not make of a parameter ends up being normal or uniform (uniform distribu-
a difference in essence in describing a physical relation. It can tion can be considered as a special case of a normal distribution
be shown easily that any power of the parameter  has the same with → ∞), then, from Eq. 15, we can assume the parameter to
homogeneous distribution volume element d/. In other words, be Cartesian. We will use this property in the next section to verify
any power of a Jeffreys parameter is also a Jeffreys parameter. that the transformations we propose are indeed Cartesian. The
Note that considering the case m  1, we see that this argument rejection algorithm works reasonably well in one or two dimen-
also applies to the reciprocal of . This gives us a good reasoning sions. In larger-dimensional problems, the chances of acceptance
to define physical parameters (especially those having physically become very low and it is usually necessary to work with a very
meaningful reciprocals) as Jeffreys parameters. large number of trials in the Monte Carlo process.
Tarantola (2005) discussed the relation between the homoge-
neous probability distribution and the dispersion parameters in Parameter Sets Used in Well Testing and
probability densities. We will consider the normal distribution
Their Transformation
and the log-normal distribution here, given in Eqs. 13 and 14,
respectively. Consider that we have an abstract parameter space with each point
I T
corresponding to some set of parameters
= ⎡⎣
1 ,
2 , $
N ⎤⎦ .
f ( x ) = c exp ⎢ − ( x − 2 ) ⎥. . . . . . . . . . . . . . . . . . . . . . . . . . (13)
⎡ 2
⎤ For example, each point in this space may correspond to a par-
⎣ 2 ⎦ ticular permeability, wellbore-storage coefficient, and skin factor.
I I
The distance between two points
1 and
2 must reflect the sym-

g(x) =
c ⎡ log2 x 
exp ⎢ − ( ) ⎤⎥. . . . . . . . . . . . . . . . . . . . . . . . (14)
metries of the problem. In fact, only rarely will these correspond to
a Cartesian coordinate system, where the distance is the Euclidian
x ⎣ 2 2 ⎦ distance:
The parameter is the dispersion parameter in both of those I I
(
) ( )
2 2
equations. Tarantola argues that, for consistency, as → ∞, the
1 −
2 = 1,1 −
2 ,1 $ +
1,N −
2 ,N . . . . . . . . . . . . . (19)
probability densities should approach the associated homogeneous
probability density for that parameter. We see that In fact, the homogeneous probability distribution that corre-
sponds to a Cartesian space (i.e., uniform distribution) is a very
lim f ( x ) = c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (15) special distribution. One may, of course, assume a uniform prior
→∞ probability distribution when there is no other information. This
and assumption is only as good as any other assumption. In well test-
ing, typical parameters we work with are permeability, wellbore-
c
lim g ( x ) = . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (16) storage coefficient, skin, dual-porosity parameters, transmissivity
→∞ x and storativity ratios, and initial pressure. This list, of course, can

700 September 2011 SPE Journal


be extended further. Even though we do not have a good reason to = true ± 4, = true ± 4, re = re ,true ± 1, and pini = pini,true ± 0.5 . Using
assume any parameter is Cartesian, in the past such an assumption this randomized parameter set, we calculated the pressure values of
has been made very frequently, in most cases without reasoning. the trial data set. Using the threshold function in Eq. 18, we decided
In the preceding section, we have seen that, in the absence of whether we want to keep that particular set of values. As discussed
any other information, there are good reasons to believe that a earlier, the distribution of the accepted values corresponds to the
parameter is a Jeffreys quantity, provided, of course, the parameter posterior distribution of the parameter. Fig. 2 shows the histograms
is positive. of accepted values of transformed and untransformed parameters
With the exception of skin factor, reservoir parameters are posi- for a Monte Carlo simulation with 1,268,000 trial values, of which
tive quantities, and, therefore, they can be considered as Jeffreys 1,041 were accepted.
quantities. In addition, the distribution of permeability (k) in nature In Fig. 2, we see that none of the original untransformed
is known to be log-normal. parameters shows uniform or normal distribution. Consequently,
In the preceding section, we showed that in the limit as the there is good reason to implement a transformation. On the other
spread parameter → ∞, the posterior probability should approach hand, all five of the transformed parameters (x-axis terms with
the homogeneous probability distribution. From Eq. 16, the log- overbars) show normal-like distribution. Fig. 3 shows the joint
normal distribution approaches the homogeneous distribution of a distributions of some of the transformed-parameter pairs. Again,
Jeffreys quantity. Consequently, permeability should be a Jeffreys the transformed parameters have normal-like distributions. Con-
quantity. sequently, the Monte Carlo simulations verify the validity of the
Skin factor, for many practical cases, ranges from 8 to 25. set of transformations we proposed. However, as noted before, the
Therefore, we can first transform s to obtain a positive intermedi- transformations we proposed are by no means unique. It may be
ate quantity: possible to find better-performing Cartesian transformations for a
variety of data sets.
s+8
Y= . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (20)
25 − s Performance Improvement Using
Note that the new variable Y has the property 0 < Y < ∞ and it Parameter Transformation
can be assigned to be a Jeffreys quantity. Similarly, for storativity In the commonly applied Gauss-Marquardt algorithm (or other
ratio , the intermediate transformation Z  /(1 ) has a range similar derivative-based Newton methods), at each iteration, we
0 < Z < ∞, given that 0 < < 1. We assume all other parameters, make a quadratic approximation to the objective function at the
without using any intermediate transformation, to be Jeffreys current point. The next iteration is the minimum of that approxi-
quantities. mate quadratic function. Consequently, the more quadratic-like
In the preceding sections, we showed that we need to take the the true objective function is, the faster convergence will be.
logarithm of a Jeffreys quantity to make it Cartesian. Table 1 lists Without transformation, the function will have quadratic behavior
some of the common reservoir parameters used in well testing and only in the close vicinity of the function minimum. Fig. 4 shows
the Cartesian transformations we propose for them. We should note the contour plots of the objective function for transformed and
that these transformations are not unique, and it may be possible to untransformed variables. We see that the parameter transforms
find other transformations that might work better in certain cases. proposed in the preceding section (Table 1) make the function
Nevertheless, in the experience of our study, this set of transforma- more quadratic-like for a wider range of values around the function
tions performed substantially better than the untransformed case in minimum. Not only does being quadratic-like improve the perfor-
most practical well-test cases. mance by reducing the number of iterations to converge, but also,
In the previous section, we discussed that Monte Carlo simula- with transformed parameters, it is possible to choose from a wider
tions can be used to verify that a parameter set is Cartesian. To range of starting guesses and still find the minimum successfully.
verify the transformations proposed in Table 1 using Monte Carlo This latter concept will be discussed next.
simulations, we generated a true data set using the dual-poros- Using transformed parameters, the objective function becomes
ity (Warren and Root 1963) and pseudosteady-state models with more quadratic-like, and, therefore, Newton-type derivative-based
ktrue  100 md, Ctrue  0.01 STB/psi, strue  5, true  0.1, true  approaches result in fast convergence. To test the transform
1 107, re,true  5,000 ft, and pini,true  3,000 psi. Other relevant pairs we proposed in the preceding section, we generated a
parameters were chosen to be qB  1,000 RB/D,   1 cp, rw  synthetic data set using k  163 md, C  0.015 STB/psi, and
0.35 ft,   0.1, ct  1 106 psi1, and h  300 ft. Fig. 1 shows s  7. We started the iterations from 1,000 different uniformly
the resulting drawdown data, which consist of 100 data points distributed points within the parameter ranges 1 < k < 1 104 md,
sampled logarithmically between t  5 104 hours to t  100 1 106 < C < 10 STB/psi, and 7.8 < s < 25. We used the Gauss-
hours. Marquardt (Marquardt 1963) algorithm with line search to seek
In the Monte Carlo analysis, for each of the parameters, we picked a minimum of the objective function. Fig. 5 shows the conver-
random values uniformly distributed in the transformed param- gence map for those 1,000 points. A three-parameter reservoir
eter space within the ranges k = ktrue ± 3, C = Ctrue ± 3, s = strue ± 3, model (permeability, storage, and skin) was used for the test. We

TABLE 1––CARTESIAN-TRANSFORM PAIRS FOR COMMON RESERVOIR PARAMETERS

Parameter Name Original Parameters Cartesian (Transformed) Parameters

Permeability k

Wellbore storage C

Skin factor s

Storativity ratio

Transmissivity ratio

Initial reservoir pressure pi

Distance to boundary re

September 2011 SPE Journal 701


Data-Space Transformation in Nonlinear
Regression
In the first part of the paper, we described inverse-problem solu-
tion as a change in our state of knowledge. We discussed the
underlying basis and necessity of making transformations on the
Pressure, psi

parameter space for optimal nonlinear-regression performance. In


101 addition to parameter-space transformations, it is also possible to
make transformations on the data space. In the second part of the
paper, we will discuss how the wavelet transform can be used to
organize the information content of the data and, consequently,
how the data can be manipulated in desirable ways to improve
the parameter estimation.
100
The Wavelet Transform. The wavelet transform is an expres-
sion of a function in one of many possible basis sets providing
100 a very effective organization of the information contained in
the pressure transient, which is especially useful for measuring
Time, hours
time evolution of frequency components. The wavelet transform
decomposes signals over a basis that is composed of scaled and
Fig. 1—The synthetic drawdown data used in the Monte Carlo
simulations in Figs. 2 and 3.
translated versions of a single mother function called a wavelet.
A wavelet is a normalized function with zero mean. Hence, for
a wavelet ,
observed that the algorithm converges (within 20 iterations) for a +∞
much wider range of initial guesses when transformed parameters ∫  ( t ) dt = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (21)
−∞
were used. Fig. 6 shows the number of iterations required to reach
the correct answer for the same test. As expected, the number of and
iterations increases as the starting guesses are farther away from
the true answer. Using transformed parameters, it is possible to +∞
 =∫  ( t ) dt = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . (22)
2

find the optimum point in fewer iterations. −∞

200 800
60
100
150 600
40
100 400
50
50 200 20

0 0 0
0 2 4 6 0
500 1000 1500 2000 4 6 8 –20 –18 –16 –14 –12
k λ ×10–6
k λ
150 80
400 60
60
100
40 40
200
20 50
20
0 0 0 0
0 0.05 0.1 0.15 0.2 –6 –4 –2 0 5000 10000 15000 7 8 9 10
C C re re
100 100 150
150
100 100
50 50
50 50

0 0 0 2950 0
–20 0 20 40 –4 –2 0 2 4 3000 3050 7.99 8 8.01 8.02 8.03
s s pini pini
600 80
60
400
40
200
20
0 0
0 0.5 1 –10 –5 0 5
ω ω

Fig. 2—Histograms of untransformed and transformed parameters showing posterior probability distributions through
Monte Carlo simulations.

702 September 2011 SPE Journal


3100 14000
9.5
8.03 12000
3050 9
8.02 10000

pini

p ini
3000 8.01 8000

re

re
8.5
8 6000
2950 4000 8
7.99
2900 2000 7.5
0 0.5 1 –6 –4 –2 0 2 4 0 2 4 6 –20 –15
ω ω λ ×10–6 λ
×10–6 ×10–6
6 –12 –12
–14 4 –14
4
–16 –16

λ
λ
λ

λ
2 2
–18 –18
–20 0 –20
0
0 0.5 1 –6 –4 –2 0 2 –10 0 10 20 30 –6 –2 0 2
ω ω s s
0.8 2 30 2
0 20
0.6
0
–2
ω

s
ω

s
0.4 10
–4 –2
0.2 0
–6
0
0 0.1 0.2 –6 –4 –2 0 0.1 0.2 –6 –4 –2
C C C C
×10–6
–12 0.2
–2
4 –14 0.15
–4
C

C
–16 0.1
λ
λ

2
–18 0.05 –6

0 –20 0
0 1000 2000 3 4 5 6 7 0 1000 2000 3 4 5 6 7
k k k k

Fig. 3—2D scatter plots of untransformed and transformed parameters. Uniform and normal-like distributions for transformed
parameters verify that the transformation is Cartesian. Transformed variables are indicated by symbols with overbars.

For discrete signals, a family of wavelets can be obtained from N


1 ⎛ ti − 2 j n ⎞
a mother wavelet  as follows: Wf j ,n = ∑ f ( ti )  . . . . . . . . . . . . . . . . . . . . . . (24)
i =1 2 j / 2 ⎜⎝ 2 j ⎟⎠
1 ⎛ t − 2jn⎞
 j ,n = j / 2  ⎜ i j ⎟ . . . . . . . . . . . . . . . . . . . . . . . . . . . . (23)
2 ⎝ 2 ⎠ and
In Eq. 23, j is the scaling parameter, n is the dilation parameter,
and i  1, …, N is the time index. The forward and inverse wavelet f ( ti ) = ∑ Wf j ,n j ,n ( ti ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . (25)
transforms, respectively, are j ,n

Untransformed Parameters Transformed Parameters


500
6
400
5.5
300
k

5
k

200
4.5
100
0.2 0.4 0.6 0.8 1 –2.5 –2 –1.5 –1 –0.5 0
C C
(a) (b)

Fig. 4—Contour plots of the objective function for transformed and untransformed parameters. The white asterisk shows the func-
tion minimum. The ranges for the parameters are the same. For the transformed case, the objective function can be approximated
better with a quadratic function.

September 2011 SPE Journal 703


Untransformed (Jeffreys) Transformed (Cartesian)
10 10

8 8

C, STB/psi
C, STB/psi
6 6

4 4

2 2

0 0
0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000

(a) k, md (b) k, md

Fig. 5—Scatter plot of starting guesses. Red dots show successful results, and black dots show unsuccessful results. Success
was counted as convergence within 20 iterations to within a tolerance of 2% of the true parameter value. With transformed param-
eters, the algorithm successfully converged 39.4% of the time, whereas the convergence rate was 17.5% when no transformation
was used. Starting guesses that lead to successful convergence span the entire range of initial guesses for the transformed
case, but only small starting guesses for k lead to success for the untransformed case.

The wavelet transform is a linear transform. Each linear opera- Nonlinear Regression in the Wavelet Domain. Least-squares
tion can be expressed conveniently in terms of matrix operations. analysis aims to minimize the objective function in Eq. 27.
Eq. 24 can be written as
I N
I 2
E (
) = ∑ ⎡⎣ pi − pcalc ( ti ,
) ⎤⎦ , . . . . . . . . . . . . . . . . . . . . . . . (27)
Wf = AW f , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (26) i =1

where pi are the pressure data, ti are the corresponding time data,
I I
where each row of AW is a wavelet-basis function.
is the vector of reservoir parameters, and pcalc ( t ,
) is the model
pressure function. As also described by Awotunde and Horne
One can choose from a number of different wavelet bases to I
perform a wavelet transform. Daubechies wavelets (Daubechies (2008), the basic idea here is to replace pi and pcalc ( t ,
) with their
1992) have minimum size support for a given number of vanishing wavelet transforms and carry out least-squares regression in the
moments, allowing for representing discrete functions accurately transformed domain:
with a small number of wavelet coefficients. In our analyses, we I N

( )
EW (
) = ∑ Wpk − Wpcalc,k , . . . . . . . . . . . . . . . . . . . . . . . (28)
2
used Daubechies wavelets with four vanishing moments.
k =1
In the discrete wavelet transform (Eq. 26), in a strict sense, the
I
function f must be evenly spaced in time. Pressure gauges in some where EW (
) is the new objective function in the wavelet domain.
systems record pressure at fixed time intervals, but, in some sys- If we express Eq. 28 as a matrix operation,
tems, pressure is recorded logarithmically in time or recorded only I I I T I I
when there is sufficient change in its value. Using even sampling EW (
) = (Wp − Wpcalc ) (Wp − Wpcalc ), . . . . . . . . . . . . . . . . . (29)
is important for wavelet-based techniques that eventually rely on
reconstructing the signal. In our technique, as we will discuss, we using Eq. 26 in Eq. 29, we obtain
use a substantially reduced basis, which is far from providing a sen- I I I T I I
sible reconstruction. The reduced basis contains only a few wavelets EW (
) = ( p − pcalc ) AWT AW ( p − pcalc ), . . . . . . . . . . . . . . . . . (30)
selected on the basis of their magnitude or their sensitivities with
respect to reservoir parameters. Hence, for the applications presented where AW is a matrix with rows consisting of the orthonormal wavelet
here, uniform sampling is not necessary. Uniform sampling is impor- basis functions. Because we use an orthonormal basis, there are as
tant in applications requiring a reconstruction of the signal (e.g., many wavelets as the number of data points; hence, AW is a full
following noise removal). In our analyses, we did not use any inter- rank, N N matrix. AWTAW gives the identity matrix, making Eq. 30
polation and assumed that the existing sampling in time provided the the same objective function as in Eq. 27, hence yielding exactly the
optimal representation of changes in the pressure transient. same performance as regular least squares in nonlinear regression.

Untransformed Transformed
10 20 10 20
8 8
15 15
6 6
C

10 10
4 4

2 5 2 5

0 0 0 0
0 500 1000 1500 2000 2500 3000 3500 4000 0 2000 4000 6000 8000 10000
(a) k (b) k

Fig. 6—Scatter plot of number of iterations. Average number of iterations is 12.1 for the untransformed case and 10.9 for the
transformed case. Note also that there are twice as many points successfully converging in the transformed-parameter case.

704 September 2011 SPE Journal


When the pressure data are transformed, it is very likely that was chosen on the basis of sensitivities. Real-field data (Bourdet
quite a few of the wavelet coefficients will be zero. The origi- et al. 1983) were used for the demonstration of Strategies 1 through
nal pressure data can be reconstructed (by Eq. 25) with no loss 3, and a synthetic data set was used for Strategy 4.
using the nonzero components only. Hence, it is only necessary
to include in the basis the wavelets giving nonzero coefficients. Strategy 1: Constant Number of Wavelets. In the first strat-
Furthermore, noise in the data can be eliminated by excluding the egy, the reduced wavelet basis consists of a fixed number of
high-frequency wavelet components with coefficients below a cer- wavelets that are selected on the basis of their magnitudes. This
tain threshold. With the exclusion of zero and noisy components, strategy improves the convergence properties of nonlinear regres-
most of the rows of AW will be removed; however, the objective sion. Least-squares regression makes a paraboloidal approxima-
function in Eq. 30 will still give the same error as Eq. 27. tion to the objective function at each iteration, the minimum of
The discussion here shows that a pressure signal with many which becomes the next iteration. Usually, in the close vicinity
data points can be represented exactly with a much smaller num- of an optimal point, the function shows paraboloidal (quadratic)
ber of wavelets and nonlinear regression can be conducted using behavior, ensuring convergence for good starting guesses. How-
wavelets, achieving the same performance as with regular least ever, highly nonlinear functions (e.g., the pressure transient) can
squares. This approach, compared to evaluating the model function quickly deviate from paraboloidal behavior for points away from
for each time data point, can be used to reduce the computational the optimal point. As a result, poor starting guesses can lead to a
load (without any practical change in regression) through develop- local minimum that does not give a good match between the data
ment of efficient algorithms to calculate the overlap integral of the and the model function. Expressing the objective function in a
model function with the selected wavelets. reduced wavelet basis simplifies the objective function extending
Our main motivation in this paper is different in that we not the paraboloidal behavior, avoids local minima, and helps converge
only exclude the zero and noisy wavelet components, but also some to the global minimum. For success of Strategy 1, it is important to
of the significant wavelet components, from the basis function. In keep the number of wavelets as small as possible [i.e., equal to or
our approach, the number of wavelets included in the reduced basis slightly larger than the number of parameters (up to ~np3)].
is typically on the order of the number of estimated parameters,
which would not be sufficient for a sensible reconstruction of the Strategy 2: Coarse Start, Detailed Finish. Strategy 1 can help
original pressure signal. As a consequence, the error function in with convergence, but using an approximate objective function can
Eq. 30 will be different from that of Eq. 27, giving an opportunity limit the accuracy of final results. Strategy 2 consists of two steps.
for performance improvement of nonlinear regression in addition We first start with a minimal basis similar to Strategy 1. Once con-
to the data-compression and noise-removal benefits discussed. vergence (or a predetermined number of iterations) is achieved, the
We will show that even a very small number of wavelets can second step continues from that point with a larger wavelet basis.
be sufficient to find the best fit to the data. The performance of Consequently, the first step provides a better starting guess and the
nonlinear regression in the wavelet domain depends directly on the second step provides an accurate answer.
selection of the reduced wavelet basis, which will be discussed in
the next section. Strategy 3: Add More Detail at Every Iteration. Strategy 2 can
be extended further to increase the amount of detail at every itera-
Wavelet-Based Nonlinear-Regression tion. In Strategy 3, we start with a minimal wavelet basis, as in
Strategies Strategy 2. At each iteration (or at every specified number of itera-
In this section, we discuss four different strategies that we pro- tions), one more wavelet is added to the basis. Hence, complexity
posed to control the reduced wavelet basis set used in nonlinear of the objective function is increased as the regression algorithm
regression. Note that these strategies are only examples demon- progresses toward the minimum.
strating the capabilities of the wavelet analysis. On the basis of Fig. 7 shows a comparison of Strategies 1 through 3 on real
the requirements of a particular problem, other strategies could well-test data, which is a classic example of a damaged-well
be designed. The wavelets in the reduced basis can be chosen response. The data were reported by Bourdet et al. (1983) and
with respect to the magnitudes of the wavelet coefficients or the also analyzed by Horne (1995, Test 22, page 249). We used the
sensitivities of the wavelet coefficients with respect to the reservoir Gauss-Marquardt (Marquardt 1963) nonlinear-regression algo-
parameters. In Strategies 1 through 3, the reduced basis was chosen rithm with line search. For reasonable starting guesses, the regular
on the basis of magnitudes only; in Strategy 4, the reduced basis least-squares approach converged to the global minimum. In the

Time, hours
–3 –2 –1 0 1 2
10 10 10 10 10 10
3
10
Data
Wavelet fit
2
10
Pressure, psi

1
10

Regular St.1 St.2 St.3


0 k (md) 15.12 11.45 11.45 11.46
10 C (STB/psi) 0.0091 0.0091
0.0016 0.0091
s –2.0 7.78 7.79 7.79
pi (psi) 6997 3877 3877 3877
–1
10

(a) (b)

Fig. 7—Pressure-transient fit (a) and progression of the objective function (b). The number of wavelets used is shown in dashed
curves. Regular least-squares regression (LS) follows more of a steepest-descent path, ending up in a local minimum. The wavelet
strategies converge to the global minimum, resulting in a good fit.

September 2011 SPE Journal 705


k C re
s ω λ (ft)
(md) (STB/psi)
2 True Value 200 0.0150 3.50 0.20 5.00e-7 2000
10 Least Sq. 281 0.0164 7.45 0.27 1.52e-8 2087
Strategy 4 188 0.0147 2.85 0.16 5.24e-7 2149
Pressure, psi

1 Data
10
Regular LS
Strategy 4

0
10

–1 0 1 2
10 10 10 10
Time, hours
(a) (b)

Fig. 8—(a) Synthetic data and fits using the least-squares and the wavelet approaches (Strategy 4). Strategy 4 gives a better
match between the data and the model curve. (b) Wavelet sensitivities for each parameter. The three wavelets in Group 1 were
used in the first three iterations. Subsequently, wavelets in Group 2 (wavelets sensitive to re) also were added to the basis. The
colors show the relative magnitudes for the quantities in each column such that dark red corresponds to the maximum value
and dark blue corresponds to zero in each column.

regressions shown in Fig. 6, to demonstrate the capabilities of stability of nonlinear regression, especially for reservoir models
wavelets, we intentionally picked a poor starting guess (k  100 with many parameters.
md, C  0.5 STB/psi, s  0, and pi  7,500 psi). All three wave- To test Strategy 4, we generated a synthetic-data set using
let strategies converged to the global minimum at k  11.45 md, the dual-porosity model (Warren and Root 1963) with boundary
C  9.1 103 STB/psi, s  7.78, and pi  3,877 psi, whereas effects (pseudosteady state). Dual-porosity models are known to
regular least squares converged to a distant local minimum. A be difficult to match reliably, hence our choice of this model as
reasonable starting guess would lead regular least squares to the a test. The true values for the parameters were k  200 md, C 
same global minimum. 0.015 STB/psi, s  3.5,  0.2,  5 107, and re  2,000 ft.
In Strategy 1, the number of wavelets was chosen to be four For this synthetic-data set, the time data were sampled logarithmi-
(i.e., equal to the number of parameters). With five or more wave- cally between t  0.01 hours and t 100 hours. Fig. 8a shows the
lets, Strategy 1 followed the exact same path as the regular least synthesized data, and Fig. 8b shows the magnitudes of the wavelet
squares, giving a wrong answer. Using fewer wavelets than the coefficients and the wavelet sensitivities. In Fig. 8b, the leftmost
number of parameters renders the regression underdetermined and column shows the relative magnitudes of the 12 largest wavelets.
so is not applicable to Strategy 1. However, in Strategies 2 and 3, The magnitudes of the remaining wavelets were significantly lower
we started with three wavelets (an underdetermined case) to avoid than the ones shown here. The wavelets were numbered from 1
undesired local minima. In Strategy 2, the number of wavelets was (top, largest) to 12 (bottom, smallest). The columns to the right
increased to 10, starting with the fifth iteration, after which the show the wavelet sensitivities of the wavelets for each parameter,
regression quickly converged to the global minimum. In Strategy calculated at the starting guess of k  1000 md, C  0.1 STB/psi,
3, the wavelets were increased one by one at each iteration up to s  1,  0.1,  1 106, re  1,500 ft. A good fit could not
10 wavelets, after which the number of wavelets was kept constant. be found with regular least-squares analysis, as shown with the
Strategy 3 followed a steepest-descent path between Iterations 3 blue curve in Fig. 8a. Especially, the dual-porosity parameters
and 10 but then converged to the global minimum. The analysis and deviated significantly from the true answer.
shows that the performance of regression can be improved by We applied Strategy 4 in the following way: In the first three
controlling the amount of information taken into account (i.e., the iterations, we used only the three wavelets in Group 1. This set
number of wavelets) as the regression progresses. represents the wavelets most sensitive to each of the parameters,
Choosing the wavelet basis on the basis of the magnitudes of except for re. Leaving re out reduced the complexity of the problem
wavelets, as in Strategies 1 through 3, has a potential drawback. initially. In the first three iterations, we obtained better guesses for
When there is an abrupt change in the pressure transient, the wave- k, C, and s. Fig. 9 shows how the parameters changed with the
let coefficients in the vicinity of the steep change tend to take large progress of nonlinear regression. Starting with the fourth iteration,
values. Using large-magnitude wavelet coefficients corresponding we also included the wavelets in Group 2, which are the wavelets
to unrealistic abrupt changes (e.g., outliers) can reduce the perfor- most sensitive to changes in re. As seen in Fig. 8a, there is a good
mance of nonlinear regression. This problem can be avoided by match between the model curve and the data. Strategy 4 improved
first detecting and then eliminating outliers using wavelet-based the stability of nonlinear regression and allowed for accurate esti-
techniques (Athichanagorn et al. 2002). mation of reservoir parameters for this synthetic-data set.

Strategy 4: Estimation on the Basis of Sensitivity Coefficients. Discussion. In this section, we described several different examples
The last strategy demonstrates how wavelet-sensitivity coefficients capitalizing on the multiresolution property of wavelet analysis.
can be used to select a reduced wavelet basis. The basic idea in Wavelet transformation provides a suitable platform to control the
Strategy 4 is to limit the degrees of freedom initially until a better amount of detail included in the analysis at any moment. We have
guess is achieved. The degrees of freedom can be limited by using seen that the stability of the nonlinear regression can be improved
a basis that includes wavelets sensitive to a subset of the reservoir greatly by choosing appropriate strategies. As mentioned, what we
parameters only. After a few iterations, a better guess is obtained have included here is only a small subset of possible wavelet-based
and wavelets sensitive to other parameters can be added to the basis techniques. Many other suitable strategies could be designed on
to converge to the global minimum. Strategy 4 helps improve the the basis of the data set being analyzed. We should also note that,

706 September 2011 SPE Journal


1000 0.1 6
Least Sq. Least Sq.
St. 4 St. 4 5
800 True Val. 0.08 True Val.
4
600 0.06
k C 3
400 0.04 s
2 Least Sq.
St. 4
200 0.02 True Val.
1

0 0 0
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
Iterations –6 Iterations Iterations
×10
0.5 1
Least Sq. Least Sq.
St. 4 St. 4
0.4 True Val. 0.8 2500
True Val.

0.3
λ 0.6 re 2000
ω
0.2 0.4
Least Sq.
0.1 0.2 1500 St. 4
True Val.

0 0
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
Iterations Iterations Iterations

Fig. 9—The change in the parameter values during nonlinear regression using Strategy 4. The wavelet basis included the three
wavelets in Group 1 up to Iteration 3 (shown with vertical dotted line). In the subsequent iterations, additional three wavelets from
Group 2 were also added. Using Strategy 4, k, C, and s were refined first and then the remaining parameters were handled.

because a reduced basis set is used, noise elimination is achieved hand, parameter estimates found using the wavelet transform are
automatically in all of the strategies discussed in this section. much closer to the true answer.

Statistics of Initial Guess. We have discussed that wavelet-trans-


Performance Analysis of Wavelet-Based formed nonlinear regression can improve the stability of nonlinear
Nonlinear Regression regression. We showed that, using an appropriate strategy, one can
Noise Performance. One of the main advantages of conducting increase the chance of achieving a successful convergence to the
nonlinear regression in the wavelet domain is automatic noise optimal point in the parameter space. Fig. 11 provides a more com-
elimination. In the previous sections, we have seen that a reduced plete picture of the statistics of initial guesses for the same data set
wavelet basis offers several advantages in terms of stability of used with Strategies 1 through 3 (Horne 1995, Test 22, page 249).
nonlinear regression and accuracy of the estimated parameters. In We have used 1,000 different starting guesses uniformly distributed
reduced-basis sets, wavelets corresponding to high-frequency com- (Fang 1998) on the parameter space of four variables k, C, s, and
ponents in data are usually excluded because they do not carry criti- pi. Fig. 11 provides a space map of the successful results on the
cal information. Noise is a zero-mean random component added to k–C plane. The wavelet-transform approach increases the chance
each of the data points. In most cases, the noise in each data point of success to a large extent. Out of the 1,000 uniformly distributed
is independent from that in the others, making it a high-frequency starting guesses, 196 were successful using regular least squares
component. Elimination of high-frequency wavelet components and 258 were successful using Strategy 3. Note that the range of
from the basis, therefore, eliminates noise. We should note, how- initial guesses was chosen to be artificially large to demonstrate the
ever, that the least-squares estimator is a maximum-likelihood esti- difference. In practice, usually a better range for possible answers
mator for a signal with normally distributed noise. Consequently, if is known, increasing the chance of convergence quite substantially.
the noise is only in the pressure signal, using the wavelet transform In Fig. 11, we also observe that the successful starting guesses for
is not expected to improve the confidence intervals. the least-squares case lie on a narrow band, resulting from cross
In the example shown in Fig. 10, we considered a scenario correlation between permeability and storage. On the other hand,
where both time data and pressure data are noisy [see Dastan the successful results for the wavelet case span a much larger space
and Horne (2009) for more details on the effects of time noise]. on the k–C plane. Consequently, the wavelet technique is less prone
The figure shows parameter estimates as a function of noise for a to cross correlation between parameters because permeability and
synthetic-data set for which the true answers of the parameters are storage affect different portions of the pressure transient, which is
known. For each noise amplitude, 200 realizations were generated easily isolated by the wavelet transform.
and 95% confidence intervals were calculated using these realiza-
tions. There are two important aspects in the figure: the agreement Application of Parameter- and Data-Space-Transform Tech-
with the true result and the width of the confidence intervals. The niques to the Analysis of 20 Data Sets. To test the procedures
red and blue dashed lines show the estimated parameter values and more extensively, we applied the parameter- and data-transfor-
should be compared to the black dotted line (the true value). The mation techniques discussed in this paper to analyze 20 sets of
shaded regions represent the confidence intervals (in the vertical pressure-transient data, 17 of which were from real well tests
direction). It is desired to have narrower confidence intervals. (Horne 1995). Incidentally, these data sets were studied earlier
Fig. 10 reveals that widths of the confidence intervals are gener- also, using different nonlinear-regression techniques (Dastan and
ally comparable for the transformed and untransformed cases. Horne 2009).
However, the regular least-squares-analysis estimates deviate from The 20 tests analyzed were chosen to include ambiguous
the true result with increasing noise, especially for wellbore-stor- interpretations, meaning multiple sets of parameters and multiple
age estimate, which can be attributed largely to the fact that noise reservoir models giving equally good fits. There are two pairs of
in time affects the early-time data more severely. On the other data sets (Data Sets 2 and 10 and Data Sets 3 and 8) for which

September 2011 SPE Journal 707


36 72 108 144
250
t noise (seconds)

k estimate
200 Regular LS
Wavelet
True Answer
150

p noise (seconds)
100
0 0.5 1 1.5 2
36 72 108 144
0.02
C estimate t noise (seconds)

Regular LS
0.015 Wavelet
True Answer

0.1 p noise (seconds)


0 0.5 1 1.5 2
36 72 108 144
15
t noise (seconds)
s estimate

10 Regular LS
Wavelet
True Answer
5

p noise (seconds)
0
0 0.5 1 1.5 2

Fig. 10—Noise performance comparison of wavelet analysis (red) to regular least-squares analysis (blue). The noise in both
the time data and the pressure data is normally distributed. The horizontal axis shows the standard deviation of the noise. The
vertical axis shows the estimate (red and blue dashed lines) and the confidence intervals (shaded regions).

two different model functions were used to fit the same pressure • Strategy 4: Wavelets most sensitive to k and s were used
data, in an effort to analyze the effects of ambiguity. We considered in the first three iterations. From the fourth iteration through the
several reservoir models including dual-porosity reservoirs (Data sixth iteration, wavelets most sensitive to the remaining parameters
Sets 1, 8, 10, 15, and 20), reservoirs with rectangular and pseudos- (excluding k and s) were used. After the sixth iteration, we contin-
teady-state boundaries (Data Sets 9 and 17), cyclic production tests ued with np3 wavelets chosen by magnitude, hence representing
(Data Set 16), acidized wells (Data Set 18), hydraulically fractured all the parameters in the basis.
wells (Data Set 19), falloff tests (Data Set 11), and transition data We analyzed each strategy using both untransformed (Jeffreys)
analysis (Data Set 5). Data Sets 1, 16, and 17 were synthetically and transformed (Cartesian) parameters, making a total of 10 cases
generated, and the rest were real well-test data. Specific refer- for each data set. A Monte Carlo analysis with 100 realizations
ences to the sources of the data sets can be found in Dastan and was used to calculate 95% confidence intervals by adding to the
Horne (2009). original data realizations of normally distributed noise with 1-psi
We considered regular least squares and all four of the wavelet standard deviation. We have developed a graphical method to
strategies discussed in this paper. summarize the results for all the data sets. The graphs compare
• Strategy 1: We used np3 wavelets, where np is the number the strategies in terms of the confidence intervals and the shifts in
of parameters used in the fit function. the mean values of the parameters.
• Strategy 2: We used np1 wavelets in the first step (up to the Fig. 12 shows the results for Data Set 1 and can be used as a
sixth iteration) and np3 wavelets in the second step. guide to understanding Figs. 13 and 14, which give the full results
• Strategy 3: We started with three wavelets and incremented for all 20 data sets. The upper row shows the results for no param-
until reaching np3 wavelets. eter transformation (Jeffreys parameters), and the lower row shows

Regular LS Wavelet
2 2
10 20 10 20

0 0 15
10 15 10

–2 10 –2 10
10
C

10
C

–4 5 10
–4 5
10

–6 0 –6 0
10 10
–1 0 1 2 3 –1 0 1 2 3
10 10 10 10 10 10 10 10 10 10
(a) k (b) k

Fig. 11—Parameter-space map of starting guesses that successfully converged to the optimal point for regular-least-squares
(Regular LS) and wavelet (Strategy 3) approaches. There are four parameters—k, C, s, and pi —but only the k–C plane is shown,
for brevity. The color corresponds to the number of iterations it takes to reach the optimal point. Of the 1,000 starting guesses,
the wavelet-transform approach was successful for 258 while the regular-least-squares case was successful for 196.

708 September 2011 SPE Journal


Arrows mean the confidence interval is
wider than twice the acceptable limit:
> ±20% for k and C
Reference to LS mean
> ±40% for ω and λ
(Jeffreys case)
> ±2 for s

No parameter
k transformation
C (Jeffreys)
s
ω
λ

Reference width Cartesian


of “acceptable” transformation
confidence intervals

LS St. 1 St. 2 St. 3 St. 4

Fig. 12—Analysis results for Data Set 1. Mean value of the parameters and the confidence intervals are shown with respect to
regular least squares with no parameter transformation. The upper row shows the results for the untransformed case, and the
lower row shows the results for Cartesian transformations. The horizontal dashed line in the center provides a reference to the
mean values of the untransformed, regular least-squares case. The other two dashed lines show the limits for acceptable con-
fidence intervals for parameters. Note that the vertical-axis scaling is different for each parameter.

the results for Cartesian-transformed regression. The columns (The very narrow confidence intervals for the least-squares cases
show the results for regular least squares, Strategy 1, Strategy 2, are because of the regression failing to move forward from the
Strategy 3, and Strategy 4. There are three horizontal dotted lines starting guess.) In Data Set 15, both wavelet strategies provided
in each row to help with the comparison of confidence intervals a good fit, whereas the regular least-squares approach failed. In
and shifts in mean values. The regular least-squares case with no Data Set 17, both the regular least squares and Strategy 3 provide
parameter transformation was used as the reference for the other a visually good fit, whereas Strategy 1 failed.
cases. The dotted line in the center shows the reference for mean The objective function used in nonlinear regression is identical
values. The vertical shift of the center of a colored rectangle shows for transformed and untransformed parameters. Hence, the global
how much the mean deviated with respect to the reference case. and local minima are in identical positions in the parameter space.
The height of each rectangle shows the confidence intervals. The Only the gradient and Hessian are different, resulting in a different
vertical scale can be inferred from the upper and lower dotted lines, path in nonlinear regression. In a comparison of untransformed and
which show the acceptable confidence intervals. The acceptable transformed cases in Figs. 12 and 13, it is possible to observe wider
limits were assumed to be 20% for the dual-porosity param- confidence intervals for transformed parameters in some of the
eters ( and ), 1 for the skin factor, and 10% for all other data sets. It should be noted that wider confidence intervals do not
parameters (Horne 1995). For visual convenience, the vertical scale necessarily mean that the parameter transformation did not work
for each parameter was set such that the acceptable confidence well. On the contrary, wider confidence intervals show that there
intervals align. Hence, for example, a rectangle that fits exactly were indeed multiple plausible solutions that the untransformed
between the upper and lower dotted lines would correspond to a case missed. Data Sets 2, 3, 5, 8, 9, and 10 are short of reaching
10% confidence interval for wellbore storage, 20% confidence radial-flow behavior, which creates ambiguity in the estimation.
interval for storativity, and a confidence interval of 1 (in absolute Indeed, Data Sets 2 and 10 are the same pressure data, fit using
value) for the skin factor. The arrow points at the ends of bars are different model functions. Similarly, Data Sets 3 and 8 are the same
used to denote the cases for which the confidence intervals were data sets fit using different model functions. For such ambiguous
too wide to be displayed on the graph (i.e., the confidence interval data, having smaller confidence intervals does not necessarily
for the corresponding parameter was wider than twice the accept- mean the fit is more reliable and a good visual match does not
able limit for that parameter). The white triangle in the center of a guarantee a good answer. In these data sets, because there are
rectangle shows that the mean deviated so much that the rectangle insufficient data in the infinite-acting part, in many cases, one
would actually fall outside the graph limits (i.e., the mean shifted can choose many different values of permeability and match to a
more than 20% for and , more than 2 for s, and more than 10% corresponding value of skin, resulting in large confidence intervals
for the rest of the parameters). in estimation. Relative performance of the various techniques was
Figs. 13 and 14 summarize the results of the analysis for all analyzed further in Dastan and Horne (2010).
20 data sets. We see that the wavelet transformation provides an
advantage over least squares when the reservoir description is
relatively complex, such as dual-porosity reservoirs (Data Sets 1, Conclusions
8, 10, 15, and 20) and reservoirs with boundaries (Data Sets 9 and In this work, we analyzed parameter-space and data-space transfor-
17). For relatively simple reservoir descriptions, wavelet-transform mations during nonlinear regression in well-test interpretation. We
analysis and regular least squares showed generally comparable showed that Cartesian transformations in the parameter space and
performance. Among the different strategies used for data trans- wavelet transformation in the data space improve the performance
formation, Strategy 4 can yield narrow confidence intervals, espe- of nonlinear regression significantly.
cially when the number of parameters is large. We proved the necessity of parameter transformations for
Examination of the p-vs.-t graphs for the parameter estimates reservoir parameters. We showed that most physical parameters
reveals that both least squares and wavelet transforms result in a are Jeffreys quantities, whereas nonlinear regression works best
good fit for all cases except Data Sets 12, 15, 17, and 19. In Data with Cartesian parameters. For the first time, we proposed suitable
sets 12 and 19, both least-squares and wavelet approaches failed. transformation pairs for commonly used reservoir parameters. We

September 2011 SPE Journal 709


1 2
k k
C C
s s
ω
λ

LS St. 1 St. 2 St. 3 St. 4 LS St. 1 St. 2 St. 3 St. 4

3 4
k k
C C
s s

LS St. 1 St. 2 St. 3 St. 4 LS St. 1 St. 2 St. 3 St. 4

5 6
k k
C C
s s

LS St. 1 St. 2 St. 3 St. 4 LS St. 1 St. 2 St. 3 St. 4

7 8
k k
C C
s s
pi ω
λ

LS St. 1 St. 2 St. 3 St. 4 LS St. 1 St. 2 St. 3 St. 4

9 10
k k
C C
s s
re ω
λ

LS St. 1 St. 2 St. 3 St. 4 LS St. 1 St. 2 St. 3 St. 4

Fig. 13—Analysis of Data Sets 1 through 10. Fig. 12 can be used as a guide to understand this graph.

verified the validity of the transformations using a Monte-Carlo-based directly to the choice of the reduced wavelet basis. By appropriate
threshold algorithm. We have shown that the probability distribu- choice of the wavelets to be included in the reduced basis, we
tions of transformed parameters usually exhibit normal distribu- achieved direct control over the nonlinear regression. We have
tion, a strong indication of being Cartesian. proposed a number of strategies to control the information con-
For data transformations, we conducted nonlinear regression tained in the wavelet basis and have shown examples of how
on the wavelet transform of the pressure signal and obtained sig- these strategies can be used effectively. We have also discussed
nificant performance improvement. We showed that the wavelet the statistics of initial guesses and noise performance of the
transform is not only useful for data reduction and noise elimina- wavelet-transform-based nonlinear regression. Finally, we com-
tion but also is suitable for performance improvement. The per- pared the performance of some of the wavelet-based strategies on
formance improvement through the wavelet transform is related some 20 real- and synthetic-data sets.

710 September 2011 SPE Journal


11 12
k k
C C
s s

LS St. 1 St. 2 St. 3 St. 4 LS St. 1 St. 2 St. 3 St. 4

13 14
k k
C C
s s

LS St. 1 St. 2 St. 3 St. 4 LS St. 1 St. 2 St. 3 St. 4

15 16
k k
C C
s s
ω
λ

LS St. 1 St. 2 St. 3 St. 4 LS St. 1 St. 2 St. 3 St. 4

17 18
k k
C C
s s
d
d
d
dw

LS St. 1 St. 2 St. 3 St. 4 LS St. 1 St. 2 St. 3 St. 4

19 20
k k
C C
s s
ω
λ

LS St. 1 St. 2 St. 3 St. 4 LS St. 1 St. 2 St. 3 St. 4

Fig. 14—Analysis of Data Sets 11 through 20. Fig. 12 can be used as a guide to understand this graph.

Our conclusions can be summarized as • Wavelets provide appropriate reweighting of data and improve
• Regression using Cartesian variables yields a more quadratic- the stability and performance of nonlinear regression.
like objective function, and the regression can be completed in • Wavelets provide automatic noise elimination and outlier
fewer iterations. removal.
• Cartesian transformation increases the probability of conver- • A reduced wavelet basis provides effective reduction of data points
gence substantially when a good starting guess is not available. in the analysis without having to delete actual data points.
• Using Cartesian transformations is especially useful for detect- • The reduced wavelet basis should be selected on the basis
ing ambiguous data sets with insufficient infinite-acting data. of wavelet amplitudes and sensitivity coefficients. Selection
Transformed analysis finds possible fits over a larger volume, on the basis of sensitivity coefficients (Strategy 4) allows for
resulting in wide confidence intervals for ambiguous data (as decoupling the regression on the basis of parameters and, hence,
should be the case for uncertain problems). improves the performance.

September 2011 SPE Journal 711


• Representing the objective function in the reduced wavelet Daubechies, I. 1992. Ten Lectures on Wavelets, No. 61. Philadelphia, Penn-
basis provides improved stability by simplifying the objective sylvania: CBMS-NSF Regional Conference Series in Applied Math-
function when a good initial guess is not available. The study ematics, Society for Industrial and Applied Mathematics (SIAM).
of statistics of initial guesses reveals that the wavelet approach Fang, K.T. 1998. Theory and Applications of the Uniform Design, Experi-
is more robust against poor starting guesses. mental Design: Theory and Application. Presented at the Symposium
on Statistical Theory of Experimental Designs, Oberwolfach, Germany,
Nomenclature 14–15 November.
AW  wavelet basis matrix Gill, P.E., Murray, W., and Wright, M.H. 1981. Practical Optimization.
C  wellbore storage New York: Academic Press.
dcirc  distance to circular boundary Horne, R.N. 1995. Modern Well Test Analysis: A Computer Aided
Approach, second edition. Palo Alto, California: Petroway, Inc.
dE  distance to rectangular boundary in the east
Jaynes, E. T. 1968. Prior Probabilities. IEEE Trans. Systems Sci. Cybernet-
dN  distance to rectangular boundary in the north ics 4 (3): 227–241. doi: 10.1109/TSSC.1968.300117.
dS  distance to rectangular boundary in the south Jeffreys, H. 1939. Theory of Probability. Oxford, UK: Oxford University
dW  distance to rectangular boundary in the west Press.
E  objective function of regular least squares Jeffreys, H. 1957. Scientific Inference, second edition. Cambridge, UK:
EW  objective function of wavelet-transformed least squares Cambridge University Press.
k  permeability Kikani, J. and He, M. 1998. Multiresolution Analysis of Long-Term
np  number of estimation parameters Pressure Transient Data Using Wavelet Methods. Paper SPE 48966
p  pressure presented at the SPE Annual Technical Conference and Exhibition,
pi  initial pressure at t  0 New Orleans, 27–30 September. doi: 10.2118/48966-MS.
Lu, P. and Horne, R.N. 2000. A Multiresolution Approach to Reservoir
re  distance to reservoir boundary
Parameter Estimation Using Wavelet Analysis. Paper SPE 62985 pre-
s  skin factor sented at the SPE Annual Technical Conference and Exhibition, Dallas,
t  time 1–4 October. doi: 10.2118/62985-MS.
Wf  wavelet coefficients Mallat, S. 1999. A Wavelet Tour of Signal Processing, second edition. San
α  vector of model parameters Diego, California: Academic Press.
 transmissivity ratio Marquardt, D.W. 1963. An Algorithm for Least-Squares Estimation on
 standard deviation of homogeneous probability distribution Nonlinear Parameters. SIAM J. Appl. Math. 11 (2): 431–441. doi:
  porosity 10.1137/0111030.
  wavelet basis function Mathisen, T., Lee, S.H., and Datta-Gupta, A. 2003. Improved Permeability
 storativity ratio Estimates in Carbonate Reservoirs Using Electrofacies Characteriza-
tion: A Case Study of the North Robertson Unit, West Texas. SPE Res
Acknowledgments Eval & Eng 6 (3): 176–184. SPE-84920-PA. doi: 10.2118/84920-PA.
The authors are indebted to the late Albert Tarantola for his Sahni, I. and Horne, R.N. 2005. Multiresolution Wavelet Analysis for
extensive help and guidance in the first half of the paper related Improved Reservoir Description. SPE Res Eval & Eng 8 (1): 53–69.
to parameter transformations. The support of the members of the SPE-87820-PA. doi: 10.2118/87820-PA.
Stanford University Petroleum Research Institute-D Consortium Tarantola, A. 2005. Inverse Problem Theory and Methods for Model
on Innovation in Reservoir Testing is gratefully acknowledged. Parameter Estimation. Philadelphia, Pennsylvania: SIAM.
Warren, J.E. and Root, P.J. 1963. The Behavior of Naturally Fractured
References Reservoirs. SPE J. 3 (3): 245–255; Trans., AIME, 228. SPE-426-PA.
doi: 10.2118/426-PA.
Athichanagorn, S., Horne, R.N., and Kikani, J. 2002. Processing and
Wong P.M., Bruce A.G., and Gedeon T.D. 2002. Confidence Bounds
Interpretation of Long-Term Data Acquired From Permanent Pressure
of Petrophysical Predictions From Conventional Neural Networks.
Gauges. SPE Res Eval & Eng 5 (5): 384–391. SPE-80287-PA. doi:
IEEE Trans. Geoscience and Remote Sensing 40 (6): 1440–1444. doi:
10.2118/80287-PA.
10.1109/TGRS.2002.800278.
Awotunde, A.A. and Horne, R.N. 2008. A Multiresolution Analysis of the
Relationship between Spatial Distribution of Reservoir Parameters and
Time Distribution of Data Measurements. Paper SPE 115795 presented
at the SPE Annual Technical Conference and Exhibition, Denver, 21–24
September. doi: 10.2118/115795-MS. Aysegul Dastan is a petroleum engineer in the Dynamic
Bourdet, D.L., Whittle, T.M., Douglas, A.A., and Pirard, Y.M. 1983. A Reservoir Characterization Team at Chevron Energy Technology
Company in Houston. Her work experience and research areas
New Set of Type Curves Simplifies Well Test Analysis. World Oil 196
cover novel well-test interpretation methods, numerical optimi-
(6): 95–106. zation and parameter estimation techniques, and using analyt-
Carvalho, R.S., Redner, R.A., Thompson, L.G., and Reynolds, A.C. 1992. ical and numerical reservoir models for production data analy-
Robust Procedures for Parameter Estimation by Automated Type-curve sis. Dastan holds MS and PhD degrees from Stanford University
Matching. Paper SPE 24732 presented at the SPE Annual Techni- and a BS degree from the Middle East Technical University,
cal Conference and Exhibition, Washington, DC, 4–7 October. doi: Ankara, Turkey. Roland N. Horne is the Thomas Davies Barrow
10.2118/24732-MS. Professor of Earth Sciences at Stanford University and was the
Dastan, A. and Horne, R.N. 2009. Significant Improvement in the Accuracy chairman of Petroleum Engineering from 1995 to 2006. He holds
of Pressure-Transient Analysis Using Total Least Squares. SPE Res Eval BE, PhD, and DSc degrees from the University of Auckland,
New Zealand, all in engineering science. Horne has been an
& Eng 13 (4): 614–625. SPE-125099-PA. doi: 10.2118/125099-PA.
SPE Distinguished Lecturer and has been awarded the SPE
Dastan, A. and Horne, R.N. 2010. A New Look at Nonlinear Regression Distinguished Achievement Award for Petroleum Engineering
in Well Test Interpretation. Paper SPE 135606 presented at the SPE Faculty, the Lester C. Uren Award, and the John Franklin Carl
Annual Technical Conference and Exhibition, Florence, Italy, 19–22 Award. Horne is a member of the U.S. National Academy of
September. doi: 10.2118/135606-MS. Engineering and is an SPE Honorary Member.

712 September 2011 SPE Journal

You might also like