Professional Documents
Culture Documents
This paper was prepared for presentation at the 2009 SPE Latin American & Caribbean Petroleum Engineering Conference, Cartagena, Columbia, 2009, 31 May - 3 Jun 2009
This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents of the paper have not been
reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect any position of the Society of Petroleum Engineers, its
officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written consent of the Society of Petroleum Engineers is prohibited. Permission to
reproduce in print is restricted to an abstract of not more than 300 words; illustrations may not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright.
Abstract
This paper discusses a new workflow to stochastically estimate the performance of infill locations in a mature oil or gas
field. Usually performance evaluations for infill wells are conducted using either much generalized statistical methods or
numerical simulation. Both approaches have a significant drawback; the prior being quick however very often lacking in
accuracy, the latter being very accurate however usually very complex in setup and computation.
The presented workflow is a new approach to infill well performance prediction that combines speed and reasonable
accuracy. The workflow generates a set of key performance indicators of existing wells derived from historic dynamic data
(fluid production rates, pressures, etc.), static data (reservoir properties, etc.) and predicted data (simplified production
forecasts). The wells are then grouped according to the similarity of their KPIs. The production profiles of the wells within the
same group are combined to a type curve that is described by the most likely production profile and an associated uncertainty
range.
A data-driven expert system is used to identify and capture the correlations of the parameters such as geographic locations,
well spacing, reservoir properties and the group membership (equivalent to type curve). This expert system can then be applied
to any location in the field in order to determine the most likely group membership of a potential infill well. The classification
of an infill well to a group is hereby not necessarily unique; the expert system might classify an infill well into several groups
and assign a probability of occurrence for each of the groups. A Monte Carlo routine is then applied to forecast the
performance of the infill locations honoring the respective probability of occurrence of each type curve.
The presented approach has been successfully applied for infill well selection in a statistical field development study for
YPF in the Argentinean San Jorge Basin.
Introduction
Despite the long production history with more than 20 000 wells, the current methodology to make decisions still takes a
long and costly path consisting of testing, plugging and stimulation procedures. The authors believe that, in parallel to the
reservoir modeling studies, YPF can make use of the immense array of well data and the wisdom acquired over one century of
production history, to produce a set of innovative practices to boost the efficiency of their current operations.
Therefore a statistical approach based on the power of emerging computing tools for data mining can assist YPF to
recognize patterns and develop methodologies which are strong enough to tackle the aforementioned technical challenges,
transforming these opportunities in real bottom line results.
The goal of the study is to generate sound technical arguments to formulate an innovative strategy to accelerate the
exploitation of oil and gas assets of YPF in the San Jorge Basin, Argentina. The workflows of interest are
• identify candidates for infill drilling locations,
• propose a field development strategy based on lessons learned from the past in order to know the size of the
business from an economic point of view
• identify benefits through optimizing infill locations using data mining methodology.
The objective of this paper is to present the methodology and results obtained during the analysis phase as value promise
for field development.
2 SPE 122186
Workflow overview
The objective of the workflow is to select the best infill locations in the study area using an integrated data mining
workflow. The main focus is to systematically investigate past production performance and use identified trends to predict the
future performance of the existing wells and of infill locations.
A dataset of a study area was provided. The dataset contained mainly information about oil, gas and water production as
well as the water injection volumes. Only very little petrophysical information was available with no representative aerial
trend. The presented workflow takes advantage of the available production data and systematically investigates past
production performance using statistical indicators as well as significant key performance indicators (KPI). The predicted
future performance is determined with a simplified well forecast based on a hyperbolic decline curve, leading to a set of future
performance related KPIs. These two sets of KPIs are combined in an expert system to draw quick decisions about the future
development potential in a certain area of the field. The result of this workflow is a list of infill locations and their predicted
performance. As a concluding point in this workflow a comprehensive reasoning logic will score each infill location according
to its performance considering the expert knowledge from the engineers who operate the field.
0.18
0.16
0.14
0.12
Frequency
frequency
0.1
0.08
0.06
0.04
0.02
0
1 2 3 4 5 6 7 8 9 10
Category
category
Figure 1: Histogram of initial oil rates in a certain area of the field. Categories represent oil rates, where category 1
represents lowest oil rates and category 10 the highest.
In Figure 2 a map of the study area can be seen. The color represents the different time intervals in which a well came on-
stream. The size of the bubbles represents the initial oil rate (the bigger the bubbles, the higher the oil rates). Singularities such
as extremely good wells in regions where performance is rather poor can be identified and further analysis of the production
can be initiated.
SPE 122186 3
Y-coordinate
X-coordinate
Figure 2: Geographic bubble map depicting the time interval (color) and the initial oil rates (size)
Classification of wells
As mentioned before the performance of each well is described by two sets of KPIs; the historic KPIs calculated out of
historic production information and the future KPIs calculated out of predicted performance. The predicted performance is
determined using decline curve analysis.
Both KPI sets combined give multiple parameters to describe the performance of every well. After a statistical analysis a
subset of 9 KPIs has been identified that have the biggest influence in the classification of each well. A multidimensional
clustering algorithm (SOM) is used to sort the data according to their features.
The 15 clusters, which have been detected in this analysis therefore represent groups of similar wells, which means that the
SOM has significantly reduced the used data amount from more than 700 individual wells to 15 groups of similarly behaving
wells. Each of these 15 groups is clearly defined by particular features that vary significantly from one group to the other (e.g.
significantly different initial oil rates, initial water cut, etc.), however vary only very slightly for the wells within a particular
group. Figure 3 shows the result of the clustering process in a cross-plot. The plot shows the hyperbolic exponent from the
decline curve analysis (predicted KPI) vs. the initial oil rate (historic KPI) for four different clusters. The color of the points
represents the cluster membership. As can be seen the clustering has grouped the wells into bins of similar measurements.
4 SPE 122186
Figure 3: Cross-plot of a predicted KPI vs. a historic KPI; color = cluster membership
This feature generalization imposes a stochastic approach to manage the whole population of wells in a particular group. In
contrast to a conventional aproach, where each well is processed individually with its respective deterministic parameters, the
generalized technique implies that the parameters for each well are defined stochastically through the group to which this
particular well belongs to.
This requires a statistical analysis of the population of each group. The distribution of the well parameters within each
group was checked for outliers, confidence intervals and consistency and the mean and standard deviation were determined
(Figure 4). From this point in the workflow onwards the individual well data are not considered anymore but only the
probability distribution parameters of the group to which this well belongs to are used in the remaining steps.
Hyperbolic Decline Exponent
A technique has to be applied to describe the relation of location, time when a well has been drilled (here referred to as
‘vintage interval’) and the initial spacing with the cluster membership. A Bayesian Network is used to approach this task.
Formally a Bayesian Network is a probabilistic model that represents a set of variables and their probabilistic
interdependencies. The interdependencies can either be entered by an expert – as used in various expert or troubleshooting
systems – or a learning algorithm can infer and quantify the interdependencies between the input and output parameters from a
provided training data set. In the discussed workflow the latter approach is used.
Figure 5: Quantification of aerial type curve probability using a Bayesian Network approach
As can be seen in Figure 5 a Bayesian Network is initially set up with four input parameters: x-coordinate, y-coordinate,
spacing and vintage interval. The output parameter is the type curve distribution. The training process investigates and
quantifies the causal dependencies between input parameters and type curve cluster. The training algorithm modified the
conditional probability tables of the given Bayesian Network according to the observations in the field. The Bayesian network
was then able to quantify the probability that a well at a certain location, drilled at a certain time, would perform according to a
certain type curve (e.g. in the depicted example the investigated location would have a 43.3 % chance of performing like a
cluster 4 well, a 22.4 % chance of being a cluster 8 well, etc.). Having the type curve selector in place it is possible to infer the
most likely characteristics of a new well drilled in any location in the reservoir.
Model Validation
Once the clustering algorithm and the cluster quantification process is set up, a certain number of wells are picked to
calibrate and blind test the process. These wells have not been part of the model setup. In the model validation step the
production performance at their locations is predicted using the full forecasting workflow. The stochastic result for the
production performance (especially the P30, P50 and P70 lines for oil production rates and cumulative oil production) are
plotted and compared against actual historical values. In case of a general under- or over performing trend, the model has to be
reviewed. This could result in different peer grouping of the wells or an adapted forecasting method.
Figure 6: Model validation. left plots: oil rate actual (green) vs. predicted P30, P50, P70 (blue lines); right plots
cumulative oil production actual (green) and predicted P30, P50, P70 (blue lines)
Expert system
The candidate screening is performed using an expert system based on a Bayesian Belief network (BBN). The BBN can be
applied to describe and reconstruct a complex decision process involving multiple parameters under uncertainty. The various
input parameters in a BBN can either be conditionally independent – hence, not having any influence on each other or
conditionally dependent. In the latter case prior knowledge of the dependency of the various parameters has to be quantified
and entered in the BBN. This is either done using a Bayesian learning approach or manually through an expert (Mitchell 1997,
Korb 2004).
The input parameters in a BBN can either be continuously measured variables or discrete variables. Each parameter can
also be entered stochastically, considering uncertainty in the measurements or reasoning process. The result of a BBN is a
decision, which again is represented stochastically considering the inherent uncertainty of the variable measurements and
decision process.
BBNs are applied in various disciplines such as troubleshooting of computer hardware problems, medical diagnosis,
speech recognition, credit fraud detection or spam mail filtering.
In this study a BBN was set up purely by experts to reproduce their reasoning process, considering the various aspects of
their decision such as economic, logistic and reservoir considerations (Neapolitan 2004). The outcome is a score between 0
and 100 that describes the well’s probability of being a good producer, 100 being the best.
SPE 122186 7
Case Study
A case study has been set up to validate the approach and to estimate the business impact of this forecasting routine. A
study area was picked to investigate how the field development would have been done differently if this forecasting model
would have been available before. Three cases were investigated and compared:
1. “Base case”: the last seven wells that have been drilled in the study area were determined. These seven wells have
been left out of the model set up process. The forecasting model was set up and the performance of the seven wells
was predicted. The results of the forecasting workflow were cross-checked with the actual performance to validate the
applicability of the model (‘blind test’). Two of the blind test results have been depicted earlier in Figure 6. It was
agreed that the model is accurate enough to proceed.
2. “Same number of wells”: a virtual grid of infill locations was created. The locations fulfilled several requirements
(e.g. minimum spacing to neighbor wells, maximum distance from edge of the field, etc.). The forecasting workflow
was automated in order to predict the performance of every virtual location and the reasoning system was then
applied to score every location. The top seven locations were selected as the picks of the forecasting workflow.
3. “More wells”: this scenario investigates a case where a significant higher number of wells can be drilled (in this case
23 locations were investigated). Again, a virtual grid with wells was created and every location was forecasted and
scored according to the previously described workflow.
8 SPE 122186
A comparison of the three scenarios can be seen in Figure 8. It can be seen that even with the same number of wells a
significant higher cumulative production can be achieved in the first 10 years. A significant increase in the number of wells
drilled logically also leads to a significantly higher recovery. However, it is interesting to note that the performance per well
decreases after a certain number of wells, thus leaving each well with approx. 5 % less recovery in the “More wells” case
compared to the “Same wells” case.
Figure 8: Case comparison – cumulative oil production base case (purple), same wells (black) and more wells (orange)
Conclusions
A predictive tool to forecast the performance of infill wells is presented to YPF showing that it can bring numerous
advantages and new opportunities to the asset:
• As an advisory expert system this methodology integrates a degree of expertise that was not available to the asset
team before. The methodology can consistently handle large amounts of data and can be customized to the
standards of the asset team. Also the turnover of experienced personnel can be approached with an advisory tool
like that because it captures the knowledge from experts and lessons learnt from previous decisions.
• Due to the streamlined workflow the time to find a ranked list of infill drilling locations is reduced to
approximately three to four weeks. The conventional approach could take several weeks or even several months
longer. The current approach saves YPF a significant amount of resources (workforce and time).
• Since several hundred infill locations can be forecasted very quickly YPF can now perform sensitivity studies to
optimize their infill drilling campaign in a short time period. YPF can hereby formulate the objective function for
the optimization process to maximize oil recovery or the NPV of the field development plan.
Acknowledgement
The authors would like to thank YPF and Schlumberger for the permission to publish the present paper.
References
Arps, J.J.: “Analysis of Decline Curves,” Trans. AIME (1945), 160, 228-247
Ghoraishy S.M., Liang J.T., Green D.W., Liang H.C., “Application of Bayesian Networks for Predicting the Performance of Gel-Treated
Wells in the Arbuckle Formation, Kansas”, paper SPE 113401 prepared for presentation at the 2008 SPE/DOE Improved Oil recovery
Symposium held in Tulsa, Oklahoma, USA, 19-23 April 2008
Korb Kevin, Nicholson Ann: “Bayesian Artificial Intelligence”, Chapman & Hall/CRC Press UK, London, United Kingdom, 2004
Jensen Finn: “Bayesian Networks and Decision Graphs”, Statistics for Engineering and Information Science, Springer, New York, NY,
USA, 2001
Mitchell Tom: “Machine Learning”, International Edition, McGraw-Hill, Singapore, 1997
Neapolitan Richard: “Learning Bayesian Networks”, Pearson Prentice Hall, Upper Saddle River, NJ, USA, 2004
Zangl, G., Hannerer, J., “Data Mining – Applications in the Petroleum Industry”, 2003, Round Oak Publishing, Katy, TX