Spe 189808 MS

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/323418228
Data-Based Smart Model for Real Time Liquid Loading Diagnostics in Marcellus
Shale via Machine Learning
Conference Paper · January 2018

DOI: 10.2118/189808-MS
CITATION READS
1 146
5 authors, including:
Amir Ansari Ebrahim Fathi

West Virginia University West Virginia University
2 PUBLICATIONS 1 CITATION 77 PUBLICATIONS 802 CITATIONS
SEE PROFILE SEE PROFILE
Ali Takbiri-Borujeni
West Virginia University
38 PUBLICATIONS 175 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Data-Driven Modeling View project
Application of data-driven modeling in engineering problems View project
All content following this page was uploaded by Amir Ansari on 24 September 2018.
The user has requested enhancement of the downloaded file.

SPE-189808-MS
Data-Based Smart Model for Real Time Liquid Loading Diagnostics in

Marcellus Shale via Machine Learning
Amir Ansari, Ebrahim Fathi, Fatemeh Belyadi, Ali Takbiri-Borujeni, and Hoss Belyadi, Department of Petroleum and
Natural Gas Engineering, West Virginia University
Copyright 2018, Society of Petroleum Engineers
This paper was prepared for presentation at the SPE Canada Unconventional Resources Conference held in Calgary, Alberta, Canada, 13-14 March 2018.
This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents
of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect
any position of the Society of Petroleum Engineers, its officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written
consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may
not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright.
Abstract
Liquid loading in horizontal gas wells impairs gas production and if not diagnosed in a timely manner can
kill the well. Liquid loading occurs when the gas production rate declines and gas velocity drops below the
critical velocity required to carry liquid to surface. Different models used in conventional reservoirs such as
droplet, film or transient multiphase flow models are also applied with modifications in unconventional gas
reservoirs. However, none of these models show great success when applied to inclined and horizontal wells
in shale gas reservoirs. This is due to the fact that these models are developed for vertical wells and cannot
identify the right multiphase flow regime in inclined and horizontal sections of the well. It is also extremely
hard defining the right liquid droplet size and shape or liquid film thickness as well trajectory changes.
Furthermore, these models cannot accurately predict the transient between annular and slug flow regimes
in horizontal wells. As more wells are produced in shale gas reservoirs, a great amount of information from
production control and monitoring becomes available which can be used to build a data-based smart model
for real time diagnostics of liquid loading in new wells. In this new approach, data (pressure, completion,
and productions) from many wells which have experienced or not experienced liquid loading problems in
the same area will be the basis for developing the smart model.
What is being proposed here is a unique approach that includes developing a data-based technology for the
training of neural networks that can be used as a smart model in real time to identify the start of liquid loading
in unconventional gas wells. This innovative technique incorporates a unique fuzzy pattern recognition
algorithm and unsupervised analysis technique to identify the most influential parameters impacting liquid
loading in unconventional gas wells. The main objective for this manuscript is to develop a smart model
that can predict the dynamics of liquid-gas interface and identify the start of liquid loading. Finally, the
minimum gas velocity/rate to avoid the liquid loading can be determined.
For this study, a Marcellus Shale reservoir is selected. Production and completions history of 160 wells
are collected. First the study is performed on a single well where 70 percent of the information is used for
neural network training purposes, 15% for calibration, and 15% for validation of the model. The results
show that the smart model is able to precisely predict the start of the liquid loading in the well and raise
a warning flag when the possibility of liquid loading is high. Next, series of wells in the region is picked
2 SPE-189808-MS
and smart model is built based on the 70% training, 15% calibration, and 15% validation. This model is
then used to predict the liquid loading in a different well in the same region as a completely blind well.
The results show high accuracy and reliability in predicting the start of liquid loading. To overcome the
implicit dependency of the model to Turner et al. critical velocity criteria during the training, unsupervised
learning algorithm is used to predict the loading and unloading status of the wells. The technique showed
great success in predicting the well status and confirmed with field observations.
The new smart model developed for Marcellus Shale shows great promise that this approach can be
applied in other areas where limited history of production and liquid loading exists.
Introduction
Unconventional gas reservoirs usually produce water and gas or gas condensate during the production. As
reservoir pressure decreases with time due to gas production, the gas flow rate also declines. A critical gas
velocity exists for each well below which liquid water or condensate cannot be transferred to the surface.
Below this critical gas velocity liquid will be accumulated at the bottom of the tubing when gas flow rate
is not enough. Liquid accumulation "liquid loading" will decrease the production rate and if not corrected
kills the well. Liquid loading is an accumulation of water, gas condensate or both in tubing that leads to an
increase in flowing bottom hole pressure and a decrease in gas production rate. Liquids can enter the well
directly from reservoir or condense from the gas in the wellbore due to pressure drop below the dew point
pressure of the wet or condensate gas reservoir. There are common signs that can be used in the field to
identify the liquid loading problems in a well. The easiest technique is surface monitoring that includes: 1-
High tubing/casing differential pressure; 2- High flowing bottom hole pressure; 3- Observed slugging from
well; 4- Rapid increase in decline rate; 5- fluctuating gas production rate.
In unconventional gas reservoirs liquid loading is one of the major challenges that can lead to early
decline in gas production rates. Horizontal well technology and hydraulic fracturing is commonly used to
enhance gas production in unconventional gas reservoirs. Liquid loading in unconventional reservoir can
happen during the flow back, after hydraulic fracturing, or during the production in horizontal section of
the well (Belyadi et al., 2016). This can impair the production performance before being identified in the
vertical section of the well as discussed earlier and have been observed in conventional reservoirs (Jackson
et. al., 2011).
In the Oil and Gas industry several conventional techniques have been used as solutions for wells with
liquid loading problems. These include:
I. Using Velocity String: running smaller diameter tubing that leads to increase in gas velocity and
higher liquid lift capacity.
II. Soaping: Adding surfactant at the bottom of tubing generates foaming that helps removing water
build up, however, this is not as effective in condensate loading. There are also limitations on water
cut below which the technique might not be economically viable.
III. Venting: Dropping the surface pressure to atmospheric pressure to maximize gas velocity.
IV. Compression: Dropping the surface pressure below line pressure to increase gas velocity.
V. Using Downhole pumps.
VI. Plunger lift: Using mechanical plunger to avoid liquid accumulation downhole. This is the least
expensive technique that can be used as a solution to liquid loading. There are pressure and rate
limitations that are applied to this technique.
There has been a huge interest in real time determination of the start of liquid loading problem in gas wells
to avoid any possible production loss and to take best practical action for liquid loading remediation. Turner
et al., 1969 investigated and introduced a minimum gas velocity required to prevent liquid loading. Turner
et al., suggested droplet movement model suits better to field data as compared to liquid film movement
SPE-189808-MS 3
model and derived the terminal slip velocity equation based on terminal free settling velocity of liquid and
maximum droplet diameter. Comparing with field data Turner et al. found that the terminal slip velocity
equation underestimates the minimum gas velocity by 20% due to the assumptions of neglecting transport
velocity and multiphase flow pressure and limitations on calculating the fluid density and pressures (Guo et
al., 2005). Later several studies such as Coleman et al., 1991; Nosseir et al., 2000, have tried to improve the
accuracy of the Turner et al., by modifying the correction factors, and extending the equation for multiphase
flow. Brito et al., 2015 used film reversal mechanism instead of droplet movement model used by Turner
et al., and introduced new critical velocity required to carry the liquid film surrounding the pipe wall.
However, none of the attempts so far could release the major assumptions used by Turner et al, to develop
the terminal gas velocity equation. Guo et al., 2005 developed mist-flow model for 4 phase-flow of gas, oil,
water, and solid particles and concluded that the 20% correction assumed by Turner still underestimates the
minimum gas velocity required to avoid liquid loading. Transient multiphase flow simulations along with
video logging have been used in some cases such as the Jean Marie formation in Canada. However, these
techniques also comprise significant simplifying assumptions such as dividing the reservoir inflow to five
discrete inflow points and comparing that with video logging which is qualitative.
As more wells are produced in unconventional gas reservoirs, such as Marcellus shale, significant amount
of information from production control and monitoring becomes available which can be used to build a
data-based smart model for real time diagnostics of liquid loading in new wells. In this paper, data (pressure,
completion, and productions) from 160 wells that have experienced or not experienced liquid loading
problems in the same area will be the basis for developing the smart model. The main objectivehere is to
develop a smart model that can predict and identify the start of liquid loading in Marcellus gas reservoirs
via application of machine learning.
Methodology
Machine learning is a process through which computer will learn from data to find a possible pattern in
the data set. This process encompasses three main components; Learning algorithm, Data, and Pattern in
the data. If these three components are present, a successful learning process can be achieved based on the
capability of the learning algorithm. There are two major types of Machine Learning: supervise learning
and unsupervised learning. In supervised learning, both input and output are available, and the learning
algorithm tries to find the relationship between them. One of the supervised learning algorithm that will be
used in this article is "Artificial Neural Network" (ANN).
In unsupervised learning, there is no information about the output. The learning algorithm tries to find
the pattern inside the input data alone. One of unsupervised learning algorithm that will be used in this
article is "K-mean Clustering".
Artificial Neural Network (ANN)

The idea of ANN came from the neurons of the brain and the way they communicate with each other to
solve a problem. Each artificial neural network consists of an input layer, one or more hidden layers, and
an output layer. The number of neurons (processing elements) in the output and the input layers are chosen
based on the nature of the problem being solved and the properties which are going to be predicted. Figure 1
shows a typical ANN with three input neurons and two output neurons. ANN has one or more hidden layers
and each layer has a specific number of neurons. To have a well-trained network, proper parameters should
be introduced to the network. If improper data is used to train the network, there will be no guarantee to
have a well-trained network that leads to correct predictions.
4 SPE-189808-MS
Figure 1—Artificial Neural Network schematic
The number of hidden layers and the neurons in each hidden layer depends on the complexity of the
problem, number of parameters, and number of records. Experience also plays an important role in this
decision making. But generally, there is no solid rule for them.
Objective function. Regardless of the learning method, each machine learning process needs an
optimization procedure that helps the process reduces the error as much as possible. The very common and
simple objective function in supervised learning is the summation of all the differences between predicted
values by the learning method and the actual values of the output as shown in equation 1.
(1)
During the learning process, the learning algorithm tries to assign different weights to each of the
connections between neurons in Figure 1, in a way that the global error of the objective function becomes
minimum. Also, a blind calibration is done simultaneously to stop the learning process.
Neural Network setup. To prove the concept of using data-basedmodel as a predictive tool to detect liquid
loading, artificial neural network is employed. Since ANN is a supervised learning algorithm, each sample
in the dataset needs to be labeled as "Loaded" or "Not Loaded". To label the data, Turner criteria for liquid
loading is used. It is important to mention that we have used Turner rate as a check point at this point and
will show later that the model is independent of the Turner minimum gas velocity criteria for liquid loading
prediction. If the rate is above the Turner rate, there is no loading in the well but if the rate drops below the
Turner rate, the well is loaded, third class could also be added as the "Alert" class where the rate is close
to Turner rate but not lower than that. Three binary columns are added to the dataset showing the loading
status. If it is not loaded the value in column "Not Loading" becomes "1," and if it is loaded the value in
column "Loading" becomes "1," and the same for the "Alert" class as shown in Table 1.
Table 1—Well status classification

SPE-189808-MS 5
In this case, casing pressure, tubing pressure, and production rate (gas rate) is the input to the ANN and
the binary columns are the outputs of the ANN as shown in Figure 2. The ANN consist of one layer with
7 hidden neurons.
Figure 2—Schematic of the Neural Network model
For the first attempt, the data of well #1 was used to train an ANN. The results of this classification are
shown in two ways, the confusion matrix and well production profile in Figure 4 and Figure 5. Confusion
matrix is a common way to show the classification results as shown in Figure 3.
Figure 3—Schematic of confusion matrix
Confusion matrix shows that the data points are classified perfectly (99.9% accurate) and only 2 points
are misclassified.
Figure 4—Confusion matrix of well #1
Figure 5 compares the three classes of loaded, unloaded, and warning status of the well as compared
to the Turner rate criteria. The color-coded production profile is in solid agreement with the turner rate
criteria. For this specific well, 70 percent of the production data is picked randomly for training, 15% for
calibration, and the remaining 15% for validation of the predictions. We have also performed the sensitivity
analysis on the percent of the data that can be used for training and could decrease that to only 50% of the
data with high accuracy.
6 SPE-189808-MS
Figure 5—Production rate of the well 1
For the next step, 20 well in a specific region were used to train the ANN model. In this case, to build a
smart model 70% of the production data from 20 wells are used for training and the rest used for calibration
and validation. The validated model was then used for complete blind tests on a different well in the same
region that was not included in the process of building the smart model. Figure 6 shows the quality of the
model built based on 20 wells.
Figure 6—Confusion matrix for training 20 wells
The model was later deployed for a completely blind case well as validation and outstanding results
were attained. Model can identify the liquid loading, not loading and warning status of the well with a high
accuracy of more than 99%, Figure 7.
SPE-189808-MS 7
Figure 7—Confusion matrix for completely blind test well
Figure 8—production rate of the completely blind well #2
Figure 8 compares the three classes of loaded, unloaded, and warning status of the well as compared to
Turner rate criteria. The color-coded production profile is in a very good agreement with the turner rate
criteria.
In the training of the model in this technique to develop the smart model using NN, we still required
to label our data with loaded, unloaded or warning status based on Turner et al. critical velocity criteria.
This approach implicitly considers that the Turner's criteria are held valid in unconventional reservoirs.
To eliminate this implicit dependency to Turner's criteria in the next section, we have tried unsupervised
algorithm that does not require any prior information on the status of the well. Therefore, it releases the
dependency of the model to any theoretical or empirical correlations.
K-mean Clustering
K-mean clustering is used to partition the data into k clusters when the data has no label (in ANN, data
needs to be labeled). The output of the k-mean clustering is k points in space which are the centroid of each
cluster. Each sample in the data belongs to the cluster which is closer to it (Euclidian distance). The centroid
can represent all the points in the cluster.
The value of k and the initial centroids are the controlling parameters of this approach (Like the number
of hidden layers and neurons in ANN). Selecting the k initial centroids is the most challenging part of the
K-mean algorithm. K-means could be initialized using some random centroids, these centroids are usually
selected from the data-set. The final solution of this algorithm highly depends on the starting points. And
8 SPE-189808-MS
different initial centroid may or may not end up with different results. To avoid finding different clusters,
the initial centroids should be defined wisely rather than selecting randomly.
The algorithm has three simple steps:
I. Finding the Euclidian distance between each sample and all the cluster centroids.
II. Assigning the label of the nearest cluster centroid to each sample.
III. Update the centroid by averaging all the samples which belong to a certain cluster.
Although the algorithm seems to be very simple, this technique is powerful in clustering especially when
reasonable parameters are used to perform the analysis.
K-mean Clustering setup. It was proved with the application of ANN in previous sections that there is a
hidden pattern in the data which needs to be extracted using data-basedapproaches. Initially we used ANN
to see whether the pattern exists or not. In order to prove that we had no choice but labeling the data using
Turner et al., minimum gas velocity criteria. But in this section, an unsupervised algorithm is employed to
do the classification without labeling the data using a supervisor (i.e., Turner criteria). In the introduction
section we have shown that the Turner critical gas rate suffers from major simplifying assumptions that
impact its application especially in unconventional gas reservoirs.
For the first attempt, it is decided to classify the data into 2 different groups with random initial centroids.
K-mean algorithm was run 1000 times to study the effect of random initialization. As expected, different
results were obtained. Turner rate was used as a comparison tool and Figure 9 shows the histogram of the
accuracy of predictions. This graph basically shows that 4 different classifications (local minimum) exists
in our data set that could be result of choosing random initial centroids.
Figure 9—Histogram of accuracy of predictions
To guarantee the repeatability of this approach, the initial centroids should be given to the k-mean
algorithm. If the initial set of centroids are given (a point close enough to one of the local minimums) to the
k-mean algorithm, the results would be unique but we are not interested in wrong unique results. The correct
initial centroids could be obtained using logical reasoning. For this purpose, we have revisited the common
signs of the liquid loading discussed earlier and used those as guidelines to choose the initial centroids.
For example, in liquid loading problem, between a high rate and low rate, it is obvious that high rate is
less probable to have liquid loading problem and low rate is a good candidate to have loading issues. Or,
between 2 sets of casing and tubing pressure, the case with higher difference between casing and tubing
pressure is more likely to become loaded. These logical statements could be further confirmed by looking
at the final centroid of accurate runs in Figure 9. One of the initial centroids are provided in the below table.
SPE-189808-MS 9
Table 2—Initial centroids
Using constant initial centroid guarantees the unique final results, but it is possible to obtain the same
final results using different initial centroids. Plotting the distribution of all the parameters helps us to know
the data better and choose these initial centroids more wisely.
The values of Table 2 were used for the initial centroids and a successful classification was performed
using k-mean algorithm. The green dots show not loaded situation and the red dots show the liquid loading
status. The turner rate is also illustrated in the plot to show the difference between the two approaches. By
looking at the graph, it is obvious that after year 2013 the well is loaded some remedies are done to resolve
this issue. Then the well was produced normally for small period of time followed by gas rate decline and
liquid loading issues. As it is clear from the figure, this approach captures the liquid loading problem by
only knowing casing pressure, tubing pressure, and gas rate even though the operational constraint has been
changed. In this technique we did not include additional information such as tubing size, density of the
liquid, temperature, etc. however having additional information might improve the accuracy of the results.
The same analysis was done using the same initial centroid for different wells in the region on the same
and different pads. The results show high accuracy of the model to predict the liquid loading as presented
in Figure 10 and Figure 11. Figure 10 and Figure 11 show that the same model can be used to identify the
beginning of the liquid loading in two completely blind wells in the region. In general well status predictions
(red and green dots) are in close agreement with Turner predictions, however, there are cases where Turner
minimum gas rate is underestimated as in Figure 10 second half of 2015, where the gas rate is above Turner
rate but still our model identified the liquid loading. Similar observations can be found in Figure 11.
Figure 10—production rate of the completely blind well #3 in compare to traditional Turner criteria
10 SPE-189808-MS
Figure 11—production rate of the completely blind well#4 in compare to traditional Turner criteria
Conclusions
In conclusion in this study we explored the opportunity of using different machine learning techniques
to identify and predict liquid loading problems in shale gas reservoir. For this purpose, production and
completion data from 160 wells in the Marcellus shale gas reservoir is selected and both supervised and
unsupervised learning algorithms are applied to identify and predict the liquid loading problems. Supervised
technique in this case NN showed that there is a possibility of identifying and predicting the loading
in shale gas reservoirs. However, since the technique required prior information regarding loading and
unloading status of the well, it implicitly uses traditional Turner et al. critical gas velocity criteria for the
training. Therefore, unsupervised learning algorithm in this case K-mean clustering is used to overcome the
problem. Unsupervised algorithm that was used showed great promise that this technique can be applied
to identify and predict the liquid loading in real time without any dependency on analytical or empirical
correlations developed earlier. The technique simply uses hard data collected during the production and is
highly reliable and robust in all the cases we have practiced in this study. The new smart model developed
for Marcellus Shale shows great promise that this approach can be applied in other areas where limited
history of production and liquid loading exists.
Acknowledgments
Authors of this manuscript would like to acknowledge the CNX Resources for providing the data and
permission for this publication.
References
1. Turner, R.G., Hubbard, M.G., and Dukler, A.E. 1969. Analysis and Prediction of
MinimumFlowrate for the Continuous Removal of Liquids from Gas Wells. J PetTechnol 21 (11):
1475–1482 Trans., AIME, 246.SPE-2198-PA. doi: 10.2118/2198-PA.
2. Coleman, S.B., Clay, H.B., McCurdy, D.G., and Norris, L.H. III. 1991. A New Look at Predicting
Gas-WellLoad-Up. J Pet Technol 43 (3): 329–333 Trans., AIME, 291. SPE-20280-PA. doi:
10.2118/20280-PA.
SPE-189808-MS 11
3. Nosseir, M.A., Darwich, T.A., Sayyouh, M.H., and El Sallaly, M. 2000. A New Approach for
AccuratePrediction of Loading in Gas Wells Under Different Flowing Conditions. SPE Prod &
Fac 15 (4): 241–246. SPE-66540-PA. doi:10.2118/66540-PA.
4. Guo, B., Ghalambor, A., and XU, C. 2006. A Systematic Approach to Predicting Liquid
Loading in Gas Wells. SPE-94081-PA, SPE Production & Operations, V 21, Issue 01. https://
doi.org/10.2118/94081-PA
5. Brito, R. M. 2015. Effect of Horizontal Well Trajectory on Two-phase Gas-liquid Flow Behavior.
Ph.D. Dissertation, The University of Tulsa, Tulsa, Oklahoma.
6. Jackson, D., Claudio, J. J., Sask, D., 2011. Investigation of Liquid Loading in Tight Gas
Horizontal Wells With a Transient Multiphase Flow Simulator. SPE-149477-MS, Canadian
Unconventional Resources Conference, 15-17 November, Calgary, Alberta, Canada
7. Belyadi,H. Fathi,E. Belyadi.F (2016) Hydraulic Fracturing in Unconventional Reservoirs:
Theories, Operations, and Economic Analysis. Gulf Professional Publishing, Nov 1, 2016 -
Technology & Engineering - 320 pages
View publication stats

Spe 189808 MS

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Spe 189808 MS

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Conference Paper · January 2018

Amir Ansari Ebrahim Fathi

SEE PROFILE SEE PROFILE

Data-Driven Modeling View project

Application of data-driven modeling in engineering problems View project

The user has requested enhancement of the downloaded file.

Data-Based Smart Model for Real Time Liquid Loading Diagnostics in

Copyright 2018, Society of Petroleum Engineers

Artificial Neural Network (ANN)

Figure 1—Artificial Neural Network schematic

Table 1—Well status classification

Figure 2—Schematic of the Neural Network model

Figure 3—Schematic of confusion matrix

Figure 4—Confusion matrix of well #1

Figure 5—Production rate of the well 1

Figure 6—Confusion matrix for training 20 wells

Figure 7—Confusion matrix for completely blind test well

Figure 8—production rate of the completely blind well #2

Figure 9—Histogram of accuracy of predictions

Table 2—Initial centroids

View publication stats

You might also like