Professional Documents
Culture Documents
Data
Abstract
arXiv:1905.09746v1 [physics.data-an] 23 May 2019
Engineering simulators used for steady-state multiphase flows in oil and gas wells and pipelines are commonly utilized to predict
pressure drop and phase velocities. Such simulators are typically based on either empirical correlations (e.g., Beggs and Brill,
Mukherjee and Brill, Duns and Ros) or first-principles mechanistic models (e.g., Ansari, Xiao, TUFFP Unified, Leda Flow Point
model, OLGAS). The simulators allow one to evaluate the pressure drop in a multiphase pipe flow with acceptable accuracy.
However, the only shortcoming of these correlations and mechanistic models is their applicability (besides steady-state versions
of transient simulators such as Leda Flow and OLGA). Empirical correlations are commonly applicable in their respective ranges
of data fitting; and mechanistic models are limited by the applicability of the empirically based closure relations that are a part of
such models. In order to extend the applicability and the accuracy of the existing accessible methods, a method of pressure drop
calculation in the pipeline is proposed. The method is based on well segmentation and calculation of the pressure gradient in each
segment using three surrogate models based on Machine Learning (ML) algorithms trained on a representative lab data set from the
open literature. The first model predicts the value of a liquid holdup in the segment, the second one determines the flow pattern, and
the third one is used to estimate the pressure gradient. To build these models, several ML algorithms are trained such as Random
Forest, Gradient Boosting Decision Trees, Support Vector Machine, and Artificial Neural Network, and their predictive abilities
are cross-compared. The proposed method for pressure gradient calculation yields R2 = 0.95 by using the Gradient Boosting
algorithm as compared with R2 = 0.92 in case of Mukherjee and Brill correlation and R2 = 0.91 when a combination of Ansari and
Xiao mechanistic models is utilized. The application of the above-mentioned ML algorithms and the larger database used for their
training will allow extending the proposed methodology to a wider applicability range of input parameters as compared to standard
accessible techniques. The method for pressure drop prediction based on ML algorithms trained on lab data is also validated on three
real field cases. Validation indicates that the proposed model yields the following coefficients of determination: R2 = 0.806, 0.815
and 0.99 as compared with the highest values obtained by commonly used techniques: R2 = 0.82 (Beggs and Brill correlation),
R2 = 0.823 (Mukherjee and Brill correlation) and R2 = 0.98 (Beggs and Brill correlation). Hence, the method for calculating the
pressure distribution could give comparable or even higher scores on field data by contrast to correlations and mechanistic models.
This fact is an indicator that the model can be scalable from the lab to the field conditions without any additional retraining of ML
algorithms.
Keywords: multiphase flow, machine learning, surrogate modeling, pressure gradient, flow pattern, liquid holdup
Preprint submitted to Journal of Petroleum Science and Engineering May 24, 2019
in [6] the authors applied Support Vector Machine algorithm bitrary inclination angle to the horizontal, in the gravity field.
(SVM), and in [7] fuzzy inference system was used. In oil production, this class of flows can be encountered at var-
Accurately determining the liquid holdup is also important ious stages of the well life, from cleanup after drilling, through
in planning the design of separation equipment. For example, cleanup after fracturing to the production stage. The key to
the slug flow regime can be dangerous for the separator, when control oilfield services operations is the ability to accurately
a significant liquid mass comes unexpectedly to surface from predict the pressure drop along the well, while typical mea-
a long near-horizontal wellbore. Several papers were devoted surements of pressure are taken on surface (with the memory
to determining this parameter by machine learning tools. For pressure gauges installed downhole, but the readings are rarely
example, in [5] and [8] authors have applied ANN. available in real-time, so the tool to calculate the pressure drop
In order to calculate pressure distribution in a pipe, segmen- from surface to the bottom hole is required). Another issue is
tation and numerical algorithm are used. During multiphase the pressure drop along the horizontal section of a (generally,
flow in tube flow pattern and pressure gradient change along multistage fractured) well. During flowback and production,
the pipe. Therefore, to solve this problem, it is necessary to the well is controlled by the choke on the surface and by the
break down the whole pipe into segments. Within each seg- ESP (if the one is installed), but no measurements are taken in
ment, the flow regime can be considered homogeneous, and the the horizontal section. Excessive drawdown (flowing the well
pressure gradient is approximately constant. Each step of the at excessively high flow rates) may be dangerous to the well-
numerical algorithm calculates the pressure gradient within the fracture system, resulting in undesired geomechanics events,
segment. Multiphase flow correlations and mechanistic models such as proppant flowback, tensile rock failure and fracture
are commonly used for this purpose. pinch-out in the near wellbore [3].
Correlations were developed upon laboratory experiments. The new approach for pressure gradient calculation on the
Among the most widely used are Aziz and Govier [9], Beggs arbitrarily oriented pipe segment is proposed. The method is
& Brill [10], Mukherjee & Brill [11], and others. Many articles based on machine learning algorithms. The considered algo-
are concerned with issues of limits to the applicability of multi- rithms are trained on lab data set that is collected from open
phase flow correlations. Authors of these papers conclude that source: articles and dissertations. The new method consists of
each correlation could be applied only in its range of input pa- three surrogate models nested within each other (see a general
rameters for obtaining results with acceptable accuracy. There discussion of surrogate modeling in [21]). The first model pre-
are also several mechanistic models with semi-empirical clo- dicts liquid holdup parameter. The second one identifies the
sure relations, which are used to predict different multiphase flow regime in the segment. The value of liquid holdup is in-
flow characteristics. The most popular ones are Hasan and cluded in the input parameters of this model. The resultant
Kabir [12], Ansari [13], TUFFP Unified [14], [15] and others. model estimates the pressure gradient. Among input parame-
Furthermore, there are also two steady-state multiphase flow ters of this model, flow pattern and value of liquid holdup are
models that have a relatively high accuracy of pressure drop presented.
calculations, namely, Leda Flow Point model [16] and OLGAS This comprehensive approach of pressure gradient calcula-
[17]. These mechanistic models are steady-state versions of tion, which includes information about flow pattern and liquid
corresponding transient simulators. holdup, and surrogate models, trained on lab data from differ-
The literature review shows that researchers applied ML al- ent sources, yields output value with high accuracy and allow
gorithms for pressure gradient prediction and direct output pres- obtaining the model with wider scope of applicability in com-
sure estimation. In the article [18] authors predicted the bottom- parison with standard techniques.
hole pressure using ANN. In contrast, in the paper [19] authors
forecasted value of bottomhole pressure via ANN, but they sug-
gested to use segmentation of the well and identify flow regime 2. Modeling approach
in each segment.
The present work is a continuation and extension of [20]. 2.1. Details of modeling approach
Compared to the earlier work, all the results are new. We ex- This part of the article is devoted to the description of the
panded the database by more than 20%, changed the set of in- method of calculating the pressure distribution along a pipe. In
put parameters used for constructing the surrogate models, and open literature (e.g., [22]) this method is called the marching
modified the resulting sub-model for pressure gradient calcu- algorithm. It is the numerical algorithm for computation the
lation. Now we use the momentum conservation equation as a integral of pressure gradient function along the pipe path:
framework and have a separate sub-model for the friction factor
coefficient. The workflow is aimed at the prediction of the pres-
Z L
dp 0 0
sure distribution. A sensitivity study is carried out. Finally, the ∆p = (l )dl . (1)
0 dL
model trained on lab data is applied to the field cases without
any retraining. To perform the integration, the pipe should be divided into n
segments of length ∆Li . Thus inside each segment, the flow
1.2. Precise problem formulation type is approximately homogeneous, and the pressure gradient
We consider the class of steady-state multiphase gas-liquid along the segment is approximately constant. In Fig. 1, an ex-
flows in wells and pipelines of circular cross-section and an ar- ample of the segmented inclined well is represented.
2
Figure 1: Schematic picture of the segmented well
dp
In other words, the integrand function dL (l0 ) is represented
by a piece-wise constant approximation and the integral is
transformed into summation:
n !
X dp
∆p = ∆Li , (2) Figure 2: The flowchart of the marching algorithm.
i=1
dL i
dp
dp are estimated and, consequently, pressure
where ∆Li is the length of i-th segment and dL is the pressure dL 1
i
gradient along i-th segment of the pipe. dp
!
We show the flowchart of the marching algorithm in Fig. 2. pcalc
in2 = pin1 + ∆L1
dL 1
Before calculation process the following parameters are
known: pressure and temperature at the inlet pin = pin1 and is obtained. These calculations are continued until the execu-
T in , temperature at outlet T out , liquid flow rate QlS C , densities of tion of the following condition:
oil, gas and water at standard conditions, gas-oil ratio (GOR),
pin2 − pguess
calc
water cut (WC) and other parameters. In the numerical algo- in
< ,
2
rithm heat processes are not accurately considered that is why
linear interpolation for temperature between inlet and outlet is where is desirable accuracy. As a result, the input pressure of
used: the second segment
T out − T in
T i = T in +
!
· (i − 1). dp
n pin2 = pin1 + ∆L1
dL 1
To calculate the flow parameters in each segment, the set of
PVT correlations is utilized in which pressure and temperature is derived. Continuing this iterative process, one could compute
are taken in the middle of the segment. the output pressure:
To begin with, the first segment is considered. In order to n !
calculate pressure at the inlet of the second segment pin2 (it is
X dp
pout = pin + ∆Li . (3)
equal to the pressure at the outlet of the first segment), the it- i=1
dL i
erative algorithm is applied. In each step of this algorithm, the
value of outlet pressure of the first segment pguess
in2 is guessed The schematic description of these calculations is shown in
and the pressure at the center of the segment Fig. 3.
At each step of the marching algorithm, it is necessary to
pguess
in2 − pin1 compute the value of pressure gradient in the considered seg-
p̄1 = ment. For this purpose, the new approach is proposed. It is
2
based on machine learning algorithms and consists of three sur-
is calculated. Using p̄1 , temperature at the center of the segment rogate models. The resultant model estimates value of the pres-
from linear interpolation and PVT correlation, pressure gradient sure gradient in the segment. Among input parameters of this
3
The next input parameter in the surrogate models is the no-
slip liquid holdup defined by the following equation:
v sl
λl =
Figure 3: Schematic description of the iterative algorithm for pressure drop vm
calculations
where vm = v sg + v sl is the mixture velocity.
Finally, we also include Reynolds (Re) and Froude (Fr) num-
model we also include the value of liquid holdup and flow pat- bers into input features. They are defined according to the equa-
tern (besides velocities and PVT properties of liquid and gas at tions:
flowing conditions). These parameters are determined by the ρns vm d v2
other two surrogate models. The next subsection of the arti- Re = , Fr = m
µns gd
cle will be devoted to the detailed description of the proposed
method of pressure gradient calculation. where ρns = ρl λl + ρg (1 − λl ) and µns = µl λl + µg (1 − λl ) are
the no-slip density and viscosity correspondingly (ρg and µg are
gas density and viscosity).
2.2. Method of calculating the pressure gradient in the pipe Thus, in the chosen set of dimensionless parameters infor-
segment mation about diameter is contained in the Reynolds and Froude
This subsection is devoted to the description of the proposed numbers. Moreover, we note that in Beggs & Brill and Mukher-
method for calculating the pressure gradient in the segment jee & Brill correlations diameter number is also excluded from
which could be oriented at an arbitrary angle to the horizon- the models.
tal direction (from −90◦ to 90◦ ). Inclination angle equal to zero Here we would like to justify the order of steps in the pro-
corresponds to a horizontal flow. Positive and negative angles posed method, where first we determine the volume fraction of
correspond to uphill and downhill flows, respectively. the liquid, and only then we determine the flow regime. The
The method of calculating the pressure gradient in the seg- logic is as follows. A typical flow regime map is built on the
ment consists of three surrogate models nested within each 2D plane in the axes being liquid and gas velocity numbers (or
other that are based on machine learning algorithms. The first these parameters could be superficial velocities of gas and liq-
model predicts the value of liquid holdup parameter in the pipe uid) [22], [4]. The velocity number is proportional to the super-
segment. The second model identifies the flow regime. It is nec- ficial velocity which, in turn, by definition is the linear velocity
essary to highlight that among input parameters of this model times the volume fraction. From computational fluid dynamics,
the output result of the first model is used. The final model it is known that by solving the system of conservation laws with
is targeted to the estimation of the pressure gradient. In this closure relations it is possible to find fields of gas and liquid lin-
model, both output results of the first model and the second ear velocities and liquid holdup. Hence, in order to determine
models are utilized. where we are on the flow regime map, one needs to determine
first the volume fraction of the fluid (i.e., holdup), and only then
For creating the surrogate models applicable for different
one determines the superficial velocities for this volume frac-
types of liquids and gases and for reducing the number of input
tion, which allows determining the point on the flow regime
features when training ML algorithms, the following set of di-
map and, hence, to identify the flow regime. Therefore, in our
mensionless variables was chosen according to the article [23].
method we first determine the holdup, then the flow regime, and
These parameters are called velocity number of gas, velocity
not vice versa.
number of liquid, viscosity number and defined by the follow-
ing equations:
2.2.1. Model for liquid holdup
ρl ρl The surrogate model for estimation of the liquid holdup pa-
r r s
g
Nvg = v sg 4
, Nvl = v sl 4
, Nµ = µl 4 , (4) rameter in the pipe segment is a regression. The following set
gσlg gσlg ρl σ3lg
of input parameters is used: inclination angle of the segment,
liquid and gas velocities numbers, viscosity number, Reynolds
where v sg and v sl are gas and liquid superficial velocities re- and Froude numbers, no-slip liquid holdup. To improve the
spectively, ρl is a liquid density and σlg is a surface tension predictive ability of a surrogate model, the input data is divided
between liquid and gas phases, µl is a liquid viscosity and g is into three categories according to the inclination angle of the
the gravitational acceleration. segment: data points are responsible for horizontal, uphill and
In thisqset of dimensionless parameters, the diameter number downhill flows. For each group a separate surrogate model for
Nd = d σρllgg (d is the tube diameter) is excluded because of its liquid holdup prediction is constructed.
relatively narrow range of values in the case of lab data. This
range almost does not overlap with the appropriate one in the 2.2.2. Model for flow regime identification
case of field data (the comparison will be represented in the In this subsection, the second surrogate model will be dis-
Section 6). As a result, the surrogate model, trained on lab cussed. It is represented by a multi-class classifier and predicts
data, with the diameter number among input features could not flow regime. Since in the lab dataset there are only four flow
be potentially applied for calculation connected with field data. regimes distinguished (bubble, slug, annular mist and stratified
4
regimes), the surrogate model determines this parameter among of input parameters of this model, values of scores and cross-
them. Flow pattern and value of liquid holdup are linked param- plot with results. Despite the results obtained are quite high, the
eters; that is why the second one is included in input features entire method for pressure gradient prediction (liquid holdup,
of this classification model. Among other input features incli- flow pattern and pressure gradient) provides very low scores on
nation angle, dimensionless parameters (Eq. 4), Reynold and the field data, which was used for model validation. This model
Froude numbers, no-slip liquid holdup are included. Similar to contains the diameter number among input parameters, which
the liquid holdup prediction model, in this case, three ML mod- has narrow variation ranges for lab and field data that slightly
els for horizontal flow, upward and downward are constructed. overlap. Excluding this dimensionless diameter number, it is
then not possible to predict the pressure gradient directly from
2.2.3. Model for pressure gradient the data. The other problem of directly preditiing the pressure
In order to estimate pressure gradient in the segment the gradient is its complex structure: gravitational, acceleration and
equation for conservation of linear momentum is used. From friction components. Hence, in the present work we propose the
this conservation law, it is possible to express pressure gradient new approach for pressure gradient estimation.
which consists of three components: friction, elevation and ac-
celeration. Using the definition of friction component, it could 2.3. Applied Machine Learning algorithms, their tuning and
be written in the form: evaluation scores
Let us introduce some notations. Matrix with input features
f ρns v2m
!
dp is denoted as X and has size m×d (m is a number of samples and
= ,
dl f 2d d is a number of features), y is a matrix (in the majority cases
– column) that contains target real values. It has size m × r,
where f is a friction factor. Further, elevation component is where r = 1 when we model a single output, and r > 1 when
expressed in the form: we consider a multiple output task. Moreover, ŷ is a matrix with
! predicted values and has the same size as matrix y.
dp In the present paper four Machine Learning algorithms are
= ρm g sin θ,
dl el applied: Random Forest, Gradient Boosting Decision Trees,
Support Vector Machines and Artificial Neural Networks (this
where ρm = ρl αl + ρg (1 − αl ) is an in-situ density of a mix- set of ML algorithms is the same as utilized in the article [20]).
ture (here αl is a liquid holdup) and θ is an inclination angle. So, all these algorithms are used to construct each of the mod-
The final component is acceleration or kinetic energy that re- els described above for liquid holdup prediction, flow pattern
sults from a change in velocity. In many cases this part of the identification and pressure gradient calculation in order to com-
pressure gradient is negligible, but it is significant in the case pare their predictive capability. Let us briefly describe these
of compressible phase presence under relatively low pressures. algorithms.
Doing similar math to [10], we express the acceleration com- Random Forest [24] can be used to solve both regression
ponent as and classification tasks by constructing ensembles of ML mod-
ρm vm v sg d p
!
dp els: the algorithm constructs several independent decision trees
=− ,
dl acc p dl and average them
where p is pressure. Thus, the equation for pressure gradient N
1 X
has the following form: hN (x) = hi (x), (6)
N i=1
f (ρl λl +ρg (1−λl ))v2m
dp (ρl αl + ρg (1 − αl ))g sin θ +
=− (ρ α +ρ (1−α ))v v
2d
. (5) where hi (X) is a decision tree, N is a total number of decision
dl 1 − l l g l m sg trees.
p
Each decision tree is trained on its own bootstrapped sam-
In Eq. 5 all parameters are known apart from the liquid ple: we select randomly and without replacement data points
holdup (αl ) and friction factor ( f ) for a multiphase flow. Using from the initial dataset to construct the bootstrapped sample.
the first surrogate model, the value of αl could be defined. In Moreover, as input features for this sample we select randomly
order to estimate the friction factor, it is necessary to construct a subset of the initial d features.
another regression model. Utilizing data from the laboratory During construction of surrogate models, the implementation
database, we recalculate the model targeted values of friction of Random Forest algorithm in the Scikit-learn library [25] on
factor by using Eq. 5. In the input parameters of this model Python language is utilized. For classification and regression
the following set is included: liquid velocity numbers, viscos- problems we use functions RandomForestClassifier() and Ran-
ity number, Reynolds and Froude numbers, liquid holdups (αl domForestRegressor() respectively.
and λl ), relative roughness. All machine learning algorithms have their own set of pa-
At the end of this section, we would like to discuss an alterna- rameters that should be tuned based on the utilized dataset.
tive approach, where the pressure gradient is directly predicted These parameters are called hyperparameters. In the case of
by the surrogate model trained on lab data. This approach is dis- the Random Forest algorithm, the following hyperparameters
cussed in the earlier work [20], where authors represent the set are adjusted: the number of trees in the forest (n estimators),
5
maximum depth of the tree (max depth), number of features to fit transformed data with a linear function w · F(x) + b, which
that algorithm considers in the process of tree construction is nonlinear in the original input space.
(max features). The hyperparameters of the Random Forest al- It can be proved that instead of explicitly defining F(·) it is
gorithm are tuned in the same sequence as they are listed. sufficient to define a kernel K(x, x0 ), being a dot product be-
Gradient Boosting Decision Trees [26]. This technique also tween x and x0 in the new input space, defined by the transfor-
constructs ensemble of ML models and can be used both for mation F(·):
regression and classification. The algorithm constructs several K(x, x0 ) = F(x) · F(x0 ).
decision trees and the final result is the weighted sum of them:
When constructing SVM we use a Gaussian kernel with
N
X width σ:
hN (x) = αi hi (x). ||x − x0 ||2
!
(7)
K(x , x) = exp
0
.
i=1 2σ2
Each decision tree hi (x) tries to fit anti-gradient of the loss func- SVM algorithm maximizes a classification margin, equal to
tion (logistic, exponential loss functions): 1/kwk, i.e. minimizes the norm kwk. In case of binary clas-
sification, the algorithm also minimizes the sum of slack vari-
∂L( fi−1 (x j ), y j )
− , j = 1, ..., m, ables that are responsible for misclassification. The optimiza-
∂ fi−1 (x j ) fi−1 (x j )=Pi−1
k=1 αk hk (x j ) tion problem has the following form:
where fi−1 (x j ) = i−1 k=1 αk hk (x j ) is a predicted value for a data
P
m
||w||2 X
point with index j on the (i − 1)-th boosting iteration. min +C ξi
w,b,ξ 2
Weights in the sum of decision trees (αi ) are found by a sim- i=1
ple line search: subject to: yi (wF(xi ) + b) ≥ 1 − ξi and ξi ≥ 0, i ∈ [1, m]. (8)
m
In case of regression, the algorithm minimizes the sum of
X
αi = argminα>0 L( fi−1 (x j ) + αhi (x j ), y j ).
j=1
slack variables that are responsible for data points lying outside
the −tube and minimizes complexity term kwk2 . The formula-
Besides Gradient Boosting algorithm, implemented in Scikit- tion of this optimization problem is as follows:
learn library [25] in functions GradientBoostingClassifier() (for
m
classification models) and GradientBoostingRegressor() (for ||w||2 X
regression models), XGBoost [27] implementation (the latest min +C (ξi + ξi∗ )
w,b,ξ 2
i=1
one version) is also used. It is a standard gradient boosting with
additional regularization to prevent over-fitting. subject to: yi − wF(xi ) − b ≤ + ξi ,
For the Gradient Boosting algorithm the following set wF(xi ) + b − yi ≤ + ξi∗ ,
of hyperparameters are tuned: the number of constructed ξi , ξi∗ ≥ 0, i ∈ [1, m]. (9)
trees (n estimators), learning rate that shrinks the contribu-
tion of each tree (learning rate), maximum depth of the tree Further, these optimization problems (8, 9) are solved in dual
(max depth), number of features that algorithm considers in formulation with the use of Lagrangian.
the process of tree construction (max features), the fraction The SVM algorithm is also implemented in the Scikit-learn
of samples that are used for learning each tree (subsample) [25] in functions SVC() and SVR() for regression and classifi-
and parameters concerning the building of the trees’ struc- cation tasks respectively, and these functions are used in the
ture (min samples split, min samples leaf). We tune the hy- present work.
perparameters of the Gradient Boosting algorithm in the fol- In the case of the SVM algorithm, the width of the Gaus-
lowing sequence: n estimators, max depth, min samples split, sian kernel (σ) and regularization parameter (C) are adjusted
min samples leaf, max features, subsample and learning rate. in the process of models construction. We tune these parame-
The Gradient Boosting algorithm is often applied for solving ter together meaning that different combinations of σ and C are
different problems connected with oil and gas industry. For ex- considered.
ample, in paper [28] authors utilized this method for prediction Artificial Neural Networks [31]. The schematic picture of
of the flow rate of the well after the process of hydraulic frac- ANN is represented in Fig. 4. This ML algorithm could be ap-
turing. In addition, in the article [29] authors created a model plied in classification and regression problems. ANN consists
based on the Gradient Boosting algorithm for calculation of the of several layers (input, output and hidden). Each layer has
bottomhole pressure in the case of transient multiphase flow in a certain number of nodes. The first layer contains input pa-
wells. rameters, so the number of nodes is p. The last one contains
Support Vector Machine [30]. Similarly to the previous two output parameters: in the case of regression problems or binary
algorithms, SVM could be used in classification and regression classification there is only one node. In the case of multiclass
analysis. In case of SVM very often input x is transformed into classification, the number of nodes is equal to the number of
a high-dimensional feature space by some non-linear mapping classes. The algorithm consists of forward and backward prop-
F(·). This transformation is applied in order to make data lin- agation. In the forward propagation process, the algorithm cal-
early separable (in case of classification) or to make it possible culates values at each node in the following way: in order to
6
with ANN models have been already mentioned in the Section
1.1.
When applying neural networks to construct surrogate mod-
els we tune the following hyperparameters: the number of
hidden layers and number of nodes in hidden layers (hid-
den layer sizes), the value of learning rate (learning rate init).
Also, activation function (activation) and a method of gradient
descent (solver) are adjusted. Let us consider the approach of
tuning the number of nodes in the hidden layers. Firstly, we
take the ANN with one hidden layer and tune the number of
nodes (n1 ) in it. Further, we consider the ANN with two hidden
layers and tune the number of nodes in the second layer (n2 )
presuming that the number of nodes in the first layer equals n1 .
The same procedure is carried out for the third layer. We also
always check that adding of the new layer improves the value
of metric; otherwise the new layer is not inserted. The max-
imum number of hidden layers in our models equals to three.
There is no sense to utilize deeper ANN due to relatively small
data set. After tuning the network structure, we tune the learn-
ing rate init parameter. Finally, we note that for the gradient
descent in the back propagation procedure we utilize the Adam
optimizer [40].
For tuning hyperparameters of ML algorithms, we use the
Figure 4: The schematic picture of artificial neural network
grid search algorithm and the M cross-validation procedure.
Let us start with the description of the grid search. First of
obtain the value at the node k in the layer h + 1 the algorithm all, it is necessary to choose the sets of the possible values of
carries out a linear combination of values in nodes in layer h hyperparameters. The grid search procedure passes through all
with some weights. Then it applies to this linear combination various combinations of values of hyperparameters and on each
activation function g(·): iteration it trains the ML algorithm on the training dataset and
tests the trained algorithm on the testing part on the dataset.
yk,h+1 = g(θk,h→h+1
T
yh ). (10) Finally, the grid search algorithm chooses the optimal combi-
nation of hyperparameters that yields the best score on the test-
In the backward process, the algorithm adjusts the weights by ing dataset. Under the term score, we mean the value of the
using a gradient descent optimization algorithm in order to de- metric that is applied in the considered regression or classifica-
crease the value of loss function L( fi , yi ). tion problem. The set of utilized metrics in our work will be
In classification problems sigmoid g(x) = exp(−x)+1
1
or hy- described further.
perbolic tangent (g(x) = tanh(x)) activation functions are used. Instead of dividing the data set on the training and the testing
In case of regression ReLu (g(x) = max(0, x)) is often applied. part just once, we apply cross-validation technique. In this pro-
Approaches to initialize parameters of ANN model are listed cedure the dataset is divided into M equal parts and M − 1 par-
in [32]. In [33] they discuss efficient ANN training algorithms, titions are utilized as a training dataset and the remaining parti-
and in [34] approaches to construct ensembles of ANNs. tion as a testing (it could be also named as validation) dataset.
Artificial Neural Network is implemented in many libraries This process is repeated M times, and on each iteration, differ-
on Python language, for example, Keras [35], Theano [36], Py- ent test partition is used. As a result, it is possible to compute
rorch [37]. In this paper the functions from Scikit-learn library M test scores and average them. Further, the averaged score is
[25] are used: for classification model: MLPClassifier() and utilized in the grid search algorithms for identifying what set of
for regression model: MLPRegressor(). Keras, Theano and Py- hyperparameters is the optimal.
torch libraries are mainly oriented for the realization of convo-
In order to evaluate the score of the tuned model with the
lution neural networks. That is why their usage is not necessary
best hyperparameters the N × M cross-validation procedure is
in the present simple case.
applied. In this method, a dataset is divided into M equal parts
Similar to the Gradient Boosting algorithm, the Artificial
N times and at each N-step the partitions are different. Thanks
Neural Network is also relatively popular for solving oil and
to this technique calculated score of the model (µ — mean) is
gas industrial problems. For example, it was used in [38] to pre-
more accurate as compared to the M cross-validation and it also
dict the value of bottomhole pressure in a transient multiphase
well flow. In the other paper [39], the authors applied different allows constructing confidence intervals of this score ± 1.96·σ
√
M·N
ML algorithms for predicting porosity and permeability alter- where σ is a standard deviation. In this paper in case of the M
ation due to scale depositing, and it was shown that ANN has cross-validation M = 5 is used (this value is the most common),
the best predictive ability for this problem. Some other studies and in case of N × M cross-validation we set M = 5 and N = 20.
7
The following set of evaluation metrics is used. In case of flow are taken from [45]. The data includes not only horizon-
multi-class classification: tal flows but also pipes at angles from −10◦ to 10◦ . And the
final 535 data point are taken from [46] in which the author
• F1-macro metrics conducted experiments with air-water and air-glycer multiphase
1 X T Py 1 X T Py horizontal flow.
P= ,R = , As a result, the consolidated data base contains the total of
|Y| y T Py + FPy |Y| y T Py + FNy
2560 points for constructing the ML model for liquid holdup
2PR prediction and flow pattern identification. Out of the total num-
F1 = , (11)
P+R ber, approximately 1700 points contain information about the
value of the measured pressure gradient. In the process of con-
where P is a precision, R is a recall, |Y| is a number of
structing the surrogate model for pressure gradient, we recal-
classes, T Py — true positives of class y, FPy — false pos-
culate the values of the friction factor according to Eq. 5 as
itives of class y and FNy — false negatives of class y.
mentioned in the subsection 2.2.3. For some data samples we
• Accuracy: obtained non-physical values (e.g., less than zero or too large
as compared with Fanning friction factor for mixture). That is
1X
m X T Py + T Ny why we do not use these data points when constructing surro-
Accuracy = 1 f (xi )=yi = ,
m i=1 y
T Py + FNy + T Py + FNy gate models. As a result, the total number of data points for
(12) training, validation and testing of the resultant surrogate model
where T Ny – true negatives of class y. is approximately 1300. In the collected dataset the flow pat-
tern is provided for about 1400 data points. In order to fill the
In case of regression: remaining samples the flow pattern map created by Mukherjee
[11] is used.
• Coefficient of determination (R2 -score) As a result of the exploratory analysis of the dataset we pro-
Pm vide the following charts. First of all, let us observe the distri-
(yi − ŷi )2
R2 = 1 − Pi=1
m 2
, (13) bution of angles at which the pipes are orientated in the experi-
i=1 (yi − ȳ) ments. In Fig. 5 the distribution of inclination angles is plotted.
where ȳ = 1 Pm
m i=1 yi .
Figure 11: Cross-plot for the Gradient Boosting algorithm in the liquid holdup
model. Different colours are responsible for different flow regimes (observed
in the experiment)
Figure 10: Cross-plot for the Gradient Boosting algorithm in the liquid holdup
model. Different colours are responsible for different categories by inclination
angles
Table 2: Empirical correlations and mechanistic models best practices. All the mentioned methods could be applied for calculation of parameters for multiphase
flow of oil, gas or gas condensate.
Figure 15: Histogram with accuracy scores in the model for flow pattern pre-
diction
3. Angle of inclination
Figure 14: Cross-plot for results of OLGAS model. In this plot, different flow 4. Froude number
regimes are marked by different colours
5. Liquid velocity number
6. Viscosity number
izontal and uphill flows (Fig. 15). 7. Liquid holdup
For classification problem, it is possible to construct a con- 8. Reynolds number
fusion matrix. Each row of this matrix represents samples that
belong to true class labels while each column represents pre- This model is based on the Gradient Boosting algorithm con-
dicted class labels. Using the confusion matrix, one could find structed using the whole data set.
how well ML algorithm predicts each class. In the Fig. 16 the After creation of the model for multiphase flow regime pre-
confusion matrix is depicted. The values in this matrix relate to diction it is possible to draw flow pattern maps. For example,
the result of flow pattern prediction model in which the Gradi- in Fig. 17 the predicted map for horizontal flow of kerosene is
ent boosting algorithm and all dataset are used. represented. This map is generated by the Gradient Boosting al-
Analyzing the confusion matrix in Fig. 16, it is possible to gorithm. In this picture beside calculated zones of flow regimes,
reveal that the Gradient Boosting algorithm predicts quite well boundaries between classes obtained by Mukherjee [43] and
annular mist, slug and stratified flow patterns and the worst de- Mandhane [50] are drawn. Using this figure one could iden-
fines bubble flow regime. tify that boundaries between flow patterns from Mukherjee and
The list of feature importance in the model for flow pattern Mandhane maps are quite similar except for transition bound-
prediction is the following: ary between slug and bubble regimes. Boundaries predicted by
the surrogate model almost restore Mukherjee map because the
1. No-slip liquid holdup information about flow pattern in laboratory data is taken from
2. Gas velocity number Mukherjee experiments only. The greatest differences between
13
group. The only one model for friction factor prediction is built
that uses all data set in training, validation and testing stages.
In this model, flow pattern feature is used among input param-
eters. The flow regime is encoded by creating several columns
(each corresponding to a specific flow regime) and populating
the data in the form 0 and 1. The method is referred to as one
hot encoding. When a sample belongs to slug flow pattern, it
will have 1 value in the column that is responsible for indication
of an appurtenance to the slug flow class. The model for friction
factor prediction is embedded in the calculation of the pressure
gradient according to equation (5), and, namely, the results of
pressure gradient estimation will be discussed further.
In Fig. 18 the comparative bar chart with different ML algo-
rithms scores is plotted.
Figure 16: Confusion matrix for results of flow pattern prediction model
Figure 18: Histogram with R2 -scores in the model for pressure gradient predic-
tion
Figure 23: Cross-plot for result of Ansari & Xiao combined model.
5. Sensitivity analysis
Table 3: ML models results of all applied machine learning algorithms in the case of usage of all dataset.
17
Figure 26: Tornado diagram with variations of liquid holdup value when base Figure 29: Tornado diagram with variations of pressure gradient value when
case belongs to uphill flow. On the left of the chart, under the parameter’s base case belongs to downhill flow. On the left of the chart, under the parame-
sequence number the base value and its variation range are written ter’s sequence number the base value and its variation range are written
Figure 27: Tornado diagram with variations of pressure gradient value when
base case belongs to uphill flow. On the left of the chart, under the parameter’s
sequence number the base value and its variation range are written
Figure 30: Dependence of pressure gradient on superficial gas velocity for all
considered base sets of input parameters obtained by using propose method and
Beggs & Brill correlation
6. Case study
Figure 28: Tornado diagram with variations of liquid holdup value when base Here we will represent the validation results of the con-
case belongs to downhill flow. On the left of the chart, under the parameter’s structed method for pressure drop calculation on field data.
sequence number the base value and its variation range are written Three field data sets with measurements on production wells
and pipelines are taken from [51]. These data sets include
At the end of this section we represent graphs with dependen- flow rates, bottomhole and wellhead pressures and tempera-
18
Figure 31: Dependence of pressure gradient on superficial liquid velocity for Figure 33: Dependence of pressure gradient on pipe inclination angle for all
all considered base sets of input parameters obtained by using propose method considered base sets of input parameters obtained by using propose method
and Beggs & Brill correlation and Beggs & Brill correlation
Figure 32: Dependence of pressure gradient on pipe diameter for all considered
base sets of input parameters obtained by using propose method and Beggs &
Brill correlation
19
Figure 35: Bar chart with comparison of ranges of diameter number in labora- Figure 37: Bar chart with comparison of ranges of average temperature in lab-
tory and field data oratory and field data
Figure 38: Bar chart with comparison of ranges of Reynolds number in labora-
tory and field data
Figure 36: Bar chart with comparison of ranges of average pressure in labora-
tory and field data
tions show that the flow is a single phase along the large part of
From Fig. 35, 36 one could reveal that ranges of diameter the flow length on each well. At the same time, in the two-phase
number and average pressure in the case of field and lab data region slug and bubble flow regimes are encountered.
almost do not overlap. In this regard, during the construction In Fig. 39 the cross-plot with obtained results of bottom hole
of ML models, these parameters should have been excluded be- pressure calculation is depicted.
cause machine learning algorithms could not be applied outside The proposed method yields coefficient of determination
the training ranges. Ranges of average temperature parameter equal to R2 = 0.806 on this data set. This result is a little bit
(Fig. 37) in the cases of lab and field data sets also slightly over- worse compared to the application of Beggs & Brill correla-
lap; that is why this feature is not included in training parame- tion (R2 = 0.82). However, the created model performs a little
ters. However, the temperature has already been utilized during bit better on this data set than the following set of correlations
the calculation of density, viscosity and surface tension. So, it and mechanistic models: Mukherjee & Brill correlation (R2 =
is not necessary to take into account it once again. Since the 0.79), Ansari & Xiao combined model (R2 = 0.66), TUFFP
ranges of liquid velocity number and Reynolds number in the Unified (R2 = 0.795), Leda Flow point model (R2 = 0.793).
case of lab data and field data only partially overlap, there might Let us move on to the results of the application of the con-
be potential errors during application of constructed model on structed model concerning the second data set. It consists of
the field data. production tests on inclined wells from Ekofisk area (Norway).
Let us consider each data set in order. The first data set con- These wells produce light oil and have a smaller diameter of
sists of production tests on inclined wells from The Forties field tubing than ones from Forties field.
(The United Kingdom). The characteristic feature of the pro- In Fig. 40 one could observe the results of application the
duced fluid in this field is relatively low gas content. Computa- constructed method for this data set. The model provides R2 =
20
ter flowlines at Prudhoe Bay field (USA). These flow lines are
nearly horizontal with a small inclination angle.
The cross-plot with calculated outlet pressures on the flow-
lines is represented in Fig. 41. The obtained coefficient of
determination on this data set is relatively high (R2 = 0.99).
In turn, Beggs & Brill correlation gives R2 = 0.98 that is
also very high result but a little worse. In contrast, the other
considered methods provide even lower scores: Mukherjee &
Brill correlation (R2 = 0.94), Ansari & Xiao combined model
(R2 = 0.78), TUFFP Unified (R2 = 0.47), Leda Flow point
model (R2 = 0.79).
Figure 39: Cross-plot with result of bottom hole pressure calculation on Forties
production wells
0.815 that is better than models with Beggs & Brill correlation
(R2 = 0.65), Ansari & Xiao combined model (R2 = 0.78) and
TUFFP Unified (R2 = 0.73). However, the score obtained by
using surrogate models is slightly worse than Mukherjee & Brill
(R2 = 0.823) and Leda Flow point model (R2 = 0.821) allow
obtaining.
Figure 41: Cross-plot with result of bottom hole pressure calculation on Prud-
hoe Bay flowlines
7. Conclusions
22
[18] E.-S. A. Osman, M. A. Ayoub, M. A. Aggour, et al., An artificial neu- [41] K. Minami, J. Brill, et al., Liquid holdup in wet-gas pipelines, SPE Pro-
ral network model for predicting bottomhole flowing pressure in vertical duction Engineering 2 (01) (1987) 36–44.
multiphase flow, in: SPE Middle East Oil and Gas Show and Conference, [42] G. Abdul-Majeed, Liquid holdup in horizontal two-phase gasliquid flow,
Society of Petroleum Engineers, 2005. Journal of Petroleum Science and Engineering 15 (2-4) (1996) 271–280.
[19] X. Li, J. Miskimins, B. T. Hoffman, et al., A combined bottom-hole [43] H. Mukherjee, An experimental study of inclined two-phase flow, Ph.D.
pressure calculation procedure using multiphase correlations and artifi- thesis, U. of Tulsa (1979).
cial neural network models, in: SPE Annual Technical Conference and [44] B. Eaton, The prediction of flow patterns, liquid holdup and pres-
Exhibition, Society of Petroleum Engineers, 2014. sure losses occurring during continuous two-phase flow in horizontal
[20] E. Kanin, A. Vainshtein, A. Osiptsov, E. Burnaev, The method of calcula- pipelines, Ph.D. thesis, U. of Texas at Austin (1966).
tion the pressure gradient in multiphase flow in the pipe segment based on [45] H. Beggs, An experimental study of two-phase flow in inclined pipes,
the machine learning algorithms, in: IOP Conference Series: Earth and Ph.D. thesis, U. of Tulsa (1973).
Environmental Science, Vol. 193, IOP Publishing, 2018, p. 012028. [46] N. Andritsos, Effect of pipe diameter and liquid viscosity on horizontal
[21] M. Belyaev, E. Burnaev, E. Kapushev, M. Panov, P. Prikhodko, D. Vetrov, stratified flow, Ph.D. thesis, U. of Illinois (1986).
D. Yarotsky, Gtapprox: Surrogate modeling for industrial design, Ad- [47] E. Burnaev, P. Erofeev, A. Papanov, Influence of resampling on accuracy
vances in Engineering Software 102 (2016) 29 – 39. doi:https: of imbalanced classification, in: Proc. SPIE, Vol. 9875, 2015, pp. 9875 –
//doi.org/10.1016/j.advengsoft.2016.09.001. 9875 – 5. doi:10.1117/12.2228523.
[22] J. P. Brill, H. K. Mukherjee, Multiphase flow in wells, Vol. 17, Society of URL https://doi.org/10.1117/12.2228523
Petroleum Engineers, 1999. [48] D. Smolyakov, A. Korotin, P. Erofeev, A. Papanov, E. Burnaev, Meta-
[23] H. Duns Jr, N. Ros, et al., Vertical flow of gas and liquid mixtures in wells, learning for resampling recommendation systems, 2019.
in: 6th World Petroleum Congress, World Petroleum Congress, 1963. [49] E. Burnaev, P. Erofeev, D. Smolyakov, Model selection for anomaly de-
[24] L. Breiman, Random forests, Machine learning 45 (1) (2001) 5–32. tection, in: Proc. SPIE, Vol. 9875, 2015, pp. 9875 – 9875 – 6. doi:
[25] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, 10.1117/12.2228794.
O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Van- URL https://doi.org/10.1117/12.2228794
derplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, [50] J. Mandhane, G. Gregory, K. Aziz, A flow pattern map for gasliquid flow
Scikit-learn: Machine learning in Python, Journal of Machine Learning in horizontal pipes, International Journal of Multiphase Flow 1 (4) (1974)
Research 12 (2011) 2825–2830. 537–553.
[26] J. Friedman, Greedy function approximation: A gradient boosting ma- [51] H. Asheim, et al., Mona, an accurate two-phase well flow model based on
chine http://www. salford-systems. com/doc, GreedyFuncApproxSS. pdf. phase slippage, SPE Production Engineering 1 (03) (1986) 221–230.
[27] T. Chen, C. Guestrin, XGBoost: A scalable tree boosting system, in: Pro- [52] E. Burnaev, V. Vovk, Efficiency of conformalized ridge regression, in:
ceedings of the 22nd ACM SIGKDD International Conference on Knowl- M. F. Balcan, V. Feldman, C. Szepesvri (Eds.), Proceedings of The 27th
edge Discovery and Data Mining, KDD ’16, ACM, New York, NY, USA, Conference on Learning Theory, Vol. 35 of Proceedings of Machine
2016, pp. 785–794. doi:10.1145/2939672.2939785. Learning Research, PMLR, Barcelona, Spain, 2014, pp. 605–622.
URL http://doi.acm.org/10.1145/2939672.2939785 URL http://proceedings.mlr.press/v35/burnaev14.html
[28] I. Makhotin, D. Koroteev, E. Burnaev, Gradient boosting to boost the ef- [53] E. Burnaev, I. Nazarov, Conformalized kernel ridge regression, in: 2016
ficiency of hydraulic fracturing, arXiv preprint arXiv:1902.02223. 15th IEEE International Conference on Machine Learning and Applica-
[29] D. I. Ignatov, K. Sinkov, P. Spesivtsev, I. Vrabie, V. Zyuzin, Tree-based tions (ICMLA), 2016, pp. 45–52. doi:10.1109/ICMLA.2016.0017.
ensembles for predicting the bottomhole pressure of oil and gas well
flows, in: International Conference on Analysis of Images, Social Net-
works and Texts, Springer, 2018, pp. 221–233.
[30] C. Cortes, V. Vapnik, Support-vector networks, Machine learning 20 (3)
(1995) 273–297.
[31] F. Rosenblatt, The perceptron: a probabilistic model for information stor-
age and organization in the brain., Psychological review 65 (6) (1958)
386.
[32] E. Burnaev, P. Erofeev, The influence of parameter initialization on the
training time and accuracy of a nonlinear regression model, Journal of
Communications Technology and Electronics 61 (6) (2016) 646–660.
doi:10.1134/S106422691606005X.
URL https://doi.org/10.1134/S106422691606005X
[33] M. G. Belyaev, E. V. Burnaev, Approximation of a multidimensional de-
pendency based on a linear expansion in a dictionary of parametric func-
tions, Informatics and its Applications 7 (3) (2013) 114–125.
[34] E. V. Burnaev, P. V. Prikhod’ko, On a method for constructing ensembles
of regression models, Automation and Remote Control 74 (10) (2013)
1630–1644. doi:10.1134/S0005117913100044.
URL https://doi.org/10.1134/S0005117913100044
[35] F. Chollet, et al., Keras, https://keras.io (2015).
[36] Theano Development Team, Theano: A Python framework for fast com-
putation of mathematical expressions, arXiv e-prints abs/1605.02688.
URL http://arxiv.org/abs/1605.02688
[37] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin,
A. Desmaison, L. Antiga, A. Lerer, Automatic differentiation in pytorch.
[38] P. Spesivtsev, K. Sinkov, I. Sofronov, A. Zimina, A. Umnov, R. Yarullin,
D. Vetrov, Predictive model for bottomhole pressure based on machine
learning, Journal of Petroleum Science and Engineering 166 (2018) 825–
841.
[39] A. Erofeev, D. Orlov, A. Ryzhov, D. Koroteev, Prediction of porosity and
permeability alteration based on machine learning algorithms, Transport
in Porous Media (2019) 1–24.
[40] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv
preprint arXiv:1412.6980.
23