You are on page 1of 17

FLARING EVENTS PREDICTION AND PREVENTION THROUGH

ADVANCED BIG DATA ANALYTICS AND MACHINE LEARNING


ALGORITHMS

M. Giuliani, G. Camarda, M. Montini, L. Cadei, A. Bianco, Eni SpA Upstream and


Technical Services, A. Shokry, P. Baraldi, E. Zio, Politecnico di Milano

This paper was presented at the 14th Offshore Mediterranean Conference and Exhibition in Ravenna, Italy, March 27-29, 2019. It was
selected for presentation by OMC 2019 Programme Committee following review of information contained in the abstract submitted by
the author(s). The Paper as presented at OMC 2019 has not been reviewed by the Programme Committee.

ABSTRACT

This paper reports the development and test of an advanced workflow for prediction of hazardous
events like flaring, providing the main prescriptions in order to monitor, mitigate and root-cause
these issues. The tool is able to forecast in advance the insurgence of dangerous process upsets
able to highly affect the normal field operations, thanks an innovative approach based on Big Data
analytics and machine learning algorithms. The actions prescribed by the algorithm can
significantly reduce and avoid flaring events, improving operation safety, production and the overall
asset value.
The flaring prediction and prescription tool has been developed creating a strong pipeline that
implement four modules: the model conceptual design, data processing, features selection and
extraction and predictive model development and validation. During the first phase, the number of
classes for flaring classification, the prediction horizon and the input time window have been set in
order to achieve the best functionality of the tool considering the physical phenomena forecasted.
The second module allow to prepare, manage and pre-process real-time raw data from field,
discarding non-informative signals applying a linear interpolation and ad hoc developed filters.
Feature selection has been performed in order to identify the best subset of weak and strong
signals, which make the prediction algorithm robust and accurate. This diagnostic phase has been
performed by the pre-application of an innovative classification algorithm. The last module is the
final development of a tuned and cross-validated classification model, based on Artificial neural
networks.
The framework pipeline developed has been implemented on real time data coming from an
operating field in southern Europe. The effectiveness achieved by the robust architecture of the
tool allow to overcome some main issues such as: lack/status of data, rapid dynamic of physical
phenomena analysed and complexity of flaring network system. The tool has been able to identify
and root cause in advance the insurgence of weak signals that cause consequently dangerous
overpressures within the producing system, giving to field engineers the possibility to highlight the
operating parameters that have to be modified or managed.
Flaring networks represent the main over-pressure relief system of an upstream treatment plant.
Hence, the implementation of this big data analytics framework is able to maximize the operational
safety of the plant, predicting the hazardous events with prescription of mitigation actions.
Moreover, it allows to maximize the asset value, granting steady operations and consequently
optimum production and the lowest environmental impact.

INTRODUCTION

In the oil and gas industry, the flaring system serves as a network to depressurize units that can
constitute critical over-pressure channels (1). Examples of units where the pressure can build up
are production manifolds, compression system and process equipment like Knock-Out drum,

1
separators and columns (2). Gas flow rates disposed to the flaring system can result from different
sources, including process relief, process flaring and blow-down (2).
The flaring system is one of the most critical parts of an oil and gas facility. It guarantees the safety
of the plant in case of over-pressure transients that can occur during plant shutdowns, startups or
can be caused by process abnormal conditions or component failures. Furthermore, a proper
design and operation of the flaring system is fundamental from the operational, economic and
environmental point of views.
On the operational side, gas flaring represents a continuous challenge for control and operation
engineers since its root causes are quite difficult to detect on time, and, consequently, the actions
necessary for its mitigation cannot be performed (3). Therefore, gas flaring can interrupt the
production process and can lead to the shutdown of the entire plant or of specific production lines.
From the economical point of view, gas flaring causes major economic losses and waste of limited
material and energy resources (4). Reports of the Global Bank and the World Bank estimate that
100 to 150 billion cubic meters of natural gas are being flared annually, which are equivalent to 3.5
to 5% of the global gas production (5). In terms of dollars, gas flaring in the oil and gas industry is
annually responsible of a loss of $10-15 billion dollars (3). From the environmental point of view,
the massive quantities of flared gas generate very large amounts of CO2 (300 to 400 million tons
of CO2 emissions per year, equivalent to the annual CO2 emission of 77 million cars). Gas flaring
also significantly contributes to the emission of pollutant precursors and greenhouse gas (6).
Given its criticality, gas flaring is a very active research area addressed by a wide spectrum of
different industrial, research and academic institutions. Most of these studies can be categorized
into four research lines:
1. Modelling and assessment of the emissions caused by the combustion of flared gas (7),
and to the estimation of the associated environmental and economic losses (8).
2. Development and validation of monitoring techniques for tracking the flared gas volumes on
global scales and over relatively long time periods (e.g. satellite-based monitoring
methods), providing continuous and systematic information on this phenomenon (9).
3. Design of robust and reliable flaring systems able to handle possible process safety
hazards (2). To this aim, engineers typically consider design standards and guidelines,
such as NORSOK and API, and commercial simulation software tools specifically tailored
for modeling flare system behaviors in the oil and gas industry. The research is typically
based on the simulation of different process relief scenarios and on the assessment of their
consequences (2). A limitation of most of the existing approaches is that guidelines and
software concern steady state analysis, without considering transient processes during
which flaring scenarios different from those considered in the design phase may occur.
4. Adaption and retrofit of the plant design for the recovery of the flared gas (10). With respect
to this, solutions are considered based on the addition of new units and connections for
sending the recovered gas to the national distribution network. These studies integrate the
flared gas conditions and scenarios in the model used for the plant design optimization
(11). Given the economical and practical obstacles of plant design retrofit and updating,
most of these studies are carried out at a theoretical level and do not consider the critical
tradeoff between the benefits of the flared gas recovery, the associated capital cost
invested and the effects on the functionality and productivity of the facility.
At the best of our knowledge, no work has yet addressed the gas flaring considering online
process monitoring and optimization, despite the relevance of this topic in the oil and gas industry.
The final goal of the project is the development of a framework for the online monitoring and
supervision of the flaring system in oil and gas facilities. The framework is expected also to assist
plant operators and engineers in conducting root cause analyses for the identification of the facility
most relevant units inducing the flaring events, and for the definition of possible corrective actions
or recommendations for flaring events mitigation.
The objective of the current deliverable is to propose a methodological framework for the prediction
of the occurrence of critical flaring events using plant monitoring data, starting from 1 treatment line
and scaling up the optimal solution.

2
PROBLEM FORMULATION

The online monitoring and supervision of the flaring system require the development of a model for
the prediction of the discharged gas flow rate entering the flare stack, whose value at time 𝑡 will be
referred to as 𝑦(𝑡) (Figure 1). In this context, the objective of the present work is the development
of a model 𝑓 which at the present time 𝑡 receives in input the time series of the values of 𝑘
monitored plant signals from an initial time 𝑡𝑜, 𝑥⃗ (𝑡,...,𝑡0)=[𝑥⃗1(𝑡,...,𝑡0),𝑥⃗2(𝑡,...,𝑡0),…𝑥⃗𝑘(𝑡,...,𝑡0)] and
provides in output the prediction of 𝑦 at the future time 𝑡+Δ𝑡𝑝, Δ𝑡𝑝>0:

Examples of typically plant monitored signals are pressures, fluid levels, temperatures and flow
rates.
In this work, the following assumptions are made:
1. The flaring event can be well characterized by comparing the signal 𝑦(𝑡)∈ℜ with specific
threshold values 𝕐=[𝕪1,𝕪2,…𝕪𝑑] to generate a linguistic description of it:

2. The future value of the flaring flow rate signal 𝑦(𝑡+Δ𝑡𝑝) is not influenced by the entire
past history of the plant signals, but it depends on the plant signal values over a limited
time interval (𝑡−Δ𝑡𝑑,𝑡).
Therefore, the objective of the present work becomes the development of the model 𝑓𝑐 which
receives in input 𝑥⃗ (𝑡,...,𝑡−Δ𝑡𝑑)=[𝑥⃗1(𝑡,...,𝑡−Δ𝑡𝑑),𝑥⃗2(𝑡,...,𝑡−Δ𝑡𝑑),…𝑥⃗𝑘(𝑡,...,𝑡−Δ𝑡𝑑)] and provides in
output the prediction of the intensity of the flaring event, represented by the categorical signal
𝑦𝑐(𝑡+Δ𝑡𝑝):

Notice that in practical situations the model prediction horizon, Δ𝑡𝑝, and the input time window,
Δ𝑡𝑑, are multiple of the sampling period Δ𝑡𝑠, i.e. Δ𝑡𝑝=𝑘𝑝∙Δ𝑡𝑠 and Δ𝑡𝑑=𝑘𝑑∙Δ𝑡𝑠, where 𝑘𝑝,𝑘𝑑 are
positive integers.

Figure 1 - Schematic representation of the addressed problem

3
METHODOLOGICAL FRAMEWORK

This Section illustrates the four modules constituting the proposed framework for the prediction of
flaring events:
1. model conceptual design;
2. data preprocessing;
3. features selection and extraction;
4. predictive model development and validation;

1. Model Conceptual Design

This module is dedicated to the configuration of the predictive model, which requires setting:
 the number of classes, 𝑑+1, 𝑑=1,2,…, used for the classification of the flaring event and the
corresponding thresholds 𝕐=[𝕪1,𝕪2,…𝕪𝑑];
 the prediction horizon Δ𝑡𝑝;
 the input time widow Δ𝑡𝑑.
The number of model output classes, 𝑑+1, mainly depends on the required model functionality.
Setting the thresholds 𝕐 requires considering several factors, such as operational requirements,
statistical characteristics of the resulting input-output data and of the measurement noise affecting
the flaring flow rate sensor.

Figure 2 - Graphical representation of the prediction horizon Δ𝑡𝑝 and the input time window
Δ𝑡𝑑.
The model prediction horizon, Δ𝑡𝑝, is set by considering an optimal trade-off between the
operational requirement of having sufficient time for performing corrective or preventive actions
and the physical dynamic behavior of the system.
The setting of the input time window, Δ𝑡𝑑, requires taking into account the necessity of considering
sufficient information of the process history for the modeling purposes, avoiding, at the same time,
to unnecessarily increase the number of model inputs. The use of cross correlations techniques
(12) and of the experts knowledge about the approximate time needed for the pressures to build up
inside the units and to propagate into the flaring network can help in the setting of Δ𝑡𝑑 and Δ𝑡𝑝.
Nevertheless, the specification of a proper time window is typically a complex task, given the
different behavior of each signal 𝑥⃗𝑖(𝑡),𝑖=1,...,𝑘 with respect to the output 𝑦(𝑡+Δ𝑡𝑝), and the different
operational and environmental conditions experienced by the facility units.

2. Data Preparation

This module has the objectives of discarding non-informative signals, such as signals with a
constant value in all the available measurements, and properly managing the missing data and the

4
NAN (Not-A-Number) values, which are typically present in industrial datasets (13). With respect to
the latter task, we have used a linear interpolation technique where the missing or NAN values are
substituted by values obtained performing a linear interpolation between the previous and the
successive available values. Other techniques for missing and NAN data treatments, such as
those based on the use of cubic and splines interpolations, have not been employed in this work,
since their application requires the setting of parameters, with the associated risk of forcing a
specific behavior into the data (14).
Assuming to have available the signals, 𝑥⃗𝑖(𝑡), 𝑖=1,…,𝑘 and 𝑦𝑐(𝑡), for a consecutive period of time of
length 𝑛∙Δ𝑡𝑠, the available data are rearranged into input-output patterns. The input are organized
in the input data matrix [X] of dimension (𝑛−𝑘𝑝− 𝑘𝑑)×(𝑘×(𝑘𝑑+1)) containing the signal values and
the output in the column vector [Yc ] of dimension (𝑛−𝑘𝑝− 𝑘𝑑)×1 containing the classes labels.

3. Feature Selection and/or Extraction

In data-driven modeling, feature selection is a step commonly performed when the number of the
possible input signals of the model is large. Feature selection aims at identifying the subsets of the
measured signals providing the most satisfactory classification performance. In practical scenarios,
at least three reasons call for a reduction in the number of features:
1. irrelevant, non-informative features result in a classification model which is not robust;
2. when the model handles many features, a large number of observation data is required to
properly span the high-dimensional feature space for accurate multivariable interpolation;
3. eliminating unimportant sensors, the cost and time of collecting data can be reduced (15).
Features selection methods are typically classified into the two categories of filter and wrapper
methods (16). In this work, a filter method based on the RELIEF feature ranking technique is used
(17), given its proven efficiency and its relatively low computational requirements.
The outcome of this module is the reduced matrix [𝑿]∗ formed by a reduced number 𝑘∗ of signals
selected from the original set of (𝑘×(𝑘𝑑+1)) signals. Notice that the extracted features 𝑥⃗ * (𝑡) can
contain a fraction of the values collected from a given signal in its time window:

Feature extraction allows achieving the same objective of feature selection by properly combining
the original signals 𝑥⃗ (𝑡,...,𝑡−Δ𝑡𝑑)=[𝑥⃗1(𝑡,...,𝑡−Δ𝑡𝑑),𝑥⃗2(𝑡,...,𝑡−Δ𝑡𝑑),…𝑥⃗𝑘(𝑡,...,𝑡−Δ𝑡𝑑)]. It is typically
based on the projection of the original signals space into a new feature space. In this
Deliverable, we consider the use of Autoencoders (18) for features extraction, given their
ability of performing non-linear dimensionality reduction.

4. Prediction Model Training and Validation

The objective of this module is the development of the classification model:

mapping the underlying relation between the 𝑘∗ selected or extracted features 𝑥⃗ *


(𝑡)=[𝑥⃗𝑖1∗(𝜏𝑖1),𝑥⃗𝑖2∗(𝜏𝑖2)𝑥⃗𝑖𝑘∗(𝜏𝑖𝑘∗)] with 𝑖∈[1,…,𝑘] and 𝜏𝑖∈(𝑡−Δ𝑡𝑑,𝑡), and the flaring event future
state 𝑦𝑐(𝑡+Δ𝑡𝑝).
Among the various empirical classification techniques that have been developed (19), we consider:
shallow Artificial Neural Networks (ANNs) (20), Support Vector Machines (21) and Deep Artificial
Neural Networks (DANNs) (22). The development of several classifiers allow comparing their
performances in the specific classification problem object of the present work. Since each
technique is based on different statistical and mathematical principles, it is expected to extract
knowledge from the data in different ways and to provide different classification performances.

5
CASE STUDY & DATA

The oil and gas producing field considered within the current project is located onshore in southern
Europe. The development strategy for the exploitation of the reservoir formations discovered in that
area was the basis for building an oil & gas treatment plant starting during the end of the XXth
century. Currently the central process facility (CPF) includes five production lines (trains),
implemented in separated phases to treat the multiphase flow coming from the wells (Figure 3).
The multiphase flow comes from several producing wells and it is constituted by three main
phases: gas, oil and water. The composition of the oil differs according to the different formation
where the wells have been drilled. This oil has different characteristics from the other concessions
of European continent and in particular it has a different content of:
 H2S – in a range of 0.5-1.5 %mol;
 CO2 – in a range of 5-30 %mol.
The final scope of the plant is to produce stabilized oil, treated gas and liquid Sulphur that are
commercialized with strict sale specifications.

Figure 3 - The simplified process block diagram of the field considered


The flaring system of this field and considered in this project consists of the following three flares:
the high pressure acid flare, the high pressure sweet flare and the low pressure acid flare. The
three flares receive the discharged gas from five relief gas headers/collectors, which collect the
discharged gas from different stages and treatment units. The attention will be focused on the high
pressure flare system, also called BA, which represents the relief system related to the most critical
upset events.
The data currently available for the project includes the flow rate signal of the flared gas conveyed
to the BA system during 7 year of operations. This project represents a proof of concept to the
application an advanced framework on this type of issues, thus only one treatment line of the field
has been analyzed in order to reduce the problem complexity. Hence, the measurements of 191
plant signals collected from 1 line during the same time frame are also available. These signals
have been measured in different stages of the process: 98 signals from the separation stage, 71
signals from the crude stabilization stage and 22 signals from the gas compression stage.
Since the values of most of the signals are missing over long times periods due to operating issue
and plant shut down, we consider in the remaining part of this deliverable a significant time interval
within the entire time domain, during which 12995 values for each signal are available.
Notice that although the high pressure acid flare receives gas disposals from separation,
stabilization, compression and absorption stages of all five treatment lines, the analysis in the
preliminary deliverable presented in this paper considers only the signals associated to 1 treatment
line.

6
FRAMEWORK APPLICATION TO THE CASE STUDY

Model conceptual design

With respect to the number of classes, 𝑑+1, used to characterize the state of discharged gas flow
rate entering the flare stack, this has been set to 2. Therefore, the objective is to develop a binary
classifier with the following state:
 “0” - no occurrence of the flaring event’ or ‘negative class’
 “1” - ‘occurrence of the flaring event’ or ‘positive class’
Considering practical operational requirements for the management and control of flaring events,
the prediction horizon, Δ𝑡𝑝, of the model is set to 2 ℎ, which corresponds to 𝑘𝑝=12, since the data
are sampled at discrete time intervals, Δ𝑡𝑠=10 min.
Taking into account the physical nature of the inputs signals (pressures, temperatures, oil/water
levels and flow rates) and a rough estimation of the time dependences between the input signals
and the flaring flow rate, the input time window, Δ𝑡𝑑, is set to 1 ℎ, which corresponds to 𝑘𝑝=6.
In order to set a proper threshold value, 𝕪1, the flaring events occurrence has been defined
considering several factors, such as the operational requirements, the statistical characteristics of
the resulting input-output data and of the flaring flow rate sensor noise

Figure 4 - Distribution of flaring flow rate values and corresponding empirical cumulative
distributions

Figure 4 shows the distribution of the flaring flow rate values and the corresponding cumulative
distribution which provides the number of positive patterns over the total number of patterns as a
function of the threshold 𝕪1. Notice that threshold values between 150 and 200 provide a fraction
of positive events well-matched with the operator experience (53.0% of positive events if 𝕪1=150
and 37.7% if 𝕪1=200), whereas having, for example, the occurrence of flaring events for 73% of
the operational time may be considered exaggeration, and for 18.6 % may be considered too few.

7
Figure 5 - Effect of the threshold value to the class-imbalance

Figure 5 shows the variation of the class imbalance ratio, defined as number of patterns above the
threshold (positive events) divided by number of patterns below the threshold (negative events), in
correspondence of the threshold values of 100, 150, 200, 250, 300 and 350. Notice that threshold
values between 150 and 200 generate manageable imbalance ratios for the development of an
empirical classification model, which, on the contrary, can become a challenging task in case of
more extreme imbalance ratios.
The analysis of the time evolution of flaring flow rate signal of the high pressure acid flare and the
considered six thresholds during a representative period of days shows that thresholds values
between 150 and 200 allow avoiding the occurrence of positive events due to sensor measurement
noise and generating non-continuously oscillating class labels. Therefore, in accordance with
production engineers, the threshold value has been initially set to 200, which produces 4894
positive events, 8101 negative events and imbalance ratio of 0.604 in overall considered period.
Finally, it is important to highlight that the signals sampling period (10 min, in this work) affects the
outcomes and conclusions of these analyses.

Data Preparation

In data preparation step, 14 signals out of the available 191 are discarded given that they show
constant values over the selected time interval and, therefore, they do not provide any useful
knowledge to the model. The ‘NAN’ values in the remaining 177 signals have been replaced by
applying a linear interpolation between the previous and successive available values. Then, the
plant monitored signals, 𝑥⃗ , and the flaring class signal, 𝑦𝑐, are arranged into input-output data
matrixes considering the model prediction horizon (12 steps-ahead) and the input time window (6
steps-back). The resulting input matrix [𝑋] is of size 12977×1062, being 12977 the number of
available complete patterns and 1062 the number of features resulting from the multiplication of the
177 signals with the 6 steps time window. The output vector [𝑌𝑐] is a binary column vector made of
12977 ‘0’ or ‘1’ labels.

Feature Selection and Extraction

Given the large number of the possible input features (1062), a feature selection step is necessary
to enable the construction of an accurate classification model. To this purpose, the RELIEF
algorithm is used to rank their importance. Figure 6 shows the weight assigned by the algorithm to
all the features, which represents a measure of the feature importance with respect to the binary
output (occurrence or non-occurrence of the flaring event).

8
Figure 6 - Features ranking according to their weights
Considering our experience in similar classification problems, we have used as input of the
classification model up to 200 of the top ranked features, which are all characterized by non-
negligible weights larger than 0.01. Notice that, in general, low numbers of features allow building
models easily interpretable and characterized by small computational efforts, but with, possibly,
unsatisfactory classification performance, since they can exclude signals containing useful
information. On the contrary, if the number of the top ranked features used as input of the model is
unnecessarily increased, the computational efforts also increase and the classification
performance decreases due to the presence of irrelevant, non-informative features which reduce
the model robustness and due to the lack of enough data to properly span the high-dimensional
input feature space.
The tags of the 50 top ranked features with corresponding engineering interpretation, the selected
delays, 𝜏𝑖, and the RELIEF ranking have been analyzed. It can be noticed that:
1) The 50 top ranked features have been derived from the 12 signals. The RELIEF has
selected multiple times the same signals with different delays;
2) The 50 top ranked features are mainly related to the separation stage;
3) In several cases, the larger the signal delay, the better the feature ranking;

Table 1 – Correlation coefficients matrix among flaring flow rate signal and the 12 features

With respect to observation 1) and 2), it seems worthwhile to share the feature selection results
with the plant production and process operation engineers, in order to further interpret and validate
them. From the numerical point of view, Table 1 shows the correlation coefficients among the
9
flaring flow rate signal and the 12 selected features. To further investigate observation 3) above,
we have performed an analysis of the cross-correlation between the flaring flow rate signal and the
top ranked 12 features considering longer time intervals.
Further analysis performed confirms that for some signals larger is the delay, more correlated is
the signal to the flaring flow rate. With respect to the fact that the RELEIF top ranked features
include the same signals with different delays, it can indicate: a) the RELIEF algorithm is not able
to discard redundant features, b) the dynamic behavior of the signal with its time variation contains
important information for the flaring events prediction.

Model training and test

This Section illustrates the development of empirical models for the prediction of the flaring events
at the high pressure acid flare. Different empirical modelling approaches are considered for the
construction of the classifiers, which associate the occurrence or non-occurrence of the flaring
events, defined using the threshold value and the features selected in the previous paragraphs. As
mentioned, it has been considered shallow Artificial Neural Networks, Support Vector Machines
and Deep Artificial Neural Networks.
The model architecture and hyperparameters, such as number of layers and number of neurons in
each layer of ANNs and DANNs and the kernel function type of the SVM, have been set according
to the authors experience (from LASAR group of Politecnico di Milan) and by following a trial -and-
error approach. In this phase of the model development, the possible classification models are
trained with the first 2/3 of the input-output data and tested using the remaining 1/3. A sequential
division between the training and test data has been preferred to a random division to better mimic
a real situation in which the model is going to be employed for time series predictions.
With respect to the estimation of the performance of an empirical classification model, we have
considered the following performance metrics (TP, FN, FP and TN indicate the numbers of true
positive, false negative, false positive and true negative test patterns, respectively):
1) 𝑅𝑒𝑐𝑎𝑙𝑙 (𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑟𝑎𝑡𝑒) = 𝑇𝑃/(𝑇𝑃 + 𝐹𝑁): this metric measures the rate with which the
classifier is able to correctly predict the occurrence of flaring events. Therefore, it indicates
the ability of the classifier to trigger real alarms. 1 − 𝑅𝑒𝑐𝑎𝑙𝑙 is the missed alarm rate.
2) 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 (𝑇𝑟𝑢𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑟𝑎𝑡𝑒) = 𝑇𝑁/(𝑇𝑁 + 𝐹𝑃): This metric measures the ability of the
classifier to correctly predict the non-occurrence of the flaring event. Therefore, it indicates
the ability of the classifier to avoid false alarms, being 1 − 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 the false alarm rate.

: is a measure estimating the accuracy of a binary classifier by balancing


recall and precision, defined by 𝑇𝑃/(𝑇𝑃 + 𝐹𝑃), which indicates the portion of patterns classified as
positive and actually positive. Once the best performing architecture and model hyperparameters
have been identified for each one of the three considered approaches (ANNs, SVM and DANNs),
their performances are further investigated and compared by considering different 10-folds cross
validation procedures, in which the data are assigned to the folds using different strategies.

Comparison of the results and Cross Validation

ANN
Artificial Neural Networks have been commonly used for building classification models in various
engineering fields. ANNs have been shown able to capture complex input-output mappings due to
their universal approximation capabilities and the flexibility of their structures (23). In this work, the
Matlab Artificial Neural Networks toolbox is used to build ANNs-based classifiers. The function
“patternnet” is used to build an ANNs with specific numbers of hidden layers and neurons, and the
function “trainscg”, based on the “Scaled Conjugate Gradient” algorithm, is used to train the

10
network. Different ANN architectures characterized by different numbers of layers, neurons in each
layer and subsets of the RELIEF top ranked features have been considered.
According to field engineers, the choice of the model to be used should avoid, first of all, missed
predictions of flaring event occurrence, and, then, not exceed a tolerable fraction of false flaring
event predictions. This can be obtained by maximizing the recall (i.e. minimizing the miss-
prediction rate) and keeping the specificity relatively high. Therefore, the most satisfactory ANN
model is ANN3 characterized by recall of 66.3% and specificity of 81.8%. It is characterized by 3
layers-architecture and a number of input features between 60 and 80. Figure 7 shows the
confusion matrix of model ANN3.

Figure 7 - Confusion matrix based on model ANN3

SVMs
Support Vector Machines (SVMs) are margin-based pattern recognition techniques, initially
developed by Vanpik (1995). SVMs have gained popularity as a reliable and efficient tool for
detection and diagnosis purposes in a wide range of studies (24). SVMs establish the classification
space based on the maximization of the margin between the training patterns and the decision
boundaries. Similar to what done for ANNs models, different sets of top ranked features and
different model settings characterized by different combinations of kernel types and allowed
percentage of outlier samples have been considered. In this work, the function “fitcsvm” of the
Matlab statistical and machine learning toolbox is used to train the SVM models. The obtained
SVM-based classifiers have performances in the following ranges: recall= [60.1: 95.9%], specificity
= [16.5: 91.1%] and F1-score= [64.4: 76.0%]. According to the criteria illustrated in model test
paragraph, models SVM6, whose confusion matrix is given in Figure 8, is the most satisfactory.

Figure 8 - Confusion matrix based on model SVM6


We can conclude that the SVM models provides larger recall values compared to those of the
ANNs, but their capability of correctly classifying the non-occurrence of the flaring events (true
negative rate or specificity) is poor.

DANNs
Deep learning aims at extracting hierarchical representations from input data by building “deep”
neural networks with multiple layers of non-linear transformations. It has been successfully applied
to various areas due to its capabilities in discovering intricate hidden structures in high-dimensional

11
data. Recently, Deep Artificial Neural Networks (DANNs), based on the use of Autoencoders, have
been applied with success to Prognostic Health Monitoring (PHM) problems, (25) (Figure 9).

Figure 9 - Typical Autoencoder Layout Figure 10 - Confusion matrix of based on DANN4

A stacked Autoencoders-based DANN is characterized by a deep architecture of consecutive


layers constructed via two steps: pre-training and fine-tuning (26). In this work, Matlab ANNs
toolbox is considered: the function “trainAutoencoder” is used to construct and train different
Autoencoders, the function “trainSoftmaxLayer” is employed to construct and train a classification
layer, the function “encode” is harnessed to extract features from each Autoeconder, the function
“stack” is used to stack all the layers, and, finally, the entire system is fine-tuned using the function
“train”. Different sets of top ranked features and different model hyperparameters settings
characterized by different numbers of Autoencoders, neurons for Autoencoder and parameters
values (e.g. “SparsityRegularization” (SR), “SparsityProportion” (SP) and the
“L2WeightRegularization2 (L2WR)) have been considered. The most satisfactory DANN models is
DANN4 thanks to an acceptable trade-off between recall and specificity (Figure 10).

Comparison
Table 2 compares the performance measures of the models ANN3, SVM6 and DANN4. Notice that
DANN4 is able to achieve the best trade-off between the recall and the specificity, which provides
a prediction model with a good ability to detect the flaring events occurrence (71.4 %) with
tolerable fraction of false alarms (18.4% =100%- 81.6 %). Although SVM6 provides a significantly
larger recall value compared to those of ANN3 and DANN4, its capability of correctly classifying
the non-occurrence of the flaring events (true negative rate or specificity) is too poor.
Table 2 - Comparison of the different models developed

Figure 11 presents a visual comparison of the flaring event predictions provided by models ANN3,
SVM6 and DANN4 over a consecutive part of the test set. Notice that the tendency of the SVM
model of predicting the positive class when the true class is the negative one is confirmed.
Furthermore, the results show that the prediction of flaring events taking place over relatively short
periods of time is difficult for all the models.

12
Figure 11 - Flaring event predictions by the 3 different models

Moreover, three different 10-folds cross-validation procedures are applied to models ANN3, SVM6
and DANN4 for further assessing their performances and robustness. The procedures differ in the
way in which the data are assigned to the 10 folds.
1) In Procedure 1, the patterns are randomly assigned to the 10 folds and, then, 9 folds are
used for model training and one for the performance metrics computation. The final value of
a performance metric is the average of the 10 values obtained in the 10 cross validations,
performed by iteratively varying the training and validation folds. The random assignment of
the data to the folds allows obtaining homogenous training and validation sets, both
characterized by patterns extracted from the entire time domain and with very similar
imbalance ratios. Notice (Table 3) that all the performance metrics are more satisfactory
than those which would be obtained by random assignment of the class labels (tossing a
coin,), except for the SVM6 specificity. Notice that DANN4 provides slightly more
satisfactory performances than ANN3, in terms of recall and F1 score.
2) The second cross-validation procedure is based on the segmentation of the total time
domain of the signals into 10 folds formed by consecutive disjoint segments. Then, similarly
to the random crossvalidation procedure, the models are iteratively trained with 9 folds and
validated with the remaining one. In this case, the imbalance ratios in the validation sets
remarkably change, whereas they remain almost constant in the training sets. Result shows
that the performance metrics are not constant in the different folds and they show a trend
correlated to that of the imbalance ratio. In particular, ANN and DANN models show more
unsatisfactory performances when the imbalance ratio is smaller;
3) The third cross-validation procedure uses the same repartition of the patterns in folds used
in the second procedure, but, in this case, the model is trained with the folds that precede
the validation fold. Notice that this procedure well mimics a real situation in which data for
model training becomes available sequentially. Notice also that, similarly to the other
procedures, SVM shows larger recalls but too poor specificity values. Furthermore, the

13
performances tend to increase with the considered fold numbers since the number of
training increases and therefore the models can learn more information.
It is worth noticing that in the second and third cross validation procedures, the models show low
performances (especially for the recalls and F1-scores) in folds where the test is characterized by
low imbalance ratios (i.e. low numbers of flaring event occurrences). A possible cause of this
behavior is that low imbalance ratios can indicate occurrences of flaring events for short periods of
time, which, are more difficult to predict. Finally, Table 3 reports means and standard deviations of
the performance metrics each cross-validation procedures. Despite the limited information
contained in the mean values due to the modifications of the statistical characteristics of the
training data in the different folds, it can be observed that: i) the specificity of the SVM model is not
satisfactory; ii) ANN and DANN models provide similar performances, but the DANN model slightly
superior from the points of view of recall and F1 score.
Table 3 - Mean and standard deviations of the performance metrics in the 1st, 2nd and 3rd
cross validation.
Recall Specifity F1-score
Techique type
µ σ µ σ µ σ
ANN3 68,7 3,8 91,9 1,0 75,4 3,2
Random
SVM6 63,3 10,8 49,9 11,3 65,1 9,1
Folds
DANN4 71,9 9,7 89,4 2,1 15,7 5,9
ANN3 51,6 24,6 74,7 7,9 50,4 23,3
II
SVM6 59,0 17,6 60,2 21,3 64,0 16,5
procedure
DANN4 48,3 22,6 81,7 12,0 52,0 22,4
ANN3 44,5 15,0 70,7 9,5 45,4 17,7
III
SVM6 59,4 19,6 59,1 15,1 62,2 9,9
procedure
DANN4 43,1 18,8 75,8 12,4 45,8 22,1
Toss a coin 49,9 0,2 49,9 0,1 42,9 0,6

CONCLUSIONS
In this deliverable, a novel framework for the prediction of flaring events in oil and gas plants has
been presented. The framework consists in the application of RELIEF algorithm for the selection of
plant signals to be used as inputs of the model for the prediction of the occurrence of flaring
events, and in a systematic procedure for the choice of the best performing machine learning
approach and the setting of the corresponding optimal architecture and hyperparameters values.
The framework has been validated using data collected from the flaring system of an operating
Upstream Plant. The obtained results show that the best performances have been obtained by a
deep neural network with three layers of stacked autoencoders.
Since the developed model for flaring event prediction uses only signals measured on 1 out of 5
plant lines, due to the modular approach adopted to reduce the complexity of the phenomenon, we
expect that the obtained performances may be further improved scaling up the tool.
Among various models developed based on different machine learning techniques, the DANN4
model seems to be a potential candidate for the further fine-tuning using all data available from all
lines and, then, deployed for flare prediction. Also, it is expected that plant experts can mine useful
knowledge on the flaring process characteristics from the analysis of the feature ranking provided
by the RELIEF algorithm. For example, considering the flaring events in the COVA plants, the
analysis of the top 50 ranked features and of the corresponding signals can allow identifying
specific plant units that are likely to have remarkable effects on the flaring events.

ACKNOWLEDGEMENTS

The authors wish to acknowledge ENI HQ for the possibility to develop the approach presented
within the company digital transformation program. They are also thankful for the precious
contribution come from Eni relevant field business unit. The authors also wish to acknowledge all
the colleagues and the management of Eni Production department who allow them to achieve a

14
great result in tool development and consequent field application. Finally, special thanks go to the
Politecnico of Milano LASAR team, and in particular to Professor Enrico Zio which help in project
development and eni know-how enhancement.

REFERENCES

1. Banerjee, K., Cheremisinoff, N. & Cheremisinoff, P., 1985. Flare Gas Systems Pocket
Handbook, Michigan: Gulf Pub Co;
Elvidge, C. D. et al., 2009. A Fifteen Year Record of Global Natural Gas Flaring Derived from
Satellite Data. Energies, Volume 2, pp. 595-622;
Casadio, S., Arino, O. & Serpe, D., 2012. Gas Flaring Monitoring from Space Using the Atsr
InstrumentSeries. Remote Sensing of Environment, Volume 116, p. 239–249;
2. Meindinyo, R.-E., 2012 . Thermo-hydraulic Modeling of Flow in Flare Systems. Master Thesis,
University of Stavanger;
Sahoo, M., 2013. High Back Pressure on Pressure Safety Valves (PSVs) in a Flare System:
Developing the Simulation model, Identifying and analyzing the back-pressure build-up. Master
Thesis Process Technology, Department of Physics and Technology, University of Bergen,
Norway;
3. Eljack, F. T., El-Halwagi, M. M. & Xuc, Q., 2014. An Integrated Approach to the Simultaneous
Design and Operation of Industrial Facilities for Abnormal Situation Management. Computer
Aided Chemical Engineering, Volume 34, pp. 771-776;
Kazi, M.-K.et al., 2018. A process design approach to manage the uncertainty of industrial flaring
during abnormal operations. Computers and Chemical Engineering, Volume 117, p. 191–208;
4. Elvidge, et al., 2009;
5. Fawole, O. G., Cai, X.-M. & MacKenzie., A., 2016. Gas Flaring and Resultant Air Pollution: A
Review Focusing on Black Carbon. Environmental Pollution, Volume 216, pp. 182-197;
Elvidge, C. D. et al., 2018. The Potential Role of Natural Gas Flaringin Meeting Greenhouse Gas
Mitigation Target. Energy Strategy Reviews, Volume 20, pp. 156-162;
6. CarbonLimits, 2013. Associated Petroleum Gas Flaring Study for Russia, Kazakhstan,
Turkmenistan and Azerbaijan. Project report, Carbon Limits;
Faruolo, M. et al., 2014. A satellite-based analysis of the Val d’Agri Oil Center (southern Italy)
gas flaring emissions. Nat. Hazards Earth Syst. Sci., Volume 14, p. 2783–2793
7. Indriani, G., 2005. Gas Flaring Reduction In The Indonesian Oil And Gas Sector: Technical and
Economic Potential of Clean Development Mechanism (CDM) Projects. HWWA Reports 253,
Hamburg Institute of International Economics (HWWA);
Johnson, M. R. & Coderre, A. R., 2012. Opportunities For Co2 Equivalent Emissions Reductions
Via Flare and Vent Mitigation: A Case Study for Alberta, Canada. International Journal of
Greenhouse Gas Control, Volume 8, p. 121–131;
Fawole, et al., 2016;
Huang, K. & Fu, J. S., 2016. A global gas flaring black carbon emission rate dataset from 1994
to 2012. Scientific Data, Volume 3;
Wang, A. et al., 2016. Combustion mechanism development and CFD simulation for the
prediction of soot emission during flaring. Frontiers of Chemical Science and Engineering,
Volume 10, p. 459–471;
Deetz, K. & Vogel, B., 2017. Development of a new gas-flaring emission dataset for southern
West Africa. Geosci. Model Dev., Volume 10, p. 1607–1620;
Elvidge, et al., 2018;

15
8. Indriani, 2005;
WorldBankGroup, 2010. Monitoring & Reporting Guidelines for Flare Reduction. Oil & Gas
CDM/JI Methodology Workgroup, Report 2, the World Bank Group, Oil & Gas Policy
Division,Pennsylvania, USA;
CarbonLimits, 2013;
Soltanieha, M., Zohrabian, A., Gholipour, M. J. & Kalnay, E., 2016. A review of global gas flaring
and venting and impact on the environment: Case study of Iran. International Journal of
Greenhouse Gas Control, Volume 49, p. 488–509Casadio, et al., 2012; Faruolo, et al., 2014;
9. Deetz & Vogel, 2017;
10. Eljack, et al., 2014;
Tahouni, N., Gholami, M. & Panjeshahi, M. H., 2016. Integration of flare gas with fuel gas
network in refineries,. Energy, Volume 111, pp. 82-91;
11. Heidari, et al., 2016;
12. Daniel, M. M., Buck, J. R. & Singer, A., 2001. Computer Explorations in Signals and Systems
Using MATLAB. 2nd ed. Upper Saddle River, New Jersey: Prentice-Hall;
13. García-Laencina, P. J., Sancho-Gómez, J.-L. & Figueiras-Vidal, A. R., 2010. Pattern
classification with missing data: A review. Neural Computing and Applications, Volume 19, p.
263–282;
14. Ardakani, M. et al., 2016. Imputation of Missing Data with Ordinary Kriging for Enhancing Fault
Detection and Diagnosis. Computer Aided Chemical Engineering, Volume 38, pp. 1377-1382;
15. Na, M., 1997. Failure detection using a fuzzy neural network with an automatic input selection
algorithm. in Intelligent Hybrid Systems. Fuzzy Logic, Neural Network, and Genetic Algorithms,
D. Rua, Ed. New York: Springer;
Emmanouilidis, C., Hunter, A., MacIntyre, J. & Cox, C., 1999. Selecting features in neurofuzzy
modelling by multiobjective genetic algorithms. in Proc. 9th Int. Conf. Artificial Neural Networks
(ICANN’99), Edinburgh, U.K., p. 7–10;
Buckner, M., Gribok, A., Urmanov, A. & Hines, J. W., 2002. Application of generalized ridge
regression for nuclear power plant sensor calibration monitoring. in 5th Int. Conf. Fuzzy Logic
and Intelligent Technologies in Nuclear Science (FLINS), Gent, Belgium;
Verikas, A. & Bacauskiene, M., 2002. Feature selection with neural networks. Pattern Recognit.
Lett., Volume 23, p. 1323–1335;
16. Zio, E., 2006. A study of the bootstrap method for estimating the accuracy of artificial neural
networks in predicting nuclear transient processes. IEEE Transactions on Nuclear Science,
Volume 53, pp. 1460-1478;
Zio, E., Baraldi, P. & Gola, G., 2008. Feature-based classifier ensembles for diagnosing multiple
faults in rotating machinery. Applied Soft Computing , Volume 8, pp. 1365-1380;
17. Urbanowicz, R. J. et al., 2018. Relief-Based Feature Selection: Introduction and Review. Journal
of Biomedical Informatics, Volume in press;
18. Haidong, S., Hongkai, J., Ying, L. & Xingqiu, L., 2018 . A novel method for intelligent fault
diagnosis of rolling bearings using ensemble deep auto-encoders. Mechanical Systems and
Signal Processing, Volume 102, pp. 278-297;
19. Pérez-Ortiz, M. et al., 2016. A Review of Classification Problems and Algorithms in Renewable
Energy Applications. Energies, 9(8), pp. 1-17;
20. Masters, T., 1993. Practical Neural Network Recipes in C++. 1st ed. San Diego New York:
Academic Press;

16
21. Liu, J. & Zio, E., 2018. A scalable fuzzy support vector machine for fault detection in
transportation systems. Expert Systems with Applications, Volume 102, pp. 36-43;
22. Jia, F. et al., 2016. Deep neural networks: A promising tool for fault characteristic mining and
intelligent diagnosis of rotating machinery with massive data. Mechanical Systems and Signal
Processing, Volume 72–73, p. 303–315
23. Zio, 2006;
Razavi-Far, R., Baraldi, P. & Zio, E., 2012. ENSEMBLE OF NEURAL NETWORKS FOR
DETECTION AND CLASSIFICATION OF FAULTS IN NUCLEAR POWER SYSTEMS.
Uncertainty Modeling in Knowledge Engineering and Decision Making (Proceedings of the 10th
International FLINS Conference), Volume 7, pp. 1202-1207;
Baraldi, P., Compare, M., Sauco, S. & Zio, E., 2013. Ensemble neural network-based particle
filtering for prognostics. Mechanical Systems and Signal Processing, Volume 41, pp. 288-300;
Shokry, A. et al., 2017. Dynamic kriging based fault detection and diagnosis approach for
nonlinear noisy dynamic processes. Computers & Chemical Engineering, Volume 106, pp. 758-
776;
24. Liu, J., Li, Y.-F. & Zio, E., 2017. A SVM framework for fault detection of the braking system in a
high speed train. Mechanical Systems and Signal Processing, Volume 87, pp. 401-409;
Shokry, et al., 2017;
Liu & Zio, 2018;
25. Jia, et al., 2016;
Haidong, et al., 2018 ;
Yang, Z., Baraldi, P. & Enrico Zio, P., 2018. Automatic Extraction of a Health Indicator from
Vibrational Data by Sparse Autoencoders. Annual Conference Of The Prognostics And Health
Management Society 2018, Philadelphia, Pennsylvania, USA
26. Yang, et al., 2018
27. Cadei, M. Montini, F. Landi, F. Porcelli, V. Michetti, M. Origgi, M. Tonegutti, S. Duranton “Big
Data Advanced Anlytics to Forecast Operational Upsets in Upstream Production System”, The
Abu Dhabi International Exhibition & Conference, 12-15 November 2018, Abu Dhabi, UAE

17

You might also like