You are on page 1of 11

Society of Petroleum Engineers

SPE-193181-MS

A Novel Hybrid Artificial Intelligence Predictive Multi-Stages Model for Gas


Compressors Based on Multi-Factors

Wael Almadhoun and Ameed Alashqar, ADNOC Offshore

Copyright 2018, Society of Petroleum Engineers

This paper was prepared for presentation at the Abu Dhabi International Petroleum Exhibition & Conference held in Abu Dhabi, UAE, 12-15 November 2018 .

This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author( s ) . Contents
of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author ( s). The material does not necessarily reflect
any position of the Society of Petroleum Engineers, its officers, or members . Electronic reproduction, distribution, or storage of any part of this paper without the written
consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may
not be copied . The abstract must contain conspicuous acknowledgment of SPE copyright .

Abstract
The objective of this paper is to build a hybrid predictive model for a gas compressor to overcome the
operational challenges, optimize maintenance demand and reduce failures risks. This can be achieved by
combining a multi layers multi models artificial intelligent techniques to capture all the historical events
and improves the model overall accuracy. The time series statistical analytics is commonly used to early
predict the potential machine failures or dip in performance in the process industry, by predicting events
that occurred in the past and by monitoring the signal trend over time. Recently, other techniques are being
used to perform predictive analytics. This includes the use of artificial neural network, the Naive Bayesian
and logistic regressions methods. Statistical time series forecasting methods are categorized in: exponential
smoothing methods, autoregressive integrated moving average (ARIMA) methods, regression methods, or
recurrent neural network (RNN) methods. Currently, a time series events are typically predicted by statistical
methods or by artificial intelligent technique such as artificial neural network (ANN). A Hybrid approach
combining more than one technique is applied to overcome the gaps on each method .
The Hybrid model consists of two stages and three models, model 1 and 2 are generalized, and model
3 is detailed. In the first stage, "Model 1" is nonlinear classification model using support vector machine
with a Gaussian kernel, it functions as anomaly detection and a machine status classifier. The objective is to
build a robust model that can be predictive, deterministic and diagnostic. Therefore, the RNN multivariate
signal prediction using long-short term memory (LSTM ) is added in parallel as "Model 2". Model 1 and 2
formed the first stage as overall detector. If anomalies are detected , so the algorithm will move to the second
stage which consist of individual time series sensor signals. A model for each signal is added as "Model
3" in the second stage as deterministic.
The results of each model show different accuracies that varied from 70% to 96%. But the model overall
confidence range is increased. The new hybrid approach provided a better understanding for each event
prediction and has minimized the number of misleading alarms.
Key words: Artificial Intelligence, Artificial Neural Network, Machines Failure, Predictive Maintenance,
Condition Monitoring
2 SPE-193181-MS

Introduction
Digital Transformation is changing every process industry. The oil & gas industry is slowly adopting
new technologies and moving from processing control automation to a new data-driven decision making
approach. This will lead many operators to upgrade their operations and operating models with industrial
internet of things (IIoT) and analytics based awareness to improve operations, reduce risk and increase
safety. Predictive maintenance would then become an integrated part of operators’ maintenance strategy.
Predictive maintenance will empower operators to pursue their production plans with better accuracy and
they would be able to reduce machine downtimes and unplanned maintenance. In addition, the operators
will be able to reduce the maintenance cost by exactly knowing the machine health status and by safely
extending the period between the preventive maintenance and extent the gap between the major overhaul.
In the oil and gas upstream and downstream industries, the Compressors are the heart of the process,
but when a compressor fails, it brings the production down . Due to the complexity of the machine, the
compressor failure can’ t be diagnosed and repaired immediately. So each machine has its routine sensor
based monitoring resulting in data essential for developing information about the condition and the status
of this equipment. But, compressors condition monitoring usually examines each measurement separately
and against static limit information. So the need for intelligent systems has become a critical need for early
prediction of failure.
Soft Computing, artificial intelligence and machine learning are playing important role in solving
engineering problems. One important use of the hybrid technique is during training of an artificial neural
network (ANN) model which is a useful tool for solving a non-linear problem. The hybrid technique is
using optimization algorithms to optimize the training hyper parameters of a neural network. The other use
of the hybrid techniques is to combine two or more different types of algorithms to complement each other
or to solve a non-role based , non-linear problem . A common use is a mathematical model, fuzzy logic with
a neural network algorithm are used together in a hybrid model.

Theory and Definitions


Support Vector Machine (SVM)
Using the sensors readings from the condition based monitoring system as input and the machine
performance and running log as output. The machine learning model is developed based on the historical
data, so the engine health condition has been classified into two classes; class one is normal operating
condition, class two is a failure risk region or anomaly. There are number of machine learning algorithms
that can be used to solve classification problems. Different nonlinear algorithms have been tested and found
that the support vector machine (SVM) is suitable to solve this classification problem using the important
features (Almadhoun, W. 2017). The radial bases function RBF kernel was used in the learning algorithm
which depends on Gaussian function and the squared exponential kernel known as the Gaussian kernel, can
be represented as in Equation 2, the squared Euclidean distance between the two feature vectors calculated
as in equation 1 (Mlkernels.readthedocs.io, 2017).

I l* — ^2112 (1)

-
i

e( )
2

K 0) (2)
= 2a2
Time-Series
Each sensor data reading is a time series, which is represented with a continuous reading every timestamp.
The time series analysis is a statistical forecasting approach using stochastic models to predict the value
of input sensor/feature using previous historical observations. This is often accomplished using linear
SPE-193181-MS 3

stochastic difference equation models with random inputs (Connor, J.T., 1994 ), but the series should be a
stationery signal. The time series is said to be stationary, if the time series statistical properties of mean,
and variance are remains constant over time (Jain, 2018). The time dependent events or trends and the
seasonality trends are other challenges for predicting a time series forecast. The most important model in
time series analysis is the Auto-Regressive Integrated Moving Averages (AR1MA) models. The ARIMA
forecasting can be performed depending on three values. The first one is AR (Auto-Regressive) which is
the lags of dependent variable. The second one is MA (Moving Average). The third value is Differences,
the number of non-seasonal differences.

Recurrent Neural Network (RNN)


The Recurrent Neural Network (RNN) is neural sequence model that recently demonstrated state of the art
performance on variety of tasks that include language modeling (Mikolov, T., 2012), speech recognition,
and machine translation (Graves, A., 2013), (Zaremba, W., 2014). The RNN models is different from other
neural network model by maintaining a hidden-layer of neurons in recurrent connections to their own
previous values. The recurrent models are powerful model for sequential data. The regular backpropagation
is a challenge for the recurrent network, the learning to store information over extended time intervals takes
very long time. Hochreiter's (1991) introduce a novel , gradient based method called long short-term memory
(LSTM ), Figure 1 is a memory cell representation and Figure 2 is a representation for the information flows
in LSTM. The LSTM can learn to bridge minimal time lags in surplus discrete-time steps by enforcing
constant error flow (Hochreiter, S. 1997).

xt xt

\ ht
Xt
/

/\
Xt
Figure 1— Long Short-term Memory Cell (Graves, A., 2013).

lit- 2 lit - 1 Vt 2/i + l //< + 2


k k k k
i i i i
i i i
i t i i
X X X

T T T T
i i i i
t i i i
i i i
X X X

T T T T
i i i i
i i i
i i
i i i

X t -2 Xt-1 Xt Xt + 1 Xt+2
.
Figure 2— The information path flow in the LSTM ( Zaremba, W , 2014).
4 SPE-193181-MS

The standard recurrent neural network (RISEN) is composed of input, hidden layer and output vectors.
Given an input sequence x = {xb x2, x3 xT } , the hidden vector sequence h = { hu h2 , h2 hT } and
the output vector sequence y = { yu y2 , y3 yT } , the output is characterized in Equations 3 and 4 by
iterating them from t = 1 to T :

ht - H ( Wxhxt + Whhht _ l + bh ) (3)

Vt = H ( Whyht + by ) ( 4)

Where the W is the weight matrices of the input-hidden weights, the H is the hidden layer function, and
the b is the bias vectors (Graves, A., 2013).

Multivariate
In the statistical analysis the multivariate is including the simultaneous observation and analysis of more
than one variable. The multivariate time series analysis is important to forecast future values depending on
the variants historical data. Rather than studding just only one variable, but studying many related variables
together a better understanding is obtained (Chakraborty K, 1992). The multivariable model consists of
multiple variables on the right side of the equation. This model can be used to assess the relationship between
a number of variables; one can assess independent relationships while adjusting for potential correlation and
dependent variable (Hidalgo B, 2013). A multivariate multiple linear regression model is shown in Equation
5 , where y is the continuous dependent variable and x are the predictors in the multi variable model.
y +*l ft + X2 ft +. .. + Xn pn + £ (5)

Hybrid
Hybrid methods combining more than one technique is being adapted in the industry to overcome the gaps on
each method (Wagner, N. 2011). The hybrid techniques combine two or more different types of algorithms to
complement each other or to solve a non-role based, non-linear problem. The hybrid algorithm is combining
two or more algorithms to solve the same problem desiring the features of each algorithm to achieve a better
overall results. In this paper, the authors didn’ t use the method of hybrid algorithm, but proposed a hybrid
technique to combine more than one algorithm, which are illustrated previously in this section.

The Compressor
Process and Operation
Centrifugal gas compressors are considered the prime mover for the natural gas in the gas export network,
gas injection system and essential component in many gas and chemical processes. High speed and high-
pressure Centrifugal compressor are machines combining the challenge in rotor dynamics and the volatility
of the gas dynamics and behavior. Due to such complexity in Centrifugal compressors, there are several
parameters need to be provided with comprehensive monitoring, such as suction and discharge pressure
& temperature, speed, anti-surge, lube oil pressure and temperature, shaft vibration, bearing temperature,
primary seal gas leakage rate and secondary seal, and others. These parameters are provided with static limits
in form of trip and alann to protect machine from permanent failure or damage. The schematic diagram for
the compressor process flow diagram , the compressor vibration and lube oil are listed in Figure 3 and 4.
SPE-193181-MS 5

<y <F
-
Anti Surge Valve

ASC
.i
PT

FE
- k

- 3
PC PT TT / / / / j
Hi
Or<C>
M

Suction
.
K.O Drum Speed After Cooler
Controller

Figure 3
- 1If
— Typical Centrifugal Compressor Process Flow Diagram.
Centrifugal
Compressor

O
*
© © ©

O© r
©Q
oo
0© - -t —
r
/ A
\
/ A
\
-o
©

©
©
Figure 4— Typical Centrifugal Compressor Lube Oil System and Vibration Instrumentation Diagram.

In spite of these limits, compressors experienced mature failure and majority of these failures happen in
centrifugal compressors are located in dry gas seal, sleeve bearing, mechanical or process vibration. These
components have a nature of sudden failure and damage and mostly trip and alarm would not be able to
safe them especially if deterioration rate is faster than operator reaction in the time between alarm and trip.
Such failure requires expert monitoring and vigilance, however, due to human limitation in maintaining
consistence level of attention and focus on extended period of time. In this paper, artificial intelligence
in form of computer algorithms are capable to provide such vigilance, close monitoring, comparisons and
statics.

Historical Data
The real-time sensors data is collected from the historian for 130 tags for the period of 2 years for training
and testing purpose and for the period of six months for verification purpose. Mostly, the following data
sources are used to build a predictive model for a machinery equipment:
6 SPE-193181-MS

• Failure history; a log of failure history of the machine or component within the machine.

• Maintenance history; the repair history of the machine, including the error log, previous
maintenance enents or parts replacements.
• Machine conditions and usage; the operating conditions of the machine, including data collected
from sensors.
• Machine features; the features of the machine, including the size, the model, and location.

Note; the data presented on this paper is anonymized but indicative. For illustration, the variables are
referenced not in the real name, the graphs are not in the actual scale, and some numbers are normalized. A
selected number of variables are listed in Table 1, which presents statistically the distribution of the sample
data.

Table 1— Sample Data with Summary Statistics.

Pressurel Pressure2 Pressure3 Tempretureflempreture 2rempreture3rempreture4


mean 51.856 1.527 1.491 29.983 3.136 50.307 77.793
std 31.586 1.033 1.048 6.649 14.395 5.414 27.868
min 0.000 0.000 0.000 12.260 -12.580 12.600 19.390
50% 69.470 0.990 0.970 30.415 -1.170 52.340 91.405

max 93.710 8.000 8.000 48.230 45.120 62.770 110.430

The Model
Following previous work, to build a predictive model for a nonlinear classification using support vector
machine algorithm. The Authors’ proposed a hybrid model consists of two stages in series, each stage has
parallel models. The hybrid model consists of three models, model 1 and 2 acting as the first detector in the
first stage, mainly anomaly detection. And model 3 consist of individual time series sensors prediction for
a selected variables based on their importance. Figure 5 shows the hybrid model high level building blocks.

Input Signals Overall Detector Stage Failure Diagnostic Stage

Model 1 Model 3.0


Status Classifier Time series
analytics

Model 2 Model 3.n


Combined RNN Time series
analytics

Figure 5 — The Hybrid Model Building Blocks.

In the first stage, the first model is a nonlinear classifier combined with the second model in parallel. The
second model is a multivariate recurrent neural network, which acts as the overall detector by calculating
the difference percentage between the predicted and the actual signals. The hybrid model will move to the
next stage if any anomaly signals are detected. The second stage consists of individual time series in the
SPE-193181-MS 7

third model, which will provide an individual sensor prediction. A deviation in a sensor or more signal can
lead to the route cause and provide indicative information well in advance before failures.

Analysis and Results


In this section, the data analysis, the methods and the results are illustrated. Due to the large number of
variables and the large size of the dataset, only sample information and plots are presented.

Data Pre-Processing
The following steps are followed to prepare and improve the data for further analysis, these steps are
essential steps for building the model:
1. Loading all data sources from sensors data, operating log, and maintenance log,
2. Transforming all sensors data into numeric values,
3. Transforming the date into timestamp to prepare the sensors data as time series,
4. Filling missing data, case by case, but mainly by mean value,
5. Removing the dataset when the motor speed was low,
6. Adding a status column to indicate the events raised at any point of time,
7. Mapping the shutdown and maintenance data with the sensor data. And update the status column with
values of normal, maintenance, shutdown trip or alarm,
8. Plotting the data for high level observation,
9. Dividing the dataset in groups.

Cluster Analysis
Following the data pre-processing, the dataset statistical summary is calculated to get observations on data
outliers and statistical highlights (minimum, maximum, mean, standard deviation) values. Then, performed
unsupervised Cluster analysis using k-mean algorithm. The clustering is performed to know, in how many
clusters the data is distributed. As shown in Figure 6, the data can be distributed in 2 clusters. This means that
the data points are distributed with high correlation to events, this could be due to the compressor working in
normal condition or having issues. The cluster analysis is beyond the scope of this paper and the illustration
is provided as part of the data analysis process. After the complete analysis, the number of clusters were 8
clusters, each one is associated with different operating conditions, speed and maintenance or failure events.

1 , 100,000
1, 050 , 000 -

1,000,000 -
950 , 000 -

900 , 000 -

850 , 000 -

800 , 000 -

750 , 000 -

700 , 000 -

650 , 000 -

600 , 000 -
550 , 000 .
——
> © 0 o o e — 0

Cl Q

Figure 6— Cluster Analysis.

Time Series Analysis


The time series analysis used to check each sensor reading over the given period of time. The analysis
performed to understand the signal type and to know if the signal is a stationary one, or if it has anomalies or
spikes and if the sensor signal has seasonality trends. The plots in Figures 7 and 8 is a sample representation
8 SPE-193181-MS

for temperature parameters, with trends plot including the rolling mean and rolling standard deviation
for a window of 12 hours. The observation from the signals shows a non-stationary one which requires
preparation before it can be used for prediction.

A [vJUvVjjl
t *
nr
Figure 7— Sample Time series analysis for Temperature.

-
Results of Dickey Puller Test :
Test Statistic
-
p value
-2.602847
0.092410
•Lags Used
Number of Observations Used
28.0O0OO0
2564.000000

Figure 8— Sample Time series analysis for Lube Oil Temperature.

Correlations
The dataset is divided into smaller groups to improve the visualization and simplify the observations. The
provided plot is a sample representation of many plots to understand the correlation between the sensor
readings. The heat map represents the dense of correlation between different parameters, same parameters
are arranged on x and y axes, so that they meet in one square. The vibration parameters correlation is shown
in Figure 9a which represents highly correlated parameters. And Figure 9b represents different correlation
relations for other parameters; for example, the temperature with the valve position.

Figure 9a— Correlation Analysis for Vibration.


SPE-193181-MS 9

I
I Figure 9b— Correlation Analysis for other parameters.

Highlights and Anomaly Detection


Before performing the prediction, a data pre-processing on the historical data and the real-time data is
necessary to clean the dataset, fill in the missing values, map the maintenance events and trips on the data
points. The overlap of these events is shown in Figure 10, the red color data points indicate shutdown events
and the green data points indicates maintenance events. The plots in Figure 10 shows sample sensors signals
for temperature and vibration respectively mapped to the historically recorded events.

n
40 A
%
n
k
VJ

r
uf
*
10

•m w Oct hr on hr

k» oT hn
iLrutruL Ji h CM

Figure 10— Sample Temperature and Vibration Trends with Events.


-
h
"

The trip events and the maintenance log are used as indicator for issues during the compressor operation,
therefore the lag in time for two weeks of data is considered as a dataset that can lead to an anomaly event.
Figure 11 shows a sample plot for sensor signal trend after alteration of low speed operation and shutdown
events, the data points that could lead to an anomaly event are highlighted in orange color.
10 SPE-193181-MS

Figure 11— Sample Temperature Trends with Anomaly Events.

Different models were built and tested for prediction based on the methodology covered in this paper.
This includes the nonlinear classification of events and the multivariate time series prediction. The accuracy
of each model is tested and found that, the higher accuracy models which results scored 88 96% in few -
-
anomalies prediction and the lower accuracy models scored 70 76% showed more anomalies.

Conclusion
The predictive analytics has been advanced in the last 10 years as the machine learning and artificial
intelligence revolutionized. The affordable, available and more powerful hardware have made it possible to
build and deploy large scale predictive models. It was possible to build, train and test the multi stage models
in reasonable time. Deploying the same model in real time operation is achievable with today’s existing
infrastructure. Deploying the hybrid predictive model as part of the predictive maintenance could lead to
improved equipment reliability and availability which results in optimized and longer Remaining Useful
Life (RUL) of the compressor and the machinery.

Reference
Almadhoun, W. and Alashqar, A., 2017, November. Machines Performance Algorithmic Modeling for Anticipating
Machines Health Using Real Time Condition Monitoring Data. In SPE Abu Dhabi International Petroleum Exhibition
& Conference. Society of Petroleum Engineers .
Connor, J.T., Martin, R.D. and Atlas, L .E., 1994. Recurrent neural networks and robust time series prediction. IEEE
transactions on neural networks, 5( 2), pp.240-254.
Chakraborty K, Mehrotra K, Mohan CK, Ranka S. Forecasting the behavior of multivariate time series using neural
networks. Neural networks . 1992 Nov 1 ;5( 6):961-70.
Graves, A ., Mohamed, A . R . and Hinton, G., 2013, May. Speech recognition with deep recurrent neural networks. In
Acoustics, speech and signal processing (icassp ), 2013 ieee international conference on (pp. 6645-6649). IEEE.
Hidalgo B, Goodman M. Multivariate or Multivariable Regression? American journal of public health .
2013;103(1):10.2105/AJPH .2012.300897. doi: 10.2105 /AJPH .2012.300897.
Hochreiter, S. and Schmidhuber, J., 1997. Long short-term memory. Neural computation, 9(8 ), pp.1735-1780.
Jain, A . (2018). Complete guide to create a Time Series Forecast (with Codes in Python ), [online] Analytics Vidhya.
Available at: https://www.analyticsvidhya.com/ blog/2016/02/time-series-forecasting-codes-python/ [Accessed 14
Sep. 2018].
Mikolov, T. and Zweig, G., 2012. Context dependent recurrent neural network language model. SLT , 12(234-239), p.8.

Mlkemels.readthedocs. io. ( 2017). Kernel Functions Machine Learning Kernels 0.0.3 documentation [ online] Available
at: http://mlkemels.readthedocs. io/en/latest/kernels.html [ Accessed 10 Sep. 2018].
,

Wagner, N ., Michalewicz, Z., Schellenberg, S., Chiriac, C. and Mohais, A ., 2011 . Intelligent techniques for forecasting
multiple time series in real-world systems. International Journal of Intelligent Computing and Cybernetics, 4(3),
pp.284-310.
Zaremba, W., Sutskever, I. and Vinyals, O., 2014. Recurrent neural network regularization. arXiv preprint
arXiv : 1409.2329.
SPE-193181-MS 11

Author Biographies
Wael Almadhoun is a Technology Specialist Engineer in ADNOC offshore, holds a BS degree in Control
and Telecom Engineering and an MSc degree in Software Engineering from Heriot-Watt University.
Almadhoun is a data scientist having hands-on experience, demonstrated ability to deliver valuable insights
via data analytics, artificial intelligence , advanced revolutionary algorithms and data-driven approach. Tie
has developed machine learning algorithms for industrial equipment and provided innovative solutions.
Almadhoun is a project management certified professional, he managed software implementation and
development projects. His research interests in the areas of artificial intelligence, machine learning,
predictive analytics, information management, and big data. He received awards from IEEE, ADNOC, and
RDPetro. Almadhoun is a member of PEO, IEEE, PMI and AIIM.

Ameed Alashqar is a Rotating Machinery Engineer in ADNOC Offshore, holds a B.S. degree with honor
in Mechanical Engineering and an MSc degree from Heriot-Watt University. Alashqar is ISO Vibration
and troubleshooting Analysist Level III having hands-on experience, participated and demonstrated ability
to investigate several failures via conducting investigating and root cause analysis approaches. Alashqar
develop a number of mechanism and procedures to control and guide machine development and health
monitoring. Alashqar is a project management professional , he supports variety of mechanical equipment
scope development in several project and new technology evaluation and acceptance. His research interests
in the areas of vibration and acoustic emission analysis, machine human center design concept, human factor
engineering, and predictive analytics. Received awards from ASME, ADNOC, and RDPetro. Alashqar is
a member of ASME, PMI and BINT.

You might also like