You are on page 1of 6

Mario R. Eden, Marianthi Ierapetritou and Gavin P.

Towler (Editors) Proceedings of the 13th


International Symposium on Process Systems Engineering – PSE 2018
July 1-5, 2018, San Diego, California, USA © 2018 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/B978-0-444-64241-7.50369-4

Deep Learning Based Soft Sensor and Its


Application on a Pyrolysis Reactor for
Compositions Predictions of Gas Phase
Components
Wenbo Zhua, Yan Maa, Yizhong Zhoub, Michael Bentona, Jose Romagnolia*
a
Department of Chemical Engineering, Louisiana State University, 3307 Patrick F.
Taylor Hall, Baton Rouge 70803, USA
b
Shanghai SupeZET Engineering Technology Corp,.Ltd, 268 Linxin Rd NO.3, Shanghai,
200335, China
jose@lsu.edu

Abstract
In this work, we proposed a data-driven soft sensor based on deep learning techniques,
namely the convolutional neural network (CNN). In the proposed soft sensor, instead of
only building time-independent correlations among the key variable with other
measurements, the moving window method is utilized to describe the most recent
process dynamics, where the time-dependent correlation can be located. Beyond on that,
a signal recovery scheme is developed to improve the model robustness when
confronting common sensor faults. The proposed soft sensoring technique was tested on
the composition data of gas-phase components from an ethylene pyrolysis reactor. The
model was also verified through the manually introduced sensor faults.
Keywords: Soft sensor; Deep learning; Data-Driven; Big data

1. Introduction
In recent decades, increased accessibility of process measurement has boosted the
development of data-driven approaches for process monitoring. The quality and
quantity of the measurements are of the paramount importance for chemical processes.
While in practice, some key variables can be difficult to measure in real-time due to
physical or technical limitations. Therefore, soft sensoring techniques haven been
developed for the supplement of real measurements. Generally speaking, soft sensor
techniques are predictive models that can find correlations among key measurements
with other variables. Two types of models that are commonly used as soft sensors are
model-based models and data-driven models. Model-based approaches are designed
based on physical features of the dynamical systems, while in data-driven approaches,
features are obtained from process data solely through machine-learning. Although
model-based approach contains physical meanings of the system, it is generally labour-
intensive and difficult as it requires experienced researchers to develop the
mathematical encoding, while it also requires massive experimental validations. On the
contrary, machine learning algorithms are able to extract optimal features directly from
raw data. Through such machine-learned features, feature selection algorithms are
trained in order to select the most meaningful subset of the features to represent the
system. The combination of feature learning and feature selection is the core of machine
learning algorithms to learn the system through data.
2246 W. Zhu et al.

Recent development in deep learning techniques (LeCun, 2015) has huge potential as a
feature learner. Deep learning methods stack multiple layers of nonlinear estimators
(neurons) to represent features in different levels from shallow to deep. Such
architecture is able to represent complicated functions and correlations, and it has
already been used to process images and natural languages as a state-of-the-art
approach. Though deep learning methods have achieved huge success, training the deep
learner is still challenging. In a typical deep network, there are always thousands of
parameters that need tuning, which requires huge amount of training data. Moreover,
activation functions e.g. sigmoid function that are used to enhance the nonlinearity
expression in each layer can cause gradient vanishing or exploding problems, leading
the gradient-based back propagation unable to tune the earlier layers in the network. The
problem can then be solved by the development of the restricted Boltzmann machine
(RBM) (Hinton G. E., 2006, 2010) and its application of greedy layer-wise training.
Shang et al. (2014) proposed a deep learning based soft sensor based on RBM. They
found that the deep soft sensor outperforms over many conventional methods with
shallow architectures. Yan et al. (2017) applied stacked denoising autoencoders
(SDAEs) to train the network, which estimated the oxygen content in flue gasses.
Another approach using hierarchical extreme learning machine was proposed by Yao
(2017).
However, finding time-correlation information remains a challenge in all above
approaches. Hence, in our work, we proposed a soft sensor that is able to extract time-
correlation information of the process as well as correlations among other
measurements, so the dynamics of the system can be well established. Convolutional
neural network (CNN) is used to obtain the process information from a moving window
at each time point. Not only CNN has shown improved performance over traditional
recurrent neural network (RNN) (Zhang, 2015), but it also reduces the training time
over RNN (Baidu-Research, 2017). Besides, we also developed a signal recovery
scheme in order to confront short-time sensor faults that happen in real processes.

2. Background
2.1. Rectified linear unit (Relu)
Relu is a bilinear activation function used in neural network, which is defined as:
݂ሺ‫ݔ‬ሻ ൌ ƒšሺͲǡ ‫ݔ‬ሻ (1)
Where x is the input. Due to the simple nonlinear expression, the gradient computation
is much more efficient than many other activation functions such as sigmoid. Besides,
relu can effectively reduce the gradient vanishing problem since the derivative is
independent of input domain. As of 2017, relu is widely used in deep learning for
multiple types of tasks.
2.2. Convolutional neural network (CNN)
CNN is a deep neural network originally designed for image analysis. Recently, it was
discovered that the CNN also has an excellent capacity in sequent data analysis such as
natural language processing (Zhang, 2015). CNN always contains two basic operations,
namely convolution and pooling. The convolution operation using multiple filters is
able to extract features (feature map) from the data set, through which their
corresponding spatial information can be preserved. The pooling operation, also called
subsampling, is used to reduce the dimensionality of feature maps from the convolution
Deep Learning Based Soft Sensor and Application on Pyrolysis Reactor 2247

operation. Max pooling and average pooling are the most common pooling operations
used in the CNN. Due to the complicity of CNN, relu is the common choice for the
activation function to transfer gradient in training by backpropagation.

3. Methods
3.1. CNN soft sensor
Based on the recent success of the CNN in sequence data analysis (Zhang, 2015), we
proposed a soft sensor using the similar idea. In the proposed soft sensor, convolutional
filters with multiple region sizes are used to extract features at various granularities
from moving windows. Hence, influence from different past time periods can be
counted into future steps. Based on the extracted features, target measurement at the
next time point can be estimated. The scheme of the proposed soft sensor is illustrated
in Figure 1.

Figure 1. Illustration of overall architecture of the proposed soft sensor. V denotes the related
variables to key variables Y. Their historical records are extracted by moving windows that are the
model input. Feature maps are generated by convolutional 1D filters with multiple length. For the
feature maps, max pooling and layer flatten operation (the fc layer) are used before final
prediction.
3.2. Loss function
Since datasets from industrial processes are born with noise from measurements,
conventional loss functions such as mean squared error (MSE) are not ideal, because
such loss functions would just duplicate the noise into the model. Inspired by Salas’s
work (2017), we utilized the following loss function in order to reduce measurement
noises for each variable.

ିଵ ሺ‫ݕ‬ ்
෍ሺ‫ݕ‬௜ െ ‫ݕ‬ഥሻܴ
ప ௜ െ‫ݕ‬
ഥሻ
ప (2)
௜ୀଵ

where yi is the predicted value, ‫ݕ‬ഥప is the target value, and R is the covariance matrix for
measurements error which is estimated as ݀݅ܽ݃ሺߪଵ ǡ ߪଶ ǡ ‫ ڮ‬ǡ ߪ௜ ሻ. It should be noted that
the standard deviation used here should be taken from steady state, so the measurement
errors can be properly estimated (Chen, 1997).
2248 W. Zhu et al.

3.3. Implementation
Sensor faults are common in process operation. In practice, soft sensors still take data
from real process measurements, and hence, faulty data can still downgrade the soft
sensor performance. To improve the robustness of the soft sensor, we integrated the
fault detection step with the soft sensor. In our work, we utilize a simple fault detection
method that detect faults based on the mean and standard deviation values of the
moving window. Besides, any proper fault detection methods can be replaced including
any knowledge based methods to multivariable statistic methods (PCA T 2). By such
method, the model can be robust for the sensor fault for a short time period.

Figure 2. The scheme of integration of fault detection method with the soft sensor. The incoming
data is tested with the current widow. Any faulty variables are replaced with the value generated
from the sensor model.

4. Case Study
In this work, the proposed method is tested on an industrial pyrolysis reactor which is
used to crack naphtha to ethylene. In the process, the naphtha is first mixed with steam
before cracking in the furnace. In practice, due to the physical limitation, compositions
of the gas-phase components in the outlet flow of reactor are unable to be measured in
real time. The measurement takes about 16 minutes for each sample from the stream.
Hence, the proposed soft sensor is applied in this process and the performance is tested
on the historical data of the process.
Total 21 variables are selected, including 5 key variables (volume percentage
measurements of methane, ethylene, ethane, propylene and propane) that require real-
time prediction, and 16 related variables (e.g. feedstock compositions, feed flow rate,
pressure, and temperature in different coils) that could provide extra process
information to the model. Sensor faults are manually created into the dataset to verify
the effectiveness of the proposed method.

5. Result and Discussion


5.1. Test result
The proposed model was trained with 9k data from an operation cycle whose recording
interval is 1 minute. The model is validated by 1k data from the same operation cycle,
and 500 data from another operation cycle was used for testing. Besides, sensor faults
were introduced in the test data to verify the robustness of the proposed method. The
overall test result is summarized in Figure 3, where the result was also compared with
the partial least squares (PLS) model that did not account time-correlation information.
Deep Learning Based Soft Sensor and Application on Pyrolysis Reactor 2249

The result demonstrates the capability of the proposed soft sensor for key variables’
prediction, and even for the drifting condition (Fig 3e), the proposed model can still
trace the process trend. Additionally, when confronting sensor faults manually
introduced into the dataset from time 190 to 210, the proposed signal recovery scheme
proves its effectiveness. The comparison with the PLS model also indicates the
superiority of the proposed model that outperforms the conventional linear and time-
independent methods.

(a) Methane (b) Ethylene

(c)Ethane (c) Propylene

(e) Propane
Figure 3. Composition prediction of gas-phase components under the condition of sensor faults
that are introduced from time 190 to 210. The moving window size for the model is 10, and 60
filters for two filter region size of 5 and 10 are used in the convolutional layer.
5.2. Parameter selection
In the proposed soft sensor, the window size and filter configurations are important
parameters that affect the ultimate performance. Hence, their effects are analysed (see
Table 1). From the experimental results, a relative small window size is recommended.
A large window length can lead the model less sensitive to the process changes and
drifting, and hence an obvious delay can be observed, while a small length makes the
model oversensitive to any fluctuations in the data including potential noises. Regarding
the filter size selection, large filter size that covers the whole moving window seems
2250 W. Zhu et al.

very necessary, since in all parallel experiments for different window lengths, such filter
size setting can achieve better performance than others. Besides, multiple filters that can
provide multi-granularity features can improve the model performance.
Table 1. Effect of window sizes and filter combinations to the model.
Window
5 10 40 60
size
20, 20, 40,
Filter 3, 5, 20, 30, 20,
2, 3 2, 5 3, 5 5, 10 10 30, 30, 50,
combination 10 30 40 30
40 40 60

Loss 0.008 0.01 0.01 0.004 0.006 0.014 0.04 0.04 0.03 0.08 0.09 0.07

6. Conclusions
In this work, a soft sensor was proposed using the convolutional neural network, which
predicts the measurements at next time step by extracting time-dependent correlations
from a moving window. The proposed method was verified on the dataset from an
industrial pyrolysis reactor. The model shows a remarkable performance on predicting
the composition values of five gas-phase components. Additionally, through the
experiment on manually introduced sensor faults, the proposed method proves its
robustness to short-time sensor faults.

References
Baidu-Research. (2017, June). DeepBench. Retrieved from Github: https://github.com/baidu-
research/DeepBench
Chen, J. B. (1997). Robust estimation of measurement error variance/covariance from process
sampling data. Computers & chemical engineering, 593-600.
Hinton, G. (2010). A practical guide to training restricted Boltzmann machines. Momentum, 926.
Hinton, G. E. (2006). Reducing the dimensionality of data with neural networks. science, 504-
507.
LeCun, Y. Y. (2015). Deep Learning. Nature, 436-444.
Salas, S. D. (2017). Online DEKF for State Estimation in Semi-Batch Free-Radical
Polymerization Reactors. Computer Aided Chemical Engineering, 1465-1470.
Shang, C. Y. (2014). Data-driven soft sensor development based on deep learning technique.
Journal of Process Control, 223-233.
Yan, W. T. (2017). A data-driven soft sensor modeling method based on deep learning and its
application. IEEE Transactions on Industrial Electronics, 4237-4245.
Yao, L. &. (2017). Deep Learning of Semi-supervised Process Data with Hierarchical Extreme
Learning Machine and Soft Sensor Application. IEEE Transactions on Industrial
Electronics.
Zhang, Y. &. (2015). A sensitivity analysis of (and practitioners' guide to) convolutional neural
networks for sentence classification. arXiv.

You might also like