You are on page 1of 54

Dynamic Modeling of Complex

Industrial Processes Data driven


Methods and Application Research 1st
Edition Chao Shang (Auth.)
Visit to download the full and correct content document:
https://textbookfull.com/product/dynamic-modeling-of-complex-industrial-processes-d
ata-driven-methods-and-application-research-1st-edition-chao-shang-auth/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Dynamic Mode Decomposition Data Driven Modeling of


Complex Systems 1st Edition J. Nathan Kutz

https://textbookfull.com/product/dynamic-mode-decomposition-data-
driven-modeling-of-complex-systems-1st-edition-j-nathan-kutz/

Data Driven Fault Detection for Industrial Processes


Canonical Correlation Analysis and Projection Based
Methods 1st Edition Zhiwen Chen (Auth.)

https://textbookfull.com/product/data-driven-fault-detection-for-
industrial-processes-canonical-correlation-analysis-and-
projection-based-methods-1st-edition-zhiwen-chen-auth/

Group Processes Data Driven Computational Approaches


1st Edition Andrew Pilny

https://textbookfull.com/product/group-processes-data-driven-
computational-approaches-1st-edition-andrew-pilny/

Analysis and Data Based Reconstruction of Complex


Nonlinear Dynamical Systems Using the Methods of
Stochastic Processes M. Reza Rahimi Tabar

https://textbookfull.com/product/analysis-and-data-based-
reconstruction-of-complex-nonlinear-dynamical-systems-using-the-
methods-of-stochastic-processes-m-reza-rahimi-tabar/
Methods and Applications for Modeling and Simulation of
Complex Systems 18th Asia Simulation Conference AsiaSim
2018 Kyoto Japan October 27 29 2018 Proceedings Liang
Li
https://textbookfull.com/product/methods-and-applications-for-
modeling-and-simulation-of-complex-systems-18th-asia-simulation-
conference-asiasim-2018-kyoto-japan-
october-27-29-2018-proceedings-liang-li/

Modeling in Membranes and Membrane Based Processes


Industrial Scale Separations 1st Edition Anirban Roy
(Editor)

https://textbookfull.com/product/modeling-in-membranes-and-
membrane-based-processes-industrial-scale-separations-1st-
edition-anirban-roy-editor/

Industrial Mathematics and Complex Systems Emerging


Mathematical Models Methods and Algorithms 1st Edition
Pammy Manchanda

https://textbookfull.com/product/industrial-mathematics-and-
complex-systems-emerging-mathematical-models-methods-and-
algorithms-1st-edition-pammy-manchanda/

Exploring the Health State of a Population by Dynamic


Modeling Methods 1st Edition Christos H. Skiadas

https://textbookfull.com/product/exploring-the-health-state-of-a-
population-by-dynamic-modeling-methods-1st-edition-christos-h-
skiadas/

Monitoring Multimode Continuous Processes: A Data-


Driven Approach Marcos Quiñones-Grueiro

https://textbookfull.com/product/monitoring-multimode-continuous-
processes-a-data-driven-approach-marcos-quinones-grueiro/
Springer Theses
Recognizing Outstanding Ph.D. Research

Chao Shang

Dynamic Modeling
of Complex Industrial
Processes: Data-
driven Methods and
Application Research
Springer Theses

Recognizing Outstanding Ph.D. Research


Aims and Scope

The series “Springer Theses” brings together a selection of the very best Ph.D.
theses from around the world and across the physical sciences. Nominated and
endorsed by two recognized specialists, each published volume has been selected
for its scientific excellence and the high impact of its contents for the pertinent field
of research. For greater accessibility to non-specialists, the published versions
include an extended introduction, as well as a foreword by the student’s supervisor
explaining the special relevance of the work for the field. As a whole, the series will
provide a valuable resource both for newcomers to the research fields described,
and for other scientists seeking detailed background information on special
questions. Finally, it provides an accredited documentation of the valuable
contributions made by today’s younger generation of scientists.

Theses are accepted into the series by invited nomination only


and must fulfill all of the following criteria

• They must be written in good English.


• The topic should fall within the confines of Chemistry, Physics, Earth Sciences,
Engineering and related interdisciplinary fields such as Materials, Nanoscience,
Chemical Engineering, Complex Systems and Biophysics.
• The work reported in the thesis must represent a significant scientific advance.
• If the thesis includes previously published material, permission to reproduce this
must be gained from the respective copyright holder.
• They must have been examined and passed during the 12 months prior to
nomination.
• Each thesis should include a foreword by the supervisor outlining the signifi-
cance of its content.
• The theses should have a clearly defined structure including an introduction
accessible to scientists not expert in that particular field.

More information about this series at http://www.springer.com/series/8790


Chao Shang

Dynamic Modeling
of Complex Industrial
Processes: Data-driven
Methods and Application
Research
Doctoral Thesis accepted by
Tsinghua University, Beijing, China

123
Author Supervisor
Dr. Chao Shang Prof. Dexian Huang
Department of Automation Department of Automation
Tsinghua University Tsinghua University
Beijing Beijing
China China

ISSN 2190-5053 ISSN 2190-5061 (electronic)


Springer Theses
ISBN 978-981-10-6676-4 ISBN 978-981-10-6677-1 (eBook)
https://doi.org/10.1007/978-981-10-6677-1
Library of Congress Control Number: 2018931440

© Springer Nature Singapore Pte Ltd. 2018


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
part of Springer Nature
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Supervisor’s Foreword

Safe and steady operations of complex industrial processes are of significant


importance to maximize economic benefits and social benefits. In the past decades,
data-driven modeling methods have received much attention in the process industry
and have been widely applied to areas such as process monitoring, fault diagnosis,
and quality prediction. An active and exciting research field termed as process data
analytics hence emerged. Its popularity not only owes to the increasing availability
of process data routinely collected, but also comes from significant complexities in
developing reliable first-principle models with clear mechanisms.
While process data analytics literally refers to tools and techniques for modeling
industrial processes from datasets, an important implication is that the information
extracted must bear clear interpretations. In other words, data-based methods shall
be useful for improving our understanding of industrial processes. However, tra-
ditional process data analytics techniques fall short of leveraging specific charac-
teristics of process data, with less interpretable physical meanings, thereby leaving a
large quantity of information unused. This leads to a series of practical limitations
and erects obstacles for the development of industrialization and informatization in
the process industry.
In this doctoral thesis, Chao Shang investigates specific dynamic characteristics
of industrial process data, develops data-driven methods tailored to process data
analytics, and studies several practical problems including process monitoring, fault
diagnosis, control performance assessment, and soft sensing. Based on the slowly
varying nature of the essential variations of industrial processes, slow feature
analysis (SFA) is introduced to simultaneously describe static and dynamic rela-
tionships among process variables. Then new monitoring statistics are put forward
for concurrent monitoring of operating condition deviations and process dynamics
anomalies. A control performance monitoring approach is motivated on the basis of
SFA, along with a real-time diagnosis method using contribution plots, thereby
alleviating smearing effects and making full use of information within all kinds of
process variables. For time-varying processes, a recursive SFA algorithm is pro-
posed, along with the associated adaptive monitoring scheme, which turns out to be
beneficial for long-run operations of monitoring systems. Temporal coherence is

v
vi Supervisor’s Foreword

incorporated a priori into states and parameters of soft sensing models, respectively,
yielding two new dynamic soft sensing models: Probabilistic slow feature regres-
sion is proposed, which extracts useful dynamic features from time series data and
synthesizes information of multi-rate data; the classical dynamic partial least
squares model is enhanced with temporal smoothness imposed, and the over-fitting
phenomenon gets alleviated. Finally, a new dynamic nonlinear soft sensing
approach is developed, which describes process dynamics and nonlinearity by
first-order systems with pure delays and support vector machines, respectively.
Extensive simulations and real industrial case studies are adopted to testify the
effectiveness of the proposed methods. This thesis not only contributes to a new
theoretical framework of process data analytics but also provides effective strategies
for process monitoring, fault diagnosis, control performance assessment, and soft
sensing in industrial practice.
Beyond this thesis, the work done by Chao Shang during his Ph.D. program is
also excellent in other aspects. For example, the new modeling methods introduced
in this thesis have already been implemented in commercial softwares for advanced
process control. The deep learning technique for process data modeling developed
by Chao Shang, albeit not included in this thesis, has been widely applied to quality
prediction, crude classification, and carbon capture process modeling, and the
associated journal article has been among the most downloaded and cited articles in
Journal of Process Control. Up to now, his publications have been cited about 200
times in total according to the statistics of Google Scholar, which well reflect the
academic influence of Chao Shang’s doctoral research.

Beijing, China Prof. Dexian Huang


December 2017
Preface

This thesis develops a systematic data-based dynamic modeling framework for


industrial processes. Based on this, it then proposes novel strategies to deal with
control monitoring and quality prediction problems in industrial productions. The
author reveals the slowly varying nature of industrial production processes under
feedback control and integrates it with process data analytics to offer powerful prior
knowledge that gives rise to statistical methods tailored to industrial data. It
addresses several issues of immediate interest in industrial practice, including
process monitoring, control performance assessment and diagnosis, monitoring
system design, and product quality prediction. In particular, it proposes a holistic
and pragmatic design framework for industrial monitoring systems, which enables
effective elimination of false alarms, as well as intelligent self-running by fully
utilizing the information underlying the data. One of the strengths of this thesis is
the integration of knowledge from statistics, machine learning, control theory and
engineering, providing a new scheme for industrial process modeling in the era of
big data.

Beijing, China Dr. Chao Shang


December 2017

vii
Part of this thesis has been published in the following journal articles:

1. Chao Shang, Fan Yang, Xinqing Gao, Xiaolin Huang, Johan A.K. Suykens,
Dex- ian Huang, Concurrent monitoring of operating condition deviations and
process dynamics anomalies with slow feature analysis, AIChE Journal,
2015,61(11):3666–3682. (Reproduced with permission).
2. Chao Shang, Biao Huang, Fan Yang, Dexian Huang, Slow feature analysis for
monitoring and diagnosis of control performance, Journal of Process Control,
2016, 39:21–34. (Reproduced with permission).
3. Chao Shang, Biao Huang, Fan Yang, Dexian Huang, Probabilistic slow feature
analysis-based representation learning from massive process data for soft sensor
modeling, 2015, AIChE Journal, 61(12):4126–4139. (Reproduced with
permission).
4. Chao Shang, Xiaolin Huang, Johan A.K. Suykens, Dexian Huang, Enhancing
dynamic soft sensors based on DPLS: A temporal smoothness regularization
approach, 2015, Journal of Process Control, 28:17–26. (Reproduced with
permission).
5. Chao Shang, Xinqing Gao, Fan Yang, Dexian Huang, Novel Bayesian frame-
work for dynamic soft sensor based on support vector machine with finite
impulse response, 2014, IEEE Transactions on Control Systems Technology, 22
(4):1550–1557. (Reproduced with permission).

ix
Acknowledgements

First of all, I would like to express my deepest gratitude to my supervisor, Prof.


Dexian Huang, for his unforgettable guidance and support during my Ph.D. pro-
gram. Professor Huang’s modest gentleman’s style is admirable and exemplary. He
not only guides me to shape a rigorous and conscientious attitude toward academic
research, but also gives meticulous care in my daily life, which influenced and
benefited me invariably.
I am indebted to thank Prof. Biao Huang and Prof. Johan A. K. Suykens for their
valuable supervisions during my visits at University of Alberta and KU Leuven.
They provided me with highly stimulating environments where many exciting ideas
were shaped and fruitful outcomes were made. Beyond technical guidance, Prof.
Biao Huang has provided constructive suggestions and important help in planning
my early academic career, which I appreciate from the heart.
Thanks are also due to Prof. Fan Yang and Prof. Yongheng Jiang for their
generous support and help in various forms. I would like to also thank Prof. Hao
Ye, Prof. Shuning Wang, Prof. Ling Wang, Dr. Wenxiang Lyu, Prof. Li Li, Prof.
Xiao He, Prof. Jin Gu, Dr. Xiaodong Yu, Dr. YujieWei, Dr. Zhen Hu, Dr. Lei Shi,
Han Li, and Xinqing Gao in Tsinghua University and Prof. Sirish L. Shah, Prof.
Jinfeng Liu, Anahita Sdeghian, Dr. Nima Sammaknejad, Yaojie Lu, Ming Ma,
Xunyuan Yin, Ruomu Tan, Jing Zhang, Yanjun Ma, and Zheyuan Liu in University
of Alberta for their consistent care and help with my research work.
Last but not the least, I would like to express my honest and deepest thanks to
my family. Without their encouragement and patience, I would not have been able
to finish this thesis.
Financial support from the National Basic Research Program of China
(No. 2012CB7505), the National Natural Science Foundation of China (Nos.
61433001, 21276137), the National High-Tech 863 Program of China
(No. 2013AA040702), and Tsinghua University Initiative Scientific Research
Program are also gratefully acknowledged.

xi
Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....... 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....... 1
1.2 Literature Review on Multivariate Statistical
Process Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Current Research Status of Process Monitoring . . . . . . . . . 5
1.3 Literature Review on Data-driven Soft Sensor Modeling . . . . . . . . 7
1.3.1 Partial Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 Current Research Status of Soft Sensing . . . . . . . . . . . . . . 10
1.4 Challenges and Opportunities . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Contents and Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Monitoring of Operating Condition and Process
Dynamics with Slow Feature Analysis . . . . . . . . . . . . . . . . . . . . . . . 21
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Slow Feature Analysis—Revisit . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.1 Definition of Slowness and Some Notations . . . . . . . . . . . 23
2.2.2 Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.3 Solution Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.4 Geometric and Statistical Properties of SFA . . . . . . . . . . . 27
2.2.5 Comparison with Classic Latent Variable Models . . . . . . . 28
2.3 Process Monitoring with SFA . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.1 Dimension Reduction of SFA . . . . . . . . . . . . . . . . . . . . . . 29
2.3.2 Monitoring Statistics Design with Slow Features . . . . . . . . 31
2.4 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.1 Simulated CSTR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.2 TE Benchmark Process . . . . . . . . . . . . . . . . . . . . . . . . . . 38

xiii
xiv Contents

2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3 Control Performance Monitoring and Diagnosis
Based on SFA and Contribution Plot . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 Control Performance Monitoring with SFA . . . . . . . . . . . . . . . . . 50
3.3 Control Performance Diagnosis with Contribution Analysis . . . . . . 51
3.3.1 Revisiting Contribution Analysis . . . . . . . . . . . . . . . . . . . 51
3.3.2 Control Performance Diagnosis . . . . . . . . . . . . . . . . . . . . 52
3.3.3 Comparison with the Covariance-Based Approach . . . . . . . 53
3.4 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.1 A Simple Simulated Multivariate Process . . . . . . . . . . . . . 56
3.4.2 TE Benchmark Process . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4 Recursive SFA Algorithm and Adaptive Monitoring System
Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.2 Updating Law for Covariance Matrices . . . . . . . . . . . . . . . . . . . . 66
4.3 The Recursive SFA Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3.1 Improved Monitoring Statistics . . . . . . . . . . . . . . . . . . . . . 69
4.3.2 Rank-One Modification for the First Decomposition . . . . . 69
4.3.3 OIP Algorithm for the Second Decomposition . . . . . . . . . . 70
4.3.4 Complexities of the RSFA Algorithm . . . . . . . . . . . . . . . . 71
4.4 Adaptive Monitoring System Design . . . . . . . . . . . . . . . . . . . . . . 72
4.4.1 Control Limits Updating . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.4.2 An Improved Stopping Criterion for Model Updating . . . . 73
4.5 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.5.1 Simulated CSTR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.5.2 An Industrial Heating Furnace System . . . . . . . . . . . . . . . 76
4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5 Probabilistic Slow Feature Regression for Dynamic
Soft Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.2 Probabilistic SFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2.1 Mathematical Definition . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.2 Parameter Estimation Using the EM Algorithm . . . . . . . . . 88
Contents xv

5.3 Soft Sensor Modeling Based on Probabilistic Slow Feature


Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.3.1 Probabilistic Slow Feature Regression . . . . . . . . . . . . . . . . 93
5.3.2 Online Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.4 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.4.1 A Hybrid Tank System . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.4.2 SRU Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6 Enhanced Dynamic PLS with Temporal Smoothness
for Soft Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.2 Dynamic PLS with Temporal Smoothness Regularization . . . . . . . 112
6.2.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.2.2 Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.2.3 Implementation Procedure in Soft Sensing . . . . . . . . . . . . 115
6.3 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.3.1 Numerical Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.3.2 TE Benchmark Process . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7 Nonlinear Dynamic Soft Sensing Based on Bayesian Inference . . . . . 125
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.2 Support Vector Machine and Its Probabilistic Interpretation . . . . . 126
7.2.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.2.2 Bayesian Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.3 Nonlinear Dynamic Soft Sensor Model Based
on FIR and SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
7.4 Parameter Optimization Based on Bayesian Inference . . . . . . . . . . 129
7.4.1 Optimizing Model Parameters of SVM . . . . . . . . . . . . . . . 130
7.4.2 Optimizing Regularization Parameters . . . . . . . . . . . . . . . . 130
7.4.3 Optimizing Kernel Parameter . . . . . . . . . . . . . . . . . . . . . . 132
7.4.4 Optimizing Dynamic Parameters in FIR . . . . . . . . . . . . . . 133
7.5 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.5.1 A Simulated Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.5.2 Online Prediction of Propylene Melt Index . . . . . . . . . . . . 137
7.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8 Conclusions and Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.1 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.2 Recommendations for Future Work . . . . . . . . . . . . . . . . . . . . . . . 142
Acronyms

AR(1) First-order autoregressive


CPA Control performance assessment
CPM Control performance monitoring
CSTR Continuous stirred tank reactor
CV Controlled variable
CVA Canonical variate analysis
DCS Distributed control system
DPCA Dynamic principal component analysis
DPLS Dynamic partial least squares
DV Disturbance variable
EM Expectation maximization
EWMA Exponentially weighted moving average
FIR Finite impulse response
GEP Generalized eigenvalue problem
ICA Independent component analysis
LDS Linear dynamical system
LS Least squares
MI Melt index
MSPM Multivariate statistical process monitoring
MV Manipulated variable
OIP Orthogonal iteration procedure
PCA Principal component analysis
PCR Principal component regression
PID Proportion-integral-derivative
PLS Partial least squares
PSFA Probabilistic slow feature analysis
RBC Reconstruction-based contribution
RMSE Root-mean-squared error

xvii
xviii Acronyms

RSFA Recursive slow feature analysis


SFA Slow feature analysis
SPE Squared prediction error
SVD Singular value decomposition
SVM Support vector machine
TE Tennessee Eastman
Chapter 1
Introduction

Abstract This chapter provides an overview of the entire thesis. First, the research
background of process data analytics for modeling, monitoring and fault diagnosis in
industrial processes is introduced. Then an extensive review of existing research on
multivariate statistical process monitoring and soft sensor modeling is given. Third,
the opportunities and challenges in advancing industrial applications of process data
analytics are highlighted. Finally, the main contents and the layout of this thesis are
provided.

1.1 Background

Process industry has been playing an indispensable role in the national economy of
China. Typical examples include petroleum industry, power systems, metallurgical
industry, and paper-making industry, to name just a few. With the development of
technology, society and economy, an increasing number of industrial enterprises
have been emerging in China. What’s more, production plants erected become more
complicated and more densely distributed, which poses significant challenges for
operations and management. Therefore, any accidental situation in industrial practice
may lead to disastrous consequences, such as serious casualties, economic losses
and environmental pollution. Hence the safety and efficiency of operations in the
process industry are of great importance in safeguarding economic interests and
social benefits.
Control technologies have been extensively adopted in the process industry.
Although various disturbances universally exist, processes can still operate around
ideal working points that are specified in advance, since influence of disturbances
can be well compensated by the feedback behavior of control systems. It is hence
crucial for process operators to monitor the operating states of processes in an online
manner. For the sake of maintenance, any deviation from the nominal operating
condition shall be detected timely. Moreover, some key variables in industrial pro-
ductions, such as boiling point of petroleum products and pollutants content in emis-
sions, are directly related to final product qualities or environmental indices, which
shall receive more attentions than other variables and be carefully controlled. To

© Springer Nature Singapore Pte Ltd. 2018 1


C. Shang, Dynamic Modeling of Complex Industrial Processes: Data-driven Methods and
Application Research, Springer Theses, https://doi.org/10.1007/978-981-10-6677-1_1
2 1 Introduction

achieve this goal, real-time measurements are necessary. However, due to techno-
logical limitations or economic concerns, such key variables are often difficult to
measure directly. The only feasible source of information is the laboratory analy-
sis, which is typically time-consuming and involves significant delays. Therefore,
it cannot satisfy the requirement of real-time monitoring and control in practice. In
this sense, online monitoring of operating status of processes, and real-time predic-
tion of key variables, are two quintessential issues in productions of modern process
industries [1].
Model-based methods can provide accurate descriptions to process dynamics
and input-output relationships. Based on this, techniques for parameter estimations
and state estimations have been systematically developed [2–5], and successfully
applied to power systems and mechanical engineering, in which research objects are
amenable to clear physical mechanisms. Nevertheless, model industrial processes
typically involve a great number of measurements, and the inherent mechanism is
extremely complicated. It erects obvious obstacles for establishing accurate math-
ematical models, thereby heavily compromising the applicability of model-based
methods. Since 1990, with the ongoing advancement of sensor technology and
distributed control system (DCS), massive data in the process industry have been
collected. The explosive availability of process data provides an asset for decision-
making in the process industry. Under such circumstance, statistical process control
approaches emerged and became more and more prevalent. The most representative
techniques are multivariate statistical process monitoring (MSPM) and soft sensing,
which have received tremendous attentions in both academia and industries [6–9].
The standpoint of MSPM is to describe the probability distribution of process data
via multivariate statistical analysis. Based on statistical properties of data, one can
achieve online monitoring and evaluation of operating condition of industrial pro-
cesses, including abnormal event detection, fault diagnosis and fault reconstructions.
As for soft sensing technology, the core idea is to establish the mathematical mapping
between fast-sampled process variables and key variables that are hard to measure,
which can be used for online estimations as an alternative to the hardware sensor. To
build a soft sensor model, one typically first down-samples the process variables to
synchronize with the key variables, and regards it as a trivial regression task, which
can be tackled using pattern recognition and machine learning techniques. Next, we
proceed with a comprehensive overview for MSPM and soft sensing techniques.

1.2 Literature Review on Multivariate Statistical Process


Monitoring

When an industrial process has a relatively simple structure and a small number
of process variables, it is a practical approach to individually set the control limit
for each process variable to detect the abnormal behavior of the process. For exam-
ple, for a certain process variable x, an upper limit xmax and a lower limit xmin can
1.2 Literature Review on Multivariate Statistical Process Monitoring 3

be set, and when x > xmax or x < xmin an abnormal condition is considered to
occur. However, in modern industrial processes, there are usually hundreds of pro-
cess variables measured simultaneously. It hence becomes unrealistic to set control
limits for process variables individually. Moreover, under closed-loop control, pro-
cess variables may show significant correlations. On some occasions, it is likely that
correlations between variables are violated but control limits for individual variables
are still met. To address this challenge, the principal component analysis (PCA)-
based MSPM technique could help extracting low-dimensional information within
high-dimensional data, based on which simple monitoring statistics can be defined
and applied. In the past twenty years, it has been successfully applied in chemical
engineering, pharmaceutical industry, and iron-making industry, and it has laid a nice
foundation for implementation of advanced process control (APC) technology.
PCA was first proposed by Pearson in the early 20th century, which is the most
widely adopted method in multivariate statistical analysis. Next, we introduce its
principle and the resulted monitoring scheme, and then revisit some recent advances
in MSPM.

1.2.1 Principal Component Analysis

Assumed that process data constitute a matrix X ∈ R N ×m , where N and m stand for
the number of data sample and data dimension, respectively. Meanwhile, each dimen-
sion of data has already been normalized.1 In PCA model, the matrix X decomposes
as [10]:
X= X + E = TPT + E (1.1)

where T = XP ∈ R N ×A is the score matrix of principal components (PCs), and


P ∈ Rm×A is the loading matrix. The number of PCs is given by A ≤ m. E ∈ R N ×m is
the residual matrix including information that is unexplained by PCs. By performing
singular value decomposition (SVD) onto the covariance matrix of x, one obtains
the PCA model:
 
1   Λ 0  T
Rxx = X X := P P
T  P  P , (1.2)
N −1 0 Λ
 = diag{λ A+1 , . . . , λm }, λ1 ≥ · · · ≥ λm
Λ = diag{λ1 , . . . , λ A }, Λ (1.3)

That is, matrix P is formed by singular vector associated with the first A singular
value. According to the property of SVD, we know that the projection directions of
different PCs are orthogonal, and different PCs are linear independent. There are a
variety of approaches for selection of the number A of PCs, and a comprehensive
introduction in this vein is given by [11].

1 Correlation-based
PCA requires that each dimension has zero mean and unit variance, while
covariance-based PCA only asks for zero-mean normalization.
4 1 Introduction

Fig. 1.1 Explicit


explanations of PCA in a x2
2-D data space Direction of 1st PC p1

x1

Direction of 2nd PC p2

PCA can be well understood both geometrically and statistically, as indicated


in Fig. 1.1. From a geometric point of view, the coordinate axis of {x1 , . . . , xm } in
the m-dimensional space is rotated to coincide with the direction having the largest
variance. From a statistical point of view, PCA can be seen as an ellipsoid representing
a multi-variate Gaussian distribution, while directions of PCs correspond to A major
axes of the ellipsoid.
The use of PCA in modeling process data has physical interpretations. Under
closed-loop control, inherent disturbances underlying the process not only affect
control variables (CVs) but also exert implicit influence on manipulated variables
(MVs) via feedback control. Therefore, all types of process variables tend to be
subject to inherent disturbance, thereby being significantly correlated. By extracting
the subspace with the largest variations in PCA, we can derive low-dimensional PCs
that abstractly represent such “common cause” disturbances. Consequently, it is easy
to define the well-known Hotelling’s T 2 statistic based on the variation in the PC
subspace:
T 2 = xT PΛ−1 PT x ≤ Tα2 , (1.4)

which essentially evaluates the “magnitude” of nominal variations of the process.


The control limit Tα2 can be calculated according to the following formula:

A(N 2 − 1)
Tα2 = FA,N −A,α (1.5)
N (N − A)

where FA,N −A,α denotes the α-confidence level of the F-distribution with A and
N − A degrees of freedom. In addition, the variations in the residual subspace can
be characterized by the squared prediction error (SPE) statistic:

SPE = ||(I − PPT )x||2 ≤ Q A,α , (1.6)


1.2 Literature Review on Multivariate Statistical Process Monitoring 5

where the control limit Q A,α is calculated as [12]:

⎡ ⎤ h10
cα 2θ2 h 20 θ2 h 0 (h 0 − 1) ⎦
Q A,α = θ1 ⎣ +1+ (1.7)
θ1 θ12

m
2θ1 θ3 j
h0 = 1 − , θj = λi , j = 1, 2, 3 (1.8)
3θ22 i=A+1

where cα is the α-quantile of the standard Gaussian distribution. Alternatively, Qin


[13] proposed a simplified approach to computing Q A,α :

Q A,α = gχh,α
2
, (1.9)

where g = θ2 /θ1 , h = θ12 /θ2 . Different from the T 2 statistic, the SPE statistic
describes the violation of correlations between process variables. The T 2 and SPE
statistics are complementary to each other, in that they provide a complete allowance
for variations in the data space. For convenience in practical use, Yue and Qin [14]
put forward a combined index, which synthesize information within both T 2 and
SPE statistics:
SPE(x) T 2 (x)
ϕ(x) = +  xT Φx. (1.10)
δα2 Tα2

1.2.2 Current Research Status of Process Monitoring

Figure 1.2 summarizes the framework of existing studies on MSPM, which primarily
includes two research streamlines. The first streamline is to design statistical mod-
els based on different properties of process data. Although linear correlation has
already been well addressed by PCA-based monitoring approaches, its underlying
assumption is still prone to some obvious limitations.
In order to address the non-Gaussianity of process data, many researchers have
proposed to use independent component analysis (ICA) and its extensions for pro-
cess data modeling and monitoring [15–17]. High-order moment information is
adopted in ICA to measure the independence between latent variables. Due to the
non-Gaussianity, the control limits in ICA-based monitoring approaches are rather
difficult to establish. One needs to resort non-parametric methods such as kernel
density estimation [18, 19] to get reliable estimations of control limits.
Focusing on the nonlinear correlations in process variables, Kramer [20] first pro-
posed the concept of autoassociative neural networks to achieve nonlinear dimension
reduction of high-dimensional data. Later, Dong and McAvoy [21] developed the
principal curve model, in which nonlinear curves and neural networks are used to
characterize the most information of process data. With the prosperity of machine
6 1 Introduction

Fig. 1.2 Framework of Fault Fault Fault Fault


existing studies on MSPM Detection Diagnosis Reconstruction Prognosis ……
Linear Correlation

Non-Gaussianity

Nonlinearity

Dynamics

Time-Varying
Behaviors

learning techniques, kernel learning methods, notably kernel PCA (KPCA) [22], has
been widely adopted for process monitoring [23–25]. The basic idea of kernel learn-
ing methods is to map the original input to a high-dimensional reproducing kernel
Hilbert space (RKHS), and translate the problem of nonlinear dimension reduction
into a linear PCA modeling task, which can be tackled conveniently by means of the
kernel trick [22]. A potential limitation of kernel models is that kernel parameters
are commonly difficult to choose. Meanwhile, the model complexity soars linearly
with the number of available data samples, thereby inevitably incurring high costs
in practical implementations.
It is often the case that industrial processes have obvious dynamic characteris-
tics. To address this issue, Ku et al. [26] proposed dynamic PCA (DPCA) model
for process monitoring by including some lagged measurements into the classic
PCA model. State-space models with Markov properties pave an efficient way for
describing process dynamics. To build state-space models from time series data,
researchers have proposed to employ canonical variate analysis (CVA) [27, 28] and
PLS [29]. Dunia and Qin discussed the problem of fault diagnosis based on state-
space models [30].
Industrial processes are time-varying themselves. To accommodate such time-
varying behaviors, adaptive monitoring approaches have been proposed and well
studied. Wold [31] first proposed to utilize exponentially weighted moving average
(EWMA) approach to update the model in an online manner. An alternative method
called moving window (MW) was proposed by Wang et al. [32]. The common stand-
point of these adaptive methods is to gradually alleviate the effect of historical data
onto the model, thereby enabling the model to adapt to recent data and track inherent
changes within the process. In order to reduce the computational burden in online
implementation, Li et al. [33] proposed two different algorithms for recursive PCA.
A comprehensive summary of existing adaptive monitoring methods is given by
Jeng [34].
1.2 Literature Review on Multivariate Statistical Process Monitoring 7

The other streamline in Fig. 1.2 proceeds with the functionality of monitoring
systems, from fault detection, fault diagnosis to fault reconstruction and fault prog-
nosis. After detecting anomalies with MSPM, we need to locate the possible faulty
variables as the “root cause”, or identify the fault type with some prior faulty infor-
mation at hand. Therefore, useful information within abnormal data can be further
excavated and leveraged to assist the decision-making of process operators. The
contribution plot is the most-used method for fault diagnosis, which quantitatively
evaluates the contribution of each process variable to the fault by decomposing mon-
itoring statistics. Fault reconstruction aims to determine the magnitude of the fault
after the fault has been correctly classified. Fault prognosis focuses on modeling
the evolving process of incipient faults and further predicting the remaining useful
life, which is helpful for taking maintenance actions and avoiding severe dangerous
events. Notice that methodologies of fault diagnosis, fault reconstruction and fault
prognosis heavily depend on the statistical model that is used for modeling process
data. Existing studies are primarily based on the basic PCA model and its extensions.
For other models, it is not a trivial task to repeat the methodologies applicable to the
PCA case, and one needs to design tailored techniques to achieve fault diagnosis,
reconstruction, and prognosis.

1.3 Literature Review on Data-driven Soft Sensor Modeling

The idea of soft sensing was first proposed by Brosilow in 1978 when he and his
colleagues studied the inferential control problem of multi-rate systems [35, 36].
Figure 1.3 illustrates the typical multi-rate sampling phenomenon in industrial pro-
cesses. Process variables, such as temperature, pressure, and flow-rate, are usually
sampled based on a short sampling interval, shown as circles in Fig. 1.3. Different
from process variables, measurements of quality variables are unavailable most of the
time, as indicated by red crosses in Fig. 1.3. Typically they can be obtained only by
manual laboratory analysis, which leads to large, sometimes even irregular, sampling
intervals. In order to obtain real-time predictions for quality variables at all snapshots,
we can establish function mappings from process variables to quality variables by
analyzing limited historical data. Such mathematical models can be used in place of
hardware sensors, and this is why they are commonly referred to as “soft sensors”.
In this way, real-time monitoring and control of key quality indices become feasible.
Åström and McAvoy regarded soft sensing as the most challenging and fundamental
problem in chemical process control [37]. After decades of development, significant
progress has been made in both theory and industrial applications of soft sensing
technology. International enterprises such as Aspen and Honeywell have launched
commercial software products that have been widely adopted in a variety of industrial
production processes.
8 1 Introduction

Fig. 1.3 Multi-rate sampling


in industrial processes
ti ti 1
Process
Variables

Quality
Variables x x x x

As a basic approach in soft sensor modeling, PLS first emerged in the field of
multivariate statistical analysis. In 1975, Wold, a famous Swedish statistician, first
introduced PLS to path analysis problems in economics and social science [38].
Next few decades witnessed remarkable flourish in practical applications of PLS,
especially in economics and chemometrics. In process industries, PLS-based soft
sensor models have been applied to both continuous production processes [39] and
batch processes [40]. In the subsection to follow, we first briefly introduce the basic
principle behind PLS model, and then review current research status of soft sensing
methodologies.

1.3.1 Partial Least Squares

Assume that process data and quality data are included in matrices X ∈ R N ×m and
Y ∈ R N × p , respectively. Here p denotes the number of quality variables. In industrial
practice, the sampling frequency of process data is much higher than that of quality
data. Therefore, the actual number of available process data samples is much greater
than N . It indicates that matrix X should be obtained by down-sampling the fast-
rate process data. Assuming that each dimension of both X and Y has already been
scaled to zero mean and unit variance in advance, the PLS model makes use of an
A-dimensional subspace that explains information in both input X ∈ R N ×m and
output Y ∈ R N × p . Mathematically, the PLS model is described as follows:

X = TPT + E
(1.11)
Y = TQT + F

where T ∈ R N ×A is the score matrix. P ∈ Rm×A and Q ∈ R p×A stand for the loading
matrices of X ∈ R N ×m and Y ∈ R N × p , respectively. The optimal number of PCs A
can determined by means of cross-validation. To derive parameters the PLS model,
the nonlinear iterative partial least squares (NIPALS) is a basic and widely adopted
approach, whose implementation details are given in Table 1.1.
After obtaining model parameters W = [w1 , . . . , w A ], P = [p1 , . . . , p A ] and
Q = [q1 , . . . , q A ], given a new input data sample xnew , the corresponding prediction
is given by:
1.3 Literature Review on Data-driven Soft Sensor Modeling 9

Table 1.1 The NIPALS algorithm [41]


Initialization: Let i = 1, X1 = X, and Y1 = Y
Step 1: Let ui be an arbitrary column in Yi
Step 2: wi = XiT ui /||XiT ui ||
Step 3: ti = Xi wi
Step 4: qi = YiT ti /tiT ti
Step 5: ui = Yi qi /qiT qi
Examine whether ui converges or not. If not, return to Step 2; otherwise continue with the
following steps
Step 6: pi = XiT ti /tiT ti
Step 7: Xi+1 = Xi − ti piT , Yi+1 = Yi − ti qiT
Let i = i + 1, and return to Step 1 until i > A

ŷ = Q(WT P)−1 WT xnew , (1.12)

which is a linear function of input data and can be directed adopted for online
prediction of key variables.
It is worth mentioning that, we can also achieve process monitoring and fault
diagnosis based on the subspace decomposition made by PLS. In comparison with
the PCA-based monitoring scheme, here we can incorporate additional information
about quality variables of special interests. First, given an input data x, we need to
calculate the score and residual in the low-dimensional subspace:

t = RT x (1.13)
x̃ = (Im − PR )x T
(1.14)

where R = W(PT W)−1 . Then the Hotelling’s T 2 statistic and the Q-statistic can be
defined as follows [42]:

A(N 2 − 1)
T 2 = tT Λ−1 t ≤ FA,N −A,α (1.15)
N (N − A)
Q = ||x̃||2 ≤ gχh,α
2
(1.16)

It seems that the T 2 and Q-statistics in PLS are similar to the T 2 and SPE statistics
in PCA; however, their physical interpretations are fundamentally different. In the
context of PLS, the T 2 statistics is commonly conceived as abstracting variations that
are highly correlated with quality variables, while the Q-statistic is defined based
on variations that are irrelevant with quality variables [43]. Since the main focus of
this thesis is on the application of unsupervised learning models in MSPM, we are
not meant to cover further details about the PLS-based monitoring and diagnosis
approaches, and interested readers are referred to [44, 45].
10 1 Introduction

In this thesis, we pay particular attentions to the application of PLS in soft sensor
modeling. Under closed-loop control, both process variables and quality variables
are affected by the intrinsic disturbance underlying the process, thereby exhibiting
obvious correlations and cross-correlations. In the PLS model, “common cause”
disturbances are interpreted as variations in the low-dimensional subspace by maxi-
mizing the covariance between input scores and output scores. Therefore, similar to
the application of PCA in MSPM, PLS-based soft sensors embody profound physical
meanings as well.

1.3.2 Current Research Status of Soft Sensing

Studies on soft sensing can also be categorized into two different streams, as shown
in Fig. 1.4. The first streamline concentrates on making various assumptions on sta-
tistical properties of models according to different process characteristics. Let us take
PLS as an example. Strictly speaking, PLS essentially delineates the linear cross-
correlation between process variables and quality variables under a single operating
condition. However, it is possible that nonlinear correlation relationship exists, and
hence many nonlinear generalizations of PLS have been proposed, for instance, the
kernel PLS (KPLS) [46] and the neural network PLS (NNPLS) [47]. Qin [48] first
pointed out the great potential of ANN in solving soft sensing problems. According
to the review article by Kadlec et al. [6], about one third of soft sensors implemented
in practice are developed based on ANN. Despite such prevalence, the training pro-
cess of classic ANN is often subject to local optimum, and it is usually difficult to
determine an optimal network structure. Support vector machine (SVM) based on
statistical learning theory is advantageous to ANN due to its ease in global optimiza-
tion and parameter selection. Because of these merits, SVM has been a hot topic
in machine learning community [49]. Besides, SVM has an excellent generalization

Fig. 1.4 Framework of


existing studies on soft
sensing
1.3 Literature Review on Data-driven Soft Sensor Modeling 11

ability in dealing with small sample problems, thereby being particularly suitable for
soft sensor modeling [50, 51].
It is often the case that industrial processes are characterized by long settling
times, and the casual relationship between process variables and quality variables
may have evident dynamics. In other words, quality measurements can be affected
by process measurements in a period of time. Kano et al. [52] first extended PLS
to its dynamic counterpart, namely dynamic PLS (DPLS), to enhance the predic-
tion accuracy of product quality in industrial distillation columns. Similar to DPCA,
DPLS incorporates some lagged historical input measurements into the model, which
takes process dynamics into account. Ma et al. [8] first proposed to filter the input by
means of impulse response (IR), and then feed the filtered input to classic static soft
sensing models. This strategy describes dynamics and steady relationship between
process variables and quality variables separately, which has a typical Wiener struc-
ture, and many research efforts have been made in this vein [53, 54]. Some other
dynamic soft sensing methods include recurrent neural networks [55, 56] and wavelet
transformations [57].
To meet varying product demands in the market, there are always multiple oper-
ating modes in industrial processes. Under such circumstance, soft sensor models
based on single operating condition have obvious limitations. Hence, multi-mode
soft sensor models have been proposed. In order to realize partitioning of different
operating modes, one has to first carry out clustering of process data based on certain
similarity criterion, and then establish an individual soft sensor model for each data
cluster. The most popular approach for data clustering is the C-means clustering
[58, 59]. An alternative strategy is to employ probabilistic models such as the Gaus-
sian mixture model [60–62]. Recently, Ge et al. [63] proposed to use external analysis
to extract common information from multi-mode data, which finally yields a unified
prediction model for multi-mode soft sensing.
The other research streamline in soft sensing focuses on the implementation prob-
lems in industrial practice, such as variable selection, data preprocessing and model
maintenance, as indicated by the horizontal axis in Fig. 1.4. If all available pro-
cess variables are used for soft sensor modeling, the resulted model will tend to be
excessively complicated and perform poorly in terms of generalization capabilities.
Meanwhile, a high implementation cost will be incurred. Therefore, it is necessary
to pick out a small percentage of dominant process variables as soft sensor inputs for
the sake of model simplifications. A pragmatic approach is to resort to both expert
knowledge and correlation analysis [39]. In a recent work [64], variable selection
methodologies for PLS-based soft sensor models are comprehensively revisited.
Moreover, before fitting a soft sensor model, one typically needs to carry out data
preprocessing to handle potential outliers and missing data due to the imperfection
of industrial data. The classic 3σ -approach is believed to be practically efficient in
detecting outliers [65]. Later, quantile estimation-based robust methods were fur-
ther proposed, for example, the Hampel test [66]. Besides, a promising approach is
to model outliers statistically in virtue of the tail effect of Student’s t distribution
[67, 68]. Hodge and Austin [69] provided a comprehensive survey of commonly used
techniques for outlier detection and identification. Targeting on the issue of missing
12 1 Introduction

data, researchers have proposed to make use of the low-dimensional subspace in


PCA/PLS to recover missing values [70, 71]. In addition, the maximum likelihood
estimation (MLE) method in statistical inference is also effective in handling missing
data [72, 73].
The accuracy of soft sensor models may deteriorate when time goes by. Therefore,
the online updating and model maintenance of soft sensors are of special interest in
practice. Classic model updating methods include MW [74], recursive least squares
(RLS) [58], and recursive PLS (RPLS) [75]. In recent years, local learning and
just-in-time learning have received increasing attentions [76–78], which can organi-
cally handle time-varying prediction tasks. The basic idea is to maintain a data-base
with input-output data samples, which should be updated once new quality data
are obtained. For online predictions, by checking the similarity between the instant
query input, only a portion of historical input-output data samples are selected, which
potentially capture the time-varying behavior of the industrial process. After that,
a real-time model can be established for calculating the predicted value. Besides, a
novel time-difference model was proposed by Kaneko et al. [78], which describes the
relationship of relative changes in both process variables and quality variables. The
merit of the time-difference model is that time-varying effects can be alleviated to
some extent, and the model could still enjoy desirable prediction accuracy even with-
out adaptations, thereby granting significant convenience in practice. Kadlec et al.
[79] provided a comprehensive overview of recent research efforts in the adaptation
mechanisms of soft sensor models.

1.4 Challenges and Opportunities

From the literature review in Sects. 1.2 and 1.3, one can see that in the past twenty
years techniques of MSPM and soft sensing have experienced rapid development and
have been pervasively applied in a large variety of industrial scenarios. Nevertheless,
there still exists room to further enhance their applicabilities, with the aim to further
promote the level of automation and intelligent decision-making in process industry.
Next we highlight two primary research challenges.
(1) Research challenges in MSPM
Classic MSPM techniques show significant limitations in dealing with frequent pro-
cess operating condition switches in modern process industry. One one hand, the
process objectives could be oriented actively according to planning and scheduling
with the aim to meet changing demands for its products. For example, the setpoint of
the melt index in a propylene production process may be adjusted to meet the real-
time requirements of the market. On the other hand, some variations or disturbances,
for instance, temperature changes in the ambient environment or the operating con-
dition changes in the upstream unit, might as well exert influence on the operating
condition of the process. In both cases, the steady distribution of process data under
normal operating condition is disrupted, and therefore, alarms would be consistently
triggered by basic MSPM methods; however, if the process remains well controlled
1.4 Challenges and Opportunities 13

owing to compensation of controllers, further actions become no longer necessary,


and hence alarms in this context should be of not only less importance but also greater
distinctions than those detecting real faults. Most hazardously, industrial practition-
ers cannot distinguish real faults from such nominal changes in operating conditions
simply based on monitoring results, which lays a serious hidden peril for safety
production in industrial processes.
(2) Research challenges in soft sensing
Classic soft sensing models fall short of fully utilizing information that underlies
multi-rate data, and yet there is room for further improvement. In static soft sens-
ing models, process data are down-sampled and further aligned with quality data
on account of the multi-rate phenomenon in Fig. 1.3. Therefore, the same amount
of process data and quality data are used for building soft sensors, thereby leaving
abundant information within fast-rate process data unused. When the process has evi-
dent dynamics, the prediction accuracy of soft sensors will degrade significantly. In
dynamic soft sensing models, the dynamic influence of process variables onto qual-
ity variables during the settling process can be characterized. However, the resulted
model has explosive complexity, thereby leading to severe over-fitting, especially
when the settling time is rather long and a large number of lagged measurements
are involved. Therefore, in order to devise appropriate dynamic soft sensing models,
one cannot make trivial extensions based on classic static soft sensing models.
The performance of data-driven models in industrial applications heavily depends
on whether the assumption that underlies is in accordance with the physical truth or
not. In this thesis, we intend to provide a new perspective on process data analytics
and industrial applications thereof based on process dynamics, and develop effective
data-driven dynamic modeling approaches that agree physically with the working
principle of industrial processes.
In fact, process dynamics embodies valuable information for effective discrimina-
tion between nominal operating condition deviations and real faults. If the process is
still under control after operating condition deviations, its dynamic behavior ought to
be intact. Conversely, in the case of real faults, process dynamics will change signif-
icantly, mainly because the control system cannot completely compensate the effect
of faults, and thus exhibit discrepant dynamic characteristics. In words, a reasonable
characterization of process dynamics can furnish abundant and comprehensive pro-
cess information for process practitioners to reach rational decisions. In addition, if
we can incorporate prior knowledge about process dynamics into dynamic soft sens-
ing models in a reasonable way, the physical meaning of models will become much
clearer, which is beneficial for reducing model complexities and further improving
the prediction precision.
We would like to assert that, process dynamics can be interpreted in two different
ways, which reveal some opportunities for future investigations. One could under-
stand process dynamics either as temporal correlations between model states, or as
temporal correlations between model parameters. In both regards, classic data-driven
models such as DPCA and DPLS have obvious limitations. For example, in DPCA,
the introduction of lagged measurements merely implies that low-dimensional latent
14 1 Introduction

variables are related to process data within the process settling time; however,
the latent variables themselves are still supposed to be temporally independent in
the model structure. We know that latent variables essentially stand for the process
states, and thus shall be temporally correlated. In this sense, the generic indepen-
dence assumption in latent variable models is inadequate. Moreover, model parame-
ters should have some temporal smoothness since model dynamics is quite unlikely
to have abrupt changes due to the inertia of industrial processes. Unfortunately, this
issue remains unaddressed by existing dynamic data-driven models.

1.5 Contents and Outline

The contents to be covered in this thesis are summarized in Fig. 1.5. We intend to
impose prior knowledge of temporal smoothness on model states and model parame-
ters, respectively, to devise new data-based methods for dynamic modeling of indus-
trial processes. An in-depth integration of both prior knowledge and information
within massive process data is to be achieved, based on which the applicability of
MSPM and quality prediction techniques can be further enhanced. By incorporating
temporal smoothness information on model states, we first propose a new approach
to process data modeling and the associated MSPM technology based on slow feature

Fig. 1.5 Layout of this thesis


Another random document with
no related content on Scribd:
IMANDRA

Silloin minä, minä tanssisin kanssanne lähteellä. Osaatteko soittaa


huilua?

PRINSSI

Minä rakastan tietoa, taitoa, soittoa ja — teitä, armollinen


autuuteeni.

IMANDRA

Joko taas! Mutta kuulkaas, hyppikäämme harakkaa!

PRINSSI

Harakkaa? En minä osaa!

IMANDRA (alkaa hyppiä)

Katsokaas, näin!

OTRO

Hahhaa!

IMANDRA

Kuka nauroi!

OTRO

Harakka!
IMANDRA

Näin minä hyppään häissäkin. Hahhaa!

HOVIROUVA (ja hoviherra rientävät esille)

Mikä häväistys! Prinsessa! Nouskaa heti! (Hovikumarruksia.)

HOVIHERRA

Pieni, kiltti prinsessa, muistakaahan, että te olette suuren


Suvikunnan valtijatar!

OTRO

Kaukovallan prinssin puolesta saan minä kunnian kosia


Suvikunnan valtijatarta ja huomenlahjana tuon minä tämän
ihmeellisen peilin, joka tekee ruman kauniiksi ja päinvastoin.

IMANDRA (katsoo äkkiä peiliin ja kirkaisee)

Mitä minä näen? Oman irvikuvani! Hyi! Uskallatte tehdä pilkkaa!


(Heittää peilin permannolle.) Kas noin!

OTRO

Oi, mitä te teitte! Te särjitte taikapeliin!

IMANDRA

Taikapeilin? Hui, hai! (Juoksee tiehensä.)

HOVIROUVA
Kultakruunuineen ja kultakengät jaloissa täytyy prinsessan heti
palata pyytämään anteeksi suur'armolliselta, hänen korkeudeltaan
Kaukovallan prinssiltä. (Menee hullunkurisesti niiaten.)

PRINSSI (Poimien peilinpalasia, jotka hän kätkee taskuunsa.)

Älkäähän toki…! Kuulkaas, herra hoviherra, mitä me nyt teemme,


minä olen aivan hullaantunut teidän tuittupää prinsessaanne!

OTRO

Prinsessan omista puheista sain minä oivan ajatuksen. Teidän


korkeutenne, pukeutukaa paimenpojaksi! Minulla on mukanani tuolla
pilarieteisessä paimenviitta, hattu, huilu, keltainen tekotukka.

PRINSSI

Otro, sinä olet aina kekseliäs! Mene tuomaan valepukuni!

OTRO

Vanha valepukunne, jota usein olette matkoillanne käyttänyt. Minä


juoksen heti noutamaan! (Menee.)

PRINSSI

Hoviherra, me näyttelimme usein hovissani paimennäytelmiä.

HOVIHERRA

Toista on näytellä paimenelämää ja toista on elää paimenelämää.


Se olisi opettavaista prinsessallekin.
PRINSSI

Aivan niin, hyvä hoviherra.

OTRO (palaa)

Kas tässä, teidän korkeutenne, tulkaa tänne! (Vetäytyvät syrjään,


prinssi pukeutuu paimenvaatteisiin ja -kenkiin.) Minä tunnen teidät
yhtä hyvänä näyttelijänä kuin valtijaana ja naissielun hallitsijana. Kas
noin, vielä hiukan keltamaalia kulmakarvoihin!

PRINSSI

Minä olen valmis. Nyt alkaa näytelmä, minkä toivon päättyvän


onnekseni.

HOVIHERRA

Hyvä, minä menen sanomaan prinsessalle, että täällä odottaa


eräs paimenpoika. (Menee.)

OTRO

Ja minä menen edeltäpäin valmistamaan metsästysretkeä.


(Menee.)

IMANDRA (tulee kultakruunu päässä ja kultakengät jaloissa,


vastahakoisesti, pyyhkien kyyneliään, katsoen maahan, samassa
alkaa hän nauraa väkinäisesti.)

Minä tulin pyytämään… Ei! Oh, minä nauran, vaikka pitäisi itkeä.
Kas, eihän täällä ole prinssiä! Paimenpoika, mitä sinä haet?
PRINSSI

He, en minä tiedä.

IMANDRA

Mikä on nimesi, tiedätkö sen.

PRINSSI

Metsä-Matiksihan minua sanotaan.

IMANDRA

Sinussa on metsän tuoksua.

PRINSSI

Kerrotaan, että tässä linnassa olisi hyvin kaunis mutta häijy


prinsessa.

IMANDRA

Minäkö häijy! Mutta missä minä olen ennen kuullut sinun äänesi?

PRINSSI

Ehkä unissanne.

IMANDRA

Ihmeellistä, niin minä olen kuullut sen kuin unissani, se on


ikäänkuin kutsunut minua metsään, vuorille, virroille, lähteelle, jonka
luona minä olen istunut sitoen kukkaseppeltä.

PRINSSI

Minä olen niin usein, usein kulkenut linnan ohi ja kurkistanut


ikkunoihin.

IMANDRA (veitikkamaisesti)

Mitä varten? Kyllä minä arvaan.

PRINSSI

Niin, arvatkaas!

IMANDRA

Ehkä minun tähteni, hihii.

PRINSSI (huokaillen)

Nii — niin.

IMANDRA

Nii — niin. Että uskalsit kurkistella korkeata prinsessaa, jolla on


kultakruunu päässä ja kultakengät jaloissa.

PRINSSI

Minä olen vain köyhä paimenpoika. Hohoo, niin.

IMANDRA (huokaillen)
Hohoo, niin. Mutta merkillistä, kuinka sinä olet Kaukovallan
prinssin näköinen.

PRINSSI

Niin sanovat ihmiset. Meitä voisi luulla kaksosiksi.

IMANDRA

Oletko sinä koskaan nähnyt Kaukovallan prinssiä?

PRINSSI

Kyllä, kyllä, useinkin… Olin kerran Kaukovallan hovissa.

IMANDRA

Mitä sinä siellä teit?

PRINSSI

Minä kerron. Palvelin kerran prinssin hovissa…

IMANDRA

Ja opit hiukan hovitapoja?

PRINSSI

Minut oli otettu sinne yhdennäköisyyteni takia.

IMANDRA
En ymmärrä.

PRINSSI

Minun toimenani oli olla ylimpänä kättelijänä.

IMANDRA

Kättelijänä? Nyt ymmärrän sinua vielä vähemmin.

PRINSSI

Minun piti kätellä kaikkia armonanojia, kun prinssi valtaistuinjuhlien


aikana väsyi kattelemaan kansaa.

IMANDRA

Elit kättesi työllä, hahhaa!

PRINSSI

Mutta sitten alkoivat hoviherrat epäillä minua yhdennäköisyyteni


takia.

IMANDRA

Entä sitten?

PRINSSI

Sitten ajoivat minut pois.

IMANDRA
Poika parka!

PRINSSI

Mutta prinssi hankki minulle kuninkaallisen karjankaitsijan viran.

IMANDRA

Todellakin! Sinä olet kuin prinssin kuva, eikä sinua erottaisi juuri
muusta kuin puvusta ja tukasta.

PRINSSI

Mitäpä te, korkea prinsessa, välitätte paimenparasta, jolla on


paraat päällä ja loput kainalossa.

IMANDRA

Mitä sinulla on siellä kainalossa?

PRINSSI (pyyhkien silmiään)

Paimenhuiluni, ainoa iloni!

IMANDRA

Mutta mitä sinä itket?

PRINSSI (tukahuttaen itkuaan)

Pidättekö kovin prinssistä?

IMANDRA
Kuules poika! Soitapa huilulla, niin minä tanssin. Tästä tulee
hauskaa!

PRINSSI

Minä soitan vain yhdellä ehdolla.

IMANDRA

Sano se pian!

PRINSSI

En minä kehtaa.

IMANDRA

Kyllä sinä saat kehdata.

PRINSSI

Niin, että minä saan yhden suunannin jokaisesta paimenpolskasta.

IMANDRA

Suunanti, hihii, mitä se on?

PRINSSI

Sitä, että minä asetan huuleni teidän huuliinne.

IMANDRA
Eikö sen kummempaa. No!

PRINSSI

No, nouskaa varpaillenne. Noin!

IMANDRA

Ah, kuinka se oli makeaa, makeampaa kuin kaikki kuninkaallisen


kyökkimestarin mesikakut. No!

PRINSSI

No, mutta minä soitan ensin.

IMANDRA

Ei sinun tarvitse soittaa, soita sitten metsässä! No! (nousee


varpailleen, prinssi aikoo suudella. Hovirouva ja hoviherra tulevat.)

HOVIROUVA

Mutta prinsessa, tämähän on kauheata, te annatte suuta vieraalle


paimenpojalle!

IMANDRA

Niin minä teenkin, en minä enään huoli pitkistä enkä pienistä


prinsseistä.

HOVIHERRA
Holhoojana otan minä kultakruunun päästänne, te ette ole enään
arvokas sitä kantamaan.

IMANDRA

Minä lähden tämän paimenen kanssa, minä sidon päähäni


kukkaisen kruunun.

HOVIROUVA

Tapahtukoon tahtonne! Minä pesen käteni.

IMANDRA

Hajuvedessä! Metsässä on ihanampi tuoksu.

HOVIROUVA (suuttuvinaan)

Menkää, me emme enään vastaa teoistanne!

IMANDRA

Minä vastaan itse teoistani. Tiedän, etten kulje väärillä poluilla.

HOVIHERRA

Kaikki polut vievät kuitenkin kotiin. Hyvästi prinsessa!

IMANDRA

Hyvästi — vaan, hyvä hoviherra ja te — hovirouva! (Tekee


kömpelöitä hovikumarruksia.) Nöyrin palvelijanne! Paimenpoika,
soita nyt huilullasi! (Prinssi soittaa, Imandra tarttuu hänen
käsivarteensa, ikäänkuin tanssien poistuvat molemmat.)

HOVIROUVA (käyttäen hajuvesipulloa)

Lähettäkäämme kuitenkin kamarineiti mukaan.

HOVIHERRA

Tehkäämme niin. Saanko tarjota käsivarteni. (Hoviherra ja


hovirouva tekevät hullunkurisia kumarruksia.) Oi, nuoruus, oi, vihreä
nuoruus, minun kuningattareni! Äst!

HOVIROUVA

Mitä te teitte?

HOVIHERRA

Anteeksi, minä aivastin.

HOVIROUVA

Oo, te…

HOVIHERRA

Teidän hajuvetenne…

HOVIROUVA (aivastaa)

Äst! Oo, anteeksi…!


HOVIHERRA (aivastaa)

Äst! Terveydeksenne!
NÄYTÖS II

Metsä.

IMANDRA (tulee paimeneksi puetun prinssin seurassa)

Missä me nyt olemme?

PRINSSI

Metsässä, Kaukovallan prinssin valtakunnassa.

IMANDRA

Kaukovallan prinssin, oh!

PRINSSI

Muistatko häntä vielä?

IMANDRA

Minulle on kaikki kuin unta!

PRINSSI
Tämä on ihanaa unta valveilla.

IMANDRA

Me olemme kulkeneet kauan ja kauvas. Minä olen väsynyt, minä


en astu enää askeltakaan.

PRINSSI

Olemme eksyksissä.

IMANDRA

Mitä me nyt teemme?

PRINSSI

Levätkäämme!

IMANDRA

Oi, tässä on lähde ja kukkia! Sido minulle seppele!

PRINSSI (sitoo seppelettä)

Se hoviherra otti sinun kultakruunusi. Se oli minun syyni. Miksi


tulin linnaan?

IMANDRA

Eipäs kuin oma syyni. Tämä suvinen seppele on keveämpi. Enkö


minä nyt ole kuin se ihmeen ihana metsätyttö?
PRINSSI

Et ole syntynyt metsässä vaan hovissa.

IMANDRA

Täällä on niin kummallinen rauha. Minä ja sinä yksin. Mutta minä


en tunne sinua vielä. Minusta tuntuu kuin et sinäkään olisi syntynyt
metsässä. Minä olen kai nähnyt sinut unissani.

PRINSSI

Jatkukoon tämä uni aina.

IMANDRA

Mutta hovissa minä aina heräsin painajaiseen. Olenko minä paha?


Nyt minä tahdon nähdä kuvani. Onko sinulla peiliä?

PRINSSI

Vain peilin siru. Minä löysin sen hovin lattialta.

IMANDRA

Kaukovallan prinssin noitapeilin? Minä en uskalla siihen katsoa.

PRINSSI

Katsele nyt vaan!

IMANDRA
Luulikohan prinssi tällä noitapeilillä lumoovansa? Siinä erehtyi!

PRINSSI

Et uskalla.

IMANDRA

Minä tahtoisin katsella, minä en tohdi, mutta… minä katson vaan.


Voi, kuinka minä olen ruma, nenä on väärässä, se venyy, venyy ja
silmät! Silmät ovat nurin päässä! (Viskaa peilin maahan.)

PRINSSI (kätkee peilin)

No, no! Kukaan ei voi nähdä itseään mistään peilistä. Onnellisinta


on nähdä kuvansa toisen olennon silmässä.

IMANDRA

Mutta lähteensilmä! Minä katson siitä. Minähän seison ihan


päälläni!
Onko tämä metsä noiduttu?

PRINSSI

Niin tosiaankin, tämä metsä on omituinen.

IMANDRA

Minä tunnen niin omituista — täällä. (Osoittaa vatsaansa.) Minä


kuulen ikäänkuin pieni kissanpoikanen naukuisi. Mitä se on?

PRINSSI
Se on pieni peikko.

IMANDRA

Peikko, uh!

PRINSSI

Nälkä.

IMANDRA

Minä en ole koskaan kärsinyt nälkää.

PRINSSI

Kiitä kyökkimestaria.

IMANDRA

Kiittääkö kyökkimestaria? Miksi?

PRINSSI

Täällä on kyökkimestarina mestari Nälkä (Kohottaa olkapäitään.)

IMANDRA

Kyökkimestari laittoi niin makeita mesikakkuja.

PRINSSI

Tässä on ketunleipiä, taikinanmarjoja ja juopukoita.


IMANDRA (syö ahneesti)

Anna, anna! Minä syön ja syön, mutta se peikko vain parkuu. Oh,
kuinka minun on nälkä! Ja minä luulin, että tämä oli niin ihanaa. Sinä
olet paha poika. Tuo paikalla mesikakkuja!

PRINSSI

Mistä minä ne tuon? Mutta prinsessa sanoi silloin hovissa, että


suunanti oli makeampaa kuin mesikakut. Koetelkaamme!

IMANDRA

Niin — hovissa. Elääkö sillä?

PRINSSI

Ei sillä elä sen enempää kuin kuunvalollakaan.

IMANDRA

Hyi, sinä kiusaat minua, sinä pilkkaat minua, sinä, sinä…

PRINSSI

So, so, itsehän sinä suostuit tulemaan kanssani.

IMANDRA

Mitähän hoviherra ja hovirouva nyt tekevät?

PRINSSI

You might also like