You are on page 1of 6

604 Y. Liu et al.

/ Chemical Engineering Science 102 (2013) 602–612

S1 ¼ fX1 ; Y 1 g, where X1 ¼ fx1;i ¼ ½x1;i;1 ; x1;i;2 ; ⋯; x1;i;m1 T A Rm1 gi ¼ 1;⋯;N N


¼ ∑ αl;i 〈ϕðxl;i Þ; ϕðxl;t Þ〉 þ bl
with m1 input variables and Y 1 ¼ fy1;i A Rgi ¼ 1;⋯;N are the i¼1
input and output variables of the first reactor with N samples,
respectively. Generally, the product quality Y l ¼ fyl;i A Rgi ¼ 1;⋯;N in ¼ αTl kl;t þ bl ð5Þ
the lth reactor can be mainly controlled by the related where kl;t ðiÞ ¼ 〈ϕðxl;i Þ; ϕðxl;t Þ〉; 8 i ¼ 1; ⋯; N is a kernel vector for
manipulated variables X l ¼ fxl;i A Rml gi ¼ 1;⋯;N in this reactor. Addi- estimation of the test sample.
tionally, it can also be affected by some process variables In summary, the development of a soft sensor model amounts
f½xT1;i ; ⋯; xTl1;i T A R∑j ¼ 1 mj gi ¼ 1;⋯;N in the previous reactors (Lou
l1

to solving a set of linear equations in the high dimensional feature


et al., 2012; Zhang et al., 2006). Without enough process space introduced by the kernel transform. The LSSVR model for
knowledge, the generalized input variables can be formulated the first reactor can be built in a simply way. However, as for the
as xl ¼ ½xT1 ; ⋯; xTl1 ; xTl T A R∑j ¼ 1 mj with mj input variables of the
l

Lth reactor, it is not a trivial task because the set of the variables
jth reactor. Correspondingly, the modeling data set of the fxL;i A R∑j ¼ 1 mj gi ¼ 1;⋯;N , including the variables at the current reactor
L

lth reactor can be further represented as Sl ¼ fXl ; Y l g; l ¼ 1; ⋯; L, L and the variables from the previous reactors, becomes more
Xl ¼ fxl;i ¼ ½xT1;i ; ⋯; xTl1;i ; xTl;i T A R∑j ¼ 1 mj gi ¼ 1;⋯;N
l
where and complicated. Without enough process knowledge, it is difficult to
Y l ¼ fyl;i A Rgi ¼ 1;⋯;N . determine a suitable input variable for constructing a model.

2.2. Basic LSSVR modeling method


2.3. Traditional two-step LSSVR-based soft sensors
Generally, the soft sensor model development of an SR process
Generally, in the construction of a data-driven model for a
based on the SVR/LSSVR framework can be described as a problem
chemical process with many input variables, two traditional
aiming to learn a mapping f: X-Y using a modeling set S¼[X, Y],
approaches have been investigated (Gustafsson, 2005; Kadlec
mainly using the kernel learning method. For the lth reactor, a
et al., 2009; Shi and Liu, 2005; Zamprogna et al., 2005; Zhao
general nonlinear model for process modeling can be formulated
et al., 2010). One can be considered as the variable or feature
as (Liu et al., 2009, 2012)
selection method (Ojeda et al., 2008; Pan et al., 2012). In the
yl;i ¼ f l ðwl ; bl ; xl;i Þ þ el;i ¼ wTl ϕðxl;i Þ þ bl þ el;i ; i ¼ 1; ⋯; N ð1Þ context of modeling for the lth reactor, variable selection aims at
finding a subset of variables Xsub;l ¼ fxsub;l;i gi ¼ 1;⋯;N that can result
where fl is the wanted model of the lth reactor; ϕ is a feature map; in more accurate and compact predictors. The variable selection
yl,i and el,i denote the output measurement and the approximation method should capture the relevant information and filter out
error for the ith sample of the lth reactor; and xl,i is a general input those inputs that are irrelevant to the specific regression model
vector usually composed of several measured variables. The (Ojeda et al., 2008; Pan et al., 2012). Actually, there is no single
symbols wl and bl are the model parameter vector and the bias definition for the best subset as different algorithms will lead to
term of the lth reactor, respectively. When the philosophy of the different subsets. The main disadvantage of variable selection
statistical learning theory and the LSSVR framework is applied to methods is the high computational cost because the search space
Eq. (1) (Schölkopf and Smola, 2002; Suykens et al., 2002), the becomes bigger as the number of the variables in the lth reactor
following optimization problem is formulated: increases to 2∑j ¼ 1 mj (Ojeda et al., 2008).
l

( γ The other classical method can be considered as the variable or


minJ l ðwl ; bl ; el;N Þ ¼ 12jjwl jj2 þ 2l jjel;N jj2
ð2Þ feature extraction. Among them, LV-based methods in the form of
s:t: yl;i wl ϕðxl;i Þbl el;i ¼ 0; i ¼ 1; ⋯; N
T
PCA and PLS have a dominating role in process modeling (Cao
where el;N ¼ ½el;1 ; el;2 ; ⋯; el;N T is the approximation error. The for- et al., 2003; Gustafsson, 2005; Kadlec et al., 2009; Shi and Liu,
mulation consists of equality constraints instead of inequality 2005; Zamprogna et al., 2005; Zhao et al., 2010). In these methods,
constraints in the conventional SVR algorithm and takes into correlation patterns among the variables can be modeled and
account a squared error with the regularization term. Therefore, underlying features are used to build and explain a quantitative
this reformulation greatly simplifies a modeling problem and relationship relevant to the quality properties. However, LV-based
makes it computationally efficient (Suykens et al., 2002). The analysis approaches cannot tell which type of LVs best fits the
user-defined regularization parameter γl (γl 40) determines the model for the observed data, since there are different ways to
trade-off between the model's complexity and approximation define LVs (Gustafsson, 2005; Zhao et al., 2010). Besides, for a
accuracy (Suykens et al., 2002). The Lagrangian can be constructed nonlinear modeling problem, most of the previous investigations
to solve the optimization problem in Eq. (2) (Schölkopf and Smola, often used PCA or PLS methods to extract LVs as a preprocessing
2002), then the solution can be expressed as step (Kadlec et al., 2009; Shi and Liu, 2005; Zamprogna et al.,
" # 2005). However, as for an SR process, the extracted important LVs
1l;N 1Tl;N Pl;N yl;N XLVs;l ¼ fxLVs;l;i gi ¼ 1;⋯;N may capture most of the process variations,
αl ¼ Pl;N yl;N  T ð3Þ
1l;N yl;N 1l;N but not necessarily explain quality properties. Furthermore, this
two-step strategy divides the soft sensor modeling task into two
1Tl;N Pl;N yl;N separate sub-tasks, regardless its connection in a unified
bl ¼ ð4Þ framework.
1Tl;N yl;N 1l;N
Similar to the above approaches, the LSSVR-based modeling
where αl ¼ ½αl;1 ; ⋯; αl;N T are Lagrange multipliers; yl;N ¼ ½yl;1 ; ⋯; method, which combines corresponding soft sensors with the
yl;N T ; Il;N A RNN is a unit matrix and 1l;N is a vector of ones; above variable or feature selection methods, is shown in Fig. 2,
where Xsub;l ¼ fxsub;l;i gi ¼ 1;⋯;N and XLVs;l ¼ fxLVs;l;i gi ¼ 1;⋯;N are the
Pl;N ¼ H1 l;N and Hl;N ¼ Kl;N þ Il;N =γ l with the kernel matrix Kl;N , with
its elements Κl;N ði; jÞ ¼ 〈ϕðxl;i Þ; ϕðxl;j Þ〉; 8 i; j ¼ 1; ⋯; N (Schölkopf transformed variables and extracted feature variables, respectively.
and Smola, 2002; Suykens et al., 2002). Correspondingly, two kinds of LSSVR-based soft sensors can be
Finally, for a new test input xl;t , the LSSVR model estimation of constructed, respectively. However, both of them can be consid-
the lth reactor, i.e., y^ l;t , can be obtained (Suykens et al., 2002) ered as indirect two-step modeling approaches. Additionally, the
sequential relationship in the whole SR process has not been
y^ l;t ¼ f l ðwl ; bl ; xl;t Þ investigated before because it was difficult to characterize.
Y. Liu et al. / Chemical Engineering Science 102 (2013) 602–612 605

Test sample xl ,t

Input and output data Basic soft sensor LSSVR model for the
for training related reactor Prediction yˆl ,t
Sl = {Xl , Yl } , l = 1,L , L (Eq. 5)
Test sample xl ,t

Traditional two-step
Yl and a subset of LSSVR model for the
soft sensor A
variables or features related reactor Prediction yˆl ,t
Xsub, l (Eq. 5)

Test sample xl ,t
Traditional two-step
Yl and extracted LVs LSSVR model for the
soft sensor B
based on PCA related reactor Prediction yˆl ,t
XLVs, l (Eq. 5)

Fig. 2. Flow diagram of the traditional LSSVR-based two-step soft sensor modeling methods for the lth reactor of a sequential process.

3. Just-in-time sequential LSSVR model for an SRMP process transformed to zl;t first using Eqs. (7) and (8).

3.1. Sequential LSSVR modeling framework and its efficient training zl;t ¼ ½y1;t ; ⋯; yl1;t ; xTl;t T A Rl1þml ð9Þ
for an SR process Finally, the prediction y^ l;t can be obtained using the obtained
SLSSVR model f~ l
This paper presents a novel soft sensor modeling method
which can integrate the variable extraction and quality prediction y^ l;t ¼ f~ l ðωl ; β l ; zl;t Þ
into a unified framework. Additionally, the sequential relationship N
in an SR process is investigated. For the lth reactor of an SR ¼ ∑ α~ l;i 〈ϕðzl;i Þ; ϕðzl;t Þ〉 þ β l
i¼1
process, the soft sensor modeling framework can be described as
below ~ Tl k~ l;t þ β l
¼α ð10Þ
(
minJ l ðωl ; β l ; ξl;N Þ ¼ 12jjjωl j2 þ γ2l jjξl;N jj2 where k~ l;t ðiÞ ¼ 〈ϕðzl;i Þ; ϕðzl;t Þ〉; 8 i ¼ 1; ⋯; N is a kernel vector and
ð6Þ α~ l ¼ ½α~ l;1 ; ⋯; α~ l;N T are Lagrange multipliers of the SLSSVR model.
s:t: yl;i ωTl ϕðzl;i Þβ l ξl;i ¼ 0; i ¼ 1; ⋯; N
  Generally, the development of a good LSSVR model depends on
where ωl ; β l are the model parameters of the final SLSSVR soft suitable selection of related model parameters. For an LSSVR model
sensor model f~ l ; and ξl;N ¼ ½ξl;1 ; ξl;2 ; ⋯; ξl;N T is the approximation with the common Gaussian kernel Kðxi ; xj Þ ¼ exp½jjxi xj jj=s,
error. The new input variable zl;i can be formulated as there are two parameters: the regularization parameter γ 40 and
the related kernel parameter s40 to be chosen. There exist some
zl;i ¼ ½y1;i ; ⋯; yl1;i ; xTl;i T A Rl1þml ; i ¼ 1; ⋯; N ð7Þ
efficient model selection criteria, e.g., cross-validation and Bayesian
where inference (Cawley and Talbot, 2004; Schölkopf and Smola, 2002;
8 Suykens et al., 2002). Here, the parameters selection procedure is
>
>
> y1;i ¼ g 1 ðw1 ; b1 ; x1;i Þ ¼ wT1 ϕðx1;i Þ þ b1 ; i ¼ 1; ⋯; N mainly based on the fast leave-one-out cross-validation (FLOO-CV)
>
>
< y ¼ g ðw ; b ; x Þ ¼ wT ϕðx Þ þ b ; i ¼ 1; ⋯; N criterion (Cawley and Talbot, 2004, 2007).
2;i 2 2 2 2;i 2 2;i 2
> The traditional leave-one-out cross-validation (LOO-CV)
>⋯
>
>
> method has been shown to give an approximately unbiased
: yl1;i ¼ g l1 ðwl1 ; bl1 ; xl1;i Þ ¼ wTl1 ϕðxl1;i Þ þ bl1 ; i ¼ 1; ⋯; N
estimation of the generalization properties of the statistical
ð8Þ models. It can provide a sensible criterion for model selection.
where fwj ; bj gj ¼ 1;⋯;l1 are the model parameters of transform Especially for the conventional SVR method, bounds on the
models fg j gj ¼ 1;⋯;l1 , respectively. For the jth ðj ¼ 1; ⋯; l1Þ reactor, LOO-CV estimator have been proved to be an effective criterion
the LSSVR model can be trained to obtain the nonlinear transform for model selection (Cawley and Talbot, 2004, 2007). However,
of xj;i to the “virtual” quality variables yj;i ; i ¼ 1; ⋯; N; j ¼ 1; ⋯; conventional LOO-CV for SVR/LSSVR is computationally expensive,
l1. Then the new input variable zl;i ¼ ½y1;i ; ⋯; yl1;i ; xTl;i T A Rl1þml so it is not suitable for online performing. Therefore, applying the
and the corresponding new training set Zl ¼ fzl;i gi ¼ 1;⋯;N are con- FLOO-CV criterion (Cawley and Talbot, 2004) as the foundation of
structed. Note that this variable zl;i is different from the previous the parameter selection is feasible.
xl;i in Eq. (1). Based on this transform, the dimension of the new For the lth reactor, the total FLOO-CV error of the LSSVR model
extracted variable zl;i is generally much less than the previous one EFLOOCV
l;N with a candidate parameter (γ, s) and N training samples
of xl;i for many SR processes because l1 þ ml o o ∑lj ¼ 1 mj , can be obtained (Cawley and Talbot, 2004)
 
especially when l is large. Moreover, the process information N N 
 αl;i 

contained in variables of the previous reactors can be represented EFLOOCV
l;N ¼ ∑ jjξ FLOOCV
l;i jj ¼ ∑   ð11Þ
i ¼ 1 P l;N;ii þ sl;i =ol
2
i¼1
by the corresponding LSSVR transform models fg j gj ¼ 1;⋯;l1 . Con-
sequently, zl;i can be considered as a new transformed input where P l;N;ii is the item at the ith row and ith column of Pl,N,
variable, including the extracted sequential relationship of the sl ¼ Pl;N 1l;N ¼ ½sl;1 ; ⋯; sl;N T and ol ¼ 1Tl;N Pl;N 1l;N . The related terms
previous reactors and information in the current reactor. (i.e., Pl,N and αl,i) are available. Additionally, the computational
After training, the aforementioned LSSVR models of the lth load of sl and ol is small. Consequently, compared to the conven-
FLOOCV
reactor can be obtained. Then, the new input xl;t can be tional LOO-CV method, the computation of El;N is much more
606 Y. Liu et al. / Chemical Engineering Science 102 (2013) 602–612

Fig. 3. Schematic diagram of the sequential LSSVR modeling method for an SR process.

efficient, which allows further selection of parameters of a model can be obtained as follows:
sequential of LSSVR models. 8
< minEFLOOCV
The proposed sequential modeling method is shown in Fig. 3. l;N

For the lth (l4 1) reactor, since the transform of the first (l  1) : for Θl;K ¼ fθl;k ¼ ½γ 1;k ; s1;k ; ⋯; γ l1;k ; sl1;k ; γ l;k ; sl;k T gk ¼ 1;⋯;K
reactors yj;i ; i ¼ 1; ⋯; N; j ¼ 1; ⋯; l1 can be mainly determined
ð12Þ
by the corresponding LSSVR models fg j gj ¼ 1;⋯;l1 , the prediction
model f~ l and the first (l  1) transformed LSSVR models On one hand, with the transformed LSSVR models fg j gj ¼ 1;⋯;l1 ,
fg j gj ¼ 1;⋯;l1 are trained simultaneously. For online prediction of a the “virtual” quality variables are connected with the input
query samplexl;t , the transformed variable zl;t ¼ ½y1;t ; ⋯; yl1;t ; variables in the previous reactors. They can provide the related
xTl;t T A Rl1þml can describe the input variables in a sequential quality information for the final reactor. On the other hand, note
way with an extracted form. Furthermore, this new transformed that the main purpose is to achieve good quality prediction for the
variable contains quality information of the first (l  1) reactors. final reactor. As can be seen in Eq. (12), the “virtual” quality
Consequently, the online prediction with the model f~ l can be variables, yj;i ; i ¼ 1; ⋯; N; j ¼ 1; ⋯; l1, of the first (l  1) reac-
achieved in a straightforward way. tors can be adjusted among Θl;K . That is, they can also be
As shown in Fig. 3, a forward FLOO-CV-based optimization considered as “adjustable” input variables (but not fixed) in
method for solving the sequential modeling framework is pro- zl;t ¼ ½y1;t ; ⋯; yl1;t ; xTl;t T to make the final model f~ l more suitable
posed. As for the lth reactor, the modeling and prediction can be for quality prediction. Consequently, the “virtual” quality variables
implemented in a unified way instead of implementing on the provide a sequential link from the first reactor to the last one in an
reactors one by one. First, the candidate parameter set can be SR process.
defined as Θl;K ¼ fθl;k ¼ ½γ 1;k ; s1;k ; ⋯; γ l1;k ; sl1;k ; γ l;k ; sl;k T gk ¼ 1;⋯;K Additionally, it can be noted that, for N training data with the
with K groups of parameters. Among the set Θl;K , ðγ 1;k ; s1;k ; ⋯; candidate parameter set Θl;K , the complexity of FLOO-CV can be
γ l1;k ; sl1;k Þk ¼ 1;⋯;K are for the first (l  1) LSSVR models reduced to about O(KN3) operations, compared to the traditional
fg j gj ¼ 1;⋯;l1 and ðγ l;k ; sl;k Þk ¼ 1;⋯;K are for the model f~ l , respectively. LOO-CV with about O(KN4) operations for training an LSSVR
Then, based on the FLOO-CV criterion, the proposed SLSSVR model model. Consequently, the computational load can be greatly
can be efficiently trained in a forward manner. The final SLSSVR reduced for this sequential modeling problem. This allows the
Y. Liu et al. / Chemical Engineering Science 102 (2013) 602–612 607

proposed sequential modeling framework to be implemented in 2012). The just-in-time LSSVR (JLSSVR) model was first proposed
an efficient manner. by Liu et al. (2007) for batch processes and then extended by Ge
and Song (2010) for continue processes. Generally, for a query
sample xt, there are three main steps to build a JLSSVR model (Liu
3.2. Just-in-time LSSVR modeling approach for et al., 2007, 2012)
multi-grade processes
Step 1: Select the relevant samples to construct a similar set
As the market is highly competitive, time-to-market, quality Ssim in the database S based on some defined similarity criteria.
and differentiation are the main competitive factors in the market. Step 2: Build a JLSSVR model fJLSSVR(xt) using the relevant
Many different grades have to be produced in a polymer manu- dataset Ssim.
facturing process. In many cases, there will not be sufficient data to Step 3: Predict the output y^ t online for the current query
train the model throughout the entire input space. The distribu- sample xt and then discard the JLSSVR model fJLSSVR(xt).
tion of modeling data is generally uneven. Sufficient modeling
data can be obtained in some steady-state operating conditions,
but few data are available beyond these areas. Consequently, it is With the same three-step procedure, a new JLSSVR model can
difficult to describe the whole multi-grade (MG) process with be built for the next query sample. Generally, the Euclidean
multiple grades using only a single model (Kaneko et al., 2011; Kim distance-based similarity is the most commonly utilized index
et al., 2005; Liu, 2007; Yu, 2012b). (Atkeson et al., 1997; Bontempi et al., 1999; Cheng and Chiu, 2004).
This problem may be overcome by fixing models a priori in the The similarity factor (SF) sti between the query sample xt and the
given areas, and applying learning methods only where the sample xi in the dataset is defined below (Atkeson et al., 1997;
samples are available and reliable. Local models can provide the Bontempi et al., 1999)
advantages of learning efficiency and generalization. However, it is sti ¼ expðdti Þ ¼ expðjjxi xt jjÞ; i ¼ 1; ⋯; N ð13Þ
difficult to automatically find the suitable level of locality for a
given subspace of an arbitrary problem. Therefore, in this section, where dti is the distance similarity between xt and xi in X. The
a JITL modeling strategy combined with LSSVR is proposed for the value of sti is bounded between 0 and 1 and when sti approaches to
whole SRMG process. The JITL method, which is inspired by the 1, xt resembles xi closely. Although the distance-and-angle-based
ideas from local modeling and database technology, has been SF can obtain better performance than only utilization of the
developed as an attractive alternative to nonlinear process model- Euclidean distance (Cheng and Chiu, 2004; Liu et al., 2012), an
ing, monitoring and control (Atkeson et al., 1997; Bontempi et al., additional parameter for balance of the distance and the angle
1999; Cheng and Chiu, 2004; Fujiwara et al., 2009; Ge and Song, should be determined. Additionally, some correlation-based simi-
2010; Hu et al., 2013; Kano and Ogawa, 2010; Liu et al., 2007, larity criteria have been proposed recently (Fujiwara et al., 2009;

Fig. 4. Illustrations of the proposed JS-LSSVR and SLSSVR modeling methods and other traditional methods for modeling of an SRMG process (with sequential-reactor and
multi-grade characteristics).
608 Y. Liu et al. / Chemical Engineering Science 102 (2013) 602–612

Kano and Ogawa, 2010). However, it is not the main scope in this nonlinear modeling method is proposed for the whole SRMG
study, so it is not investigated here. process. It can integrate sequential learning and local learning.
Based on the similarity criterion, the n (nmin rn rnmax) related The step-by-step procedures of the JS-LSSVR approach for online
samples should be chosen for building a JLSSVR model. Generally, quality predictions of an SRMG process are summarized as below.
two parameters nmin and nmax are chosen. Only the relevant data The flow diagram of the corresponding procedures of the JS-LSSVR
sets formed by the nminth relevant data to the nmaxth relevant data and SLSSVR methods is shown in Fig. 4.
are used in the model (Cheng and Chiu, 2004). However, how to
determine the number of similar samples is not straightforward. Step 1: Collect the process input and output data, i.e.,
Cheng and Chiu (2004) adopted a linear search method with a Sl ¼ fXl ; Y l g, for model training of the lth reactor.
search step Δn to choose n. Nevertheless, the computational Step 2: Train an SLSSVR model for the sequential relationship in
burden may become large if the range of [nmin, nmax] is wide. an SR process using an efficient FLOO-CV training strategy (Eqs.
To solve the problem, an improved similarity criterion which (11) and (12)). The model data set, i.e., S~ l ¼ fZl ; Y l g, can be
was recently proposed for batch processes is adopted (Liu et al., obtained (Eqs. (6)–(9)), where the input variables in the
2012). First, the nmax most similar samples can be selected due to previous reactors are substituted with “virtual” quality vari-
the SF criterion in Eq. (13). Then, rank the nmax similar samples ables ðZl Þ via LSSVR transform models.
according to the degree of similarity. Define a cumulative similar- Step 3: For a new input measurement in the test set, i.e., xl;t , it
ity factor (CSF) Stn as below (Liu et al., 2012) should be transformed into a new input variable zl;t ¼ ½y1;t ; ⋯;
n yl1;t ; xTl;t T A Rl1þml (Eq. (9)) before online prediction.
∑ sti Step 4: Further search its similar set S~ l;sim ¼ fZl;sim ; Y l;sim g using
i¼1
Stn ¼ nmax
; n r nmax ð14Þ the similarity criterion (Eqs. (13) and (14)). A JS-LSSVR model
∑ sti for zl;t is constructed and the prediction y^ l;t can be obtained (Eq.
i¼1
(10)). Go to Step 3 and do the same procedure for the online
which represents the cumulative similarity of n most similar prediction of the next new input measurement.
samples compared to the relevant dataset Ssim. The CSF index
can be utilized to access the cumulative similarity and then it can In Steps 1–3, the SLSSVR modeling method can integrate the
determine the n most similar samples in a more reasonable way. variable extraction and quality prediction into a unified framework
As an alternative, the search of [nmin, nmax] can be substituted by for an SR process. The process variables (Xj ; j ¼ 1; 2; ⋯; l) are
the choice of Stn, e.g., Stn ¼0.9. This means 90% of the similar transformed into the “virtual” quality variables Zj ; j ¼ 1; 2; ⋯; l
samples have been selected. Especially for an MG process, the under the constraints of the sequential relationships. As shown
relevant set of a query sample for different grades are generally in Fig. 4, using the proposed transform strategy, the number of
different. It is difficult to determine the range of [nmin, nmax] input variables ðXj ; j ¼ 1; 2; ⋯; lÞ can be reduced from ∑lj ¼ 1 mj to
beforehand. Therefore, compared to the method of searching n in l1 þ ml . Step 4 is for the JS-LSSVR model. A similar set with n
the range of [nmin, nmax] directly, the CSF index is more meaningful rather than N samples (n o o N) is utilized to construct the model
and its computational burden can be reduced by simply selecting to improve the local learning ability for the online prediction of a
Stn (Liu et al., 2012). query sample.
With a suitable choice of Stn, the similar samples n are Additionally, the main properties of these modeling methods
determined and the relevant dataset Ssim can be obtained. Then, for an SRMG process are listed in Table 1. For detailed compar-
with a candidate parameter, the JLSSVR model can be constructed. isons, other modeling methods, including LSSVR, PCA–LSSVR, and
As aforementioned, the model parameter can also be selected JLSSVR aforementioned, are also shown on the top of Fig. 4. As
based on the FLOO-CV criterion (Cawley and Talbot, 2004). Finally, shown in Fig. 4 and Table 1, none of LSSVR, PCA–LSSVR nor JLSSVR
if the SRMG process is in batch operations, the selection of similar methods can capture the sequential relationship in an SR process.
samples can be further implemented in a time-scale window due The main property of SLSSVR is that it can integrate the variable
to frequent repetition of batch runs. Additionally, the JITL method extraction and quality prediction into a unified framework
can be combined with the moving-window strategy to save the through the “virtual” quality variables as the connected bridge.
searching time of similar samples, especially for process monitor- This is the first contribution of this work and it is different from
ing (Hu et al., 2013). the traditional LSSVR and two-step modeling methods. However,
as aforementioned, only SLSSVR is not enough for modeling of an
3.3. Implementation steps of just-in-time sequential LSSVR for an MG process although the sequential relationship is captured. Due
SRMG process to the MG data, the distribution of “virtual” quality variables is
widely and non-uniformly located. On the other hand, as shown in
As aforementioned, reliable online quality prediction of an Table 1, compared with a single global model, the JLSSVR model
SRMG process often encounters different challenges, including can offer a more accurate prediction for an MG process because of
process nonlinearity, input variables selection/extraction, sequential its local learning ability, but the input variables and the sequential
relationship in reactors, and shifting operating modes by multiple relationship in an SR process are not explored. Naturally, the JS-
grades. In this section, a just-in-time sequential LSSVR (JS-LSSVR) LSSVR method, which is the second main contribution of this

Table 1
Comparisons of several soft sensors, including JS-LSSVR, SLSSVR, JLSSVR, PCA–LSSVR, and LSSVR, for an SRMG process (with sequential-reactor and multi-grade
characteristics).

Method Abbreviation Input variables transformation Sequential relationship captured Suitable for multiple grades

Just-in-time Sequential LSSVR (proposed) JS-LSSVR Yes Yes Yes


Sequential LSSVR (proposed) SLSSVR Yes Yes No
Just-in-time LSSVR (Liu et al., 2012) JLSSVR No No Yes
Two-step PCA-based LSSVR (Cao et al., 2003) PCA–LSSVR Yes No No
Basic LSSVR (Shi and Liu, 2006; Suykens et al., 2002) LSSVR No No No
Y. Liu et al. / Chemical Engineering Science 102 (2013) 602–612 609

E-201

V-234
R-231 R-229 R-233

R-234

E-201

V-232

Fig. 5. A simplified flowchart of the industrial SRMG polymerization process in an industrial plant in Taiwan.

work, combines the advantages of both SLSSVR and JLSSVR. It has Table 2
Comparisons of prediction between MI2 and MI3 using different soft sensors
sequential, global–local, and quality-relevant properties, so it is
(The best results are bold and italics.).
more suitable for a process with both of sequential-reactor and
multi-grade characteristics. MI and its range Method Input variables description RMSE RE (%)
In this work, the LSSVR model is adopted as the basic nonlinear
model. In general, SVR, GPR, and other comparable nonlinear KL MI3 (0–0.3) JS-LSSVR 1+1+9¼ 11 0.020 21.1
SLSSVR 1+1+9¼ 11 0.026 29.1
models (Chen and Ren, 2009; Ge et al., 2011; Schölkopf and Smola,
JLSSVR 12+8+9 ¼29 0.033 44.0
2002; Tipping, 2001; Yu, 2012b; Zhang et al., 2010) can also be PCA–LSSVR 13 0.041 46.9
applied to this framework. As for these nonlinear methods, LSSVR 12+8+9 ¼29 0.041 50.0
efficient training of a related just-in-time nonlinear model (e.g., PLS 10 0.055 61.9
a just-in-time sequential GPR model) should be developed. Alter- MI2 (0–10) JS-LSSVR 1+8¼ 9 0.768 23.9
natively, the SVR model with time difference recently proposed by SLSSVR 1+8¼ 9 0.777 34.8
Kaneko and Funatsu (2011) can tackle the nonlinearity in process JLSSVR 12+8¼ 20 0.849 32.4
PCA–LSSVR 8 1.035 42.2
variables and the degradation of the model. These would be LSSVR 12+8¼ 20 0.943 37.8
interesting future studies when they are applied to modeling of PLS 8 1.069 45.3
SRMG processes.

PCA–LSSVR and LSSVR models are trained using the FLOO-CV


4. Online prediction of MI in an industrial process criterion. To assess the prediction performance, two performance
indices, including root-mean-square error (RMSE) and relative
In this section, the proposed JS-LSSVR and SLSSVR soft sensor RMSE (RE), are both considered and defined respectively as
modeling methods are validated by predicting MI of an industrial follows:
polyethylene production process in a plant in Taiwan. All the data sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
N
samples have been collected from daily process records and the RMSEl ¼ ∑ ðy^ l;i yl;i Þ2 =N; l ¼ 2; 3 ð15Þ
corresponding laboratory analysis. After simply preprocessing the i¼1
set with a simple 3-sigma criterion, about 300 samples collected
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
from a product line from July 2009 to June 2011 are investigated in N
this study. Among them, the data from July 2009 to December REl ¼ ∑ ½ðy^ l;i yl;i Þ=yl;i 2 =N; l ¼ 2; 3 ð16Þ
i¼1
2010 are used for training and data in 2011 are for test. The data
are available on the web of http://wavenet.cycu.edu.tw/~cpse/ Details about comparisons between MI2 and MI3 prediction
web/Resources.htm. among JS-LSSVR, SLSSVR, JLSSVR, PCA–LSSVR, LSSVR and PLS
A simplified flowchart of the polyethylene production process methods are tabulated in Table 2. First, the results show that
is shown in Fig. 5. There are three main reactors in this poly- the conventional PLS method, represented as a multivariable
ethylene production process. Their product quality variables (melt LV method, is insufficient for this case, especially for prediction
indices) for the sequential reactors are noted as MI1, MI2, MI3, of MI3. Although some improved PLS approaches have been
respectively. A total of 29 process variables correlated with the utilized for MI prediction of polyethylene processes (Ahmed
product quality have been chosen. There are 12 variables in the 1st et al., 2009), they may be unsuitable for modeling of complex
reactor, 8 variables in the 2nd reactor, and 9 variables in the 3rd nonlinear SRMG problems. The basic LSSVR shows better predic-
reactor, respectively. Besides, there are 3 main product grades for tion performance than PLS mainly because of its nonlinear
this production line. MI is sampled and analyzed in the lab once a modeling ability. PCA–LSSVR (Cao et al., 2003), represented as a
day, always in the morning. Owing to lack of online analyzers for two-step modeling method, does not always perform better than
MI, the operating variables have to be manipulated until the assay LSSVR because the extracted LVs may fail to explain quality
results are available. Consequently, off-grade products and materi- properties. Moreover, the sequential relationship in the SRMG
als have been produced inevitably. process has not been captured by PCA–LSSVR, LSSVR or PLS.
The JS-LSSVR and SLSSVR methods are applied to predict MI2 JLSSVR (Liu et al., 2012), as a local nonlinear modeling method,
and MI3, compared with traditional JLSSVR, PCA–LSSVR, LSSVR, can achieve higher prediction accuracy than PCA–LSSVR and
and PLS methods. For PCA–LSSVR and PLS, the number of LVs for LSSVR approaches. As for prediction of MI2, JLSSVR is only a little
each reactor is chosen by the common cumulative percentage of inferior to SLSSVR in terms of RMSE. Nevertheless, as for predic-
variance criterion of 85% (Kadlec et al., 2009). Then, all SLSSVR, tion of MI3, JLSSVR shows much inferior to SLSSVR both in two

You might also like