Professional Documents
Culture Documents
Lth reactor, it is not a trivial task because the set of the variables
jth reactor. Correspondingly, the modeling data set of the fxL;i A R∑j ¼ 1 mj gi ¼ 1;⋯;N , including the variables at the current reactor
L
lth reactor can be further represented as Sl ¼ fXl ; Y l g; l ¼ 1; ⋯; L, L and the variables from the previous reactors, becomes more
Xl ¼ fxl;i ¼ ½xT1;i ; ⋯; xTl1;i ; xTl;i T A R∑j ¼ 1 mj gi ¼ 1;⋯;N
l
where and complicated. Without enough process knowledge, it is difficult to
Y l ¼ fyl;i A Rgi ¼ 1;⋯;N . determine a suitable input variable for constructing a model.
Test sample xl ,t
Input and output data Basic soft sensor LSSVR model for the
for training related reactor Prediction yˆl ,t
Sl = {Xl , Yl } , l = 1,L , L (Eq. 5)
Test sample xl ,t
Traditional two-step
Yl and a subset of LSSVR model for the
soft sensor A
variables or features related reactor Prediction yˆl ,t
Xsub, l (Eq. 5)
Test sample xl ,t
Traditional two-step
Yl and extracted LVs LSSVR model for the
soft sensor B
based on PCA related reactor Prediction yˆl ,t
XLVs, l (Eq. 5)
Fig. 2. Flow diagram of the traditional LSSVR-based two-step soft sensor modeling methods for the lth reactor of a sequential process.
3. Just-in-time sequential LSSVR model for an SRMP process transformed to zl;t first using Eqs. (7) and (8).
3.1. Sequential LSSVR modeling framework and its efficient training zl;t ¼ ½y1;t ; ⋯; yl1;t ; xTl;t T A Rl1þml ð9Þ
for an SR process Finally, the prediction y^ l;t can be obtained using the obtained
SLSSVR model f~ l
This paper presents a novel soft sensor modeling method
which can integrate the variable extraction and quality prediction y^ l;t ¼ f~ l ðωl ; β l ; zl;t Þ
into a unified framework. Additionally, the sequential relationship N
in an SR process is investigated. For the lth reactor of an SR ¼ ∑ α~ l;i 〈ϕðzl;i Þ; ϕðzl;t Þ〉 þ β l
i¼1
process, the soft sensor modeling framework can be described as
below ~ Tl k~ l;t þ β l
¼α ð10Þ
(
minJ l ðωl ; β l ; ξl;N Þ ¼ 12jjjωl j2 þ γ2l jjξl;N jj2 where k~ l;t ðiÞ ¼ 〈ϕðzl;i Þ; ϕðzl;t Þ〉; 8 i ¼ 1; ⋯; N is a kernel vector and
ð6Þ α~ l ¼ ½α~ l;1 ; ⋯; α~ l;N T are Lagrange multipliers of the SLSSVR model.
s:t: yl;i ωTl ϕðzl;i Þβ l ξl;i ¼ 0; i ¼ 1; ⋯; N
Generally, the development of a good LSSVR model depends on
where ωl ; β l are the model parameters of the final SLSSVR soft suitable selection of related model parameters. For an LSSVR model
sensor model f~ l ; and ξl;N ¼ ½ξl;1 ; ξl;2 ; ⋯; ξl;N T is the approximation with the common Gaussian kernel Kðxi ; xj Þ ¼ exp½jjxi xj jj=s,
error. The new input variable zl;i can be formulated as there are two parameters: the regularization parameter γ 40 and
the related kernel parameter s40 to be chosen. There exist some
zl;i ¼ ½y1;i ; ⋯; yl1;i ; xTl;i T A Rl1þml ; i ¼ 1; ⋯; N ð7Þ
efficient model selection criteria, e.g., cross-validation and Bayesian
where inference (Cawley and Talbot, 2004; Schölkopf and Smola, 2002;
8 Suykens et al., 2002). Here, the parameters selection procedure is
>
>
> y1;i ¼ g 1 ðw1 ; b1 ; x1;i Þ ¼ wT1 ϕðx1;i Þ þ b1 ; i ¼ 1; ⋯; N mainly based on the fast leave-one-out cross-validation (FLOO-CV)
>
>
< y ¼ g ðw ; b ; x Þ ¼ wT ϕðx Þ þ b ; i ¼ 1; ⋯; N criterion (Cawley and Talbot, 2004, 2007).
2;i 2 2 2 2;i 2 2;i 2
> The traditional leave-one-out cross-validation (LOO-CV)
>⋯
>
>
> method has been shown to give an approximately unbiased
: yl1;i ¼ g l1 ðwl1 ; bl1 ; xl1;i Þ ¼ wTl1 ϕðxl1;i Þ þ bl1 ; i ¼ 1; ⋯; N
estimation of the generalization properties of the statistical
ð8Þ models. It can provide a sensible criterion for model selection.
where fwj ; bj gj ¼ 1;⋯;l1 are the model parameters of transform Especially for the conventional SVR method, bounds on the
models fg j gj ¼ 1;⋯;l1 , respectively. For the jth ðj ¼ 1; ⋯; l1Þ reactor, LOO-CV estimator have been proved to be an effective criterion
the LSSVR model can be trained to obtain the nonlinear transform for model selection (Cawley and Talbot, 2004, 2007). However,
of xj;i to the “virtual” quality variables yj;i ; i ¼ 1; ⋯; N; j ¼ 1; ⋯; conventional LOO-CV for SVR/LSSVR is computationally expensive,
l1. Then the new input variable zl;i ¼ ½y1;i ; ⋯; yl1;i ; xTl;i T A Rl1þml so it is not suitable for online performing. Therefore, applying the
and the corresponding new training set Zl ¼ fzl;i gi ¼ 1;⋯;N are con- FLOO-CV criterion (Cawley and Talbot, 2004) as the foundation of
structed. Note that this variable zl;i is different from the previous the parameter selection is feasible.
xl;i in Eq. (1). Based on this transform, the dimension of the new For the lth reactor, the total FLOO-CV error of the LSSVR model
extracted variable zl;i is generally much less than the previous one EFLOOCV
l;N with a candidate parameter (γ, s) and N training samples
of xl;i for many SR processes because l1 þ ml o o ∑lj ¼ 1 mj , can be obtained (Cawley and Talbot, 2004)
especially when l is large. Moreover, the process information N N
αl;i
contained in variables of the previous reactors can be represented EFLOOCV
l;N ¼ ∑ jjξ FLOOCV
l;i jj ¼ ∑ ð11Þ
i ¼ 1 P l;N;ii þ sl;i =ol
2
i¼1
by the corresponding LSSVR transform models fg j gj ¼ 1;⋯;l1 . Con-
sequently, zl;i can be considered as a new transformed input where P l;N;ii is the item at the ith row and ith column of Pl,N,
variable, including the extracted sequential relationship of the sl ¼ Pl;N 1l;N ¼ ½sl;1 ; ⋯; sl;N T and ol ¼ 1Tl;N Pl;N 1l;N . The related terms
previous reactors and information in the current reactor. (i.e., Pl,N and αl,i) are available. Additionally, the computational
After training, the aforementioned LSSVR models of the lth load of sl and ol is small. Consequently, compared to the conven-
FLOOCV
reactor can be obtained. Then, the new input xl;t can be tional LOO-CV method, the computation of El;N is much more
606 Y. Liu et al. / Chemical Engineering Science 102 (2013) 602–612
Fig. 3. Schematic diagram of the sequential LSSVR modeling method for an SR process.
efficient, which allows further selection of parameters of a model can be obtained as follows:
sequential of LSSVR models. 8
< minEFLOOCV
The proposed sequential modeling method is shown in Fig. 3. l;N
For the lth (l4 1) reactor, since the transform of the first (l 1) : for Θl;K ¼ fθl;k ¼ ½γ 1;k ; s1;k ; ⋯; γ l1;k ; sl1;k ; γ l;k ; sl;k T gk ¼ 1;⋯;K
reactors yj;i ; i ¼ 1; ⋯; N; j ¼ 1; ⋯; l1 can be mainly determined
ð12Þ
by the corresponding LSSVR models fg j gj ¼ 1;⋯;l1 , the prediction
model f~ l and the first (l 1) transformed LSSVR models On one hand, with the transformed LSSVR models fg j gj ¼ 1;⋯;l1 ,
fg j gj ¼ 1;⋯;l1 are trained simultaneously. For online prediction of a the “virtual” quality variables are connected with the input
query samplexl;t , the transformed variable zl;t ¼ ½y1;t ; ⋯; yl1;t ; variables in the previous reactors. They can provide the related
xTl;t T A Rl1þml can describe the input variables in a sequential quality information for the final reactor. On the other hand, note
way with an extracted form. Furthermore, this new transformed that the main purpose is to achieve good quality prediction for the
variable contains quality information of the first (l 1) reactors. final reactor. As can be seen in Eq. (12), the “virtual” quality
Consequently, the online prediction with the model f~ l can be variables, yj;i ; i ¼ 1; ⋯; N; j ¼ 1; ⋯; l1, of the first (l 1) reac-
achieved in a straightforward way. tors can be adjusted among Θl;K . That is, they can also be
As shown in Fig. 3, a forward FLOO-CV-based optimization considered as “adjustable” input variables (but not fixed) in
method for solving the sequential modeling framework is pro- zl;t ¼ ½y1;t ; ⋯; yl1;t ; xTl;t T to make the final model f~ l more suitable
posed. As for the lth reactor, the modeling and prediction can be for quality prediction. Consequently, the “virtual” quality variables
implemented in a unified way instead of implementing on the provide a sequential link from the first reactor to the last one in an
reactors one by one. First, the candidate parameter set can be SR process.
defined as Θl;K ¼ fθl;k ¼ ½γ 1;k ; s1;k ; ⋯; γ l1;k ; sl1;k ; γ l;k ; sl;k T gk ¼ 1;⋯;K Additionally, it can be noted that, for N training data with the
with K groups of parameters. Among the set Θl;K , ðγ 1;k ; s1;k ; ⋯; candidate parameter set Θl;K , the complexity of FLOO-CV can be
γ l1;k ; sl1;k Þk ¼ 1;⋯;K are for the first (l 1) LSSVR models reduced to about O(KN3) operations, compared to the traditional
fg j gj ¼ 1;⋯;l1 and ðγ l;k ; sl;k Þk ¼ 1;⋯;K are for the model f~ l , respectively. LOO-CV with about O(KN4) operations for training an LSSVR
Then, based on the FLOO-CV criterion, the proposed SLSSVR model model. Consequently, the computational load can be greatly
can be efficiently trained in a forward manner. The final SLSSVR reduced for this sequential modeling problem. This allows the
Y. Liu et al. / Chemical Engineering Science 102 (2013) 602–612 607
proposed sequential modeling framework to be implemented in 2012). The just-in-time LSSVR (JLSSVR) model was first proposed
an efficient manner. by Liu et al. (2007) for batch processes and then extended by Ge
and Song (2010) for continue processes. Generally, for a query
sample xt, there are three main steps to build a JLSSVR model (Liu
3.2. Just-in-time LSSVR modeling approach for et al., 2007, 2012)
multi-grade processes
Step 1: Select the relevant samples to construct a similar set
As the market is highly competitive, time-to-market, quality Ssim in the database S based on some defined similarity criteria.
and differentiation are the main competitive factors in the market. Step 2: Build a JLSSVR model fJLSSVR(xt) using the relevant
Many different grades have to be produced in a polymer manu- dataset Ssim.
facturing process. In many cases, there will not be sufficient data to Step 3: Predict the output y^ t online for the current query
train the model throughout the entire input space. The distribu- sample xt and then discard the JLSSVR model fJLSSVR(xt).
tion of modeling data is generally uneven. Sufficient modeling
data can be obtained in some steady-state operating conditions,
but few data are available beyond these areas. Consequently, it is With the same three-step procedure, a new JLSSVR model can
difficult to describe the whole multi-grade (MG) process with be built for the next query sample. Generally, the Euclidean
multiple grades using only a single model (Kaneko et al., 2011; Kim distance-based similarity is the most commonly utilized index
et al., 2005; Liu, 2007; Yu, 2012b). (Atkeson et al., 1997; Bontempi et al., 1999; Cheng and Chiu, 2004).
This problem may be overcome by fixing models a priori in the The similarity factor (SF) sti between the query sample xt and the
given areas, and applying learning methods only where the sample xi in the dataset is defined below (Atkeson et al., 1997;
samples are available and reliable. Local models can provide the Bontempi et al., 1999)
advantages of learning efficiency and generalization. However, it is sti ¼ expðdti Þ ¼ expðjjxi xt jjÞ; i ¼ 1; ⋯; N ð13Þ
difficult to automatically find the suitable level of locality for a
given subspace of an arbitrary problem. Therefore, in this section, where dti is the distance similarity between xt and xi in X. The
a JITL modeling strategy combined with LSSVR is proposed for the value of sti is bounded between 0 and 1 and when sti approaches to
whole SRMG process. The JITL method, which is inspired by the 1, xt resembles xi closely. Although the distance-and-angle-based
ideas from local modeling and database technology, has been SF can obtain better performance than only utilization of the
developed as an attractive alternative to nonlinear process model- Euclidean distance (Cheng and Chiu, 2004; Liu et al., 2012), an
ing, monitoring and control (Atkeson et al., 1997; Bontempi et al., additional parameter for balance of the distance and the angle
1999; Cheng and Chiu, 2004; Fujiwara et al., 2009; Ge and Song, should be determined. Additionally, some correlation-based simi-
2010; Hu et al., 2013; Kano and Ogawa, 2010; Liu et al., 2007, larity criteria have been proposed recently (Fujiwara et al., 2009;
Fig. 4. Illustrations of the proposed JS-LSSVR and SLSSVR modeling methods and other traditional methods for modeling of an SRMG process (with sequential-reactor and
multi-grade characteristics).
608 Y. Liu et al. / Chemical Engineering Science 102 (2013) 602–612
Kano and Ogawa, 2010). However, it is not the main scope in this nonlinear modeling method is proposed for the whole SRMG
study, so it is not investigated here. process. It can integrate sequential learning and local learning.
Based on the similarity criterion, the n (nmin rn rnmax) related The step-by-step procedures of the JS-LSSVR approach for online
samples should be chosen for building a JLSSVR model. Generally, quality predictions of an SRMG process are summarized as below.
two parameters nmin and nmax are chosen. Only the relevant data The flow diagram of the corresponding procedures of the JS-LSSVR
sets formed by the nminth relevant data to the nmaxth relevant data and SLSSVR methods is shown in Fig. 4.
are used in the model (Cheng and Chiu, 2004). However, how to
determine the number of similar samples is not straightforward. Step 1: Collect the process input and output data, i.e.,
Cheng and Chiu (2004) adopted a linear search method with a Sl ¼ fXl ; Y l g, for model training of the lth reactor.
search step Δn to choose n. Nevertheless, the computational Step 2: Train an SLSSVR model for the sequential relationship in
burden may become large if the range of [nmin, nmax] is wide. an SR process using an efficient FLOO-CV training strategy (Eqs.
To solve the problem, an improved similarity criterion which (11) and (12)). The model data set, i.e., S~ l ¼ fZl ; Y l g, can be
was recently proposed for batch processes is adopted (Liu et al., obtained (Eqs. (6)–(9)), where the input variables in the
2012). First, the nmax most similar samples can be selected due to previous reactors are substituted with “virtual” quality vari-
the SF criterion in Eq. (13). Then, rank the nmax similar samples ables ðZl Þ via LSSVR transform models.
according to the degree of similarity. Define a cumulative similar- Step 3: For a new input measurement in the test set, i.e., xl;t , it
ity factor (CSF) Stn as below (Liu et al., 2012) should be transformed into a new input variable zl;t ¼ ½y1;t ; ⋯;
n yl1;t ; xTl;t T A Rl1þml (Eq. (9)) before online prediction.
∑ sti Step 4: Further search its similar set S~ l;sim ¼ fZl;sim ; Y l;sim g using
i¼1
Stn ¼ nmax
; n r nmax ð14Þ the similarity criterion (Eqs. (13) and (14)). A JS-LSSVR model
∑ sti for zl;t is constructed and the prediction y^ l;t can be obtained (Eq.
i¼1
(10)). Go to Step 3 and do the same procedure for the online
which represents the cumulative similarity of n most similar prediction of the next new input measurement.
samples compared to the relevant dataset Ssim. The CSF index
can be utilized to access the cumulative similarity and then it can In Steps 1–3, the SLSSVR modeling method can integrate the
determine the n most similar samples in a more reasonable way. variable extraction and quality prediction into a unified framework
As an alternative, the search of [nmin, nmax] can be substituted by for an SR process. The process variables (Xj ; j ¼ 1; 2; ⋯; l) are
the choice of Stn, e.g., Stn ¼0.9. This means 90% of the similar transformed into the “virtual” quality variables Zj ; j ¼ 1; 2; ⋯; l
samples have been selected. Especially for an MG process, the under the constraints of the sequential relationships. As shown
relevant set of a query sample for different grades are generally in Fig. 4, using the proposed transform strategy, the number of
different. It is difficult to determine the range of [nmin, nmax] input variables ðXj ; j ¼ 1; 2; ⋯; lÞ can be reduced from ∑lj ¼ 1 mj to
beforehand. Therefore, compared to the method of searching n in l1 þ ml . Step 4 is for the JS-LSSVR model. A similar set with n
the range of [nmin, nmax] directly, the CSF index is more meaningful rather than N samples (n o o N) is utilized to construct the model
and its computational burden can be reduced by simply selecting to improve the local learning ability for the online prediction of a
Stn (Liu et al., 2012). query sample.
With a suitable choice of Stn, the similar samples n are Additionally, the main properties of these modeling methods
determined and the relevant dataset Ssim can be obtained. Then, for an SRMG process are listed in Table 1. For detailed compar-
with a candidate parameter, the JLSSVR model can be constructed. isons, other modeling methods, including LSSVR, PCA–LSSVR, and
As aforementioned, the model parameter can also be selected JLSSVR aforementioned, are also shown on the top of Fig. 4. As
based on the FLOO-CV criterion (Cawley and Talbot, 2004). Finally, shown in Fig. 4 and Table 1, none of LSSVR, PCA–LSSVR nor JLSSVR
if the SRMG process is in batch operations, the selection of similar methods can capture the sequential relationship in an SR process.
samples can be further implemented in a time-scale window due The main property of SLSSVR is that it can integrate the variable
to frequent repetition of batch runs. Additionally, the JITL method extraction and quality prediction into a unified framework
can be combined with the moving-window strategy to save the through the “virtual” quality variables as the connected bridge.
searching time of similar samples, especially for process monitor- This is the first contribution of this work and it is different from
ing (Hu et al., 2013). the traditional LSSVR and two-step modeling methods. However,
as aforementioned, only SLSSVR is not enough for modeling of an
3.3. Implementation steps of just-in-time sequential LSSVR for an MG process although the sequential relationship is captured. Due
SRMG process to the MG data, the distribution of “virtual” quality variables is
widely and non-uniformly located. On the other hand, as shown in
As aforementioned, reliable online quality prediction of an Table 1, compared with a single global model, the JLSSVR model
SRMG process often encounters different challenges, including can offer a more accurate prediction for an MG process because of
process nonlinearity, input variables selection/extraction, sequential its local learning ability, but the input variables and the sequential
relationship in reactors, and shifting operating modes by multiple relationship in an SR process are not explored. Naturally, the JS-
grades. In this section, a just-in-time sequential LSSVR (JS-LSSVR) LSSVR method, which is the second main contribution of this
Table 1
Comparisons of several soft sensors, including JS-LSSVR, SLSSVR, JLSSVR, PCA–LSSVR, and LSSVR, for an SRMG process (with sequential-reactor and multi-grade
characteristics).
Method Abbreviation Input variables transformation Sequential relationship captured Suitable for multiple grades
E-201
V-234
R-231 R-229 R-233
R-234
E-201
V-232
Fig. 5. A simplified flowchart of the industrial SRMG polymerization process in an industrial plant in Taiwan.
work, combines the advantages of both SLSSVR and JLSSVR. It has Table 2
Comparisons of prediction between MI2 and MI3 using different soft sensors
sequential, global–local, and quality-relevant properties, so it is
(The best results are bold and italics.).
more suitable for a process with both of sequential-reactor and
multi-grade characteristics. MI and its range Method Input variables description RMSE RE (%)
In this work, the LSSVR model is adopted as the basic nonlinear
model. In general, SVR, GPR, and other comparable nonlinear KL MI3 (0–0.3) JS-LSSVR 1+1+9¼ 11 0.020 21.1
SLSSVR 1+1+9¼ 11 0.026 29.1
models (Chen and Ren, 2009; Ge et al., 2011; Schölkopf and Smola,
JLSSVR 12+8+9 ¼29 0.033 44.0
2002; Tipping, 2001; Yu, 2012b; Zhang et al., 2010) can also be PCA–LSSVR 13 0.041 46.9
applied to this framework. As for these nonlinear methods, LSSVR 12+8+9 ¼29 0.041 50.0
efficient training of a related just-in-time nonlinear model (e.g., PLS 10 0.055 61.9
a just-in-time sequential GPR model) should be developed. Alter- MI2 (0–10) JS-LSSVR 1+8¼ 9 0.768 23.9
natively, the SVR model with time difference recently proposed by SLSSVR 1+8¼ 9 0.777 34.8
Kaneko and Funatsu (2011) can tackle the nonlinearity in process JLSSVR 12+8¼ 20 0.849 32.4
PCA–LSSVR 8 1.035 42.2
variables and the degradation of the model. These would be LSSVR 12+8¼ 20 0.943 37.8
interesting future studies when they are applied to modeling of PLS 8 1.069 45.3
SRMG processes.