You are on page 1of 5

Proceeding of the 11th World Congress on Intelligent Control and Automation

Shenyang, China, June 29 - July 4 2014

A Modified PLS Regression Model for Quality Prediction*


Yingwei Zhang and Lingjun Zhang
State Laboratory of Synthesis Automation of Process Industry,
Northeastern University, Shenyang City, Liaoning Province,
110819, P. R. China
zhangyingwei@mail.neu.edu.cn
algorithm. Section III gives a simulation example to illustrate
the feasibility of the proposed method. The conclusions are
given in section IV in the end.

Abstract - In this paper, a modified partial least-squares


(PLS) regression modeling method is proposed. The proposed
method can build a modified regression model to extract the
useful information in residual subspace, which is helpful to
predict the output variables. With this method, more accurate
quality variables are predicted. In simulation experiment,
penicillin fermentation process is used to test the proposed
modified PLS method and the conventional PLS method is also
applied in the process. It is shown that the proposed method is
more effective than the conventional PLS method.

II. MODIFIED PLS ALGORITHM


A. Correlation between Input Residual and Output
Variables
The PLS method is to extract the relationship between
input and output variables by maximize the covariance
between these two spaces. Obeying this purpose, the latent
variables are worked out. With the obtained latent variables,
the model which reflects the relationship between input and
output variables is built. Given an input matrix X R nm
consisting of n samples with m process variables per
sample, and an output matrix Y R n p with p quality
variables per sample, PLS projects X and Y to a lowdimensional space defined by the latent variables as follows
[6]:
X = TPT + E
(1)

T
Y = TQ + F
where T = [t1 , t 2 t h ] is the score matrix, t i is the latent

Index Terms - Modified PLS regression model. Residual


subspace. Prediction. Quality variables.

I. INTRODUCTION
As a data-driven method, partial least squares (PLS)
method has been widely used in the modeling, monitoring and
fault diagnosing of industrial processes and it has shown good
performance [1]-[3]. In a complex multi-variable system, the
PLS method can extract the information to build the
relationship between input and output variables [4]. With the
developed relationship or model, new output variables can be
predicted when the input variables are known. With this
speciality of PLS, it is important in predicting and controlling
the quality of products.
In this paper, the PLS method is analyzed from the point
of space decomposition. It is indicated that there is still some
information which is relevant to the output variables in the
input variables residual, after the input space and output space
are decomposed in PLS [5]. And the left information will
affect the accuracy of the model and prediction. In this paper,
the relationship between residual subspace and output
variables is gotten from the perspective of projection by
analyzing the residual subspace. Then a novel modified PLS
method is proposed. Compared with conventional PLS
method, the modified method can curve a more accurate
relationship between input and output variables, improving the
precision and prediction power of the model. In the simulation
study, the penicillin fermentation process is employed to test
the effectiveness of the proposed method. And the PLS
method is also applied in the penicillin fermentation process.
The simulation study results show that the proposed modified
PLS method can predict a more accurate output result than the
PLS method.
The remaining sections of this paper are organized as
follows. Section II analyses correlation between residual and
quality variables and then presents the modified PLS

978-1-4799-5825-2/14/$31.00 2014 IEEE

variable. P = [p1 , p 2 p h ] is the loading matrix for X and


Q = [q1 , q 2 q h ] is the loading matrix for Y . h is the
number of latent variables. In this paper, number of principal
components in PLS is determined by the cross-validation [7].
PLS method is usually computed with a nonlinear iterative
partial least squares algorithm (NIPALS) which is described in
Table I [8]. The objective of PLS embedded in this algorithm
is to find the solution of problem as follow:
max wTi XTi Yi q i
(2)
s.t. w i = 1, q i = 1
where w i , qi are weight vectors which yield t i = Xi w i and

u i = Yi q i , respectively. Denoting W = [w1 ,..., w h ] , T


cannot be calculated from X directly with W . Let
i 1

ri = w i , ri = (I m w j pTj )w i , i > 1

(3)

j =1

and R = [r1 ,..., rh ] . Then, the score matrix T can be


computed from the original X as follows:
T = XR
P , R and W have the following relationship:

1383

(4)

R = W ( P T W ) 1

According to the analyses above, decomposition of X is


influenced by Y . If Y is more relevant to leading PCA
scores of X , the PLS decomposition of the X space is
similar to the PCA decomposition of X space. In this case,
the obtained scores can describe the model characteristics. If

(5)

P T R = R T P = WT W = I h
(6)
From the computation shown above, it can be seen that
decomposition of X space is determined by P and R . So
that how Y influence decomposition of X space can been
reflected by angle between pi and ri . For every dimension
T

r p =1
(7)
so that cosine of angel between p and r is calculated as
follow:
cos (r, p) = 1 r p
(8)

Step

The PLS weight vector r is in Span {a1 ,..., al } according


to the properties of PLS. Therefore,
h

r= r

(9)

i =1

and

Table. I
NIPALS algorithm
NIPALS

Set u equal to any column of Y

w = XT u

t = Xw , t t / t

c = YT t

u = Yc , u u / u

If t converges, go to Step 7, else return to

X X tt T X

Step2

i2 = 1

(10)

i =1

Y Y tt T Y

Then,
h

p = XT t / t T t = XT Xr / t T t = i i ai

r i i 2

i =1

(11)

a2

i =1

cos (r, p) = i i
i =1

r4 = p 4

Combine (8), (9) and (10),


2
i

2
i

(12)

i =1

max

Minimizing (12) subject to (10), the max angle between


p and r is gotten as follow:

r2
p2

max (r , p) = arccos(2 1h (1 + h ))
(13)
To visualize the results geometrically, consider the
special case of two inputs and one output. Suppose
X = [x1 , x 2 ] , X = t1 a1T + t 2 aT2 , Y = c1 t1 + c 2 t 2 , then (13) is
transformed the form as follow:
max (r , p) = arccos(2 12 (1 + 2 ))
(14)
From (14), it can be seen that angle between p and

r1 = p1

a1

Fig.1 Effect on X-space decomposition by Y

Y is more relevant to the non-leading PCA scores of X , the


PLS decomposition of the X space can be very different from
the PCA decomposition of the X space. The variance left in
the residual E can be very large. If this left information is not
orthogonal to the output variables, it will influence of
prediction of Y . In this case, to predict Y accurately, E
should be handled.
Now the relationship between residual E and output
variables Y is analyzed from the perspective of projection. As
shown in Fig. 2, R( X) and R(Y) denote space of X and Y ,

r becomes close to 0 when 2 is close to 1 . Some different


angles between p and r are described in Fig. 1. In PLS,
decomposition of X space is determined by both X and Y .
If Y is only relevant to t1 , not relevant to other t i , then r
coincides with a1 . In this case, p1 and r1 coincide, which

respectively. Ei 1 and Fi 1 denote the residual after i 1

iterations. When i = 0 , E0 = X , F0 = Y . E
i1 and Fi1 are

forms the same decomposition as PCA. p1 and r1 in Fig. 1


denote this case. If Y is more relevant to t 2 than t1 , then r

the projections of Ei1 and Fi1 on score direction. Ei 1 is


the projection of Ei 1 on residual subspace. is the angle
between the residual subspace of PLS and the direction of
Fi 1 . When the covariance of X and Y is maximized, the

is chosen to be farther from a1 and closer to a 2 , causing

(r, p) becoming larger. This case is described by p 2 and


r2 in Fig. 1. Therefore, (r, p) will increase as how Y is
relevant to t i . The max (r, p) case is described by p* and

possible places of t i are shown in the figure. It changes as the


direction which is described with dotted line in the figure. At
the same time, direction of the residual subspace Ei is

r in Fig. 1. When Y is only relevant to t 2 , p and r


coincide again, which is described by p 4 and r4 in Fig. 1.

changed as the change of t i . The angle reflects the

1384

correlation between Ei and Fi 1 . The correlation between Ei


and Fi 1 is same as the correlation between Ei and Y which
is proved as follows:

Ei1

R(X)

ti
Ei

Ei1

ETi 1Fi 1FiT1Ei 1


= ETi 1[Fi 2 t i 1riT1 ][Fi 2 ti 1riT1 ]T Ei 1
= ETi 1Fi 2 FiT2 Ei 1 ETi 1Fi 2 ri 1tTi 1Ei 1

E
i1

(15)

Fig. 2 Geometrical analysis of principal and residual subspace

tTi Ei = tTi [Ei 1 t i pTi ] = tTi Ei 1 tTi t i (tTi Ei 1 / tTi t i ) = 0 (16)


and

ETi-1Fi-1Fi-T1Ei

ETi-1Fi-2 FiT2 Ei 1

x1
x2
xp

(17)
(18)

then

ETi Fi 1FiT1Ei = ETi Fi FiT Ei = ETi F0 F0T Ei = ETi YYT Ei (20)


When the score t i gets close to the direction of Ei1 ,
becomes smaller which indicating there is information which
is relevant to Fi1 or Y in residual Ei1 . When the score t i
gets close to the direction of Fi1 , the residual Ei1 becomes
perpendicular to Fi1 gradually. When t i and Fi1 coincide,

Ei1 is perpendicular to Fi1 , indicating that there is no


information relevant to Y in Ei1 .
B. Modified PLS Algorithm
The PLS input-output regression model is built to
estimation the output information directly. According to the
description about the decomposition of X and Y , the
regression model is gotten as follows:
= XB + H
(21)
Y

between residual subspace and output space, then PE is


written as follow:
(27)
PE = YT E ET E
With the Inner product transform above, the orthogonal
part between E and Y is removed. In E , this part is
irrelevant to Y and it is useless to predict Y . Then this part
Y' which is relevant to Y is obtained as follow:
(28)
Y' = EPE T
Insert (7) into (8), the modified part in Y is obtained as
follow:

(22)

where H is the residual of model and its ignored usually.


Based on NIPALS, the equations are obtained:
(23)
W = XT U
P = XT T(TT T)1
T

1
Y' = E ( ET E ) ET Y

(24)

-1

T
= E ( ET E ) ET Y

(25)
C = Y T(T T)
Combine (22), (23), (24), and (25), the specific form of
B is gotten as follow:
B = XT U(TT XXT U ) 1 TT Y
(26)
Therefore, for X = ( x1 , x 2 ,
Y = ( y1 , y 2 ,

= XB
Y

According to the discussion above, there is correlation


between Y and scores, so that some information which is
related to Y is left in the residual subspace in the
decomposition of X . And the left information will affect the
accuracy of the model. To improve accuracy of the model, a
novel modified method is proposed in this paper. This new
method further decomposes Y to obtain the information
which is related to residual subspace E . Then the related
information is added to the original model to get a more
accurate result.
PE is defined to express the projection relationship

so that

y1
y2
yq

Fig. 3 PLS regression model

ETi Fi 1FiT1Ei = ETi Fi FiT Ei = ETi F0 F0T Ei = ETi YYT Ei (19)

B = W ( PT W ) CT

F
i 1

R(Y)

ETi 1t i 1riT1FiT 2 Ei 1 + ETi 1t i 1riT1ri 1tTi 1Ei 1


Because

ETi 1ti 1 = tTi Ei = 0


so that (15) is transformed the form as follow:

Fi1

(29)

= E ( ET E ) ET Y
1

As the residual form is E = X TPT , then the regression


relationship between Y' and input variables is described as
follow:

, x p ) and

, y q ) , the sketch map of PLS regression model

Y' = ( X - XRPT )( ET E ) ET Y
1

is shown in Fig. 3.

= X ( I - RPT )( ET E ) ET Y
1

= XB'

1385

(30)

where B' is the regression coefficient of E - Y regression


equation and the specific form of B' is shown as follow:

B' = ( I - RPT )( ET E ) ET
1

model of the process. For these characteristics, the proposed


method is applied to the control and predicting of the process.
The experiment results show that the proposed method is
effective.
The conventional PLS method is also applied to model
the process. Fig. 4 and Fig. 5 show the training error of
samples. Fig. 6 and Fig. 7 show the prediction error. The
tracking to the true value of two methods are shown in Fig. 8
and Fig. 9. In these figures, it can be seen that training error
and prediction error are smaller by using modified PLS
method, compared with the conventional PLS method. The
modified PLS method predicts the quality variables better and
improves the model accuracy. The effect of two PLS methods
are compared in Table II. The RMSE value is calculated with
the equation as follow:

(31)

Considering the principal components part, the last


modified regression model is obtained as follow:
Y = XB + EPET

= XB + XB '
= X[ XT U(TT XXT U ) 1 TT Y + ( I - RPT )( ET E ) ET Y]
1

(32)

= XB M
where B M is modified regression coefficient. It is written as:

B M = XT U(TT XXT U) 1 TT Y + ( I - RPT )( ET E ) ET Y


1

(33)

Now, useful information in residual has been extracted.


The new regression model has the similar form to
conventional PLS model. It improves the accuracy in
predicting Y .
To sum up, calculation of the modified PLS regression
method is shown as follows:
(1) Normalize the original data X and Y , initialize E0 = X ,

y i )

(34)
n
Seen from the values in Table II, the conclusion that the
modified PLS has higher accuracy is drawn.
Table II
Comparison of the two PLS methods performance
Model
Training RMSE Prediction RMSE
Conventional PLS
0.0017
0.0043
Modified PLS
9.1689e-004
0.0017

-3

Training Error of Penicillin


Concentration (g/L)

t i = Ei w i / Ei w i ,

u i +1 = Fi q i ;
(4) Calculate the loading vector: pi = ETi t i / tTi t i ;
(5) Calculate the loading vector: qi = FiT t i / tTi t i ;

Ei +1 = Ei t i pTi

;
Fi +1 = Fi t i qTi
(7) Return to Step (2) until i = h , h is the number of
principal component calculated with cross-validation.
X = TPT + Eh +1
(8) Obtain the prediction model:
;
Y = TQT + Fh +1

x 10

-2

PLS
Modified PLS

-4

-6

(6) Calculate the residual matrix:

i =1

RMSE =

F0 = Y , i = 0 ;
(2) Select a random row in Fh to be ui ;
(3) Follow the steps below until t h is convergent to a
satisfactory degree:
w i = ETi ui / uTi ui ,
qi = FiT t i / FiT t i

(y

10

20

30

40

50

60

70

80

90

100

Samples
Fig. 4 Training error of penicillin concentration with conventional PLS and
modified PLS

Training Error of Generated


Heat (kcal)

0.4

where T = [t1 , t 2 t h ] , P = [p1 , p 2 p h ] and

Q = [q1 , q 2 q h ] ;
(9) Calculate the projection matrix of E h+1 in Y :
PE = YT E h +1 ETh +1E h +1 ;

(10) Calculate the non-orthogonal part in Y : Y = EPET ;

0.2

0.1

-0.1

-0.2

-0.3

(11) Obtain the new prediction model: Y = TQ + Y .

PLS
Modified PLS

0.3

10

20

30

40

50

60

70

80

90

100

Samples
Fig. 5 Training error of generated heat with conventional PLS and
modified PLS

III. SIMULATION STUDY


Penicillin fermentation process is a complex biochemical
reaction process [9]. There are many variables in this process,
which are multiple correlative and coupled. The data is timevarying and uncertain so that it is hard to build an accurate

1386

between quality variables and residual is revealed. The


proposed method can develop a more accurate regression
model between input variables and quality variables by
making full use of the left information in residual. The case
study on penicillin fermentation process is performed to test
the performance modified PLS algorithm for prediction, where
the conventional PLS method is applied too. Results of case
study show better performance of modified PLS algorithm
than convention PLS method.

Prediction Error of Penicillin


Concentration (g/L)

0.01

0.005

-0.005

PLS
Modified PLS

-0.01

-0.015

10

12

14

16

18

ACKNOWLEDGMENT

20

Samples
Fig. 6 Prediction error of penicillin concentration with conventional
PLS and modified PLS

REFERENCES
[1] Q. Chen, U. Kruger, Analysis of extended partial least squares for
monitoring large-scale processes, IEEE Transactions on Control Systems
Technology, vol. 13, no. 5, pp. 807-813, September 2005.
[2] J.H. Chen, K.C. Liu, On-line batch process monitoring using dynamic
PCA and dynamic PLS models, Chemical Engineering Science, vol. 57,
no. 1, pp. 63-75, January 2002.
[3] Y. Zhang, L. Zhang, Fault identification of nonlinear processes,
Industrial & engineering chemistry research, vol. 52, no. 34, pp. 1207212081, August 2012.
[4] S.J. Qin, Survey on data-driven industrial process monitoring and
diagnosis, Annual reviews in control, vol. 36, no. 2, pp. 220-234,
December 2012.
[5] S.J. Qin, Y. Zheng, Quality-relevant and process-relevant fault
monitoring with concurrent projection to latent structures, AIChE
Journal, Vol. 59, no. 2, pp. 496-504, Feb 2013.
[6] G. Li, S.J. Qin and D. Zhou, Geometric properties of partial least squares
for process monitoring, Automatica, Vol. 46, no. 1, pp. 204-210, January
2010. 33.
[7] S. Wold, Cross-validatory estimation of components in factor and
principal components model, Technometrics, vol. 20, pp. 397-405, 1978.
[8] B.S. Dayal, J.F. MacGregor, Improved PLS algorithms, Journal of
chemometrics, vol. 11, no. 1, pp. 73-85, 1997.
[9] Y. Zhang, C. Wang, R. Lu, Modeling and monitoring of multimode
process based on subspace separation, Chemical engineering research &
design, vol. 91, no. 5, pp. 831-842, May 2013.

Prediction Error of Generated


Heat (kcal)

0.3

0.2

0.1

-0.1

-0.2

-0.3

PLS
Modified PLS

-0.4

-0.5

10

12

14

16

18

20

Samples
Fig. 7 Prediction error of generated heat with conventional PLS and
modified PLS

Penicillin Concentration (g/L)

0.92

0.91

0.9

0.89

0.88

0.87

PLS
Modified PLS
True Value

0.86

0.85

0.84

10

12

14

16

18

20

Samples
Fig. 8 Quality variable track conditions for penicillin concentration
68.5

Generated Heat (kcal)

68

67.5

67

66.5

PLS
Modified PLS
True Value

66

65.5

10

12

14

16

18

20

Samples
Fig. 9 Quality variable track conditions for generated heat
IV. CONCLUSION

In this paper, the modified PLS modeling method is


proposed. By analyzing the impact of quality variables on the
decomposition of input variables space, the relationship

1387