Professional Documents
Culture Documents
Abstract: A quality intelligent prediction model for small-batch producing process was proposed in the article, after comparing with
the common used approaches of procedure intelligent prediction and their characteristics. The prediction process and algorithm were
presented too. Fuzzy least square support vector machine (FLS-SVM) is taken as the intelligent kernel for the model. On one hand,
it can solve the small-batch learning better and avoid the disadvantages, such as over-training, weak normalization capability, etc., of
artificial neural networks prediction. On the other hand, it makes samples fuzzy by membership function to choose optimum samples
and make history data ‘nearer is more weight’. After doing lots of prediction experiments and comparing with other common prediction
methods, the method proposed in the article proved to be good normalization capability, more rapidly built, and more easily realized. It
offers feasibility to predict and control small-batch machining process online.
Key Words: small-batch; support vector machine (SVM); fuzzy least square support vector machine (FLS-SVM); quality prediction
ters but the latest processing parameters. We call the rule as Eq.(3) is transformed to dual optimization problem:
“nearer data weigh heavier and farther data weigh lighter”.
N
The concept of membership was introduced into the article, 1 ∗
max J = − (α − αi ) αj∗ − αj ψ (xi , xj ) −
and a fuzzy LS-SVM prediction model based time-domain ∗
a,a 2 i,j=1 i
membership was presented. This model put different mem-
berships according to the impact extent of history data. It N
N
reduces the impact of early data to the current producing pro- ε (αi∗ + αi ) + yi (αi∗ − αi )
i=1 i=1
cess, and improves the accuracy of real-time prediction. The ⎧
model was researched in the producing process of bearing’s ⎨
N
(αi − αi∗ ) = 0
outer ring, and compared prediction accuracy with the tradi- s.t. i=1 , (5)
⎩
tional model. The research demonstrates that the model is αi , αi∗ ∈ [0, c]
applicable and reasonable to small-batch producing process.
Standard SVM regression model turns into:
2 Fuzzy least square support vector machine
N
Suppose a data-set {xi , yi }, (i = 1, 2, · · · , N ) is waited
for being regressed, xi ∈ Rn and yi ∈ R are the input and y (x) = (ai − a∗i ) ψ (xi , x) + b,
i=1
output of system, respectively. The n-dimension input sam-
ples are mapped from the original space to high dimension
space F by nonlinear transformation ϕ (·), in which space the b= (ai − a∗i ) [ψ (xj , xi ) + ψ (xk , xi )], (6)
best linear regression function is constructed. SVS
N
1 2
N
1 1
min ω T ω + c (ξi + ξi∗ ), min ω T ω + γ ξ ,
2 2 2 i=1 i
i=1
s.t. ⎧ s.t.
⎨ yi − ω T ϕ (xi ) − b ≤ ε + ξi
yi = ω T ϕ (xi ) + b + ξi , i = 1, 2, · · · , N, (7)
ω T (xi ) + b − yi ≤ ε + ξi∗ , (2)
⎩
ξi ≥ 0, ξi∗ ≥ 0, i = 1, · · · , N where, positive real number γ is tuning constant. γ can take
where c is called equilibrium factor, usually be valued 1. ξi a compromise between training errors and model complex-
and ξi∗ are the errors of training-set, and they indicate how ity. And it has better ability of generalization. The larger the
much is the distance for samples beyond the fit precision ε. value of γ is, the lesser the error of regression model will be.
Lagrange equation can be set up from Eq.(2): The loss function of LS-SVM changes inequality constraints
to equality constraints, and it’s different from the standard
N SVM. Lagrange function is introduced in as the following:
1 T
l (ω, b, ξ, ξ ∗ ) = ω ω+c (ξi + ξi∗ ) −
2 i=1 L (ω, b, ξ, a) =
N
1 T N N
αi ε + ξi + yi − ω T ϕ (xi ) − b − ω ω+γ ξi2 − ai [ωϕ (xi ) + b + ξi − yi ] (8)
i=1 2 i=1 i=1
N
αi∗ ε + ξi∗ + yi − ω T ϕ (xi ) − b − where, ai , (i = 1, · · · , N ) is the Lagrange multiplier.
i=1 The best a and b can be acquired by KKT:
N
⎧
(ηi ξi + ηi∗ ξi∗ ), (3) ⎪ ∂L N
⎪
⎪ = 0 → ω = ai ϕ (xi )
i=1 ⎪
⎪
⎪
⎪ ∂ω
⎪
⎪ i=1
where, αi , αi∗ ≥ 0, ηi , ηi∗ ≥ 0, i = 1, 2, · · · , N . ⎪
⎪ N
⎨ ∂L
With the help of first partial derivative, Eq.(3) can be =0→ ai = 0
∂b , (9)
transformed to dual optimization problem. According to xi , ⎪
⎪ i=1
⎪
⎪ ∂L
(ai − a∗i ) = 0 is support vector. Variable ω is complexity of ⎪
⎪ = 0 → ai = γξi
⎪
⎪
the function, and it’s the linear combination of the mapping ⎪ ∂ξ
⎪
⎪
⎩ ∂L = 0 → ωϕ (x ) + b + ξ − y = 0
function ϕ (·). Therefore, the calculation complexity of sys- i i i
tem identification by SVM depends on not the dimensions of ∂a
space but the number of samples. Then, optimization can be transformed to:
Kernel function is introduced instead of nonlinear map-
ping ϕ (·): 0 ΘT b 0
T = (10)
ψ (xi , xj ) = ϕ (xi ) ϕ (xj ) (4) Θ Ω + γ −1 I a y
DONG Hua, et al./Systems Engineering – Theory & Practice, 2007, 27(3): 98–104
N
1 1
min ω T ω + γµi ξi2
2 2 i=1
Figure 1. Quality prediction for small-batch process based
s.t. FLS-SVM (a) Modeling method based on FLS-
yi = ω T ϕ (xi ) + b + ξi , i = 1, 2, · · · , N (12) SVM; (b) Predict method based on FLS-SVM
Lagrange function is structured as follows:
control chart to set up training sample set, and keep the win-
L (ω, b, ξ, a) = dow width unchanged.
N N
Data window move with the process. While a new da-
1 T
ω ω + γµi ξi2 − ai [ωϕ (xi ) + b + ξi − yi ](13) tum comes into the window in one side, another datum goes
2 i=1 i=1
out of the window in another side correspondingly at each
move. Thus, the number of the data (set as n) in the window
According to the optimum conditions, the matrix is is the same. Take the data in the window as a n-dimension
structured as: vector of the input sample.
Suppose the position of the current time window is i,
0 ΘT b 0
= (14) the input vector xi of samples is the data from i to i + n − 1,
Θ Ω + (γµi )−1 I a y
and the output is yi = zi+n . First, move the window to get
the next training sample {xi+1 , yi+1 }, which is put into the
where, y, a, Θ, Ω, I have the same meaning as LS-SVM ex-
LS-SVM learning machine to get the regression parameters
pression.
a and b.
Estimation function of fuzzy LS-SVM can be obtained
Second, put a and b into the LS-SVM prediction ma-
by solving matrix Eq.(14). On comparing Eq.(14) with
chine while extracting history data zN ∼ zN +n−1 from the
Eq.(10), the fuzzy membership µi is the difference. So we
time window and take them as input xN to get prediction re-
call this method as Fuzzy least square support vector ma-
sponse ŷN , which is the prediction value ẑN +n at moment
chine.
N + n. Third, take ẑN +n as the real value at moment N + n
3 Quality prediction model based on FLS-SVM of to get the next prediction value at N + n + 1 moment. The
small-batch producing process process of quality prediction based on LS-SVM is shown as
Figure 1.
3.1 Time-sequence predicting model
The commonly used quality analysis and prediction 3.2 Fuzzy membership of Processes sample sequence
model is usually based on SPC nowadays. Generally, about Generally for sample sequence, the degree of impor-
20 to 50 process data are extracted from the quality control tance and the impact on the future data by history data is
charts to set up training sample set. Then, the sample set increasing from far to near. Different membership to his-
are studied by intelligent model (such as artificial neural net- tory samples according to its difference positions in the time
works) and masters potential rules of the process quality to domain are observed. Specific to say, nearer data samples
predict the quality in the future. are given larger membership, while further data samples
Small-batch process produces little production, usually are given smaller membership. Through this fuzzy process,
within the dozens. It is difficult to provide a number of sam- nearer samples are enhanced and farther samples are weak-
ples. Therefore, the research decreases the number of con- ened.
tinuous data (called as data width) from 20–50 data to 3–8 To determine the fuzzy membership of history samples,
data, which are extracted from the control chart. The specific two schemes were designed in our research: one is index
process is: establish a window of data in the quality distribution in time domain, the other is linear distribution in
DONG Hua, et al./Systems Engineering – Theory & Practice, 2007, 27(3): 98–104
Table 1. Real deviation sequence for bearings’ outer ring size (Unit: mm)
No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14
∆Ds max s 0.38 0.37 0.38 0.39 0.42 0.41 0.40 0.38 0.36 0.42 0.44 0.40 0.37 0.40
No. 15 16 17 18 19 20 21 22 23 24 25 26 27 28
∆Ds max s 0.42 0.45 0.42 0.41 0.40 0.42 0.41 0.44 0.43 0.43 0.42 0.43 0.44 0.46
time domain. of models, output of training sample sets, and the number
Membership of index distribution in time domain avail- of training samples, respectively. The computer used in ex-
able to the following formula: periment is Pentium M-436M CPU and 128M memory. The
N −i time-consumption and regression precision of each model
µi = α (1 − α) , (15)
are shown in Table 2.
where, α is index coefficient, 0 < α < 1 and i = From Table 2, it is obvious to see that the MSE of
1, 2, · · · , N . LS-SVM model is minimum, and then is that of FLS-SVM
Membership of linear distribution in time domain µi is: model. They are one or two orders lower than that of BP
µi = α + i · (1 − α) /N , (16) neural network models and polynomial regression model.
The time-consumption of polynomial regression model is
The parameter α of membership distribution in time-
minimum, and then that of LS-SVM model and FLS-SVM
domain is very important to solve multivariate linear
model. Time-consumption of BP neural network model is
Eq.(14). It impacts on the accuracy of results. Under normal
the largest.
circumstances, if the time sequence data sample is stable,
linear distribution and less value of α can be selected; if the The predicted deviations of polynomial regression
data sequence fluctuates more seriously, index distribution model, BP neural network models, LS-SVM model, and
and lesser value of α can be selected. The value of α can FLS-SVM model are shown in Figure 2.
also be selected through optimized calculation by computer Regression capacity of the BP neural network models,
to get α, which is the least fitting error. LS-SVM, and FLS-SVM model was tested by input of train-
4 Experiments of process quality prediction ing sample sets. These tests are not predictions but sample
tests actually. That is to say 5th to 25th work pieces are re-
A semi-automatic turning process has twenty-eight gression tests of samples, and 26th to 28th work pieces are
pieces of outer ring of bearing. Obviously, it is a typical prediction values.
small-batch process.
The real deviation sequence for outer ring of bearings Conclusions from Figure 2:
size is shown in Table 1. The tolerance of the outer ring (1) Polynomial model can predict quality deviation
is 90+0.45
+0.30 where, ∆Ds max s is the largest deviation of the caused by system factors but cannot predict those by random
outer ring’s diameter by measurement. In the turning pro- factors in turning process.
cess, there are system factors (such as tool wear, thermal (2) Regression and prediction capability of random er-
deformation, etc.) and random factors (such as system vi- ror based on BP neural networks is better than polynomial
bration, rough material, etc.) impacting the turning quality. model.
Prediction models based on three-order polynomial re-
gression, four-order polynomial regression, BP neural net- (3) LS-SVM model has high regression precision for
work, LS-SVM, and fuzzy LS-SVM were set up, respec- training samples. Output sequence of the LS-SVM model
tively, with the first twenty-five data in Table 1, and then to almost coincides with the real deviation sequence. For the
predict the values of 26th, 27th, 28th work piece. Compare three future data (26th to 28 work pieces), the prediction pre-
the prediction values with the real values. cision is higher than that of BP neural network.
In the experiment, the polynomial models and BP neu- (4) There are more regression errors on the first half
ral network are called polyfit function and artificial neural of sample sequence based on FLS-SVM model but much
network toolbox in Matlab, respectively; the time window higher precision on the second half of the sample sequence.
width of BP neural network model, LS-SVM model, and That proves the rule: “nearer data weigh heavier and farther
FLS-SVM model was valued five; in BP neural network, data weigh lighter”. It’s better to predict the three future data
the learning rate and training steps valued 0.1 and 10,000 by FLS-SVM model.
steps, hidden layer neurons were taken six and ten for ex- In addition, parameters γ and δ were not optimized only
periments, respectively; LS-SVM and FLS-SVM algorithm to test FLS-SVM prediction model. The parameters should
were achieved by matrix, adjusting constant valued 500, high be adjusted by the conditions and experiences in applications
precision radial basis function (RBF) was chosen as Ker- to improve precision.
nel function, RBF parameters valued 0.1; index distribution
membership function in time-domain was selected for fuzzy
samples of FLS-SVM, and the distribution parameters val- Table 2. Time-consumption and regression precision of
ued 0.3.
each model
n
Mean square error M SE = (ŷi − yi )2 n was de- Method 3’Polynomial BP(6,1) BP(10,1) LS-SVM FLS-SVM
i=1 CPU 0.1s 60.3s 64.3s 0.6s 0.6s
fined as the testing indicators, in order to compare the regres-
MSE 3.5e-4 2.7e-4 1.3e-4 1.0e-5 4.3 e-5
sion precision of each model, where ŷi , yi , and n are output
DONG Hua, et al./Systems Engineering – Theory & Practice, 2007, 27(3): 98–104