You are on page 1of 11

Powder Technology 347 (2019) 114–124

Contents lists available at ScienceDirect

Powder Technology

journal homepage: www.elsevier.com/locate/powtec

An experimental modeling of cyclone separator efficiency with


PCA-PSO-SVR algorithm
Wei Zhang a,⁎, Linlin Zhang a, Jingxuan Yang a, Xiaogang Hao a, Guoqing Guan b,⁎⁎, Zhihua Gao c
a
Department of Chemical Engineering, Taiyuan University of Technology, Taiyuan 030024, Shanxi, China
b
Energy Conversion Engineering Laboratory, Institute of Regional Innovation (IRI), Hirosaki University, 2-1-3, Matsubara, Aomori 030-0813, Japan
c
Key Laboratory Coal Science and Technology, Taiyuan University of Technology, Taiyuan 030024, Shanxi, China

a r t i c l e i n f o a b s t r a c t

Article history: Accurate prediction of the complicated nonlinear relationship among the grade efficiency, geometrical dimen-
Received 15 October 2018 sions, and operating parameters based on limited experimental data is the most effective way to design a
Received in revised form 21 January 2019 high-efficiency cyclone separator. Herein, a hybrid PCA-PSO-SVR model is proposed to predict the grade effi-
Accepted 25 January 2019
ciency of cyclone separators with the operating parameters based on 217 sets of experimental data provided
Available online 1 March 2019
in the literature. The experimental data are preprocessed using the random sampling technique together
Keywords:
with the normalization method and principal component analysis (PCA) at first; subsequently, the particle
Cyclone separator swarm optimization (PSO) algorithm is incorporated to optimize the parameters for the support vector regres-
Grade efficiency sion (SVR), including the penalty factor C, kernel function parameter g, and insensitive loss ε. Finally, the SVR
Support vector regression algorithm model with the optimized parameters is trained with 80% pretreatment data, and the generalization ability of
Particle swarm optimization the model is tested with the remaining 20% data. The mean squared error of the test sets is 6.948 × 10−4 with
Principal component analysis a correlation coefficient of 0.982. The comparison results show that the PCA-PSO-SVR model has higher accuracy,
better generalization ability, and stronger robustness than the existing models for predicting the cyclone separa-
tor efficiency in the case with only a few experimental data.
© 2019 Elsevier B.V. All rights reserved.

1. Introduction Chen et al. [12] conducted an experimental investigation to predict


the influence of the operating temperature on overall cyclone efficiency;
Cyclone separator efficiency is considered one of the major criteria Lim et al. [13] performed a trial examination of the effects of cylinder-
to design cyclone geometry and evaluate its performance. As shown in and cone-shaped vortex finders on the particle collection efficiencies
Table 1, four approaches have been developed to estimate the separa- of cyclones with different flow rates; Luo et al. [14] derived the effi-
tion efficiency of cyclone separator. In the approach (1), the theoretical ciency formula for a particular reverse-flow cyclone with a plane top
and semi-empirical models [1–9] are always derived from physical and volute inlet (i.e., the PV-cyclone) separator by applying the similar-
descriptions of gas flow pattern and energy dissipation mechanisms in ity theory and regression analysis based on a large set of experimental
the cyclone. However, although this is a conventional way, the assump- data. However, a series of assumption has to be made to facilitate similar
tions and simplifications used in these models easily lead to signifi- analysis but it cannot fit the actual situation, and the high accuracy of
cant errors between the experimental data and predicted results. the regression model depends on a large set of data. Moreover, a series
Considering that the efficiency model of a cyclone separator should of experiments [14–16] made on the PV-cyclone have proved that the
ideally be established through experimental data [10], the approach factors affecting the separation efficiency of cyclone separator are too
(2) has been applied. For instances, by using this approach, Zhu et al. complex to be represented only by the Stokes number.
[11] studied the effects of the flow rate, cylinder height, and exit tube To understand the effect of the geometrical ratios on the flow
length on the collection efficiency with a set of experimental data; field pattern and separation performance, the approach (3), i.e.,
computational fluid dynamics (CFD) study is always applied
[17–22]. However, the modeling data obtained through CFD simula-
⁎ Corresponding author at: College of Chemistry and Chemical Engineering, Taiyuan tion is too time-consuming, and simultaneously, the obtained data
University of Technology, Taiyuan 030024, China.
⁎⁎ Corresponding author at: Energy Conversion Engineering Laboratory, Institute of
deviate considerably from the real situation. Furthermore, the full
Regional Innovation (IRI), Hirosaki University, 2-1-3, Matsubara, Aomori 030-0813, Japan. elastic particle-wall collision and the ideal assumptions of dust col-
E-mail addresses: zhangwei01@tyut.edu.cn (W. Zhang), guan@hirosaki-u.ac.jp lection at the bottom usually result in the over-prediction of the sep-
(G. Guan). aration efficiency with smaller particles.

https://doi.org/10.1016/j.powtec.2019.01.070
0032-5910/© 2019 Elsevier B.V. All rights reserved.
W. Zhang et al. / Powder Technology 347 (2019) 114–124 115

Table 1 cyclone separator. In this study, a hybrid model is proposed to predict


Summary of the different approaches for separation efficiency estimation. the grade efficiency of cyclone separators with the operating parame-
Approach Comments References ters based on 217 sets of experimental data provided in the literature.
(1) Theoretical and These models are derived Zhao [1,2]; Göran et al. [3];
It is expected to reach the following three objectives. (1) Modeling the
semi-empirical from physical descriptions Qiu [4]; Sun et al. [5];Yang separation efficiency of a cyclone separator with experimental data, in
models of gas flow pattern and et al. [6]; Barth [7];Dietz which both the geometrical and operating parameters are considered.
energy dissipation [8]; Leith-Licht [9] (2) Principal component analysis (PCA) is used, by which eight factors
mechanisms in the
that affect the grade efficiency are reduced to five independent factors
cyclone. However,
assumptions and for the modeling, but the information loss is minimal. (3) To create
simplifications used in accurate mathematical models with limited experimental data, SVR is
these models lead to used; meanwhile, to improve the modeling accuracy, the particle swarm
significant errors between optimization (PSO) algorithm is applied. The paper is organized as fol-
the experimental data and
predicted results.
lows. After the introduction section, the procedure by using the PCA-
(2) Experimental These models are Rafiee et al. [10]; Zhu et al. PSO-SVR model to predict the grade efficiency of cyclone separators is
and statistical developed through a [11];Chen et al. [12]; Lim described in Section 2. Then, in Section 3, to evaluate the universality
models statistical regression et al. [13]; Luo et al. [14]; and accuracy of PCA-PSO-SVR, it is compared with the classical theoret-
analysis based on an Jin et al. [15,16]
ical models and several ANN models, and the results are discussed.
experimental data set for
different cyclone Finally, the conclusions of this study are given.
configurations; however,
determining the optimal 2. SVR modeling hybrid PCA and PSO
correlation function for
fitting experimental data is
difficult.
Fig. 1 shows the flowchart of grade efficiency modeled with the pro-
(3) Computational CFD can provide the Sun [17];Francesco et al. posed PCA-PSO-SVR.
fluid dynamics performance parameters [18];Huang et al. [19];
(CFD) and the detailed Misiulia [20]; Mazyana 2.1. Experimental data
information of the flow [21]; Zhou et al. [22]
field inside the cyclone;
the main drawback of CFD The experimental data used in this study comes from the research on
is computationally a particular reverse-flow cyclone with a plane top and volute inlet,
expensive to solve the which is an efficient gas-solid separator jointly developed by the Uni-
Navier–Stokes equations in versity of Petroleum (China) and China Petrochemical Corporation in
fluid mechanics.
(4) Artificial intel- AI models (such as Elsayed [23,25]; Zhao [24];
1990 [28]. It is widely used in fluidized catalytic cracking (FCC) units,
ligence (AI) artificial neural networks, Yetilmezsoy [26]; coal combustion, gasification and petrochemical reaction processes for
models and and genetic algorithms), Khalkhali [27]; gas-solid separation under high temperature, high pressure, and high
machine learn- and machine learning dust concentration. The cyclone separator shown in Fig. 2 has an inlet
ing algorithms algorithms (such as
height a, an inlet width b, a vortex finder diameter dr, a vortex finder
support vector machines,
SVM) have become height S, a cyclone diameter D, a particle exit diameter B, a separation
powerful tools of scientific space height Hs, a cylindrical part height H1, and a conical part height
research and technology H2. According to the cyclone flow field studies and performance tests
without the need of for the FCC catalyst-gas system, the optimum ratios are B/D = 0.4–0.5,
understanding the nature
of phenomenon.
Hs/D = 2.8–3.0, S/a = 0.8–1.0, and a/b = 2.2–2.5 [5].
A set of 217 experimental data from literature [14–16] is extracted
for investigating the effect of different modeling methods on modeling
With the development of the modern computer technology, the im-
plementation of big data processing has become very easy. As such, the
approach (4), i.e., Artificial Neural Network (ANN) and Support Vector
Regression (SVR) [23,24] algorithms, is becoming a hot topic by pro-
cessing complex nonlinear mathematical models based on sample
data without knowing the mechanism. These algorithms are success-
fully applied to model the efficiency of the cyclone separator based on
CFD samples or the experimental data [25–27]. In particular, Elsayed
et al. [25] successfully applied two radial basis function neural networks
(RBFNNs) to model the pressure drop and cut-off diameter for cyclone
separators, in which they studied seven geometrical parameters on
the cyclone separator performance (the pressure drop and cut-off diam-
eter) without considering the other operating parameters. However,
the effects of particle size, particle density, gas velocity, surface rough-
ness, kinematic viscosity, and so on, and the effects of dimensions and
geometry of the cyclone on the performance still need to be taken into
account for accurate modeling.
Based on the above review, nowadays, it is required an accurate
mathematical model to effectively predict the complex and nonlinear
relationship between the separation efficiency and both the geometrical
and operating parameters. It is considered that the accurate prediction
of the complicated nonlinear relationship among the grade efficiency,
geometrical dimensions, and operating parameters based on limited
experimental data is the most effective way to design a high-efficiency Fig. 1. Flowchart of the PCA-PSO-SVR proposed method.
116 W. Zhang et al. / Powder Technology 347 (2019) 114–124

dust particle size distribution σ. However, the effect of the mean square
error of dust particle size distribution on the separation performance
can be neglected with regard to both physical and mathematical as-
pects. To sum up, there are a total of eight input variables. The grade ef-
ficiency of particles ηi is selected as the output variable. Table 2
summarizes the input and output variables of SVR and gives some ex-
perimental data.

2.2. Support vector regression

Support Vector Regression (SVR) is a powerful learning model to


minimize the structural risk with better generalization capability
based on the statistical theory. The core concept of SVR is firstly to
map the original data into a high-dimensional feature space nonlinearly,
and then to find an optimal linear regression function in this feature
space. In short, the process is to achieve linearization and ascending di-
mensions. Thus, the problem of finding the optimal high dimensional
linear plane is transformed into a convex quadratic programming prob-
lem. SVR problems with kernel functions are represented in Fig. 3.
Solving the nonlinear regression problem is actually the process of
solving the weight vector ωi and the threshold value u. The values of
ωi and u are estimated by minimizing Eq. (1) based on Structural Risk
Minimization Principle,
m
Rreg ½ f  ¼ Remp½ f  þ λkωk2 ¼ ∑i¼1 C ðei Þ þ λkωk2 ð1Þ

Fig. 2. The structure of PV cyclone separator.


Where Rreg is structural risk; Remp is empirical risk; f is function of non-
linear regression; ei is error between the predicted value and the true
value, ei = f(xi) − yi. λ is regularization constant; m is number of sam-
the accuracy. The factors influencing the efficiency of the cyclone sepa-
ples; C(⋅) is ε-insensitive loss function defined as [29]
rator include the geometrical and operating parameters. There are three
geometrical parameters that seriously affect the separation efficiency, C ðei Þ ¼ maxð0; jei j−εÞ ð2Þ
namely, cyclone diameter D, ratio of cyclone cross-sectional area to
inlet cross-sectional area Ka = πD2/4ab, and ratio of diameter of vortex where ε is insensitive loss which denotes the fault tolerance level of the
finder to that of cyclone d~r ¼ dr =D. Moreover, six operating parameters mode. The larger the value, the greater the tolerance of the model to
affect the collection efficiency: a gas velocity at cyclone inlet vi, a the error, and the higher the probability of under-learning; otherwise,
concentration of inlet particles Ci, a diameter of particles δ, a particle the probability of over-fitting is larger. Therefore, it is crucial to select
density ρp, a median size of particle dm, and a mean square error of proper insensitive loss vector ε for support vector regression.

Table 2
Input and output variables of the SVR model with corresponding experimental data.

Input variables Output variables

x1 x2 x3 x4 x5 x6 x7 x8 y1

D Ka ~r
d vi Ci δ ρp dm ηi

800 4.4 0.25 15.16 10 6 2876 13.570 93.264


800 4.4 0.312 15.16 10 6 2876 13.570 91.334
800 4.4 0.445 15.16 10 6 2876 13.570 85.826
400 4.4 0.44 15.99 10 9 2876 11.986 95.215
400 4.4 0.44 15.96 30 9 2876 11.986 96.294
400 4.4 0.44 16 50 9 2876 11.986 96.746
800 4.4 0.44 11.12 10 5 2876 14.27 82.939
800 4.4 0.44 11.12 10 6 2876 14.27 86.591
800 4.4 0.44 11.12 10 7 2876 14.27 89.320
800 4.4 0.44 11.12 10 8 2876 14.27 91.121
800 4.4 0.44 15.16 10 8 2876 14.27 93.264
800 4.4 0.44 19 10 8 2876 14.27 94.241



800 4.4 0.44 14.97 10 5 2876 11.986 82.451
800 4.4 0.44 16.10 10 5 2876 13.540 85.140
800 4.4 0.44 15.16 10 5 2876 14.270 84.727
800 7.2 0.44 15.16 10 8 2876 13.57 97.153
800 4.26 0.44 15.16 10 8 2876 13.57 94.435
400 5.5 0.44 20 10 8 2876 9.976 97.487
400 4.4 0.44 10.73 10 5 3050 11.835 87.415
800 4.4 0.44 11.01 10 5 3050 11.835 82.496
800 4.4 0.44 14.97 10 9 2876 11.986 89.138
800 4.4 0.44 14.59 10 9 3050 11.835 90.249

The significance of [bold] in the table is to emphasize the change of variables


W. Zhang et al. / Powder Technology 347 (2019) 114–124 117

Although the characteristic space and nonlinear mapping are used in


the derivation process, their expressions are not required in the actual
calculation [31]. The nonlinear regression function is computed by the
kernel function, and the coefficients of αj, α j  , and b which correspond
to the support vector in the sample data. The selection of kernel function
is crucial to support vector regression which directly affects the nonlin-
ear mapping of samples. Different kernel functions including the
polynomial function, Gaussian radical basis function (RBF), and sig-
moid (s-shaped) kernel function is selected in the SVR algorithm [32].
For better generalization and nonlinear regression ability, the RBF ker-
nel function is selected for the SVR modeling. The expression is shown
in Eq. (7),

 
Fig. 3. Schematic of SVR model. K ðx; xi Þ ¼ exp −g kx−xi k2 ð7Þ

Solving the minimization problem of Eq. (1) is then transformed into where g is kernel function parameter. Here, changing the value of g
solving the quadratic programming problem of Eq. (3) after introducing indirectly changes the nonlinear mapping function, which can deter-
the concept of relaxation variable ξi, mine the complexity and performance of the model directly.
 In this study, the purpose of SVR model training is to find an appro-
m   
minJ ¼ 1 2
kωk2 þ C∑i¼1 ξi þ ξi ð3Þ priate correspondence to satisfy Eq. (8) after the input and output vari-
8 ables are settled.
< yi −ðω  ϕðxÞÞ−u ≤ε þ ξi
s:t: ðω  ϕðxÞÞ þ u−yi ≤ε þ ξi
  
:  ~r ; v ; C ; δ; ρ ; dm
ηi ¼ f D; K a ; d ð8Þ
ξi ; ξi ≥0 i i p


where ω is weight vector; 1 2 kωk2 represents model complexity. C is 217 sets of experimental data from literature [14–16] as shown in
penalty factor, which keeps a balance between the complexity and Table 2 are used to train and test the SVR model. The range of each
empirical risk [30]. Increasing the value of C indicates that the more input parameter is shown in Table 3.
attention is paid to the empirical risk, and the greater the possibility of According to the random sampling technique, 80% of the data
over-fitting occurs. Otherwise, the phenomenon of under-fitting easily are randomly selected as the training set of SVR, and the remain-
occurs. Therefore, the selection of an appropriate penalty factor is ing 20% are used as the test set to verify the generalization ability
required. Choosing a suitable value of C is crucial during the establish- of the model. Before training, the input data need to be normal-
ment of a favorable SVR model. ized so that each variable can be converted into a number be-
The Lagrangian multipliers method and KKT conditions can be used tween 0 and 1. The output results after training should be
to transform the quadratic programming problem of Eq. (3) into the reversely normalized.
dual optimization problem of Eq. (4)
8 9
>
< 1 >
m  
 

m  
 m  
=
max J ðα Þ ¼ max − ∑ i ¼ 1 α i −α i α j −α j Kðxi ; x j Þ−ε∑i¼1 α i þ α i þ ∑i¼1 yi α i −α i
>
: 2 >
;
j¼1
8
>
> X
m  
>
< α i −α i ¼ 0 2.3. Dimensionality reduction based on PCA
s:t: i¼1 ð4Þ
>
> 0bα i bC
>
:
0bα i bC When modeling multivariate data, the model complexity and com-
putation time could be increased by the large amount of variables. To
where αi, α i  , αj and α j  are Lagrangian operators; K(xi, xj) is the kernel solve this problem, the principal component analysis (PCA) is adopted
function with which the input space of data can be transformed into a to reduce the dimension of the dataset. PCA is one of the most com-
nonlinear and high-dimensional space. According to the αi and α i  cal- monly used dimensionality reduction algorithms, which can well over-
culated from Eq. (4), the support vector xi (with αi and α i  are not come the disadvantages of computational complexity resulted from too
both 0) and the standard support vector xi (with one of αi and α i  is many dependent variables. The idea of PCA is to map the n-dimensional
C) can be determined, and then the threshold value u can be calculated features to k dimensions (k b n) according to the maximum variance
according to Eq. (5),

1 n h
l
  i h
l
  io
u¼ ∑0bα i bC yi −∑ j¼1 α j −α j Kðxi ; x j Þ−ε þ ∑0bα bC yi −∑ j¼1 α j −α j Kðxi ; x j Þ þ ε
N NSV i

ð5Þ

where NNSV is number of standard support vectors,l is number of sup- theory. The k-dimensional feature matrix is called the master element,
port vectors. Then the resulting approximation function can be written and it is a linear combination of the previous features. The new k fea-
as Eq. (6): tures are independent and reflect most information of the sample space.
l   Decision of the reduced number of dimensions related to PCA
f ðxÞ ¼ ∑i¼1 α i −α i K ðxi ; xÞ þ u ð6Þ is a critical step. As the number of dimensions is only a few, some
118 W. Zhang et al. / Powder Technology 347 (2019) 114–124

Table 3
Range of input parameters.

D (mm) Ka ~r
d vi (m/s) Ci (g/m3) δ (μm) ρp (kg/m3) dm (μm)

Min 300 4 0.25 5 5 1 2000 8


Max 1200 8 0.5 50 1000 15 3500 15

information could be lost by the dimension-reduced matrix. Inversely, 2.4. Parameter optimization of SVR by PSO
as the dimensions remain high, the complexity of the regression
model also becomes too high. In both cases, the generalization ability The generalization capacity of SVR greatly depends on the hyper-
of the regression model is low. In this study, the original SVR model parameters, i.e., the penalty factor C, kernel function parameter g, and
is an eight-dimensional space, which could be reduced from eight to insensitive loss ε. However, it is difficult to determine the proper value
three by PCA. The performance parameters of the model from the test of these parameters by prior knowledge, and the process of tuning
set after the dimension reduction and their corresponding SVR parame- parameters manually is time-consuming. Furthermore, the effect of
ters are listed in Table 4. These performance parameters include the these three parameters on the model performance is still uncertain.
information retention ratio, and mean square error and correlation coef- Thus, the particle swarm optimization (PSO) is adopted for the para-
ficient of the test set. It is observed directly from Fig. 4 that the informa- meter's optimization.
tion retention rate becomes lower and lower with the decrease in the The PSO algorithm was proposed firstly by Kennedy and Eberhart
dimensions. However, the greatest information loss occurs when the di- [33] inspired by the hunting of birds. In the optimization process, each
mension is reduced from five to four. After the dimensionality-reduced particle has its own speed, location, and fitness value determined by
models are tested one by one using the test set, it is found that the cor- the target function. In each iteration, the particle updates its speed
relation coefficient of the model is the largest and the root mean square and position based on the best historical position (individual best)
error is the smallest when the dimensions are reduced to five. Thusly, that the particle passes through and the best position (global best)
the dimension reduction from eight to five is the best one. that all particles can be found. The formula for updating speed and po-
In this study, the dimension reduction matrix W is obtained by the sition are as follows,
following steps.
xid ðt þ 1Þ ¼ xid ðt Þ þ vid ðt þ 1Þ t þ 1
Step 1: Normalizing the training set. ¼ ω  t þ C1  r1  Pid−xidt þ C2  r2  ðGid−xidðtÞÞ ð11Þ
Step 2: Centralizing the training set.
Step 3: Calculating the covariance matrix of the training set. vid ðt þ 1Þ ¼ ω  vid ðt Þ þ C 1  r 1  ðP id −xid ðt ÞÞ þ C 2  r2  ðGid −xid ðt ÞÞ ð12Þ
Step 4: Calculating the eigenvalues of the covariance matrix and the
corresponding eigenvectors. where i is ith particle, d is dimension, t is iteration number, C1 and C2 are
Step 5: Sorting the eigenvalues from large to small, the eigenvectors learning factors, r1 and r2 are random numbers between 0 and 1, ω is
corresponding to the first five eigenvalues are found. The five eigen- inertial weight of linear decreasing, Pid is individual extreme value of
vectors form the dimension reduction matrix W shown in Eq. (9). the ith particle on the d dimension, and Gid is global extreme value of
all particles.
According to Eq. (10), the input matrix A consisting of eight input The 5-fold cross-validation is used to evaluate the fitness of each
variables is reduced to a five-dimensional feature matrix N by the 8 particle to maintain a balance between computation cost and effective-
× 5 dimension reduction matrix W. The newly generated matrix N is ness of parameters optimization. Training sets are randomly divided
composed of five independent variables N1, N2, N3, N4 and N5. into five non-intersecting subsets with a roughly equivalent number
of data patterns. For every set of SVR parameters C, g and ε, extracting
2 3 from the corresponding particle, four subsets are selected randomly to
−0:0068 0:0547 0:1711 0:3651 0:9095
6 0:6800 −0:2824 −0:4245 0:3984 −0:0366 7 be the training set for establishing SVR model, and the performance of
6 7
6 0:1585 0:8006 −0:3241 0:1317 −0:0345 7 this SVR model is measured by calculating RMSE on the remaining
6 7
6 0:5024 −0:2846 0:2136 −0:2094 0:0256 7 one subset according to Eq. (13).
W ¼6
6 −0:3822
7 ð9Þ
6 −0:3918 −0:0870 0:2790 −0:0314 7
7
6 −0:2003 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
6 −0:0450 −0:1815 0:5427 −0:2155 7
7 1 n
4 −0:2123 −0:0941 −0:1701 0:2292 −0:1155 5 RMSE ¼ ∑ ðy −f ðxi ÞÞ
2
ð13Þ
n i¼1 i
0:1696 0:1758 0:7553 0:4711 −0:3300
where, n is the number of samples; yi is the true value; f(xi) is the pre-
N ¼AW ð10Þ dicted value of the model.
This process is repeated for five times until each of the five subsets
where, N = [N1 N2 N3 N4 N5]T, A ¼ ½ δ D ρp dm vi Ci Ka ~ 
d has been used once (only once) as the testing subset in turn. Eventually,
r

Table 4
Performance parameters of the SVR model after dimension reduction.

Dimension Information retention ratio Mean square error of test set Correlation coefficient of test set Parameters of SVR

Penalty factor C RBF kernel parameter g Insensitive loss ε

8 1 1.840e-3 0.951 650 0.673 0.03


7 0.9997 1.095e-3 0.970 217.62 0.62 0.01
6 0.9988 9.383e-4 0.975 368 0.9 0.021
5 0.9985 6.984e-4 0.982 660 0.673 0.026
4 0.9605 3.100e-3 0.916 800 1.5 0.01
3 0.9543 3.736e-3 0.900 25.63 240 0.001
W. Zhang et al. / Powder Technology 347 (2019) 114–124 119

Fig. 4. Model performance corresponding to the different dimensions.

Fig. 5. Fitness curve.


the fitness value of each particle is estimated by averaging the RMSE
value over 5-subsets [34]. However, to prevent the over-fitting of SVR Subsequently, the trained models are performed to predict the simu-
model, a lower limit is set for the root mean square error during the par- lated results according to the input of testing data and then compared
ticle swarm optimization, and the optimization ends when the root with the true value. Finally, the performance of the model is evaluated
mean square error starts to be smaller than this lower limit. according to the evaluation parameters.
In this study, the SVR parameter optimization with PSO is described
as follows: 2.5. Evaluation parameters

Step 1: The PSO parameters are set and the particle swarm is initial- For evaluating the performance of the model for the grade efficiency
ized as shown in Table 5. The parameters include the swarm size, the prediction, the normalized mean squared error MSE and the correlation
maximum iterations, the acceleration coefficients c1 and c2, the in- coefficient R are defined as
ertia weight, the penalty factor C∈[0.1, 800], the RBF kernel parame-
ter g∈[0.1, 10], and the ε-insensitive loss function parameter ε∈[0, 1], 1 n 2
MSE ¼ ∑ ðy −f ðxi ÞÞ ð14Þ
respectively. Then, a population of initial particles is generated with n i¼1 i
the random position and velocity.  
n
Step 2: For the training set, a five-fold cross-validation is used to cal- ∑i¼1 ðyi −yÞ  f ðxi Þ− f
R2 ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
 2 ð15Þ
culate the fitness value of different parameter combinations and Pn 2 Pn
then the calculated result is taken as the initial individual pbest for i¼1 ðyi −yÞ i¼1 f ðxi Þ−f

each particle. Here, the best pbest is set in particle swarm as the ini-
tial gbest. where, n is number of samples; yi is true value; f(xi) is predicted value of
Step 3: The speed and position of the particle are updated according the model; y is average valuation of true values; f is average of the pre-
to Eqs. (11) and (12), and then the fitness value before updating dicted values. The smaller the mean squared error MSE, the higher the
accuracy of the model prediction. Meanwhile, the greater the correla-
pbest and gbest is calculated.
tion coefficient, the higher the correlation between experimental data
Step 4: Step 3 is repeated until the end condition is met and the op- and predicted values. Moreover, R2 = 1 indicates that the predicted
timal parameter is finally obtained. value is completely correlated with the experimental data; that is,
there is a linear relationship in the sense that the probability is 1. Be-
Fig. 5 shows the optimization result varying with the number of iter- sides, in the following, the simulation time (CPU time) t is also consid-
ations. The whole evolutionary process illustrates the changing trend of ered to evaluate the computational efficiency.
the best population fitness during the evolution process. The fitness de-
creases with the increasing generation number and converges at about 3. Comparison and discussion
generation 25. After 50 iterations, the RMSE obtained by the training set
is 3.123 × 10−4 through the five-fold cross-validation, and the value of 3.1. Comparison between the prediction results of PCA-PSO-SVR and exper-
{C, g, ε} in the final optimization results is {660, 0.673, 0.026}. imental data
The SVR model configured with the optimal value {C, g, ε} obtained
by the particle swarm optimization is trained based on the training Fig. 6 shows the comparison between the predicted results of the
data selected at random until it meets the convergence conditions. PCA-PSO-SVR model and experimental data for the grade efficiency of
cyclone separators. The abscissa represents the experimental data of
grade efficiency as reported in the literature [14–16], and the ordinate
Table 5 represents the predicted values of grade efficiency output of the PCA-
PSO parameter settings.
PSO-SVR model. The red balls illustrate the predicted results of grade
Particle swarm size 50 efficiency by the PCA-PSO-SVR model for the training samples. The
Maximum iterations 50 green triangles are the grade efficiency values predicted by the PCA-
(C,g,ε) Min = (0.1,0.1,0) PSO-SVR model for the test samples. They are all concentrated near
search range Max = (800,10,1) the x = y line, indicating that the predicted results are consistent with
The initial position of the particle swarm randomly generated the experimental data. The normalized mean squared error MSE and
The initial velocity of the particle swarm randomly generated
the correlation coefficient R of the training samples and testing samples
120 W. Zhang et al. / Powder Technology 347 (2019) 114–124

To compare the modeling accuracy of the two dimensionality reduction


methods between PCA and Stokes number, the PSO-SVR hybrid algo-
rithm is used for regression modeling with 80% experimental data of
217 sets on the two five-dimensional models. Subsequently, both
models are tested separately with the remaining 20% data. The test
results are shown in Fig. 7. The values of {C, g, ε} in the PCA-PSO-SVR
algorithm are {660, 0.673, 0.026} and the value of {C, g, ε} in the Stokes
number-PSO-SVR are {203, 3, 0.01}. The black squares indicate the pre-
dicted results of the PCA-PSO-SVR model for the testing samples. The
closer the data points are to the line x = y, the closer the prediction
results are to the experimental data. Most of the black squares cluster
are found to be near the x = y line. The PCA-PSO-SVR model maintains
high accuracy even when the number of experimental data is small (dis-
tributing in grade efficiency less than 80%). The results show that PCA
method requires more data for the process of dimensionality reduction.

3.3. Comparison among the PCA-PSO-SVR and classical theoretical models

Three theoretical models of the cyclone separator (Barth [7], Dietz


Fig. 6. Comparison on the prediction results of grade efficiency between training sample [8] and Leith-Licht [9]) are compared with the PCA-PSO-SVR. Figs. 8
and test sample.
(a), (c), (d), (e), (f) show that the influence trend of Ka, D, δ, ρp and vi
on the grade efficiency obtained by these models. One can see that the
are shown in Fig. 6. Especially, both the correlation coefficients are close PCA-PSO-SVR model is consistent with the trend reflected by the exper-
to 1. This indicates that the PCA-PSO-SVR model can be used as a new imental data. However, the Barth model gives too optimistic prediction
method to fit the complex nonlinear relationship between the grade values for the grade efficiency, and in contrast, the Dietz and Leith-Licht
efficiency and other influencing factors of the cyclone separator to aug- models give more pessimistic prediction values. Most of the predicted
ment the generalization ability and robustness. values of PCA-PSO-SVR model are closer to the experimental data.
Fig. 8(b) shows that the grade efficiency of the experimental data
3.2. Comparison of the dimensionality reduction between PCA and stokes decreases with the increase in d̃r in the range of 0.25 to 0.45. The trends
number of PCA-PSO-SVR and Barth models are consistent with experimental
data. However, it is interesting to note that the other two models result
It is well known that the cyclone efficiency is greatly influenced by in the opposite trend in the same range.
the Stokes number, a dimensionless number characterizing the behav- The concentration of inlet particles and particle size distribution are
ior of particles suspended in a fluid flow. The Stokes number is defined two important parameters that affect the grade efficiency, but they
as Eq. (16). are not considered in the three theoretical models. In contrast, the
PCA-PSO-SVR model is used to deal with any factor that has been mea-
ρp δ2 vi sured by the experiments. The particle size distribution is expressed by
Stk ¼ ð16Þ
18μD the median diameter dm and the root mean square difference of the
dust particle size. With the increase in the median particle size, the
where ρp is the particle density, δ is the particle diameter, D is the cylin- large particles will play a certain drag effect on the small particles, which
der diameter and vi is the gas velocity at cyclone inlet. The Stokes will improve the separation efficiency in a certain range. The influence of
number-based method can be regarded as a dimensionality reduction the mean square error of the dust particle size distribution on the sepa-
method that integrates the four factors affecting efficiency into a dimen- ration performance is negligible according to the experimental data anal-
sionless variable. Thus, the eight variables, {δ, D, ρp, dm, vi, Ci, Ka, d r̃ }, that ysis. As the concentration increases, the drag force generated by the large
affect the cyclone efficiency are reduced to five, {Stk, dm, Ci, Ka, d ̃r}. particles moving toward the wall will entrain the small particles toward
the wall. As a result, the collision, interception, and agglomeration be-
tween particles increase. The viscous force of the gas stream on the par-
ticles is relatively reduced, which will lead to an increase in the particle
separation efficiency. Prediction data of the median particle size dm and
the inlet concentration ci on grade efficiency from the PCA-PSO-SVR
model are compared with the experimental data in Figs. 8(g) and (h).
The trend shows that the PCA-PSO-SVR model can better reflect the
effect of these two parameters on the grade efficiency.

3.4. Comparison among PCA-PSO-SVR, PCA-SVR, PSO-SVR, and SVR models

To test the improvements of SVR performances by PCA and PSO, re-


spectively, the prediction results of PCA-PSO-SVR, PCA-SVR, PSO-SVR,
and SVR models for the testing sample are shown in Fig. 9. The red cir-
cles indicate the predicted results of the PCA-PSO-SVR model for the
testing samples. Most of them are concentrated near line x = y which
means the predicted results agree well with the experimental data.
The PCA-PSO-SVR model achieves the minimum mean square error
and high correlations compared with the other three models. The
PCA-SVR model, whose correlation coefficient of 0.957 is higher than
Fig. 7. Prediction results comparison between PCA-PSO-SVR and Stokes-PSO-SVR. those of the PSO-SVR and SVR models, shows that the PCA effectively
W. Zhang et al. / Powder Technology 347 (2019) 114–124 121

Fig. 8. Performance comparison between the PCA-PSO-SVR and theoretical models.

Fig. 8 (continued).
122 W. Zhang et al. / Powder Technology 347 (2019) 114–124

Table 6
Evaluation parameters and hyper-parameters of SVR hybrid with PSO and PCA.

PCA-PSO-SVR PCA-SVR PSO-SVR SVR

MSE 6.948 × 10−4 1.617 × 10−3 1.010 × 10−3 1.736 × 10−3


R2 0.982 0.957 0.929 0.885
C 660 512 313 256
g 0.673 0.5 0.131 0.125
ε 0.026 0.031 0.010 0.016

Fig. 10. Time consuming of parameter optimization.

Fig. 8 (continued).

Fig. 11. Comparison of the PCA-PSO-SVR model with the BP, RBF and GRNN models for the
grade efficiency.

Table 7
Evaluation parameters of different models.

PCA-PSO-SVR BP RBF GRNN

MSE 6.948 × 10−4 1.200 × 10−2 4.680 × 10−2 3.780 × 10−2


Fig. 9. Comparison of the PCA-PSO-SVR with PCA-SVR, PSO-SVR and SVR models for the R2 0.982 0.901 0.8245 0.7532
grade efficiency.
W. Zhang et al. / Powder Technology 347 (2019) 114–124 123

reduces the dimensionality of feature space and improves the generali- Notation
zation ability of the model. The mean squared error of PSO-SVR method
is 1.010 × 10−3, lower than those of PCA-SVR and SVR models which a Inlet height,mm
means that the particle swarm optimization improves the modeling b Inlet width,mm
accuracy of the SVM. Table 6 lists the mean squared error MSE and cor- H1 Cylinder height, mm
H2 Cone height, mm
relation coefficient R for evaluating the performance of the models com-
S Length of vortex finder, mm
bined with the hyper-parameters of SVR {C, g, ε} for the grade efficiency B Particle exit diameter, mm
prediction. C The penalty factor
Fig. 10 shows the time consuming of CPU for SVR and PCA-SVR with Ci Concentration of inlet particles,g/m3
the standard grid method (t = 145.07 s and 3508.85 s, respectively) and D Cyclone diameter,mm
dr Cyclone gas outlet diameter,mm
the time consuming of CPU for PSO-SVR and the PCA-PSO-SVR with PSO ~
dr The ratio of diameter of vortex finder to that of cyclone (dimensionless)
algorithms (t = 25.63 s and 502.65 s, respectively). The time required by ~
dr=dr/D
PSO algorithm is far less than that with standard grid method because g The parameter of the kernel function
the optimization process needs 2500 times calculation with the 5-fold Ka The ratio of cyclone cross-section area to inlet cross-sectional area,
cross-validation to confirm the fitness function when the iteration is (dimensionless) Ka = πD2/4ab
vi Gas velocity at cyclone inlet,m/s
50 and particle number is 50 when using particle swarm algorithm for
δ Particle diameter,μm
optimization. However, the optimization process needs 125,000 tim- dm Median size of particle,μm
es calculation with the 5-fold cross-validation to acquire the fitness ε The insensitive loss
function when each optimization parameter is set to 50 levels using a η Overall efficiency,%
standard grid search method for the optimization. In summary, as an ηi Graded efficiency of particles,%
ρp Particle density,kg/m3
advanced evolutionary algorithm, the particle swarm optimization can
ωi The weight vector
replace the standard grid search method to find better model parame- u The threshold value
ters to improve the optimization speed and accuracy.

3.5. Comparison among PCA-PSO-SVR, BP, RBF and GRNN models Acknowledgments

To test the validity of the PCA-PSO-SVR model, three types of ANN Authors acknowledge support from the National Key Research
(Artificial Neural Network) models are adopted to model the cyclone and Development Program of China (2018YFB0604603-03),
grade efficiency, namely, back propagation (BP), radial basis function National Natural Science Foundation of China (No. 21506139),
(RBF), and general regression neural network (GRNN). The BP neural NSFC-Shanxi Joint Fund for Coal-Based Low-Carbon Technology
network adopts a single hidden layer structure with 10 neurons. The (No. U1710101) and Special Talent Program of Shanxi Province
radial basis function has a spread velocity of 7.5 in the radial basis neural (No. 201605D211005).
network, and the spread velocity of the probabilistic neural network in
the generalized regression neural network is set to 0.1. Most of the pre- References
diction results of the PCA-PSO-SVR model cluster near the x = y line in
[1] B. Zhao, Development of a dimensionless logistic model for predicting cyclone sep-
Fig. 11, which means that the accuracy of PCA-PSO-SVR model is supe- aration efficiency, Aerosol Sci. Technol. 44 (12) (2010) 1105–1112, https://doi.org/
rior to the other three neural networks. Some values predicted by RBF 10.1080/02786826.2010.512027.
are lower than that obtained from the experimental data, while some [2] B. Zhao, Prediction of gas-particle separation efficiency for cyclones: a time-of-flight
model, Sep. Purif. Technol. 85 (2012) 171–177, https://doi.org/10.1016/j.seppur.
values predicted by GRNN are higher than that obtained from the exper- 2011.10.006.
imental data. This phenomenon is especially noticeable when there are [3] G. Lidén, A. Gudmundsson, Semi-empirical modelling to generalise the dependence
only a few data (distributing in grade efficiency less than 80%). Table 7 of cyclone collection efficiency on operating conditions and cyclone design, J. Aero-
sol Sci. 28 (5) (1997) 853–874, https://doi.org/10.1016/S0021-8502(96)00479-X.
lists the evaluation parameters of BP, RBF, GRNN, and PCA-PSO-SVR
[4] Y.F. Qiu, B.Q. Deng, N.K. Chang, Numerical study of the flow field and separation ef-
models. It shows that the PCA-PSO-SVR model achieves the minimum ficiency of a divergent cyclone, Powder Technol. 217 (2012) 231–237, https://doi.
mean square error and high correlations compared with the other org/10.1016/j.powtec.2011.10.031.
[5] G.G. Sun, J.Y. Chen, M.X. Shi, Optimization and applications of reverse-flow cyclones,
three ANN models.
China Particuology 3 (2005) 43–46, https://doi.org/10.1016/S1672-2515(07)
60162-6.
4. Conclusions [6] J.X. Yang, G.G. Sun, M.S. Zhan, Prediction of the maximum-efficiency inlet velocity in
cyclones, Powder Technol. 286 (2015) 124–131, https://doi.org/10.1016/j.powtec.
2015.07.024.
The PCA-PSO-SVR modeling method, which combines the principal [7] W. Barth, Design and layout of the cyclone separator on the basis of new investiga-
component analysis, particle swarm optimization, and support vector tions, Brennstoff-Warme-Kraft 8 (1956) 1–9, http://refhub.elsevier.com/s0032-
regression algorithm, is proposed to model the cyclone efficiency 5910(17)30882-3/rf0100.
[8] P.W. Dietz, Collection efficiency of cyclone separators, AICHE J. 27 (1981) 888–892,
using the experimental data. The simulation results show that PCA, as http://refhub.elsevier.com/s0032-5910(17)30882-3/rf0105.
an unsupervised dimensionality reduction algorithm, can effectively [9] D. Leith, W. Licht, The collection efficiency of cyclone type particle collectors: a new
reduce the dimensionality of feature space, eliminate partial noise theoretical approach, AIChE Symp. Ser. 68 (1972) 196–206, http://refhub.elsevier.
com/s0032-5910(16)30086-9/rf0030.
data, reduce the complexity of the model, and improve the generaliza- [10] S.E. Rafiee, M.M. Sadeghiazad, Efficiency evaluation of vortex tube cyclone separator,
tion ability of the model. As an optimization algorithm, PSO has the Appl. Therm. Eng. 114 (2017) 300–327, https://doi.org/10.1016/j.applthermaleng.
excellent optimization ability to gain the proper parameters of SVR 2016.11.110.
[11] Y. Zhu, K.W. Lee, Experimental study on small cyclones operating at high flowrates,
model. With the optimized parameters, SVR is successfully used to pre- J. Aerosol Sci. 30 (10) (1999) 1303–1315, https://doi.org/10.1016/S0021-8502(99)
dict the grade efficiency of cyclone separator. The prediction results 00024-5.
show that PCA-PSO-SVR model has strong predictive ability, high stabil- [12] J.Y. Chen, M.X. Shi, Analysis on cyclone collection efficiencies at high temperatures,
China Particuology 1 (2003) 20–26, https://doi.org/10.1016/S1672-2515(07)
ity, high generalization ability and robustness compared with the classi-
60095-5.
cal theoretical models, i.e.,PSO-SVR, SVR, PCA-SVR, and some types of [13] K.S. Lim, H.S. Kim, K.W. Lee, Characteristics of the collection efficiency for a cyclone
ANN models. As a future extension of this work, the development of with different vortex finder shapes, J. Aerosol Sci. 35 (2004) 743–754, https://doi.
higher performance artificial intelligence models and advanced optimal org/10.1016/j.jaerosci.2003.12.002.
[14] X.L. Luo, J.Y. Chen, Research on the effect of the particale concentration in gas upon
search algorithms is necessary to predict the grade efficiency of cyclone the performance of cyclone separators, J. Eng. Thermophys-rus. 13 (3) (1992)
separator more accurately and guide its optimization design. 282–285, http://jetp.iet.cn/EN/Y1992/V13/I3/282.
124 W. Zhang et al. / Powder Technology 347 (2019) 114–124

[15] Y.H. Jin, J.Y. Chen, Computation method of PV™ cyclone performance, Acta Pet. Sin. 2 [25] K. Elsayed, C. Lacor, Modeling and pareto optimization of gas cyclone separator
(1995) 93–99, http://lib.cqvip.com/qk/81668X/200001/1878380.html. performance using RBF type artificial neural networks and genetic algorithms,
[16] Y.H. Jin, M.X. Shi, Experimental studies on scale-up of cyclone separator, J. China Powder Technol. 217 (2) (2012) 84–99, https://doi.org/10.1016/j.powtec.2011.10.
Univ. Pet. Ed. Nat. Sci. 5 (1990) 46–55, http://qikan.cqvip.com/article/detail.aspx? 015.
id=353292. [26] K. Yetilmezsoy, Determination of optimum body diameter of air cyclones using a
[17] X. Sun, Y.Y. Joon, Multi-objective optimization of a gas cyclone separator using ge- new empirical model and a neural network approach, Environ. Eng. Sci. 23 (4)
netic algorithm and computational fluid dynamics, Powder Technol. 325 (2018) (2006) 680–690, https://doi.org/10.1089/ees.2006.23.680.
347–360, https://doi.org/10.1016/j.powtec.2017.11.012. [27] A. Khalkhali, H. Safikhani, Pareto based multi-objective optimization of a cyclone
[18] M. Francesco, R. Francesco, N.G. Carlo, Separation efficiency and heat exchange op- vortex finder using CFD, GMDH type neural networks and genetic algorithms,
timization in a cyclone, Sep. Purif. Technol. 179 (2017) 393–402, https://doi.org/10. Eng. Optim. 44 (1) (2012) 105–118, https://doi.org/10.1080/0305215X.2011.
1016/j.seppur.2017.02.024. 564619.
[19] A.N. Huang, I. Keiya, F. Tomonori, F. Kunihiro, K. Hsiu-Po, Effects of particle mass [28] G.G. Sun, M.X. Shi, The proper design and application of PV cyclone, Pet. Refin. Eng.
loading on the hydrodynamics and separation efficiency of a cyclone separator, J. 32 (9) (2002) 4–7, in Chinese https://doi.org/10.3969/j.issn.1002-106X.2002.09.
Taiwan Inst. Chem. E. 90 (2018) 61–67, https://doi.org/10.1016/j.jtice.2017.12.016. 002.
[20] D. Misiulia, A.G. Andersson, T.S. Lundström, Effects of the inlet angle on the collec- [29] M.P. Wang, Q. Tian, Dynamic heat supply prediction using support vector regression
tion efficiency of a cyclone with helical-roof inlet, Powder Technol. 305 (2017) optimized by particle swarm optimization algorithm, Math. Probl. Eng. 1 (2016)
48–55, https://doi.org/10.1016/j.powtec.2016.09.050. 1–10, https://doi.org/10.1155/2016/3968324.
[21] W.I. Mazyana, A. Ahmadib, J. Brinkerhoffa, H. Ahmedc, M. Hoorfar, Enhancement of [30] R. Dash, P.K. Sa, B. Majhi, Particle swarm optimization based support vector regres-
cyclone solid particle separation performance based on geometrical modification: sion for blind image restoration, J. Comput. Sci. Technol. 27 (5) (2012) 989–995,
numerical analysis, Sep. Purif. Technol. 191 (2018) 276–285, https://doi.org/10. https://doi.org/10.1007/s11390-012-1279-z.
1016/j.seppur.2017.09.040. [31] Y.Y. Chen, Q.F. Xiong, Support Vector Machine Method and Application Course [M].
[22] F. Zhou, G.G. Sun, Y. Zhang, H. Ci, Q. Wei, Experimental and CFD study on the effects Beijing, 2011.
of surface roughness on cyclone performance, Sep. Purif. Technol. 193 (2018) [32] Y. Yajima, H. Ohi, M. Mori, Extracting feature subspace for kernel based linear pro-
175–183, https://doi.org/10.1016/j.seppur.2017.11.017. gramming support vector machines, J. Oper. Res. Soc. Jan. 46 (4) (2003) 395–408,
[23] K. Elsayed, C. Lacor, CFD modeling and multi-objective optimization of cyclone geom- https://doi.org/10.15807/jorsj.46.395.
etry using desirability function, artificial neural networks and genetic algorithms, Appl. [33] E. Russell, K. James, Particle swarm optimization, in: IEEE proceedings, Neural Netw.
Math. Model. 37 (8) (2013) 5680–5704, https://doi.org/10.1016/j.apm.2012.11.010. 4 (1995) 1942–1948, https://doi.org/10.1109/ICNN.1995.488968.
[24] B. Zhao, Modeling pressure drop coefficient for cyclone separators: a support vector [34] Z. Zhong, D. Pi, Forecasting satellite attitude volatility using support vector regres-
machine approach, Chem. Eng. Sci. 64 (2009) 4131–4136, https://doi.org/10.1016/j. sion with particle swarm optimization, IAENG Int. J. Comput. Sci. 41 (3) (2014)
ces.2009.06.017. 153–162http://www.iaeng.org/IJCS/issues_v41/issue_3/IJCS_41_3_01.pdf.

You might also like