Professional Documents
Culture Documents
A method for more-accurate prediction of crystallization kinetics is greatly needed in the field of industrial
crystallization. Traditional empirical correlations cannot give reliable predictions, because of the highly nonlinear
behavior of crystallization kinetics, although they have been used for a long time. In this paper, the development
of a neural network model is presented. The model was trained with limited data obtained from an anti-
solvent crystallization system (ciprofloxacin hydrochloride, H2O, and ethanol). The predictions from the network
then were validated against newly measured data. The results confirm that this approach gives much more-
accurate predictions of the kinetics, in terms of crystal growth and agglomeration as examples. The mean
relative error of the predicted growth rates from this model, versus the measured data, is generally <10%
and, in some cases, is as good as 5%. This is a significant improvement on the relative error of 20% or more
that is typically achieved by traditional correlations.
1. Introduction very well. Such networks can overcome the problems associated
with linear and perception networks. The network constructed
The artificial neural network (ANN) has been developing for
in this work adopts such a multilayer structure, to approach a
many years and has an increasingly significant role across a
function with definite discrete points. The training strategy of
wide range of fields. ANN has been applied to chemical
the network is shown in Figure 1.
engineering over the last 16 years. Since 1988, when Hosking
and Himmelblau1 first applied it to error diagnosis for a chemical Determining a suitable learning rate for a nonlinear network
process, it has been widely used for various applications in is important. A large learning rate may lead to instability.
chemical engineering.2 Conversely, if the learning rate is too small, an unrealistically
Crystallization kinetics (e.g., nucleation rate or crystal growth) long training time may be needed. Unlike linear networks, there
is essential for the analysis, design, and operation of industrial is no easy, prescriptive way of selecting a suitable learning rate
crystallization processes. Traditionally, kinetics is described by for nonlinear multilayer networks.
correlations that are obtained from experimental data using The network constructed in this work generalizes the Wid-
assumed functional forms. The mechanisms of nucleation and row-Hoff study rule; it trains the weightings of nonlinear
crystal growth are extremely complex, with high nonlinearity differentiable functions and is trained by input vector and target
and parameter interactions, which are not easily reflected by vector pairs. The ANN model can finally approximate a function
simple empirical correlations, so that the use of those correla- that relates the input vectors to specific target vectors.
tions normally leads to large errors in predictions. However, A multiple-layer neural network consists of input vectors,
the neural network approach is better able to reproduce the hidden layers, and output layers. In the current work, the network
overall system behavior, with its nonlinearities and parameter is constructed with two layers, which is believed to give
interaction effects. It can not only capture the interactions sufficiently good approximations for predicting kinetics.
between each influential element but also provide the mapping The detailed structure of the network is illustrated in Figure
procedure from input to output, so it is expected to be able to 2, in which P is the input vector and the parameters b1 and b2
predict the complex interplays of the influential factors in represent the hidden-layer and output-layer bias vectors, re-
crystallization processes. spectively. N1 represents the hidden-layer parameter vectors,
This paper describes the development of a neural network which can be defined as
model to predict crystal nucleation, growth, and agglomeration
rates. An anti-solvent crystallization system that is composed N1 ) W1P + b1 (1)
of ciprofloxacin hydrochloride, H2O, and ethanol was used for
the purposes of training and validating the model. where W1 is the hidden-layer weighting matrix.
The hidden-layer propagation function, f1(x), is expressed as
2. Construction of the Network
Several papers have suggested that using a multiple-layer 2
f1(x) ) -1 (2)
neural network is a more effective way to approximate nonlinear 1 + e-2x
relationships between parameters.4-8 In fact, 80%-90% of ANN
applications use a multiple-layer neuron structure or derivatives Its purpose is to send information in the hidden layer to the
thereof. output layer. The output parameter, N2, can be calculated using
Multilayered networks can perform well on any linear or the following equation:
nonlinear problems and can approximate any reasonable function
N2 ) W2P + b2 (3)
* To whom correspondence should be addressed. Tel.:
+86 22 27405754. Fax: +86 22 27400287. E-mail address:
david.wei@tju.edu.cn. where W2 is the output-layer weighting matrix.
10.1021/ie0487944 CCC: $33.50 © 2006 American Chemical Society
Published on Web 11/30/2005
Ind. Eng. Chem. Res., Vol. 45, No. 1, 2006 71
[ ]
In this application, supersaturation (S), temperature (t), volume
fraction (a), solid suspension density (Mt), and agitation rate dC(t) d ln V(t)
(Nstr) are defined as the input vectors (suspension density is - + (C(t) + Mt(t))
dt dt
[ ]
defined as the mass of solid particles per unit volume of slurry), G) (6)
whereas growth rate (G), nucleation rate (B0), and agglomeration n wti%
coefficient (Ka) comprise the output vectors. The use of the Mt(t) ∑
i)1 (L - Li)
tansig function (f1(x)) as the hidden propagation function, and i+1
the purelin function (f2(x)) as the output propagation function,
enable this network to approximate any function reasonably well. B0 ) n(L1,t) × G (7)
All weights and biases in the network are obtained through a
training procedure that is based on the measured data. After a The agglomeration coefficient (Ka) is the constant (assuming
predefined level of precision is achieved, the training procedure size independence) for describing the agglomeration rate (ragg),
can be terminated and the multiple-layer network is then ready as shown in eq 8 and can be obtained by discretizing population
to be used for prediction. balance:
To evaluate the accuracy of the network constructed, the
following mean relative error, Q, is used:
Q)
1 N
∑1 |
the prediction value - the experimental value
|
ragg ) Ka (∑
1 i-1
2 j)1
Ni-jNj - Nim0 ) (8)