You are on page 1of 8

Fluid Phase Equilibria 502 (2019) 112282

Contents lists available at ScienceDirect

Fluid Phase Equilibria


j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / fl u i d

A note on artificial neural network modeling of vapor-liquid


equilibrium in multicomponent mixtures
Ivan Argatov a, c, *, Vitaly Kocherbitov b, c
a
Faculty of Technology and Society, Malmo€ University, SE-205 06, Malmo €, Sweden
b
Faculty of Health and Society, Malmo€ University, SE-205 06, Malmo €, Sweden
c
Biofilms e Research Center for Biointerfaces, Malmo € University, SE-205 06, Malmo€, Sweden

a r t i c l e i n f o a b s t r a c t

Article history: Application of artificial neural networks (ANNs) for modeling of vapor-liquid equilibrium in multicom-
Received 21 May 2019 ponent mixtures is considered. Two novel ANN-based models are introduced, which can be seen as
Received in revised form generalizations of the Wilson model and the NRTL model. A unique feature of the proposed approach is
8 August 2019
that an ANN approximation for the molar excess Gibbs energy generates approximations for the activity
Accepted 15 August 2019
Available online 20 August 2019
coefficients. A special case of the ternary acetic aciden-propyl alcoholewater system (at 313.15 K) is used
to illustrate the efficiency of the different models, including Wilson's model, Focke's model, and the
introduced generalized degree-1 homogeneous neural network model. Also, the latter one-level NN
Keywords:
Vaporeliquid equilibrium
model is compared to the Wilson model on 10 binary systems. The efficiency of the two-level NN model
Ternary system is assessed by a comparison with the NRTL model.
Excess gibbs energy © 2019 Elsevier B.V. All rights reserved.
Activity coefficients
Artificial neural network

1. Introduction In Refs. [11,12], ANNs were utilized to estimate the activity co-
efficients of ternary liquideliquid equilibrium data. In the majority
The problem of vapor-liquid equilibrium (VLE) in multicompo- of the mentioned studies, a multilayer perceptron (MLP) was
nent mixtures represents a practical interest and was considered in employed with different number of nodes in one hidden layer.
a number of studies [1]. For instance, the prediction of VLE in However, two-hidden-layer feedforward networks have been also
mixtures involved in vaporeliquid separation are of special used [10,13]. It is known [14] that the approximation capability of
importance for modeling distillation in alcoholic beverage pro- MPL networks severely depends on the number of hidden neurons,
duction [2]. Nh , which, in turn, determines the number of learnable parameters
In the last decade, artificial neural networks (ANNs) have been (weights and biases). On the other hand, with increasing Nh , more
intensively applied for modeling different physical and chemical experimental data will be needed for training the MPL network.
phenomena [3,4], including VLE (see, e.g., Ref. [5]). There are In the present paper, we pursue the approach proposed by Focke
number of successful examples of applications of feed-forward [15] to limit the number of neurons Nh to the number m of com-
ANNs, and, in particular, for direct prediction of the Fick diffu- ponents in the mixture. This crucial constraint inevitably signifi-
sivity in binary liquid systems [6] and vapor-liquid phase equili- cantly lowers the approximation capability of ANN models,
briumeflashecalculations [7]. Recently, an ANN model was requiring, however, much less data to determine the model pa-
proposed [8] for the prediction of the activity coefficient at infinite rameters. At the same time, the flexibility of ANN models is sup-
dilution of binary systems with the properties of the individual ported by a certain arbitrariness in the choice of the neuron's
components used as inputs to the neural network. Besides the activation function. Another idea that enabled Focke to advance the
mixture composition, temperature and pressure can be employed ANN modelling of mixtures was the use of inputs as substitutes for
as the ANN inputs [9,10]. weights in the output neuron. In what follows, this suggestion has
been utilized as well.
Another important feature of our approach is the incorporation
€ University, SE-
* Corresponding author. Faculty of Technology and Society, Malmo of prior knowledge about VLE into the ANN architecture (an idea
205 06, Malmo€, Sweden. mentioned previously by Petersen et al. [16]). In particular, the ANN
E-mail address: ivan.argatov@gmail.com (I. Argatov).

https://doi.org/10.1016/j.fluid.2019.112282
0378-3812/© 2019 Elsevier B.V. All rights reserved.
2 I. Argatov, V. Kocherbitov / Fluid Phase Equilibria 502 (2019) 112282

model for the excess Gibbs energy of a mixture possesses the


property of degree-1 homogeneity and has been forced to obey the
known constraint for a pure component system.
As a result, we introduce two new ANN-based thermodynamic
models for approximating the excess Gibbs energy of a multicom-
ponent mixture, which generalize the well-known Wilson model
[17] and the NRTL model by Renon and Prausnitz [18]. We note that

recently, Reynel-Avila et al. [19] developed a hybrid ANNs-NRTL
model, which utilizes the feed composition and temperature and
where a feedforward ANN is used for estimating the NRTL model
parameters. Our approach differs from theirs in a key aspect,
namely, it generalizes the neuron's structure. Fig. 1. Focke's NN model for mixture property modeling. This scheme illustrates the
P
process of calculation in accordance with formula (5). The symbol denotes the
2. Theory operation of weighted summation.

2.1. Wilson's model


model is shown in Fig. 1. The key features of Focke's approach are:
(1) each of the m components in the mixture is associated with a
According to Wilson [17], the molar excess Gibbs energy of a
specific hidden layer neuron, (2) the hidden neurons may use an
liquid mixture with m components is approximated by
arbitrary but strictly monotonic function, f, as the transfer (activa-
gE Xm Xm  tion) function, (3) the output neuron employs the inputs as weights
¼  xi ln xj Lij ; (1) on connections from the hidden layer neurons, and (4) the activa-
RT i¼1 j¼1 tion function of the output neuron is the inverse function of f,
which is denoted by f 1. The latter two conditions are deterministic
where xi denotes the mole fraction of the i-th component, Lij s Lji for the Focke model.
are adjustable parameters, and Lii ¼ 1 (no summation on repeated The general Focke model, which is presented in Fig. 1, is
indices). formalized as
The activity coefficients are related to the excess Gibbs energy as
X
m X
m 
gE X
m y ¼ f 1 xi f aij xj : (5)
¼ xk lngk (2) i¼1 j¼1
RT
k¼1
Thus, the choice of the activation function f determines the
and Focke model. For instance, taking the logarithmic function f ðuÞ ¼
  ln u, the general Focke model (5) yields
vng E
¼ RT lngk : (3)
vnk T;p;njsk Ym  X
m x i
y¼ i¼1
aij xj ; (6)
P j¼1
Here, n ¼ n1 þ n2 þ … þ nm ¼ m i¼1 ni is the total number of moles
of all compounds, R is the ideal gas constant, T and p are the tem-
which is an exponential form of the semi-theoretical model
perature and pressure, which are held constant.
introduced by Wilson [17] (see Appendix).
From Eqs. (1) and (3), it follows that
We employ the special Focke model (6) under the constraint
0 1
X
m X
m
xL
lngk ¼  ln@ xj Lkj A þ 1  Pmi ik : (4) y¼0 for xi ¼ 1; xj ¼ 0; jsi: (7)
j¼1 i¼1 j¼1 xj Lij
From here it follows that aii ¼ 0. Hence, the total number of
An important advantage of Wilson's model is that it requires adjustable parameters aij is equal to mðm  1Þ.
only parameters which can be evaluated from binary mixture data Observe that under the zero constraint (7), formula (6) can be
[1]. rewritten as
Observe that Wilson's approximation (1) employs the operation
Ym X xi
of linear weighted summation of variables xi , which can be treated y¼ aij xj : (8)
i¼1
as inputs to the Wilson model. The same operation (usually with jsi
the addition of an extra constant term, called bias) is performed by
an artificial neuron as the first of its two basic operations. Another Here, it is noteworthy that the inputs x1 ; x2 ; …; xm are constrained
basic operation of artificial neurons is called activation and is per- to obey the equation
formed by application of a nonlinear function, called activation
function, to the result of the summation operation. In formula (1), X
m
the logarithmic function acts as an activation function. xi ¼ 1: (9)
i¼1

2.2. Focke's NN model Therefore, any variation of xi leads to variation of the base of the
i-th factor on the right-hand side of Eq. (8), even though it does not
Recently, Focke [15] developed a special neural network model explicitly depend on the variable xi .
for describing mixture properties. The architecture of the Focke NN It should be emphasized that, though the Wilson model (1) and
I. Argatov, V. Kocherbitov / Fluid Phase Equilibria 502 (2019) 112282 3

the Focke model (6) are mathematically interrelated (see Appen- We emphasize that the total number of fitting parameters in
dix), they will be treated as different approximations. (11) is mðm  1Þ, which is the same as that of Wilson's model (1). In
view of (11) and (12), the ANN-type approximation (13) satisfies the
2.3. Degree-1 homogeneous NN model zero limit condition (7).
Strictly speaking, in Eq. (13), the activation function f1 is applied
In our modeling approach, we consider an ANN as a special kind not directly to the weighted sum of inputs but rather to its inverse
of nonlinear regression model, whose building blocks can be value, since the sum of all inputs is equal to unit (see Eq. (9)). By
regarded as generalized artificial neurons, which receive weighted setting F1 ðuÞ ¼ f1 ð1=uÞ, the ANN model (13) can be recast in the
inputs and sequentially perform the operation of summation and form of multilayer perceptron with an activation function pos-
the activation operation of nonlinear transformation. sessing a step-like discontinuity.
We take advantage of the flexibility of ANNs and construct a Observe that because tanhðuÞ2ð1; 1Þ for any u2ð  ∞; ∞Þ, the
special architecture that represents the output variable y as a ho- following inequality holds:
mogeneous function of degree 1 in the input variables x1 ; x2 ; …; xm    0 , 1
   
(see Fig. 2). The units in the input layer receive m input signals xi ,   X m  Xm 
y  x  f @1 W x A   1:
which are transferred to each of m hidden neurons with individual   i 1 j¼1 ij j 
  i¼1  
weights Wij . The neuron outputs with weights equal to the corre-
sponding inputs are fed into the output neuron to produce the ANN From here it follows the ANN model (13) with the activation
model output y. The presented neural network model differs from a function (10) is a priori applicable for approximating functions yðx1 ;
MLP by the work of its hidden neurons, which is explained below. x2 ; …; xm Þ, which take the values between 1 and 1. Therefore, in a
To be more specific, we make use of the hyperbolic tangent general case, an appropriate normalization of the data is required.
tanhu ¼ ð1  e2u Þ=ð1 þ e2u Þ as an activation function of the hid- Further, we apply the ANN model (13) for approximating the
den layer neurons, which is one of the most used activation func- molar excess Gibbs energy, i.e., we put
tions. This choice allows us to avoid the presence of biases, thereby
significantly reducing the number of fitting parameters. Moreover, Pm !
gE Xm
j¼1 xj
since tanhð0Þ ¼ 0, we put ¼ xi f1 Pm ; (14)
RT i¼1 j¼1 Wij xj
f1 ðuÞ ¼ tanhðu  1Þ; (10)
so that the function GE ¼ ngE will be given by
so that
Pm !
GE X m
j¼1 nj
f1 ð1Þ ¼ 0: (11) ¼ ni f1 Pm :
RT i¼1 j¼1 Wij nj
Additionally, it is required that
By the definition (see Eq. (3)), the activity coefficients can be
Wii ¼ 1; (12) evaluated as

where Wij are the weights of the hidden layer connections, and no Xm !
summation over repeated indices is implied. n
j¼1 j
lngk ¼ f1 Xm
The number of the hidden neurons equals the number of
j¼1
Wkj nj
mixture components, and the inputs are used as the weights of the
output layer. Finally, the identity activation function is employed Xm ! Xm Xm
for a single output unit. X m n
j¼1 j
W n  Wik
j¼1 ij j
n
j¼1 j
0
þ ni f 1 Xm X m 2
Thus, the introduced ANN model works as follows:
i¼1 W n
j¼1 ij j Wij nj
Pm ! j¼1
X
m
j¼1 xj
y¼ xi f1 Pm : (13) or, which is the same,
i¼1 j¼1 Wij xj

Xm !
x
j¼1 j
lngk ¼ f1 Xm
j¼1
Wkj xj
Xm (15)
! Xm Xm
X m x
j¼1 j
W x  Wik
j¼1 ij j
x
j¼1 j
0
þ xi f 1 Xm Xm 2 :
i¼1
j¼1
W ij xj W x
ij j j¼1

Now, let us underline that Eq. (9) can be applied to simplify


formula (15), since the sum of all xj (j ¼ 1; 2; …; m) equals 1.
However, this equation may not be utilized to simplify the
numerator of the argument of the activation function in formula
P
(14), because the relation vð m j¼1 xj Þ =vxi ¼ 1 was used in deriving
formula (15).
Let us consider a special case of the activation function, namely,
f1 ðuÞ ¼ lnu, so that f1 0 ðuÞ ¼ 1=u. It is easy to show that in this case
Eqs. (14) and (15) coincide with Eqs. (1) and (4), provided that Wij ¼
Fig. 2. Degree-1 homogeneous NN model for mixture property modeling. This scheme
illustrates the process of calculation in accordance with formula (13).
4 I. Argatov, V. Kocherbitov / Fluid Phase Equilibria 502 (2019) 112282

Lij . Thus, the ANN model (13) generalizes Wilson's model for an
arbitrary monotonic activation function satisfying the centering
condition (11).
Observe that at the beginning we employ an tanh-based acti-
vation function, and then we generalize our equations for an arbi-
trary activation function f1 ðuÞ, which satisfies Eq. (11). Finally, we
consider a special case of logarithmic function to show a link to the
Wilson model. The logic of this approach is that both the form of
activation function and its centering play a role in designing an
ANN model.

Fig. 4. Schematic of generalized neuron with two kinds of input connections with
weights Vij and Wij . The neuron output is obtained by application of the activation
2.4. Generalized two-level degree-1 homogeneous NN model function f0 to the ratio of the corresponding weighted sums of inputs (see formula
(18)).

Let f0 ðuÞ be an arbitrary monotonic function such that

f0 ð0Þ ¼ 0: (16) Xm !
V x
j¼1 ij j
The centering condition (16) establishes the connection of the lngk ¼ f0 Xm
two-level NN model with the NRTL model. The choice of the acti- j¼1
Wkj xj
vation function adds an extra degree of freedom to the model. The Xm !(
general requirement is that the activation function f0 ðuÞ should be X
m V x Vik
j¼1 ij j
nonlinear, easily computable and with a good fitting performance. þ xi f 00 Xm Xm (19)
Along with the weights Wij , let us introduce the second set of i¼1 W x
j¼1 ij j
W x
j¼1 ij j
weights Vij , such that Xm 9
Wik V x >
j¼1 ij j
=
Vii ¼ 0: (17)  
Xm 2 >
;
We put W x
j¼1 ij j

Pm ! The process of derivation of formula (19) from Eq. (18) follows


X
m
j¼1 Vij xj
y¼ xi f0 Pm ; (18) that for formula (15) from Eq. (13).
i¼1 j¼1 Wij xj Note that in view of (16) and (17), the limit condition (7) is
satisfied automatically.
which, apart from the different choice of the centering condition for It is interesting that for a special choice of parameters Wij and
the activation function (cf. Eqs. (11) and (16)), differs from formula Vij , the ANN model (18) reduces to the NRTL model [18] in the case
(13) by generalizing the numerator of the argument of the activa- f0 ðuÞ ¼ u.
tion function as well as by using weights of two levels.
It is readily seen that by definition the right-hand side of Eq. (18) 3. Results
is a homogeneous function of degree one. Formula (18) represents a
generalized ANN model shown in Fig. 3. The corresponding neural 3.1. Case of binary systems
network contain one hidden layer of generalized neurons, each of
which receives inputs via two channels of weighted connections The best way to evaluate performance of a model is to test it on
(see Fig. 3) (see Fig. 4). available literature data and compare the fitting results with pre-
If formula (18) is applied for approximating the relative molar dictions of existing models. For this we use the one-level NN model
excess Gibbs energy g E =ðRTÞ, then the activity coefficients will be (13) with the activation function f1 ðuÞ defined by Eq. (10) and its
given by derivative f 01 ðuÞ ¼ 1  tanh2 ðu  1Þ. The binary VLE literature data
selected for this evaluation include systems with both positive and
negative deviations from ideal behavior in the liquid phase. Ap-
proximations of ideal behavior in the gas phase were utilized for
calculation of activity coefficients. Parameters of the ANN and
Wilson models were calculated from the excess Gibbs energy
values (2) using a nonlinear least-squares solver implemented in
MATLAB. The results of calculations are presented in Table 1. To
characterize the performances of the two models, we used the
mean absolute deviation in g E and also the mean absolute per-
centage error (MAPE) defined as

n  b 
100% X Yi  Y i ;
MAPE ¼
n i¼1  Yi 

where Yi is the absolute value, and Y b is the predicted value. It is to


i
note here that the mean absolute deviation is a more robust char-
Fig. 3. Generalized degree-1 homogeneous NN model for mixture property modeling. acteristic for this type of data since in some cases MAPE can be less
This scheme illustrates the process of calculation in accordance with formula (18). reliable due to unreasonably high errors arising from points where
I. Argatov, V. Kocherbitov / Fluid Phase Equilibria 502 (2019) 112282 5

Table 1
Comparison of the one-level NN regression model (13) with the Wilson model for several binary systems.

System Ref. T,+ C Wilson model NN model

L12 L21 Error; % Dg ;


E W12 W21 Error; % DgE ;
J=mol J=mol

Acetone-Chloroform [20] 55 1.0969 1.8052 1.8049 7.0319 1.3210 1.6745 1.473 6.1125
Acetone-Hexane [21] 55 0.3980 0.3351 2.2554 15.0724 0.4774 0.3629 1.9657 11.18
Toluene-Dioxane [22] 100 1.1651 0.6517 6.0137 6.8682 1.0401 0.7657 6.1928 7.0385
Toluene-Isobutanol [22] 100 0.5985 0.4694 1.3988 7.283 0.6221 0.5198 1.5409 8.0809
Nitromethane [23] 45 0.1477 0.2493 0.2455 1.5114 0.2675 0.2452 9.7108 57.4062
Tetrachloromethane
Tetrahydrofuran [24] 25 1.2652 0.4031 11.5507 9.4097 1.2505 0.4724 10.5295 9.5073
Propanol
Benzene-Isooctane [25] 45 1.1967 0.3780 0.77721 1.3395 1.0780 0.5032 2.8067 3.7381
Propanol-Propylacetate [26] 40 0.5782 0.5251 2.2208 7.1915 0.6048 0.5685 2.7768 8.2608
Chloroform-Hexane [20] 55 1.1642 0.4867 2.5815 4.895 1.0647 0.5926 1.7216 3.6074
Methyl Acetate-Benzene [27] 50 0.9654 0.6899 8.7575 9.6196 0.9099 0.7547 8.727 9.6215

g E values are very close to zero. alcoholewater system (at 313.15 K) and make use of a set of 21
The results (see Table 1) show that in most cases the ANN and examples from Ref. [29] that are uniformly distributed over the
Wilson models show similar performances. In several cases, the entire interior of the input domain (see red points in Fig. 6).
ANN model provides better fitting of the experimental data Because, by their construction, the ANN-type models satisfy the
compared to the Wilson model. An interesting exception is Nitro- zero limiting condition (7), the vertices of the input polyhedron
methane e Tetrachloromethane system, where the ANN model (see blue points in Fig. 6) are automatically included into the train
performs substantially worse. The latter system shows very high data set with nil contribution to the training error.
positive deviations from ideal behavior (with high values of activity The following Focke coefficients have been obtained for the
coefficients) and the shape of the selected activation function acetic aciden-propyl alcoholewater system at 313.15 K:
seems to be not optimal for describing these values. A detailed
discussion of the optimal choice of activation function for fitting of a11 ¼ 0; a12 ¼ 0:04764; a13 ¼ 0:21113;
different types on experimental data is however out of the scope of a21 ¼ 0:18613; a22 ¼ 0; a23 ¼ 0:45959; (21)
this article. a31 ¼ 0:6983; a32 ¼ 1:48588; a33 ¼ 0:
Observe that this effect can be explained by the asymmetry in
the discrepancy between the activation function f1 ðuÞ and the The six non-zero coefficients aij written out above were evalu-
logarithmic function used in the Wilson model (see Fig. 5a). ated by fitting the Focke model (6) to the experimental data [29].

3.2. Case of a ternary system

Based on the binary data [28], the following Wilson coefficients


have been previously obtained for the acetic aciden-propyl
alcoholewater system at 313.15 K:

L11 ¼ 1; L12 ¼ 0:7046; L13 ¼ 0:33892;


L21 ¼ 1:41926; L22 ¼ 1; L23 ¼ 0:03447; (20)
L31 ¼ 1:10338; L32 ¼ 0:56097; L33 ¼ 1:
This data will provide a reference point for testing the newly
designed approximations.
In our case study, we consider the ternary acetic aciden-propyl Fig. 6. Ternary composition diagram.

Fig. 5. Activation functions f1 ðuÞ ¼ tanhðu  1Þ and f0 ðuÞ ¼ tanhu for the one-level (a) and two-level (b) NN models, respectively. The two-dot-dash line indicates the common
tangent line. The superscripts 0 or 1 of the activation function refer to the centering condition where the function has null value.
6 I. Argatov, V. Kocherbitov / Fluid Phase Equilibria 502 (2019) 112282

Further, the following ANN weights have been obtained for the
acetic aciden-propyl alcoholewater system at 313.15 K:

W11 ¼ 1; W12 ¼ 0:43649; W13 ¼ 0:38015;


W21 ¼ 2:28382; W22 ¼ 1; W23 ¼ 0:02779; (22)
W31 ¼ 0:94631; W32 ¼ 0:55159; W33 ¼ 1:
An unconstrained least-square optimization technique was used
to minimize error of the ANN model (13).
In all cases, the problem of determining the model parameters
was solved as a minimization problem for the mean-squared error
function using the PTC Mathcad software. For this purpose, we
made use of the internal Mathcad Minimize function, which utilizes
the conjugate gradient method. The numerical experiments were
carried out several times in order to cope with the possibility of
local optima. Initial guesses of the sought model parameters, which
should be set before the optimization procedure, were specified in a
random fashion for the first time, and then the optimization results
were used as initial guesses for the subsequent optimizations until
the stabilization of the iteration results. The whole optimization Fig. 7. Absolute relative percentage errors of the regression models for the excess
iteration process was repeated three-four times in order to ensure Gibbs energy.
that the final optimal solution does not depend on the initial guess
values.
To assess performance of the Wilson regression model as well as
the ANN models, we utilize the mean absolute percentage error
t11 ¼ 0; t12 ¼ 0:261694; t13 ¼ 0:479356;
(MAPE). The results of the model predictions are shown in Table 2.
t21 ¼ 0:274028; t22 ¼ 0; t23 ¼ 0:76377;
Observe that the MAPE is a generally accepted measure in percent
t31 ¼ 1:48644; t32 ¼ 2:31422; t33 ¼ 0;
of prediction accuracy and provides an intuitive way of judging the
extent of error. At the same time, however, MAPE is highly influ- (24)
enced by a few outlying errors and it has no upper bound when its
minimum value is zero. Note that to deal with the limitations of the
MAPE, another statistical measures were designed [30]. a12 ¼ 0:3; a13 ¼ 0:25; a23 ¼ 0:47: (25)
The absolute relative percentage errors of the three approxi-
In our calculations we put f0 ðuÞ ¼ tanh u, so that f 00 ðuÞ ¼ 1 
mations for the excess Gibbs energy are illustrated in Fig. 7. It ap-
tanh2 u. The motivation for this choice is given by the asymptotic
pears that the ANN model (13) outperforms the Focke model (6).
formula tanh u ¼ u þ oðuÞ, so that the difference between the two-
It is interesting to observe that the accuracy of the ANN pre-
level NN model and the NRTL model is minimized (see Fig. 5b).
dictions for the activity coefficients gk (see Fig. 8a and b) is kept in
Based on the least-square minimization procedure, the
the same range as that achieved for the excess Gibbs energy g E . This
following ANN weights have been obtained for the same system:
can be regarded as a sign of the model robustness.

W11 ¼ 1; W12 ¼ 0:1791; W13 ¼ 0:19602;


W21 ¼ 1:19937; W22 ¼ 1; W23 ¼ 0:14391;
3.3. Comparison of the two-level NN model with the NRTL model
W31 ¼ 0:29698; W32 ¼ 0:09722; W33 ¼ 1;
In this section, we consider the comparison with the NRTL (Non (26)
Random Two Liquid) model
Pm
gE X
m
j¼1 xj tji Gji
V11 ¼ 0; V12 ¼ 0:13818; V13 ¼ 0:10375;
¼ Pm ; (23) V21 ¼ 5:069  104 ; V22 ¼ 0; V23 ¼ 0:83639;
RT k¼1 xk Gki
i¼1 V31 ¼ 0:32063; V32 ¼ 0:10109; V33 ¼ 0:
where tij are the dimensionless interaction parameters, Gij ¼ (27)
expð  aij tij Þ, and aij are the so-called non-randomness parame- For the two-level NN model, we obtain MAPE ¼ 5:775%, which
ters. We note that aij ¼ aji , aii ¼ 0, and tii ¼ 0 (no summation on is a better result than that produced by the one-level NN model (see
repeated indices). Table 2). Evidently, this is the result of a larger number of the fitting
Based on the binary data [28], the following NRTL coefficients parameters used in the two-level NN model. At the same time, the
have been previously obtained for the acetic aciden-propyl NRTL model, which utilizes the dimensionless parameters (24) and
alcoholewater system at 313.15 K: (25), yields MAPE ¼ 10:047%, which is a little bit better than the
mean absolute percentage error of the Wilson model (see Table 2).
Still, it should be emphasized that the NRTL model has 9 of
Table 2
Mean absolute percentage error of the regression models. freedom (six parameters tij and three different parameters aij ),
whereas formula (18), in view of (12) and (17), employs six pa-
gE g1 g2 g3
rameters Wij and six parameters Vij (see (26) and (27)). Therefore, it
Wilson 10.327 5.287 5.23 4.073 makes sense to additionally require that Vij ¼ Vji , thereby reducing
Focke 11.471 N/A N/A N/A to three the number of fitting parameters Vij . The corresponding
ANN 7.965 7.721 4.782 4.199
optimal values of the fitting parameters are
I. Argatov, V. Kocherbitov / Fluid Phase Equilibria 502 (2019) 112282 7

Fig. 8. Absolute relative percentage errors of the regression models for the activity coefficients.

regression model (6) is not compromised.


W11 ¼ 1; W12 ¼ 0:12296; W13 ¼ 3:682  104 ; The crucial point is to observe that the structure of the degree-1
W21 ¼ 1:30562; W22 ¼ 1; W23 ¼ 0:26197; homogeneous neural network model (13) mimics that of the Wil-
W31 ¼ 0:21916; W32 ¼ 0:70728; W33 ¼ 1; son model (1). Therefore, the question arises regarding the effi-
(28) ciency of training this ANN model based only on binary mixture
data, but this requires a separate study using a complete data set,
V12 ¼ 7:243  103 ; V13 ¼ 0:21483; V23 ¼ 0:33941: which covers not only the edges of the ternary diagram, but also the
entire interior (for testing and validation purposes).
(29) When we discuss the ANN models and compare them against
It is interesting that the constrained two-level NN model (with 9 the Wilson model (1), (20), it should be emphasized that they are
fitting parameters (28) and (29)) only slightly reduces its accuracy, competing not on equal conditions d parameters of the Wilson
which is shown by MAPE ¼ 5:819%. model (20) have been obtained from binary-component data, while
Nevertheless, this example does not imply that the NRTL model the ANN weights (21) and (22) have been evaluated from ternary
is worse compared with the two-level NN model, because the NRTL data. At the same time, the comparison is performed based on the
data (24) and (25) was collected from the binary subsystems and three-component data, that is in the interpolation domain for the
partially taken from different sources. Wilson model, whereas for the Focke and ANN models we obtain
their parameters on the same data set as that used for the com-
4. Discussion and conclusions parison purpose. This circumstance is of paramount importance for
any claim that may subsequently be drawn from this specific
The logic of our analysis of the outlined ANN-based models in comparison.
application to describing the excess Gibbs energy is as follows. It is Therefore, it should be made clear that our performance analysis
well known that the Wilson coefficients Lij can be determined (see Fig. 8) does not lead to any conclusion that the tanh-based ANN
solely from binary mixture data, which is much simpler and model performs better than the Wilson model (or otherwise), and
apparently more precise. However, the concept of neural networks further analysis is needed to fully assess the performance potential
requires that the train data should span the entire domain of the of the generalized degree-1 homogeneous NN model. However, the
input parameters. In the case under consideration, the inputs xi are main advantage of the latter ANN model is that it generalizes the
assumed to be non-negative, that is xi  0, and are subject to the Wilson model, and, generally speaking, a better fit can be achieved
constraint x1 þ x2 þ … þ xm ¼ 1, which describes a hyperplane in by suitable choice of the activation function. It is also of interest to
the m-dimensional Euclidean space. Therefore, the input domain is note the apparent similarity in the structures of the NN models (13)
a polyhedron, whose edges represent the binary mixtures. That is and (18), if we abstract from the centering conditions (11) and (16).
why, generally speaking, the availability of the binary data is not Indeed, when Vij ¼ dij (cf. Eq. (17)), where dij is the Kronecker delta,
enough to train neural network models. and f0 ðuÞ ¼ lnu, the generalized two-level ANN model reduces to
Since Focke's model (6) as well as our ANN-type model (13) the Wilson model. Thus, the ANN approach presented here includes
contain the same number of parameters as Wilson's model, their the Wilson and NRTL models as particular cases and provides a new
comparison will reveal the model performance efficiency. The case generalized modeling platform for the description of VLE.
of ANN-type model (18), which generalizes the NRTL model, is not Observe that though the presented ANN modeling approach is
included into the computational analysis, since the latter model has illustrated on a single ternary system and a few binary systems, the
more degrees of freedom. considered examples reveal the underlying principles of practical
Apparently, the fit of the Focke model can be improved by a application of the developed ANN models. A further study on a
more suitable choice of the activation function in (5). The selection larger dataset of complete data is required to find activation func-
of the logarithmic function reduces the general Focke's model (5) to tions which are in a sense optimal. In particular, what is really
the special Focke model (6), which after the application of the needed is the study of the model sensitivities to the choice of the
logarithmic operation recovers Wilson's model (1). However, it activation function. However, this question is more mathematical
should be underlined that since the right-hand side of Eq. (6) is question than practical, and its consideration falls outside the limits
always positive, the special Focke model (6), generally speaking, is of this paper.
not applicable to modeling the excess Gibbs energy, since it may The main novelty of this paper lies in the exploiting the ho-
take negative values. Nevertheless, since all entries for g E in the mogeneity property of the introduced neural network models,
data for the ternary system are positive, the use of the Focke which allowed to derive corresponding approximations for the
8 I. Argatov, V. Kocherbitov / Fluid Phase Equilibria 502 (2019) 112282

activity coefficients. [3] M. Dehghani, H. Modarress, A. Bakhshi, Modeling and prediction of activity
coefficient ratio of electrolytes in aqueous electrolyte solution containing
amino acids using artificial neural network, Fluid Phase Equilib. 244 (2) (2006)
Acknowledgment 153e159.
[4] R. Beigzadeh, M. Rahimi, S. Shabanian, Developing a feed forward neural
IA is grateful to the Biofilms center for the hospitality during his network multilayer model for prediction of binary diffusion coefficient in
liquids, Fluid Phase Equilib. 331 (2012) 48e57.
stay at the Malmo € University, where this research was carried out. [5] M. Moghadam, S. Asgharzadeh, On the application of artificial neural network
for modeling liquid-liquid equilibrium, J. Mol. Liq. 220 (2016) 339e345.
Nomenclature [6] R. Beigzadeh, M. Rahimi, S.R. Shabanian, Developing a feed forward neural
network multilayer model for prediction of binary diffusion coefficient in
liquids, Fluid Phase Equilib. 331 (2012) 48e57.
aij adjustable parameters in the Focke model [7] J.P. Poort, M. Ramdin, J. van Kranendonk, T.J. Vlugt, Solving vapor-liquid flash
aij non-randomness parameters in the NRTL model problems using artificial neural networks, Fluid Phase Equilib. 490 (2019)
39e47.
f, f0 , f1 transfer (activation) functions [8] H.A. Behrooz, R.B. Boozarjomehry, Prediction of limiting activity coefficients
GE excess Gibbs energy for binary vapor-liquid equilibrium using neural networks, Fluid Phase
gE molar excess Gibbs energy Equilib. 433 (2017) 174e183.
[9] S. Mohanty, Estimation of vapour liquid equilibria of binary systems, carbon
gk activity coefficient of the k-th component dioxideeethyl caproate, ethyl caprylate and ethyl caprate using artificial
Lij adjustable parameters in the Wilson model neural networks, Fluid Phase Equilib. 235 (1) (2005) 92e98.
m number of components [10] V. Nguyen, R. Tan, Y. Brondial, T. Fuchino, Prediction of vaporeliquid equi-
librium data for ternary systems using artificial neural networks, Fluid Phase
n total number of moles of all components
Equilib. 254 (1e2) (2007) 188e197.
ni number of moles of the i-th component [11] H. Ghanadzadeh, M. Ganji, S. Fallahi, Mathematical model of liquideliquid
p pressure equilibrium for a ternary system using the GMDH-type neural network and
R ideal gas constant genetic algorithm, Appl. Math. Model. 36 (9) (2012) 4096e4105.

[12] A. Ozmen, Correlation of ternary liquideliquid equilibrium data using neural
T temperature network-based activity coefficient model, Neural Comput. Appl. 24 (2) (2014)
tij dimensionless interaction parameters in the NRTL 339e346.
model [13] M. Hakim, G. Behmardikalantari, H. Najafabadi, G. Pazuki, A. Vosoughi,
M. Vossoughi, Prediction of liquideliquid equilibrium behavior for aliphaticþ
Vij weights in neural network model aromaticþ ionic liquid using two different neural network-based models,
Wij weights in neural network model Fluid Phase Equilib. 394 (2015) 140e147.
xi inputs for neural network model [14] S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice-Hall, New
Jersey, 1999.
y output of neural network model [15] W. Focke, Mixture models based on neural network averaging, Neural Com-
put. 18 (1) (2006) 1e9.
Appendix.A. Relation between the models of Focke and [16] R. Petersen, A. Fredenslund, P. Rasmussen, Artificial neural networks as a
predictive tool for vapor-liquid equilibrium, Comput. Chem. Eng. 18 (1994)
Wilson S63eS67.
[17] G. Wilson, Vapor-liquid equilibrium. xi. a new expression for the excess free
Taking exponents of both sides of Eq. (1), we get energy of mixing, J. Am. Chem. Soc. 86 (2) (1964) 127e130.
[18] H. Renon, J. Prausnitz, Local compositions in thermodynamic excess functions
  Xm Xm x i  for liquid mixtures, AIChE J. 14 (1) (1968) 135e144.
gE 
[19] H. Reynel-Avila, A. Bonilla-Petriciolet, J. Tapia-Picazo, An artificial neural
exp  ¼ exp ln xj Lij ;
RT i¼1 j¼1
network-based nrtl model for simulating liquid-liquid equilibria of systems
present in biofuels production, Fluid Phase Equilib. 483 (2019) 153e164.
[20] L. Kudryavtseva, M. Susarev, Liquid-vapor equilibrium in chloroform-hexane
from where it follows that and acetone-chloroform systems, Russ. J. Appl. Chem. 36 (1963) 1231e1237.
[21] L. Kudryavtseva, M. Susarev, Liquid-vapor equilibriums in the systems
  Y  X m x i 
gE m acetone-hexane and hexane-ethyl alcohol at 35, 45, and 55 and 760mmHg,
exp  ¼ i¼1
exp ln xj Lij ; Russ. J. Appl. Chem. 36 (1963) 1471e1477.
RT j¼1 [22] M. Susarev, A. Toikka, Liquid-vapor-equilibrium in toluene-dioxane-isobutyl
alcohol system at 80 degrees and 100 degrees, Russ. J. Appl. Chem. 46 (11)
or, which is the same, as (1973) 2461e2464.
[23] I. Brown, F. Smith, Liquid-vapour equilibria. VII. The systems nitromethaneþ
  Y X m x i benzene and nitromethaneþ carbon tetrachloride at 45+C, Aust. J. Chem. 8 (4)
gE m (1955) 501e505.
exp  ¼ i¼1
xj Lij : [24] J.A. Salas, E.L. Arancibia, M. Katz, Excess molar volumes and isothermal vapor-
RT j¼1 liquid equilibria in the tetrahydrofuran with propan-1-ol and propan-2-ol
systems at 298.15 K, Can. J. Chem. 75 (2) (1997) 207e211.
Now, it is readily seen that the latter equation coincides with Eq. [25] S. Weissman, S.E. Wood, Vapor-liquid equilibrium of benzene-2, 2, 4-
(6), provided that aij ¼ Lij and y ¼ expð  g E =RTÞ. trimethylpentane mixtures, J. Chem. Phys. 32 (4) (1960) 1153e1160.
[26] V. Sokolov, N. Markuzin, Experimental data on vapoureliquid equilibrium and
However, it should be emphasized that we directly apply the chemical reaction in the system acetic aciden-propanolewaterepropyl ace-
Focke model (6) for modeling the excess Gibbs energy g E by tate, 35e82, Soviet Institute of Science Information, 1982, pp. 1e12.
imposing the zero constraint (7). That is why, Eqs. (6) and (6) [27] I. Nagata, H. Hayashida, Vapor-liquid equilibrium data for the ternary sys-
tems: methyl acetate-2-propanol-benzene and methyl acetate-chloroform-
provide a model which is different from that of Wilson [17].
benzene, J. Chem. Eng. Jpn. 3 (2) (1970) 161e166.
[28] V. Kocherbitov, A. Toikka, Liquid-vapor and liquid-liquid phase equilibria in
References the system acetic acid-n-propanol-water-n-propyl acetate at 313.15 K, Russ. J.
Appl. Chem. 72 (10) (1999) 1706e1708.
[1] R. Orye, J. Prausnitz, Multicomponent equilibriadthe wilson equation, Ind. [29] V. Kocherbitov, A. Toikka, Liquid-vapor equilibrium in the acetic acidn-propyl
Eng. Chem. 57 (5) (1965) 18e26. alcohol-water system at 313.15 K, Russ. J. Appl. Chem. 70 (11) (1997)
[2] C.A. Faúndez, F.A. Quiero, J.O. Valderrama, Phase equilibrium modeling in 1777e1781.
ethanolþ congener mixtures using an artificial neural network, Fluid Phase [30] J. Tayman, D.A. Swanson, On the validity of mape as a measure of population
Equilib. 292 (2010) 29e35. forecast accuracy, Popul. Res. Policy Rev. 18 (4) (1999) 299e322.

You might also like