J Jcou 2019 05 009

Journal of CO₂ Utilization 33 (2019) 83–95
Contents lists available at ScienceDirect
Journal of CO2 Utilization

journal homepage: www.elsevier.com/locate/jcou
Predicting solubility of CO2 in brine by advanced machine learning systems: T

Application to carbon capture and sequestration
Nait Amar Menada, Abdolhossein Hemmati-Sarapardehb, Amir Varameshc,
Shahaboddin Shamshirbandd,e,
⁎
a
Département Etudes Thermodynamiques, Division Laboratoires, Sonatrach, Boumerdes, Algeria
b
Department of Petroleum Engineering, Shahid Bahonar University of Kerman, Kerman, Iran
c
Department of Chemical & Petroleum Engineering, University of Calgary, Calgary, AB, T2N 1N4 Canada
d
Department for Management of Science and Technology Development, Ton Duc Thang University, Ho Chi Minh City, Viet Nam
e
Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Viet Nam
ARTICLE INFO ABSTRACT
Keywords: Carbon dioxide (CO2) capture and sequestration in saline aquifers have turned into a key focus as it becomes an
CO2sequestration effective way to reduce CO2 in the atmosphere. The solubility of CO2 in brine is of vital role in monitoring CO2
CO2solubility in brine sequestration. In this study, based on molality of NaCl, pressure and temperature, modeling of CO2 solubility in
MLP brine has been carried out utilizing multilayer perceptron (MLP) and radial basis function neural network
RBFNN
(RBFNN). Levenberg-Marquardt (LM) algorithm was implemented to optimize the MLP model, while genetic
Metaheuristic algorithms
algorithm (GA), particle swarm optimization (PSO) and artificial bee colony (ABC), were applied to optimize the
RBFNN model. To this end, a widespread experimental databank including 570 data sets gathered from literature
was considered to implement the proposed models. Graphical and statistical assessment criteria were considered
to investigate the performances of these models. The obtained results revealed that all the proposed techniques
are in excellent correspondence with experimental data. In addition, the performance analyses showed that
RBFNN-ABC model exhibits the higher accuracy in the prediction of CO2 solubility in brine compared with the
other proposed smart approaches and the existing well-known models. The RBFNN-ABC model yields a root
mean square error (RMSE) value of 0.0289 and an R2 of 0.9967. Finally, the RBFNN-ABC model validity was
confirmed and a small number of probable doubtful data was detected.
1. Introduction the CO2 sequestration is to inject it into deep saline aquifers [8,9].
To manage the sequestration of CO2 in saline aquifers, it is required
Carbon dioxide (CO2) is the main greenhouse gas (GHG), which is to have accurate representation of the brine and CO2 related parameters
resulted mainly from the intake of fossil fuels which represent the [10–12]. CO2 solubility in brine is considered the main parameter in the
primary energy source for industry and transport activities. CO2 sequestration in saline aquifers [10–12]; hence, its accurate pre-
Indisputably, the increase in CO2 emission is related to sensible en- diction is necessary. Owing the importance of CO2 solubility in brine,
vironmental issues such as the change in the climate and the increase of numerous theoretical as well as experimental studies have been per-
the global surface average temperature [1–5]. Therefore, in the last few formed to deepen investigate a wide variety of related topics to this
decades, there has been a major area of interest within the ways to vital parameter. In short, Yan et al. [13] did experimental and modeling
bring down the CO2 level in the atmosphere, and one of these methods researches for the solubility of CO2 in NaCl brine. In their study, CO2
is to capture carbon and store it in the underground. This process is solubilities were measured in 0, 1, and 5 m NaCl brines under pressure
known as carbon capture and sequestration (CCS) [6]. condition varying from 5 to 40 MPa, at temperatures of 323, 373, and
CCS has been proven as a vital procedure to decrease the CO2 levels 413 K. Wang et al. [14] carried out measurements of the solubility of
in the atmosphere. Technically, the carbon capture is made by means of CO2 in synthesized brine samples under intervals of 318–348 K for
cryogenic separation adsorption / absorption and membrane separation temperature and 80–110 bar for pressure. Mohammadian et al. [15]
[7], while one of the cutting-edge way that getting much attention for conducted experimental surveys of the solubility of CO2 in NaCl brines
⁎
Corresponding author at: Department for Management of Science and Technology Development, Ton Duc Thang University, Ho Chi Minh City, Viet Nam.
E-mail address: shahaboddin.shamshirband@tdtu.edu.vn (S. Shamshirband).
https://doi.org/10.1016/j.jcou.2019.05.009
Received 15 January 2019; Received in revised form 16 March 2019; Accepted 3 May 2019
2212-9820/ © 2019 Elsevier Ltd. All rights reserved.
N.A. Menad, et al. Journal of CO₂ Utilization 33 (2019) 83–95
Nomenclature Variables
Acronyms xj position of jth particle or bee

vj velocity of jth particle
ABC artificial bee colony c1, c2 learning rates
AI artificial intelligence rand random number
ANN artificial neural network d dimension for search
BP back propagation Spred predicted solubility
GA genetic algorithm Sexp experimental solubility
LMA Levenberg-Marquardt algorithm
MLP Multilayer perceptron Subscripts
MSE mean square error
P pressure j particle j
PSO particle swarm optimization Min minimum
RBFNN radial basis function neural network Max maximum
RMSE Root mean squared error t iteration
R2 coefficient of determination
T temperature
in low salinity conditions (0 − 15 000 ppm) with pressures up to of neurons for RBFNN; biases and weights for MLP. Levenberg-Mar-
25 MPa at temperatures changing from 60 to 100 °C. Mosavat et al. [16] quardt (LM) algorithm [25], genetic algorithm (GA) [32,33], smart
reported measurement tests of CO2 solubility in water, brine, and oil nature-swarm algorithms such as particle swarm optimization (PSO)
mixtures at pressures ranging from 0.7–10.3 MPa and temperatures and artificial bee colony (ABC) [24,34], are optimization methods
between 21 and 40 °C. Zhao et al. [17] compared the solubility of CO2 which have provided relevant results in such tasks.
three kinds of brine including those of natural and synthetic formations, In this study, several smart techniques are evolved to serve as a
as well as synthetic NaCl + CaCl2 at temperatures ranged from 323 to rapid and accurate manner to estimate CO2 solubility in brine. These
423 K and pressures from 100 to 200 bar. Their experimental outcomes techniques are based on MLP and RBFNN models, which are examined
claimed that the solubility of CO2 in brines of the synthetic formation to predict the solubility of CO2 in brine in terms of NaCl (mNaCl ) mol-
can be consistently substituted by that of the synthetic NaCl + CaCl2. ality, pressure (P), and temperature (T) using extensive experimental
In addition to the experimental investigations, the past few decades databank. LMA is employed to train the MLP model, while three nature-
have seen increasingly rapid advances in developing correlations and inspired algorithms including PSO, GA and ABC are employed to opti-
thermodynamic paradigms to estimate properly the solubility of CO2 in mize the number of neurons as well as the spread coefficient of the
brine. In brief, the well-known thermodynamic models that were es- RBFNN model. Moreover, to assess the consistency of the implemented
tablished over the past century are Li and Nghiem [18] and Zuo and paradigms, statistical criteria and graphical error analyses were con-
Guo [19] models which are based on equations of state of Pen- sidered, and a comparison of the accuracy is made against preexisting
g–Robinson and Patel–Teja, respectively. However, these models, and models, namely those of Mao et al. [23] and Bahadori et al [11]. Fur-
besides the limitation of their applicability domains, they fail in pro- thermore, to delve into the examination of the physical aspect of the
viding accurate values of CO2 solubility [11]. These lacks can be ex- best-established model, the variations of CO2 solubility in brine were
plained by the moderate varieties of data that were employed in the sketched in terms of the employed variables by considering several
developments of these models. Recently, some other models have been conditions. Finally, Leverage technique was utilized to evaluate and
introduced with the aim of predicting CO2 solubility in brine, such as examine the quality of the used-data as well as to define the applic-
Sørensen et al. [20], Portier and Rochelle [21] and Duan et al. [22]. The ability realm of our best paradigm. The present paper differs from the
deep overview reported in [23] revealed that the aforementioned previously performed studies in literature according to: (1) the applied
thermodynamic models and the preexisting ones suffer from lack of techniques to model the CO2 solubility in brine, in which, MLP and
generalization as they are valid only within limited ranges of tem- RBFNN were introduced for this purpose and they were shown to be
perature, pressure and NaCl molality. A new activity–fugacity phase robust; (2) the widespread intervals of pressure (up to 1400 bar),
equilibrium model that outperforms the previous ones was developed temperature (up to 723.15 K) and molality of NaCl (up to 6.14 mol/kg)
by [23]. Despite the performances shown by this model, it must be that were considered in the development, and this ensures an extensive
underlined that it can be applied only when the NaCl molality does not applicability domain according to the employed variables for the pro-
exceed 4.5 mol/kg. The other pragmatic model which has the ad- posed models; and (3) besides the application of PSO and GA to select
vantage of calculation simplicity is the Bahadori et al. correlation [11]. properly the RBFNN control parameters, in this study, ABC was im-
Nevertheless, this correlation has not escaped criticism due to its see- plemented for this objective and it was found that ABC outperforms the
mingly intractable NaCl molality conditions (it is only valid for mNaCl well-known nature-inspired algorithms. The employed optimization
equal to 1, 2 or 4 mol/kg). mechanism in ABC, which is based on three different families of bees
On the other hand, appraisal of modeling purposes in many en- allowing a wealth in search process of optimal parameters is the prin-
gineering areas using artificial neural network (ANN), which is re- ciple reason for choosing ABC in this study. In addition, as GA and PSO
cognized as one of the principal categories of machine-learning, has as the well-known metaheuristic algorithms, they are selected to be
shown the high ability of this technique to provide more user-friendly, compared with ABC. Fig. 1 exhibits the problem sketch presented in this
inexpensive to apply, reliable and faster solutions to several problems study.
with any kind of complexity [24–30]. Multilayer perceptron (MLP) and The outline of the paper is as follows. The applied data driven
radial basis function neural network (RBFNN) are the common ANNs techniques in the modeling, namely MLP and RBFNN are highlighted in
types [31]. In addition, there are different optimization techniques to Section 2. Section 3 exposes briefly the optimization techniques em-
improve the ANNs learning process by creating appropriate values for ployed in the training of RBFNN and MLP. Section 4 describes the
their control parameters such as the spread coefficient and the number gathered data utilized in the implementation of the models. Section 5
84
Fig. 1. A schematic of the general sketch of the problem.
discloses the followed methodology, while Section 6 provides and dis- 2.1. Multilayer perceptron (MLP)
cusses our results. Section 7 delivers some brief conclusions.
An MLP is composed of many neurons organized in three sorts of
layers: input, hidden (one or more) and output layers. The role of
2. Modeling techniques hidden layer is to identify the existing relationship describing the
system inputs/outputs by means of activation functions [37]. The
Artificial neural networks (ANNs) are light mathematical models for numbers of neurons in the first and the last layers are equal to inputs
recognizing nonlinear relationships linking inputs to outputs in com- and outputs numbers, respectively. The proper choice of the hidden
plex systems [35]. This class of machine learning was inspired from layers’ numbers and their corresponding number of neurons is a key
human brain system mainly by adapting the conception of biological aspect that affects the MLP model performances. Generally, one hidden
neurons [25]. Multilayer perceptron (MLP) and radial basis function layer is enough to model the systems presenting moderate complexity,
neural networks (RBFNN) are the well-known and the most used ANN while more than one hidden layer is required for highly complex sys-
types for modeling purposes in many complex engineering tasks. These tems [38]. Trial-and-error is widely run to screen suitable number of
two kinds of ANN differ mainly in the treatment way of the information hidden layers and their number of neurons. It consists to vary the
and also in their structure [36]. neurons number in each hidden layer until reaching a satisfactory
85
stopping criterion. The generated value from the hidden/output neu- 3. Optimization techniques
rons is a resultant of multiplied neurons of previous layer to their
weights added to a bias. “Sigmoid” and “Tanh” are the frequently 3.1. Genetic algorithm (GA)
employed activation functions for hidden layers, while linear function
(Pureline) is mostly used for output layer [39]. GA is the earliest evolutionary algorithms which was proposed by
Back-propagation (BP) training procedure is utilized to obtain ap- Holland [41] and extended later by Goldberg and Holland [42]. As its
propriate set of MLP weights and bias. Levenberg-Marquardt (LM) al- name indicates, GA was inspired from genetic principles, and these
gorithm is the most employed BP technique for MLP weights and bias latter are adapted and applied as main operators while optimizing the
optimization thanks to its mathematical formulation and its provided problems being studied. GA starts by creating an initial population of
results [31]. Therefore, LM is applied in the present work for this individuals, which are encoded in form of chromosomes. A fitness
purpose. For a detailed description about LM, readers could refer to function, which is generally the objective function to minimize, is im-
[25]. plemented as a sort of index that distinguishes between the quality of
the population elements. The genetic operators applied to the popula-
tion of chromosomes guide GA iteratively towards unvisited regions in
2.2. Radial basis function neural network (RBFNN) the searching space, and hence finding better solutions. The principles
of GA operators are: selection, crossover, mutation and elitism. A full
RBFNN is a branch of ANNs, which has a strong ability to categorize description of these operators is given in many relevant published
highly-complex processes and capture the non-linearity describing sources [43–45].
them. Indeed, this capacity of learning and the great generalization These operators are repeated in each iteration while a predefined
anticipated are due to the data mapping in a high dimension space [40]. stopping condition is not satisfied.
As in MLP, RBFNN contains the same kinds of layers, viz. input,
hidden and output. However, the main difference between the RBFNN 3.2. Particle swarm optimization (PSO)
and MLP structures is that RBFNN have only one hidden layer, while
MLP can include more than one. The RBFNN hidden layer is formed by PSO is a metaheuristic algorithm, originated from the research of
nodes (hn) and biases (hb) . In addition, a specific RBF is enclosed for Kennedy [46], and updated and improved by Clerc [47]. The inspira-
each (hn) . These RBF have two main parameters, the center and the tion of PSO algorithm was from the concept of auto-organization and
width. collaboration between birds (fishes) during their displacements, where
RBFNN training consists to transfer the input data introduced in the the principles of exploiting the best individual position and that of the
input layer into the hidden layer, where a nonlinear transformation is whole swarm are the two main rules followed to make a new move-
made to capture the complexity. Gaussian function is the widely em- ment. By analogy to this, in PSO a population (swarm) of probable
ployed RBF. Gaussian function is characterized by the center (ci ) and solutions (particles), is displacing iteratively in the search space, ac-
the spread coefficient ( 2) . To calculate the position of input vector ( x ) cording to the objective function to be optimized. Besides the positions
according to the center (ci ) of the Gaussian function, the Euclidian norm x j, t , each particle is described by its velocity vj, t , which guides it through
is applied: the search steps [46,47]. For each new iteration (t + 1) , and by ex-
ploiting the gathered information from the previous iteration (t ) , pre-
ri =
d
(x k cki ) 2 processing of the velocity and the position for the particles are done
k=1 (1) using the following formulas [47]:
where cki and d represent the centers and the number of variables, re- vj, t + 1 = *((vj, t + c1 rand1 (pbestj, t x j, t ) + c2 rand2 (gbestt x j, t )) (4)
spectively. Then, this distance is injected into the Gaussian function
x j, t + 1 = x j, t + vj, t + 1 (5)
that is defined as shown below:
where xj, t and vj, t point out the position and the velocity for the particle
r2 j, correspondingly; pbestj, t denotes the best position of the individual j
(r ) = exp ,
2 2 (2) found up to the iteration t , while gbestt is the best global position; c1
and c2 correspond to the rates of learning, which define the impact of
where the parameters , r and 2 point out the Gaussian function, the social and cognitive mechanisms, respectively; rand1 and rand2 are
Euclidian distance and the spread coefficient, respectively. random from [0;1]; is equal to 0.729 .
The generated results from the output layer are provided using the Finally, in each iteration, the best position for the particles and for
below-shown equation: the best elements in the population are updated basing on the following
equations which are for a minimization problem:
hn
Yj = wji i (r ) + bi , i = 1, …, hn and j = 1, …, N pbestj, t , if h (pbestj, t ) h (x j , t + 1 )
i=1 (3) pbestj, t + 1 =
xj, t + 1, otherwise (6)
where N and hn stand for the training samples size and the number of
gbestt + 1 = min{ h (pbestj, t + 1)} (7)
neurons in the hidden layer, respectively; Yj is the jth output of input
vector x , bj is the bias and wji denotes the weight connecting the hidden where h means the objective function.
node i to the output layer. These steps are reiterated until satisfying stopping criterion.
RBFNN performances depend greatly on the number of nodes in the
hidden layer and the Gaussian RBF spread coefficient [25]. Due to this 3.3. Artificial bee colony (ABC)
fact, in the present work, we propose an implantation of three meta-
heuristic algorithms, namely PSO, GA and artificial bee colony (ABC) ABC is a smart swarm optimization technique that was developed by
for the purpose of optimizing the spread coefficient and the number of Karaboga [48]. This algorithm mimics the auto-organization and the
neurons in the hidden layer of the RBFNN model. MSE error function is foraging behavior of the honeybees. The adaptation of these mechan-
considered the fitness function for the aforementioned algorithms. isms in ABC algorithm for a given optimization problem is done by
86
initially generating a population of bees that correspond to probable data were normalized between -1 and 1, and they were split randomly
solutions. This colony is divided into three families exchanging in- into training and testing sets.
formation between them to enhance the search quality. These families To implement the MLP-LM model, trial and error method was ap-
include employed, onlooker, and scout bees. In what follows, a brief plied for determination of the number of hidden layers, their corre-
description of the ABC steps: sponding activation functions and their optimal number of neurons.
The most efficient choice for MLP-LM corresponds to a network with
- Initialization: a position x j is attributed for each j employed bee. The three hidden layers having tansig as an activation function in all of
number of employed bees is equal to the number of onlooker bees, them, and pureline for the output layer. In addition, the best archi-
which is also equal to number of solutions in the colony [49]. The tecture for the MLP-LM model was 3-11−9-9-1, in which the first and
qualities of the obtained solutions for each bee are evaluated based the last numbers stand for the number of inputs and output, corre-
on the fitness function. spondingly, while the second, the third and the fourth numbers re-
- Employed bees: for a new iteration (t + 1) , each bee j updates its present the number of neurons in each hidden layer correspondingly.
position using the obtained information from a previous iteration RBFNN model have two main control parameters, viz. the spread
(t ) . The following equation is applied: coefficient and the number of neurons in the hidden layer. GA, PSO and
ABC algorithms were applied to optimize these parameters. During the
x j, t + 1 = x j, t + j (x j , t x ,t ) (8) optimization process using the aforementioned algorithms, the control
where and j are random from {1,2, …, colony size} and [0;1], respec- parameters of RBFNN were represented in form of chromosomes
tively. must differ to j. (binary representation) in GA, while these variables were assigned as
Afterward, the fitness value of the new position is calculated, and if positions in PSO and ABC. MSE was considered the fitness function in
it is better than the old one, the bee considers this new position as the these algorithms.
current source food; otherwise it discards it. N
i=1
(Ti Yi )2
MSE =
N (9)
- Onlooker bees: after the employed bees finishing their exploration,
they share their fitness performences to the onlooker bees. These where N is the number of training data, Ti and Yi are the actual and the
latter select solutions with a probability P, which is related to the predicted values, correspondingly.
fitness values as shown below: Fig. 2 highlights the proposed procedure of these nature-inspired
fitj algorithms to train the RBFNN model, and Table 2 depicts their control
Pj = NE parameters that were obtained by mean of the tuning method. The
j=1
fitj (9) resulted models were denoted RBFNN-GA, RBFNN-PSO and RBFNN-
th ABC, respectively. The optimum number of neurons in the RBFNN
where fitj denotes the fitness value for the j bee and NE stands for the
hidden layer and the optimum spread coefficient value achieved by the
number of employed bees. Then, the onlooker bees modify the con-
evolutionary algorithms are stated in Table 3.
served positions in their memories following the principles described in
the employed bees phase.
6. Results and discussion
- Scout Bees: an employed bee converts to scout bee if its fitness
quality stays the same after a number of iterations. In this case, its 6.1. Statistical and graphical assessment
position is replaced randomly from the search space.
To examine the performances of the proposed models and compare
The fittest bee is that which provides the best source of food, and between them, statistical assessment criteria and graphical error ana-
hence, this latter is the most qualified to be the optimum solution. lysis were considered. Root mean squared error (RMSE) and coefficient
This sequence of phases is repeated in each iteration until fulfilling a of determination (R2), which are frequently used in regression analysis,
stopping condition. have been adopted in our study. The mathematical expressions of these
statistical criteria are specified in Appendix A.
4. Data acquisition and preparation A detailed comparison between the performances of the proposed
models using the above-mentioned criteria is reported in Table 4. To get
In order to develop the proposed techniques for predicting CO2 a visual verification of the models accuracy, cross plots of the predicted
solubility in brine, we have collected 570 experimental points from CO2 solubility in brine against the corresponding experimental mea-
authenticated published literature [13,50–63]. A statistical overview of surements for the established models are depicted in Fig. 3, where in
the employed data is given in Table 1. Molality of NaCl (mNaCl ) , pressure this type of plot, accumulation of points nearby the unit slope line in-
(P ) and temperature (T ) were considered the inputs for the proposed dicates the high accommodation between the model predictions and the
models and the CO2 solubility in brine as the output. Before the mod- real data. In addition, for a better visual comparison, bar plots of RMSE
eling phase, the data were normalized at the interval [-1,1]. and R2 are depicted in Fig. 4(a) and (b), respectively. Fig. 3 exhibits that
Afterwards, 80% of the experimental data are chosen randomly for
developing the models and the remaining data are designated to test Table 1
Statistical description of the input/output data.
them.
Max Min Mean SD
5. Methodology −1
Output Solubility (mol·Kg ) 12.35 0.0106 0.27 0.39
Input Temperature (°K) 723.15 273.15 379.5 97.6
Before presenting the outcomes of the proposed smart techniques Pressure (bar) 1400 0.98 153.41 272.39
for predicting the solubility of CO2 in brine, it is needed to explain the mNaCl (mol·Kg−1) 6.14 0.016 1.69 1.91
evolved procedure for each technique. As previously mentioned, the
87
Fig. 2. Flowchart of the proposed RBFNN model optimized by GA, PSO and ABC.
Table 2 Table 3
Evolutionary algorithms setting parameters used in the study. Key parameters of the established RBFNN models for CO2 solubility in brine.
Algorithms Parameters Value/setting RBFNN-GA RBFNN-PSO RBFNN-ABC
GA Population size 60 Number of Nodes 167 182 206

Crossover’s probability 90 % Spread coefficient 11.8452 14.2156 14.3946
Mutation’s probability 15 %
Elitism 10 %
Type of selection Linear ranking
Table 4
Max number of generation 100
Coding Binary
Statistical indexes of the established models for CO2 solubility in brine.
PSO Size of the Swarm 60 RBFNN-ABC RBFNN-PSO RBFNN-GA MLP-LM
Max number of iteration 100
C1 2.05 Training RMSE 0.0218 0.0240 0.0243 0.0472
C2 2.05 (456 points) R2 0.9984 0.9981 0.9980 0.9926
0.729 Test RMSE 0.0572 0.0597 0.0925 0.0304
ABC Number of employer bees 10 (114 points) R2 0.9896 0.9887 0.9724 0.9968
Number of onlooker bees 10 All RMSE 0.0289 0.0311 0.0380 0.0439
Maximum number of iteration 30 (570 R2 0.9967 0.9962 0.9929 0.9934
Number of iterations to scout bees 4 points)
88
Fig. 3. Cross plots of the established RBFNN models (CO2 solubility in brine).
majority of the data points predicted by all the implemented smart predicted data is achieved in both training and testing phases. More-
techniques are very close to the unit slope line proving their high over, Fig. 6 reveals that RBFNN-ABC model shows normal distribution
prediction ability. Deepen analyses of the reported statistical criteria in near zero mean in both train and test sets, indicating therefore, the
Table 4 confirms the high degree of performances for all the proposed excellent performance of this model.
techniques. Moreover, and according to Table 4 and Fig. 4, RBFNN-ABC
is the most reliable model thanks to its lower overall RMSE value of
0.0289 and its high R2 value of 0.9967. It can also be underlined from 6.2. Comparison of RBFNN-ABC model with existing models
Tables 4 and Fig. 4 that our proposed models for predicting CO2 solu-
bility in brine follow the accuracy ranking shown below: Being the most performant model that leads to accurate predictions
of the CO2 solubility in brine, RBFNN-ABC was further compared with
RBFNN-ABC > RBFNN-PSO > RBFNN-GA > MLP-LM two new approaches, namely the activity–fugacity phase equilibrium
model introduced by Mao et al. [23] and Bahadori et al [11] correla-
In order to get a deep insight into the quality of the RBFNN-ABC
tion. For this purpose, the accompanied “executable program” with
predictions, Fig. 5 illustrates a comparison between the esteemed CO2
Mao et al. [23] model was utilized to predict the corresponding values
solubility in brine using RBFNN-ABC and the experimental data for
of the employed data when these latter were within the applicability
training (Fig. 5(a)) and testing (Fig. 5(b)) phases. In addition, histogram
domain of Mao et al. approach. To ensure the fairness of the comparison
plots representing the frequency of error distribution for train and test
with the Bahadori et al [11] correlation, it should be noted that we have
dataset are shown in Fig. 6. As it can be seen from the two subplots of
considered only the points respecting the NaCl molality conditions from
Fig. 5, an excellent agreement between experimental and RBFNN-ABC
the employed data. Statistical comparison between RBFNN-ABC and
Fig. 4. Comparison of RMSE and R2 of the established models.
89
Fig. 5. The comparison between RBFNN-ABC outcomes and real values of CO2 solubility for each sample of dataset (a) training data and (b) testing data.
these models is reported in Table 5. In addition, for better readily ob- accurate in predicting the solubility of CO2 in brine with variations of
serve the comparison, Fig. 7(a) and (a) display, respectively, RMSE and mNaCl , P and T.
R2 bar plots for the evaluated models. It can be said from the statistical
criteria stated in Table 5 and the graphical comparison shown in Fig. 6 6.4. Applicability domain of the RBFNN-ABC model
that Mao et al. [23] model has satisfactory predictions compared to
Bahadori et al [11] correlation, which shows the worst performance. As experimental measurements reported in the literature may be
This bad performance may be explained by the fact that the correlation related to some uncertainties, ascertaining the validity of these data and
was established using data with modest varieties in the conditions. In identifying the RBFNN-ABC realm of application are worthwhile. To do
addition, it can obviously be seen that RBFNN-ABC model results in the so, Leverage approach, was considered. From this method, the Williams
lowest value of RMSE and the highest coefficient of determination, and plot that shows the applicability realm of the model, is obtained by
it outperforms significantly the considered pre-existing models. sketching the residual R values versus the hat vector that corresponds to
To testify the performances of RBFNN-ABC in the prediction of the the diagonal elements of the so-called Hat matrix [64,65]. The com-
solubility of CO2 in brine compared to Mao et al. model according to a putational procedure for the Leverage technique can be found else-
wide ranged temperature and pressure conditions, trend analyses of where [66]. Fig. 10 depicts the Williams plot for the RBFNN-ABC
these two models are depicted in Fig. 8. As it is shown, the immitted model. As it can be observed from this figure, the large part of the data
outputs by RBFNN-ABC follow better the real variation compared to the are in the ranges of 0 H 0.0211 and 3 R 3, signifying there-
generated outputs by Mao et al. model. fore the high trustworthiness of the RBFNN-ABC model and its statis-
tical validity. In addition, it should be noted that only 22 points are
6.3. RBFNN-ABC model validity in terms of the employed variables detected as outliers from the feasibility domain of the RBFNN-ABC
model, and their associated parameters are reported in Table 6. It must
To check the behavior of the RBFNN-ABC predictions in terms of the be noted that this number of suspected data is equivalent to only 3.86%
employed variables, trend analyses that compare the results of our best of the whole data points.
model for estimating the CO2 solubility in brine with the reported ex- Finally, it is needed to mention that the established RBFNN-ABC
perimental data for extended variety of the NaCl molality, pressure and paradigm should be applied to predict the solubility of CO2 in brine
temperature conditions, are presented in Figs. 8 and 9. Closer inspec- when P, T and mNaCl fall within the ranges of application. Indeed, this
tion of this figure shows that RBFNN-ABC is highly reliable and model can be applied for situations out-side of this region with care as
90
Fig. 6. Histogram plot for the datasets applied in constructing RBFNN-ABC: (a) train and (b) test.
Table 5 its accuracy is not ensured since it can provide accurate predictions for
Statistical parameters of various models for CO2 solubility in brine. given values of the employed variables, as well as inaccurate predic-
Model/correlation RMSE R2 tions for some other values. Nevertheless, as widespread P, T and mNaCl
conditions were included when the RBFNN-ABC model was established,
Bahadori et al. (2009) (124 data points) 6.4255 0.1354 it can be applied for many cases having proprieties located within the
Mao et al. (2013) (457 data points) 0.2442 0.9301
range of the aforementioned inputs.
RBFNN-ABC 0.0289 0.9967
(570 data points)
Fig. 7. Comparison of RMSE and R2 for RBFNN-ABC and existing models for predicting CO2 solubility in brine.
91
Fig. 9. CO2 solubility in brine based in several conditions: a comparison be-

tween RBFNN-ABC and experimental data.
Fig. 8. CO2 solubility in brine based in several conditions: a comparison be- Fig. 10. The Williams plot of the utilized dataset for RBFNN-ABC model.
tween predictions of RBFNN-ABC and those of Mao et al. (2013) model against
experimental data.
92
Table 6
Experimental data which are out of the applicability domain of the proposed RBFNN-ABC.
No T(K) P(bar) mNaCl (mol·Kg−1) Solubility (mol·Kg−1) Ref.
1 313.38 56.2470 0.52 0.92167 [62]

2 313.38 67.7630 0.52 1.04611 [62]
3 353.08 5.4726 0.52 0.05611 [62]
4 353.08 83.3480 0.52 0.83722 [62]
5 352.77 83.5410 4.34 0.39056 [62]
6 303.15 1.0132 0.2 0.029 [51]
7 308.15 1.0132 0.2 0.0259 [51]
8 473.15 1000 1.09 2.568 [56]
9 623.15 1400 4.28 3.8605 [56]
10 473.15 200 1.09 1.091 [56]
11 473.15 300 1.09 1.439 [56]
12 673.15 400 1.09 1.973 [56]
13 298.15 1.01325 5.096 0.0106 [50]
14 308.15 1.01325 0.449 0.0246 [50]
15 308.15 1.01325 0.88 0.0223 [50]
16 308.15 1.01325 1.438 0.0197 [50]
17 323.15 1.0132 1 0.0158 [51]
18 283.15 1.0132 3 0.0277 [51]
19 288.15 1.0132 3 0.0241 [51]
20 308.15 1.0132 3 0.0151 [51]
21 323.15 1.0132 3 0.0111 [51]
22 298.15 3.95 3 0.0008 [63]
7. Conclusions predicting the solubility of CO2 in brine.

3 RBFNN-ABC exhibited very small RMSE value of 0.0289 and high
In this investigation, four smart techniques were proposed to de- coefficient of determination (0.9967).
velop easy-to-use, extended and reliable models to predict the solubility 4 Comparison of the RBFNN-ABC predictions against pre-existing
of CO2 in brine basing on two machine-learning techniques, namely models in the literature highlighted its superiority compared to
MLP and RBFNN. LMA was implemented in the optimization of the these latter.
weights and the bias of the MLP model, and three evolutionary algo- 5 Trend analyses demonstrated that the generated outputs by RBFNN-
rithms including GA, PSO and ABC were considered in the optimization ABC model follow the expected variations in terms of the in-
of the RBFNN control parameters. Based on the achieved outputs, the dependent variables.
following important conclusions can be drawn: 6 The application of Leverage approach affirmed that the im-
plemented RBFNN-ABC model is reliable and statistically valid, and
1 The implemented smart models showed satisfactory performances in less than 4% of the data may be outliers.
the prediction of CO2 solubility in brine. 7 The constructed RBFNN-ABC tool can be used by mean of the pro-
2 RBFNN-ABC model is found as the most reliable model for vided excel macro to predict the CO2 solubility in brine.
Appendix A. Statistical formulas
The statistical indexes employed in the evaluation of the implemented models are defined as follows:
N
1
RMSE = (Siexp Si pred ) 2
N i=1 (A.1)
N
i=1
(Siexp Si pred )2
R2 = 1 N
i=1
(Si pred S¯ )2 (A.2)
where S denotes the CO2 solubility in brine; subscripts exp and pred point out the experimental and predicted values, correspondingly; S̄ stands for
average solubility and N is the number of data.
Appendix B. RBFNN-ABC generated model
To use the model, please open the macros titled "CO2_solubility_in_brine_Calculator.xltm".

Note: The macros must be activated in your Excel. The program generates the CO2 Solubility in brine by introducing values for NaCl molality
(mol/kg), pressure (bar) and temperature (K) and clicking in the calculate button.
Appendix C. Supplementary data
Supplementary material related to this article can be found, in the online version, at doi:https://doi.org/10.1016/j.jcou.2019.05.009.
93
References [29] M. Nait Amar, Z. Noureddine, A. Hemmati-Sarapardeh, S. Shamshirband, Modeling

temperature-based oil-water relative permeability by integrating advanced in-
telligent models with grey wolf optimization: application to thermal enhanced oil
[1] N. Abas, N. Khan, Carbon conundrum, climate change, CO2 capture and con- recovery processes, Fuel 242 (2019) 649–663, https://doi.org/10.1016/J.FUEL.
sumptions, J. CO2 Util. 8 (2014) 39–48. 2019.01.047.
[2] A.A. Olajire, Recent progress on the nanoparticles-assisted greenhouse carbon di- [30] Y. Sun, G. Yang, C. Wen, L. Zhang, Z. Sun, Artificial neural networks with response
oxide conversion processes, J. CO2 Util. 24 (2018) 522–547. surface methodology for optimization of selective CO 2 hydrogenation using K-
[3] V. Venkatraman, B.K. Alsberg, Predicting CO2 capture of ionic liquids using ma- promoted iron catalyst in a microchannel reactor, J. CO2 Util. 24 (2018) 10–21.
chine learning, J. CO2 Util. 21 (2017) 162–168. [31] A. Rostami, A. Hemmati-Sarapardeh, S. Shamshirband, Rigorous prognostication of
[4] B. Liu, X. Fu, Z. Li, Impacts of CO2-brine-rock interaction on sealing efficiency of natural gas viscosity: smart modeling and comparative study, Fuel 222 (2018)
sand caprock: a case study of Shihezi formation in Ordos basin, Adv. Geo-Energy 766–778.
Res. 2 (2018) 380–392. [32] Z. Liu, A. Liu, C. Wang, Z. Niu, Evolving neural network using real coded genetic
[5] H. Singh, Impact of four different CO2 injection schemes on extent of reservoir algorithm (GA) for multispectral image classification, Future Gener. Comput. Syst.
pressure and saturation, Adv. Geo-Energy Res. 2 (2018) 305–318. 20 (2004) 1119–1129.
[6] M.R. Soltanian, M.A. Amooie, D.R. Cole, D.E. Graham, S.A. Hosseini, S. Hovorka, [33] K. Redouane, N. Zeraibi, M. Nait Amar, Automated optimization of Well placement
S.M. Pfiffner, T.J. Phelps, J. Moortgat, Simulating the Cranfield geological carbon via adaptive space-filling surrogate modelling and evolutionary algorithm, Abu
sequestration project with high-resolution static models and an accurate equation of Dhabi Int. Pet. Exhib. Conf. (2018).
state, Int. J. Greenh. Gas Control. 54 (2016) 282–296. [34] M. Nait Amar, N. Zeraibi, Application of hybrid support vector regression artificial
[7] J.D. Figueroa, T. Fout, S. Plasynski, H. McIlvried, R.D. Srivastava, Advances in CO2 bee colony for prediction of MMP in CO2-EOR process, Petroleum (2018) In Press.
capture technology—the US department of energy’s carbon sequestration program, [35] A. Hemmati-Sarapardeh, F. Ameli, A. Varamesh, S. Shamshirband,
Int. J. Greenh. Gas Control. 2 (2008) 9–20. A.H. Mohammadi, B. Dabir, Toward generalized models for estimating molecular
[8] N.I. Gershenzon, R.W. Ritzi, D.F. Dominic, M. Soltanian, E. Mehnert, R.T. Okwen, weights and acentric factors of pure chemical compounds, Int. J. Hydrogen Energy
Influence of small-scale fluvial architecture on CO2 trapping processes in deep brine 43 (2018) 2699–2717.
reservoirs, Water Resour. Res. 51 (2015) 8240–8256. [36] A. Tatar, A. Shokrollahi, M. Mesbah, S. Rashid, M. Arabloo, A. Bahadori,
[9] N.I. Gershenzon, M. Soltanian, R.W. Ritzi Jr, D.F. Dominic, Influence of small scale Implementing radial basis function networks for modeling CO2-reservoir oil
heterogeneity on CO2 trapping processes in deep saline aquifers, Energy Procedia minimum miscibility pressure, J. Nat. Gas Sci. Eng. 15 (2013) 82–92.
59 (2014) 166–173. [37] D.A. Wood, A transparent open-box learning network provides insight to complex
[10] M. Dejam, H. Hassanzadeh, The role of natural fractures of finite double-porosity systems and a performance benchmark for more-opaque machine learning algo-
aquifers on diffusive leakage of brine during geological storage of CO2, Int. J. rithms, Adv. Geo-Energy Res. 2 (2018) 148–162.
Greenh. Gas Control 78 (2018) 177–197. [38] A. Hemmati-Sarapardeh, M.-H. Ghazanfari, S. Ayatollahi, M. Masihi, Accurate de-
[11] A. Bahadori, H.B. Vuthaluru, S. Mokhatab, New correlations predict aqueous so- termination of the CO2-crude oil minimum miscibility pressure of pure and impure
lubility and density of carbon dioxide, Int. J. Greenh. Gas Control 3 (2009) CO2 streams: a robust modelling approach, Can. J. Chem. Eng. 94 (2016) 253–261.
474–480. [39] S.S. Haykin, S.S. Haykin, S.S. Haykin, S.S. Haykin, Neural Networks and Learning
[12] M. Dejam, H. Hassanzadeh, Diffusive leakage of brine from aquifers during CO2 Machines, Pearson Upper Saddle River, NJ, USA, 2009.
geological storage, Adv. Water Resour. 111 (2018) 36–57. [40] A.M. Elsharkawy, Others, modeling the properties of crude oil and gas systems
[13] W. Yan, S. Huang, E.H. Stenby, Measurement and modeling of CO2 solubility in using RBF network, SPE Asia Pacific Oil Gas Conf. Exhib. (1998).
NaCl brine and CO2–saturated NaCl brine density, Int. J. Greenh. Gas Control. 5 [41] J. Holland, Adaptation in Natural and Artificial Systems, University of Michigan
(2011) 1460–1477. press, Ann Arbor, MI, 1975, p. 5 1.
[14] L. Wang, Z. Shen, L. Hu, Q. Yu, Modeling and measurement of CO2 solubility in [42] D.E. Goldberg, J.H. Holland, Genetic algorithms and machine learning, Mach.
salty aqueous solutions and application in the Erdos Basin, Fluid Phase Equilib. 377 Learn. 3 (1988) 95–99.
(2014) 45–55. [43] S.N. Sivanandam, S.N. Deepa, Introduction to Genetic Algorithms, Springer, 2007.
[15] E. Mohammadian, H. Hamidi, M. Asadullah, A. Azdarpour, S. Motamedi, R. Junin, [44] M. Mitchell, An Introduction to Genetic Algorithms, MIT press, 1998.
Measurement of CO2 solubility in NaCl brine solutions at different temperatures [45] D.A. Coley, An Introduction to Genetic Algorithms for Scientists and Engineers,
and pressures using the potentiometric titration method, J. Chem. Eng. Data 60 World Scientific Publishing Company, 1999.
(2015) 2042–2049. [46] J. Kennedy, R. Eberhart, Particle swarm optimization, Proceedings of IEEE
[16] N. Mosavat, A. Abedini, F. Torabi, Phase behaviour of CO2–brine and CO2–oil International Conference on Neural Networks (ICNN’95) in, (1995).
systems for CO2 storage and enhanced oil recovery: experimental studies, Energy [47] M. Clerc, The Swarm and the Queen: Towards a Deterministic and Adaptive Particle
Procedia 63 (2014) 5631–5645. Swarm Optimization, in: Evol. Comput. 1999. CEC 99. Proc. 1999 Congr. (1999),
[17] H. Zhao, R. Dilmore, D.E. Allen, S.W. Hedges, Y. Soong, S.N. Lvov, Measurement pp. 1951–1957.
and modeling of CO2 solubility in natural and synthetic formation brines for CO2 [48] D. Karaboga, An Idea Based on Honey Bee Swarm for Numerical Optimization,
sequestration, Environ. Sci. Technol. 49 (2015) 1972–1980. Tech. Rep. TR06, Erciyes Univ., 2005 10. doi:citeulike-article-id:6592152.
[18] Y.-K. Li, L.X. Nghiem, Phase equilibria of oil, gas and water/brine mixtures from a [49] D. Karaboga, B. Basturk, A powerful and efficient algorithm for numerical function
cubic equation of state and Henry’s law, Can. J. Chem. Eng. 64 (1986) 486–496. optimization: artificial bee colony (ABC) algorithm, J. Glob. Optim. 39 (2007)
[19] Y.-X. Zuo, T.-M. Guo, Extension of the Patel—teja equation of state to the prediction 459–471, https://doi.org/10.1007/s10898-007-9149-x.
of the solubility of natural gas in formation water, Chem. Eng. Sci. 46 (1991) [50] A. Yasunishi, F. Yoshida, Solubility of carbon dioxide in aqueous electrolyte solu-
3251–3258. tions, J. Chem. Eng. Data 24 (1979) 11–14.
[20] H. Sørensen, K.S. Pedersen, P.L. Christensen, Modeling of gas solubility in brine, [51] A. Findlay, T. Williams, LXXI.—the influence of colloids and fine suspensions on the
Org. Geochem. 33 (2002) 635–642. solubility of gases in water. Part III. Solubility of carbon dioxide at pressures lower
[21] S. Portier, C. Rochelle, Modelling CO2 solubility in pure water and NaCl-type waters than atmospheric, J. Chem. Soc. Perkin Trans. I 103 (1913) 636–645.
from 0 to 300 C and from 1 to 300 bar: application to the Utsira Formation at [52] G. Ferrentino, D. Barletta, F. Donsì, G. Ferrari, M. Poletto, Experimental measure-
Sleipner, Chem. Geol. 217 (2005) 187–199. ments and thermodynamic modeling of CO2 solubility at high pressure in model
[22] Z. Duan, R. Sun, C. Zhu, I.-M. Chou, An improved model for the calculation of CO2 apple juices, Ind. Eng. Chem. Res. 49 (2010) 2992–3000.
solubility in aqueous solutions containing Na+, K+, Ca2+, Mg2+, Cl-, and SO42-, [53] A.J. Ellis, R.M. Golding, The solubility of carbon dioxide above 100 degrees C in
Mar. Chem. 98 (2006) 131–139. water and in sodium chloride solutions, Am. J. Sci. 261 (1963) 47–60.
[23] S. Mao, D. Zhang, Y. Li, N. Liu, An improved model for calculating CO2 solubility in [54] S. Bando, F. Takemura, M. Nishio, E. Hihara, M. Akai, Solubility of CO2 in aqueous
aqueous NaCl solutions and the application to CO2–H2O–NaCl fluid inclusions, solutions of NaCl at (30 to 60) C and (10 to 20) MPa, J. Chem. Eng. Data 48 (2003)
Chem. Geol. 347 (2013) 43–58. 576–579.
[24] F. Ameli, A. Hemmati-Sarapardeh, M. Schaffie, M.M. Husein, S. Shamshirband, [55] S.-Y. Yeh, R.E. Peterson, Solubility of carbon dioxide, krypton, and xenon in lipids,
Modeling interfacial tension in N 2/n-alkane systems using corresponding state J. Pharm. Sci. 52 (1963) 453–458.
theory: application to gas injection processes, Fuel 222 (2018) 779–791. [56] S. Takenouchi, G.C. Kennedy, The solubility of carbon dioxide in NaCl solutions at
[25] A. Hemmati-Sarapardeh, A. Varamesh, M.M. Husein, K. Karan, On the evaluation of high temperatures and pressures, Am. J. Sci. 263 (1965) 445–454.
the viscosity of nanofluid systems: modeling and data assessment, Renewable [57] B. Rumpf, H. Nicolaisen, C. Öcal, G. Maurer, Solubility of carbon dioxide in aqueous
Sustainable Energy Rev. 81 (2018) 313–329. solutions of sodium chloride: experimental results and correlation, J. Solution
[26] M. Nait Amar, N. Zeraibi, K. Redouane, Optimization of WAG process using dy- Chem. 23 (1994) 431–448.
namic proxy, genetic algorithm and ant colony optimization, Arab. J. Sci. Eng. 43 [58] K. ONDA, E. SADA, T. KOBAYASHI, S. KITO, K. ITO, Salting-out parameters of gas
(2018) 6399–6412. solubility in aqueous salt solutions, J. Chem. Eng. Japan. 3 (1970) 18–24.
[27] M. Nait Amar, N. Zeraibi, K. Redouane, Pure co2-oil system minimum miscibility [59] J.A. Nighswander, N. Kalogerakis, A.K. Mehrotra, Solubilities of carbon dioxide in
pressure prediction using optimized artificial neural network by differential evo- water and 1 wt.% sodium chloride solution at pressures up to 10 MPa and tem-
lution, Pet. Coal. 60 (2018). peratures from 80 to 200. degree. C, J. Chem. Eng. Data 34 (1989) 355–360.
[28] M. Nait Amar, N. Zeraibi, K. Redouane, Bottom hole pressure estimation using [60] Y. Liu, M. Hou, G. Yang, B. Han, Solubility of CO2 in aqueous solutions of NaCl, KCl,
hybridization neural networks and grey wolves optimization, Petroleum 4 (2018) CaCl2 and their mixed salts at different temperatures and pressures, J. Supercrit.
419–429. Fluids 56 (2011) 125–129.
94
[61] D. Koschel, J.-Y. Coxam, L. Rodier, V. Majer, Enthalpy and solubility data of CO 2 in 3533–3554.
water and NaCl (aq) at conditions of interest for geological sequestration, Fluid [64] A.M. Leroy, P.J. Rousseeuw, Robust Regression and Outlier Detection, J. Wiley&
Phase Equilib. 247 (2006) 107–120. Sons, New York, 1987.
[62] J. Kiepe, S. Horstmann, K. Fischer, J. Gmehling, Experimental determination and [65] C.R. Goodall, 13 Computation using the QR decomposition, Chapman Hall. Handb.
prediction of gas solubility data for CO2+ H2O mixtures containing NaCl or KCl at Mod. Stat. Methods 9 (1993) 467–508, https://doi.org/10.1016/S0169-7161(05)
temperatures between 313 and 393 K and pressures up to 10 MPa, Ind. Eng. Chem. 80137-3.
Res. 41 (2002) 4393–4398. [66] A. Hemmati-Sarapardeh, F. Ameli, B. Dabir, M. Ahmadi, A.H. Mohammadi, On the
[63] S. He, J.W. Morse, The carbonic acid system and calcite solubility in aqueous Na-K- evaluation of asphaltene precipitation titration data: modeling and data assessment,
Ca-Mg-Cl-SO4 solutions from 0 to 90 C, Geochim. Cosmochim. Acta 57 (1993) Fluid Phase Equilib. 415 (2016) 88–100.
95

J Jcou 2019 05 009

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

J Jcou 2019 05 009

Uploaded by

Copyright:

Available Formats

Journal of CO₂ Utilization 33 (2019) 83–95

Contents lists available at ScienceDirect

Journal of CO2 Utilization

Predicting solubility of CO2 in brine by advanced machine learning systems: T

ARTICLE INFO ABSTRACT

Acronyms xj position of jth particle or bee

Fig. 1. A schematic of the general sketch of the problem.

GA Population size 60 Number of Nodes 167 182 206

Fig. 4. Comparison of RMSE and R2 of the established models.

Fig. 9. CO2 solubility in brine based in several conditions: a comparison be-

1 313.38 56.2470 0.52 0.92167 [62]

7. Conclusions predicting the solubility of CO2 in brine.

Appendix A. Statistical formulas

Appendix B. RBFNN-ABC generated model

To use the model, please open the macros titled "CO2_solubility_in_brine_Calculator.xltm".

Appendix C. Supplementary data

References [29] M. Nait Amar, Z. Noureddine, A. Hemmati-Sarapardeh, S. Shamshirband, Modeling

You might also like