1 s2.0 S0009250920302840 Main

Chemical Engineering Science 223 (2020) 115752
Contents lists available at ScienceDirect
Chemical Engineering Science

journal homepage: www.elsevier.com/locate/ces
Prediction of CO2 solubility in ionic liquids using machine learning

methods
Zhen Song a, Huaiwei Shi a,b, Xiang Zhang c, Teng Zhou a,b,⇑
a
Process Systems Engineering, Otto-von-Guericke University Magdeburg, Universitätsplatz 2, D-39106 Magdeburg, Germany
b
Process Systems Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstr. 1, D-39106 Magdeburg, Germany
c
Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China
h i g h l i g h t s
A dataset containing 10,116 CO2 solubilities in ionic liquids is collected.

Two machine learning models, ANN-GC and SVM-GC, are developed.
Both of the models can give reliable predictions on the CO2 solubility.
The ANN-GC model performs slightly better than the SVM-GC model.
a r t i c l e i n f o a b s t r a c t
Article history: A comprehensive database containing 10,116 CO2 solubility data measured in various ionic liquids (ILs) at
Received 31 January 2020 different temperatures and pressures is established. Based on this database, the relationship between CO2
Received in revised form 23 April 2020 solubility and IL structure, temperature and pressure is correlated using group contribution (GC) meth-
Accepted 25 April 2020
ods. Two different machine learning algorithms, namely artificial neural network (ANN) and support vec-
Available online 30 April 2020
tor machine (SVM), are employed to develop the GC models. For the 2023 test-set data, the estimated
MAE and R2 are 0.0202 and 0.9836, respectively for the ANN-GC model and for the SVM-GC model they
Keywords:
are 0.0240 and 0.9783, respectively. The distributions of prediction errors are plotted for both models to
CO2 solubility
Ionic liquids
provide more comprehensive knowledge on the model performance. The results indicate that both of the
Machine learning models can give reliable predictions on the CO2 solubilities in ILs and the ANN-GC model performs
Group contribution slightly better than the SVM-based model.
Artificial neural network Ó 2020 Elsevier Ltd. All rights reserved.
Support vector machine
1. Introduction When using CAMD to design ILs for carbon capture, the devel-
opment of reliable models for predicting the CO2 solubility in ILs
In the past few years, ionic liquids (ILs) have attracted intensive is of central importance. Traditional thermodynamic models, such
attention as alternatives for conventional organic solvents in car- as PSRK (Holderbaum and Gmehling, 1991), group contribution-
bon capture due to their unique properties, such as thermal stabil- based SAFT (Mourah et al., 2010), cubic equation of state combined
ity, negligible volatility and high gas solution capacity (Theo et al., with UNIFAC (Fredenslund et al., 1975) or COSMO-RS (Eckert and
2016; Zeng et al., 2017). However, the large number of cations and Klamt, 2002), can estimate gas solubilities. Importantly, they can
anions makes it challenging to rationally select and design the well represent the temperature and pressure effects due to their
most suitable ILs. In the past decade, the computer-aided molecu- very sound thermodynamic basis. However, these models some-
lar design (CAMD) method (Austin et al., 2016; Gani et al., 2016) times cannot give quantitatively satisfying solubility predictions.
has been extensively used for the optimal design of ILs for various Besides the rigorous thermodynamic modeling, another powerful
separation processes (Chávez-Islas et al., 2011; Chen et al., 2019a, method for solubility (or widely speaking property) prediction is
b; Liu et al., 2020; Roughton et al., 2012) including CO2 capture the quantitative structure-property relationship (QSPR) approach
(Chong et al., 2015, 2016; Valencia-Marquez et al., 2017). where the property of interest is quantitatively correlated with
certain structural descriptors of the molecules. Due to the develop-
⇑ Corresponding author at: Process Systems Engineering, Otto-von-Guericke ment of advanced software packages, such as CODESSA (Katritzky
University Magdeburg, Universitätsplatz 2, D-39106 Magdeburg, Germany. et al., 2002), for the calculation of molecular descriptors, the QSPR
E-mail address: zhout@mpi-magdeburg.mpg.de (T. Zhou). method has been widely used to predict the physical and chemical
https://doi.org/10.1016/j.ces.2020.115752
0009-2509/Ó 2020 Elsevier Ltd. All rights reserved.
2 Z. Song et al. / Chemical Engineering Science 223 (2020) 115752
properties of ILs, including heat capacity (Sattari et al., 2013), elec- To build a GC model for predicting CO2 solubility in ILs, IL mole-
trical conductivity (Gharagheizi et al., 2013), thermal decomposi- cules must be decomposed into building groups in advance. In
tion temperature (Gharagheizi et al., 2012), toxicity (Zhao et al., order to broaden the applicability of the obtained model, we
2014), and viscosity (Zhao et al., 2015). decompose ILs into cation cores, cation substituents, and anions.
The most commonly used QSPR methods in CAMD are group After performing the decomposition for all the involved ILs, 51 dif-
contribution (GC) methods where the molecular descriptors are ferent building groups including 13 cation cores, 28 anions and 10
the occurrences of functional groups in the molecule. A large num- substituent groups are resulted. A full list of the groups is provided
ber of GC models have been developed for pure-component prop- in the Supporting Information along with their numbers of occur-
erty predictions, for instance (Gharagheizi et al., 2014; Hukkerikar rence in each IL.
et al., 2012; Marrero and Gani, 2001; Sattari et al., 2016). Tradition- The experimental data are divided into a training set (8093
ally, GC methods correlate properties with the group numbers points, 80% of the data) to build the model and a test set of the
using a linear expression (Ceriani et al., 2009; Lazzus and Pulgar- remaining 2023 data points to evaluate the predictive capability
Villarroel, 2015; Huang et al., 2013). However, certain properties of the obtained model. Considering the sparsity of the feature
cannot be appropriately described by linear models where in these (group number) matrix, instead of performing random selection,
cases nonlinear GC models are required to generate more accurate we employ a hybrid artificial-random strategy to decompose the
predictions. Recently, the machine learning (ML) algorithms have dataset. Specifically, the data points consisting of the least fre-
been significantly developed and are now very popular for building quently used groups are equally divided into five folders. Subse-
complex nonlinear QSPR or GC models for estimating various types quently, all the other data points are randomly distributed into
of properties including CO2 solubility (Tatar et al., 2016; the five folders. Finally, one fold of data is used for testing and
Sedghamiz et al., 2015), H2S solubility (Faúndez et al., 2016; the other four folds of data are used for model training. The result-
Zhao et al., 2016), surface tension (Mulero et al., 2017), octanol- ing training and test data are also tabulated in the Supporting
water partition coefficient (Wang et al., 2019), critical properties Information.
(Su et al., 2019), odor of fragrance molecules (Zhang et al., 2018),
and toxicity of ILs (Cao et al., 2018). 3. ANN-GC modeling
Artificial neural networks (ANN) belong to a set of very popular
ML algorithms inspired by biological neural systems. Numerous An artificial neural network (ANN) consists of many layers and
works have shown the high accuracy of the ANN-based GC models each layer contains a certain number of neurons that receive data
for property predictions (Mondejar et al., 2017; Gharagheizi and and generate data using transformation functions. In the past few
Salehi, 2011; Fatehi et al., 2017). Recently, another ML method, decades, the number of applications of ANN for quantitative prop-
the support vector machine (SVM), has also attracted substantial erty predictions has increased significantly. The ANN-based GC
attention for data classification and regression. Due to its remark- models have been proven to perform better than the conventional
able generalization performance, the SVM approach has obtained linear and polynomial ones (Zhou et al., 2018; Mondejar et al.,
extensive applications in QSPR or GC modeling (Zhao et al., 2014; 2017). The applications of ANN-GC methods include the prediction
Zhao et al., 2015). A rigorous comparison of the predictive abilities of surface tension (Gharagheizi et al., 2011c), density (Valderrama
of ANN and SVM based GC models is thus very favorable. et al., 2009), critical property (Gharagheizi et al., 2011a), solubility
The main objective of this work is to develop ML-based GC mod- in water (Gharagheizi et al., 2011b), viscosity (Paduszyński and
els that could be used for predicting the solubility of CO2 in ILs. To Domańska, 2014), etc.
do this, a comprehensive database consisting of 10,116 CO2 solubil- As shown in Fig. 1, a popular ANN architecture comprising of a
ity data measured in various types of ILs at wide temperature and three-layer feed forward network is employed in this work. The
pressure ranges is first established. Based on the collected solubility input layer receives IL structure information (represented as 51
data, an ANN-GC model and an SVM-GC model are developed for group numbers) and the temperature and pressure, which in total
predicting the CO2-in-IL solubilities. The performances of the gives an input vector p with a size of 53 1. The hidden layer
obtained models are evaluated and compared. An application transfers and delivers this input information to the output layer
example is provided in the end to guide the readers in predicting where the solubility is predicted and the summation of errors is
CO2 solubility using our models. Moreover, the original MATLAB quantified. For a known input vector p, the output from the hidden
codes are also included in the Supporting Information. Interested layer f1(a1) is determined by Eq. (1) and the output of the output
readers can directly make use of the code for their own predictions. layer f2(a2) (i.e., CO2-in-IL solubility) is calculated by Eq. (2).
f 1 ða1 Þ ¼ f 1 ðW 1 p þ b1 Þ ð1Þ
2. Experimental data
f 2 ða2 Þ ¼ f 2 ðW 2 f 1 ða1 Þ þ b2 Þ ð2Þ
In total, 10,116 data points on the CO2 solubility (mole fraction The MATLAB tansig and purelin transfer functions are employed
0.0000648–0.9516) measured in various types of ILs (in total 124 in the hidden and output layers, respectively. This combination of
different ILs) from 243.2 K to 453.15 K and from 0.00798 bar up transfer functions is very effective for building a three-layer neural
to 499.9 bar are collected. These data are updated from an earlier network. In fact, we tried also other functions and found that the
work performed by Lei et al. (2014). The detailed CO2 solubility tansig and purelin functions are indeed the best for our regression.
data are tabulated in the Supporting Information together with The expressions of the functions are given below.
the corresponding IL abbreviation, temperature and pressure. The
2
abbreviations, full names, and structures of all the involved cations f 1 ð xÞ ¼ 1 ð3Þ
and anions are also provided in the Supporting Information. The 1 þ e2x
cations include imidazolium, pyrrolidinium, pyridinium, piperi-
f 2 ð xÞ ¼ x ð4Þ
dinium, ammonium, phosphonium, and sulfonium, and anions
contain tetrafluoroborate [BF4], chloride [Cl], dicyanamide [DCA], There are five parameters in this ANN, i.e., the number of neu-
nitrate [NO3], hexafluorophosphate [PF6], thiocyanate [SCN], tri- rons in the hidden layer, weight matrices W1 and W2, and bias vec-
cyanomethanide [C(CN)3], hydrogen sulfate [HSO4], bis(trifluoro tors b1 and b2. The number of neurons in the hidden layer is
methylsulfonyl)amide [Tf2N], methylsulfate [MeSO4], etc. normally specified before the weight and bias parameters are
Z. Song et al. / Chemical Engineering Science 223 (2020) 115752 3
Input Hidden Layer Output Layer
W1 W2
f1(a1) Output
Input p (7×53) a1 (1×7) a2
(53×1) (7×1) (1×1) = f2(a2)
b1 b2
(7×1) a1 = W1×p + b1 (1×1) a = W ×f (a ) + b
2 2 1 1 2
Fig. 1. Schematic structure of the employed three-layer ANN (the dimensions of W1, W2, b1, and b2 are given in the brackets).
regressed. Generally, with too few neurons the network may not be are depicted in Fig. 3. Moreover, the histogram of the prediction
powerful enough for predicting the data. However, with a too large deviations is presented in Fig. 4. As depicted, most of the deviations
number of neurons, the network tends to perform over-fitting. In are close to zero and a very small proportion shows an absolute
this work, we start to train the ANN-GC model with two neurons error higher than 0.1. The largest absolute deviation is around
in the hidden layer and gradually increase the number until no sig- 0.20. It is clear that the ANN-GC model can provide an accurate
nificant improvement in the performance of the network is prediction on the CO2 solubilities in ILs at various temperatures
achieved for both training and test set data. Following this strategy, and pressures.
7 neurons in the hidden layer are finally identified and used. After-
wards, W1, W2, b1 and b2 are obtained by minimizing the summa-
tion of absolute errors between the experimental and model-
predicted CO2 solubility in the training dataset using the
Levenberg-Marquardt algorithm implemented in MATLAB. The
obtained W1, W2, b1, and b2 are provided in the Supporting
Information.
Fig. 2 compares the experimental and model-predicted CO2 sol-
ubility for the 8093 data points in the training set and 2023 points
in the test set. It is clear that both the training and test data dis-
tribute closely along the diagonal line, except for a small number
of outliers which may be attributed to experimental deviations
among different measurements. For providing intuition into the
predictive power of the model, the mean absolute error (MAE)
and the coefficient of determination (R2) are determined. The esti-
mated MAE and R2 for the training set are 0.0200 and 0.9842, and
for the test set, they are 0.0202 and 0.9836, respectively, which
demonstrates that the model is not overfitted on the training data.
Moreover, these statistical indicators show an overall high-quality
prediction of the model.
For a better illustration of the model performance, the errors
between the predicted and experimental solubilities (xpred exp
CO2 xCO2 )
Fig. 3. Error of the ANN-GC model for predicting the CO2 solubility.
Fig. 2. Comparison between the experimental and ANN-GC predicted CO2 Fig. 4. Distribution of the prediction error of the ANN-GC model.
solubility.
4. SVM-GC modeling indicate a high performance of the SVM-GC model for CO2-in-IL
solubility predictions.
Support vector machine (SVM) is another popular ML algorithm Similarly, in order to provide a more comprehensive knowledge
and has attracted wide attention in many science and engineering on the model prediction errors, the deviations between the pre-
areas due to its remarkable generalization performance. In the dicted and experimental solubilities (xpred exp
CO2 xCO2 ) are plotted
early years, SVM was mainly employed for pattern recognition against the experimental observations in Fig. 6. In addition, the dis-
and now this algorithm is extensively used for data regression tribution of the prediction error is presented in Fig. 7. As seen, most
and classification (Wang, 2005). Concisely, the SVM method first of the errors fall into the range of [0.05, 0.05] and a very small
maps the input into a high-dimensional feature space by a nonlin- amount of them are outside of [0.1, 0.1]. The largest absolute
ear kernel function and then conducts linear regression in this fea- deviation is around 0.25. These plots demonstrate that the
ture space. The performance of the fitted SVM model depends on obtained SVM-GC model can well represent the CO2 solubility in
the selected kernel function and the corresponding kernel param- ILs at various temperature and pressure conditions.
eters as well as a few internal parameters of the algorithm.
The same type of input descriptors and the same sets of training
5. Model comparison
and test data are employed for developing the SVM-GC model. The
prediction or approximation function of the SVM model is given by
From the above sections, we know that both the ANN-GC and
Eq. (5)
SVM-GC models can well represent the CO2-in-IL solubility. Table 1
summarizes the estimated MAE and R2 for both models. These sta-
X
l
f ðxÞ ¼ ai K ðx; xi Þ þ b ð5Þ tistical indicators generally show a slightly better performance of
i¼1 the ANN model.
In order to better compare the models, Fig. 8 plots the cumula-
where x denotes the input vector and f(x) is the output. xi is the i-th tive probability of the absolute error between experimental and

feature vector and ai is the weight for this feature vector. Training exp
model-predicted CO2 solubility (xpred
CO2 xCO2 ) in the test set for
points with nonzero weights are called support vectors. Vector a
both models. As seen, the error probability curve of the ANN-GC
and the constant b (bias) are adjustable model parameters that
model is slightly above that of the SVM-GC model in the entire
are optimized during the training process. K(x, xi) is a selected ker-
absolute error range. 91.2% of the ANN-GC predictions and 89.2%
nel function. The Gaussian radial basis function (RBF) kernel is
of the SVM-GC predictions show an absolute error less than 0.05.
applied in this work because it has been widely demonstrated to
Besides, 98.9% of the ANN-GC data show a less than 0.10 absolute
be very effective.
error and for the SVM-GC model this value is 97.8%. It is observed
that both of the models can provide reliable predictions on the CO2
K ðx; xi Þ ¼ expðr kx xi k2 Þ ð6Þ solubility and the ANN-GC model performs slightly better than the
r is an adjustable parameter of the kernel controlling the ampli- SVM-GC model. Besides, ANN-GC has a smaller number of model
tude of the RBF function. By default, the kernel parameter r equals parameters and is thus preferentially recommended.
1=sc2 where the kernel scale sc is fitted by the MATLAB SVM solver.
All of the determined parameters for the SVM-GC model are pre- 6. Application examples
sented in the Supporting Information.
The comparison between experimental and SVM-GC predicted In this section, an application example is provided where the
CO2 solubility for both training and test data is shown in Fig. 5. It ANN-GC and SVM-GC models are employed to predict the CO2 sol-
can be seen that except for a few outliers, most of the solubility ubility in [Bmim][BF4] at 298.2 K and 49.3 bar (Entry 1525 in the
data can be well reproduced by the obtained model. The estimated collected database). As indicated in the Supporting Information,
MAE and R2 are 0.0231 and 0.9807 for the training set and 0.0240 the first two elements of the input vector are the temperature
and 0.9783 for the test set, respectively. These two measures also and pressure and subsequently the number of the 51 different
groups present in the IL molecule. Following this, it is known that
Fig. 5. Comparison between the experimental and SVM-GC predicted CO2

solubility. Fig. 6. Error of the SVM-GC model for predicting the CO2 solubility.
X
4153
xSVM ¼ ½ai exp r kx xi k2 þ b ¼ 0:443
i¼1
The experimental CO2 solubility is 0.477. Therefore, the relative

deviations of the ANN-GC and SVM-GC models are 2.72% and
7.13%, respectively.
7. Advantages and limitations
To date, various thermodynamic models have been used for

modeling the CO2 solubility in different solvents. Due to the very
sound thermodynamic basis, these models can well represent the
influences of temperature and pressure on the CO2 solubility. Most
of the time, they can also provide qualitatively correct predictions,
which is essential and sufficient for preliminary solvent selection
and process synthesis purposes. For the rigorous process design
or optimization, the quantitative accuracy of the thermodynamic
models can be improved by reducing the application range and
Fig. 7. Distribution of the prediction error of the SVM-GC model. regressing system-specific model parameters.
Despite the high quantitative accuracy of the obtained ML-GC
models, their limitations should not be ignored. Unlike the conven-
Table 1 tional models, the developed ML models are not derived from ther-
Statistical indicators for the regressed ANN-GC and SVM-GC models.
modynamic principles. Temperature and pressure are simply
Model MAE R2 considered as two inputs for the models, as illustrated in the above
ANN-GC 0.0202 0.9836 application example. Having been trained with a large number of
SVM-GC 0.0240 0.9783 solubilities measured at different conditions, the ML models can
indeed, most of the time, correctly predict the temperature and
pressure effects. However, there is no theoretical guarantee on this.
The other limitation of the ML models is that they are currently not
available in process simulators and to directly use them for opti-
mization is not easy. Nevertheless, many researchers (e.g.,
Schweidtmann et al., 2019; Eason and Biegler, 2018) are develop-
ing advanced algorithms to solve optimization problems involving
data-driven models. With such algorithms, we expect that the
developed ML models can be useful for computer-aided IL and pro-
cess design where the optimal solvent and operating conditions are
identified for a given carbon capture process.
8. Conclusion
Two GC models have been developed using the ANN and SVM
algorithms for the prediction of the solubility of CO2 in different
types of ILs at various temperatures and pressures. The results
show that both of the models can give reliable predictions and
the ANN-GC model is slightly better than the SVM-GC model. In
the cases where experimental measurements are difficult, costly
or even infeasible, the obtained models can provide fast solubility
Fig. 8. Cumulative probability of the absolute error between experimental and estimations. On the other hand, these models can be incorporated
model-predicted CO2 solubility in the test set. into a CAMD framework to identify best ILs for carbon capture
processes.
An accurate model prediction is normally limited to the com-
for the ANN model, the input vector p is the transpose of [298.2, pounds and conditions similar to those used in model training.
49.3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, Since the training data already cover most of the known types of
0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]. ILs measured at extremely broad ranges of temperatures and pres-
According to Eqs. (1)–(4), we have: sures (from 243.2 K to 453.15 K and from 0.00798 bar up to
499.9 bar), the obtained models are expected to possess a wide
2
xANN ¼ W 2 1 þ b2 ¼ 0:464 application range. The regressed model parameters as well as the
1 þ e2ðW 1 pþb1 Þ
original MATLAB codes are provided in the Supporting Information,
For the obtained SVM-GC model, there are in total 4153 support interested readers can thus make use of the models for their own
vectors (xi), as presented in the Supporting Information. The fitted predictions.
kernel scale is 70.93 thus r equals 0.0002. The regressed weight It is worth mentioning that the experimental solubility data,
vector a and bias b are also provided in the Supporting Informa- especially the cases of very low solubilities, may have large uncer-
tion. Combining Eqs. (5) and (6), for a given input vector x (i.e., vec- tainties. Usually, these uncertainties should be taken into account
tor p in the ANN model), we have: in model development. However, since most of the available liter-
ature has not reported the data uncertainties, they are not consid- Gharagheizi, F., Salehi, G.R., 2011. Prediction of enthalpy of fusion of pure
compounds using an Artificial Neural Network-Group Contribution method.
ered in the present work.
Thermochim Acta 521, 37–40.
Gharagheizi, F., Sattari, M., Ilani-Kashkouli, P., Mohammadi, A.H., Ramjugernath, D.,
Richon, D., 2012. Quantitative structure-property relationship for
CRediT authorship contribution statement thermal decomposition temperature of ionic liquids. Chem. Eng. Sci. 84, 557–
563.
Gharagheizi, F., Sattari, M., Ilani-Kashkouli, P., Mohammadi, A.H., Ramjugernath, D.,
Zhen Song: Conceptualization, Methodology, Writing - original
Richon, D., 2013. A ‘‘non-linear” quantitative structure-property relationship for
draft. Huaiwei Shi: Data curation, Investigation. Xiang Zhang: the prediction of electrical conductivity of ionic liquids. Chem. Eng. Sci. 101,
Software, Validation. Teng Zhou: Supervision, Writing - review & 478–485.
editing. Holderbaum, T., Gmehling, J., 1991. PSRK: A group contribution equation of state
based on UNIFAC. Fluid Phase Equilib. 70, 251–265.
Huang, Y., Dong, H., Zhang, X., Li, C., Zhang, S., 2013. A new fragment contribution-
corresponding states method for physicochemical properties prediction of ionic
Declaration of Competing Interest liquids. AIChE J. 59, 1348–1359.
Hukkerikar, A.S., Sarup, B., Ten Kate, A., Abildskov, J., Sin, G., Gani, R., 2012. Group-
The authors declare that they have no known competing finan- contribution+ (GC+) based estimation of properties of pure components:
Improved property estimation and uncertainty analysis. Fluid Phase Equilib.
cial interests or personal relationships that could have appeared
321, 25–43.
to influence the work reported in this paper. Katritzky, A.R., Jain, R., Lomaka, A., Petrukhin, R., Karelson, M., Visser, A.E., Rogers, R.
D., 2002. Correlation of the melting points of potential ionic liquids
(Imidazolium Bromides and Benzimidazolium Bromides) using the CODESSA
Appendix A. Supplementary material Program. J. Chem. Inf. Comput. Sci. 42, 225–231.
Lazzus, J.A., Pulgar-Villarroel, G., 2015. A group contribution method to estimate the
viscosity of ionic liquids at different temperatures. J. Mol. Liq. 209, 161–168.
Supplementary data to this article can be found online at Lei, Z.G., Dai, C.N., Chen, B.H., 2014. Gas solubility in ionic liquids. Chem. Rev. 114,
https://doi.org/10.1016/j.ces.2020.115752. 1289–1326.
Liu, X., Chen, Y., Zeng, S., Zhang, X., Zhang, S., Liang, X., Gani, R., Kontogeorgis, G.M.,
2020. Structure optimization of tailored ionic liquids and process simulation for
References shale gas separation. AIChE J. 66, e16794.
Marrero, J., Gani, R., 2001. Group-contribution based estimation of pure component
properties. Fluid Phase Equilib. 183–184, 183–208.
Austin, N.D., Sahinidis, N.V., Trahan, D.W., 2016. Computer-aided molecular design:
Mondejar, M.E., Cignitti, S., Abildskov, J., Woodley, J.M., Haglind, F., 2017. Prediction
An introduction and review of tools, applications, and solution techniques.
of properties of new halogenated olefins using two group contribution
Chem. Eng. Res. Des. 116, 2–26.
approaches. Fluid Phase Equilib. 433, 79–96.
Cao, L., Zhu, P., Zhao, Y., Zhao, J., 2018. Using machine learning and quantum
Mourah, M., NguyenHuynh, D., Passarello, J.P., de Hemptinne, J.C., Tobaly, P., 2010.
chemistry descriptors to predict the toxicity of ionic liquids. J. Hazard. Mater.
Modelling LLE and VLE of methanol+n-alkane series using GC-PC-SAFT with a
352, 17–26.
group contribution kij. Fluid Phase Equilib. 298, 154–168.
Ceriani, R., Gani, R., Meirelles, A.J.A., 2009. Prediction of heat capacities and heats of
Mulero, Á., Cachadiña, I., Valderrama, J.O., 2017. Artificial neural network for the
vaporization of organic liquids by group contribution methods. Fluid Phase
correlation and prediction of surface tension of refrigerants. Fluid Phase Equilib.
Equilib. 283, 49–55.
451, 60–67.
Chávez-Islas, L.M., Vasquez-Medrano, R., Flores-Tlacuahuac, A., 2011. Optimal
Paduszyński, K., Domańska, U., 2014. Viscosity of ionic liquids: an extensive
molecular design of ionic liquids for high-purity bioethanol production. Ind.
database and a new group contribution model based on a feed-forward artificial
Eng. Chem. Res. 50, 5153–5168.
neural network. J. Chem. Inf. Model. 54, 1311–1324.
Chen, Y.Q., Gani, R., Kontogeorgis, G.M., Woodley, J.M., 2019a. Integrated ionic liquid
Roughton, B.C., Christian, B., White, J., Camarda, K.V., Gani, R., 2012. Simultaneous
and process design involving azeotropic separation processes. Chem. Eng. Sci.
design of ionic liquid entrainers and energy efficient azeotropic separation
203, 402–414.
processes. Comput. Chem. Eng. 42, 248–262.
Chen, Y.Q., Koumaditi, E., Gani, R., Kontogeorgis, G.M., Woodley, J.M., 2019b.
Sattari, M., Gharagheizi, F., Ilani-Kashkouli, P., Mohammadi, A.H., Ramjugernath, D.,
Computer-aided design of ionic liquids for hybrid process schemes. Comput.
2013. Estimation of the heat capacity of ionic liquids: a quantitative structure-
Chem. Eng. 130, 106556.
property relationship approach. Ind. Eng. Chem. Res. 52, 13217–13221.
Chong, F.K., Foo, D.C.Y., Eljack, F.T., Atilhan, M., Chemmangattuvalappil, N.G., 2015.
Sattari, M., Kamari, A., Hashemi, H., Mohammadi, A.H., Ramjugernath, D., 2016. A
Ionic liquid design for enhanced carbon dioxide capture by computer-aided
group contribution model for prediction of the viscosity with temperature
molecular design approach. Clean Technol. Environ. Policy 17, 1301–1312.
dependency for fluorine-containing ionic liquids. J. Fluorine Chem. 186, 19–27.
Chong, F.K., Foo, D.C.Y., Eljack, F.T., Atilhan, M., Chemmangattuvalappil, N.G., 2016.
Schweidtmann, A.M., Huster, W.R., Lüthje, J.T., Mitsos, A., 2019. Deterministic global
A systematic approach to design task-specific ionic liquids and their optimal
process optimization: Accurate (single-species) properties via artificial neural
operating conditions. Mol. Syst. Des. Eng. 1, 109–121.
networks. Comput. Chem. Eng. 121, 67–74.
Eason, J.P., Biegler, L.T., 2018. Advanced trust region optimization strategies for
Sedghamiz, M.A., Rasoolzadeh, A., Rahimpour, M.R., 2015. The ability of artificial
glass box/black box models. AIChE J. 64, 3934–3943.
neural network in prediction of the acid gases solubility in different ionic
Eckert, F., Klamt, A., 2002. Fast solvent screening via quantum chemistry: COSMO-
liquids. J. CO2 Util. 9, 39–47.
RS approach. AIChE J. 48, 369–385.
Su, Y., Wang, Z., Jin, S., Shen, W., Ren, J., Eden, M.R., 2019. An architecture of deep
Faúndez, C.A., Fierro, E.N., Valderrama, J.O., 2016. Solubility of hydrogen sulfide in
learning in QSPR modeling for the prediction of critical properties using
ionic liquids for gas removal processes using artificial neural networks. J.
molecular signatures. AIChE J. 65, e16678.
Environ. Chem. Eng. 4, 211–218.
Tatar, A., Naseri, S., Bahadori, M., Hezave, A.Z., Kashiwao, T., Bahadori, A., Darvish, H.,
Fatehi, M.-R., Raeissi, S., Mowla, D., 2017. Estimation of viscosities of pure ionic
2016. Prediction of carbon dioxide solubility in ionic liquids using MLP and
liquids using an artificial neural network based on only structural
radial basis function (RBF) neural networks. J. Taiwan Inst. Chem. Eng. 60, 151–
characteristics. J. Mol. Liq. 227, 309–317.
164.
Fredenslund, A., Jones, R.L., Prausnitz, J.M., 1975. Group-contribution estimation of
Theo, W.L., Lim, J.S., Hashim, H., Mustaffa, A.A., Ho, W.S., 2016. Review of pre-
activity coefficients in nonideal liquid mixtures. AIChE J. 21, 1086–1099.
combustion capture and ionic liquid in carbon capture and storage. Appl.
Gani, R., Zhang, L., Kalakul, S., Cignitti, S., 2016. Chapter 6 - Computer-aided
Energy 183, 1633–1663.
molecular design and property prediction. In: Martín, M., Eden, M.R.,
Valderrama, J.O., Reátegui, A., Rojas, R.E., 2009. Density of ionic liquids using group
Chemmangattuvalappil, N.G. (Eds.), Computer Aided Chemical Engineering.
contribution and artificial neural networks. Ind. Eng. Chem. Res. 48, 3254–3259.
Elsevier, pp. 153–196.
Valencia-Marquez, D., Flores-Tlacuahuac, A., Vasquez-Medrano, R., 2017. An
Gharagheizi, F., Eslamimanesh, A., Mohammadi, A.H., Richon, D., 2011a.
optimization approach for CO2 capture using ionic liquids. J. Cleaner Prod.
Determination of critical properties and acentric factors of pure compounds
168, 1652–1667.
using the artificial neural network group contribution algorithm. J. Chem. Eng.
Wang, L., 2005. Support Vector Machines: Theory and Applications.
Data 56, 2460–2476.
Wang, Z., Su, Y., Shen, W., Jin, S., Clark, J.H., Ren, J., Zhang, X., 2019. Predictive deep
Gharagheizi, F., Eslamimanesh, A., Mohammadi, A.H., Richon, D., 2011b.
learning models for environmental properties: the direct calculation of octanol–
Representation/prediction of solubilities of pure compounds in water using
water partition coefficients from molecular graphs. Green Chem. 21, 4555–
artificial neural networkgroup contribution method. J. Chem. Eng. Data 56,
4565.
720–726.
Zeng, S., Zhang, X., Bai, L., Zhang, X., Wang, H., Wang, J., Bao, D., Li, M., Liu, X., Zhang,
Gharagheizi, F., Eslamimanesh, A., Mohammadi, A.H., Richon, D., 2011c. Use of
S., 2017. Ionic-liquid-based CO2 capture systems: structure, interaction and
artificial neural network-group contribution method to determine surface
process. Chem. Rev. 117, 9625–9673.
tension of pure compounds. J. Chem. Eng. Data 56, 2587–2601.
Zhang, L., Mao, H., Liu, L., Du, J., Gani, R., 2018. A machine learning based computer-
Gharagheizi, F., Ilani-Kashkouli, P., Kamari, A., Mohammadi, A.H., Ramjugernath, D.,
aided molecular design/screening methodology for fragrance molecules.
2014. A group contribution model for the prediction of the freezing point of
Comput. Chem. Eng. 115, 295–308.
organic compounds. Fluid Phase Equilib. 382, 21–30.
Zhao, Y., Gao, J., Huang, Y., Afzal, R.M., Zhang, X., Zhang, S., 2016. Predicting H2S Zhao, Y.S., Huang, Y., Zhang, X.P., Zhang, S.J., 2015. A quantitative prediction of the
solubility in ionic liquids by the quantitative structure–property relationship viscosity of ionic liquids using S sigma-profile molecular descriptors. PCCP 17,
method using Sr-profile molecular descriptors. RSC Adv. 6, 70405–70413. 3761–3767.
Zhao, Y., Zhao, J., Huang, Y., Zhou, Q., Zhang, X., Zhang, S., 2014. Toxicity of ionic Zhou, T., Jhamb, S., Liang, X., Sundmacher, K., Gani, R., 2018. Prediction of acid
liquids: Database and prediction via quantitative structure–activity relationship dissociation constants of organic compounds using group contribution
method. J. Hazard. Mater. 278, 320–329. methods. Chem. Eng. Sci. 183, 95–105.

1 s2.0 S0009250920302840 Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S0009250920302840 Main

Uploaded by

Copyright:

Available Formats

Chemical Engineering Science 223 (2020) 115752

Contents lists available at ScienceDirect

Chemical Engineering Science

Prediction of CO2 solubility in ionic liquids using machine learning

A dataset containing 10,116 CO2 solubilities in ionic liquids is collected.

Input Hidden Layer Output Layer

Fig. 5. Comparison between the experimental and SVM-GC predicted CO2

The experimental CO2 solubility is 0.477. Therefore, the relative

7. Advantages and limitations

To date, various thermodynamic models have been used for

You might also like