You are on page 1of 9

Journal of Natural Gas Science and Engineering 21 (2014) 442e450

Contents lists available at ScienceDirect

Journal of Natural Gas Science and Engineering


journal homepage: www.elsevier.com/locate/jngse

Prediction of amines capacity for carbon dioxide absorption in gas


sweetening processes
Mohammadreza Momeni, Siavash Riahi*
Institute of Petroleum Engineering, Faculty of Chemical Engineering, College of Engineering, University of Tehran, Tehran, Iran

a r t i c l e i n f o

a b s t r a c t

Article history:
Received 23 July 2014
Received in revised form
30 August 2014
Accepted 1 September 2014
Available online 26 September 2014

Almost all gas reservoirs around the world produce sour gas that contains considerable amounts of acid
gases including carbon dioxide and hydrogen sulde. Because carbon dioxide in water tends to cause
corrosion and the presence of CO2 in natural gas reduces its heating value, it must be removed prior to
preparation of natural gas for marketing. Many technologies have offered various solutions to remove
carbon dioxide from natural gas based on regenerable amine-based solvents. In order to make these
technologies more efcient and economical, further research is required in terms of experiment and
modeling to identify the main parameters which inuence the capacity of amines for CO2 absorption.
Numerous studies of amines have shown evidence that some relationships exist between the structure of
amine and its capacity for carbon dioxide absorption. Quantitative Structure Property/Activity Relationship (QSPR/QSAR) provides an effective method for predicting amines capacity for CO2 absorption. In
this paper, rst, Density functional theory (DFT) method level of B3LYP and 6-311 g (d,p) basis set was
employed to complete molecular geometrical optimization. Then, the Quantitative relationship between
the absorption capacities data and calculated descriptors was achieved by the multiple linear regression
(MLR) and model variables were selected by genetic algorithms (GA). The accuracy of the model was
veried by different statistical methods and the result proved high statistical qualities of the model.
Unlike other QSPR researches, the reported equation in this paper consists of simple and easy-calculated
descriptors which form a robust model for predicting amines capacity of carbon dioxide absorption.
2014 Elsevier B.V. All rights reserved.

Keywords:
Gas sweetening
Rich loading
Carbon dioxide
Absorption
Amines
QSPR

1. Introduction
Amines are molecules containing nitrogen atoms attached to a
carbon-based chain structure. They can be applied in various elds
of engineering and science. One of the most important applications
of amines is using them as an acidic gas absorption liquid for
removing carbon dioxide from natural gas or oxygen containing
systems for instance ue gas (Singh et al., 2007, 2009). The Absorption capacity of amines is an important characteristic. Moreover, Different aspects of the molecules behavior of toxicity and
environmental protection to technical issues can be affected by this
feature. The solubility and absorption rate of carbon dioxide in
amine based CO2 absorbents are not only important due to technical considerations but also are vital for environmental issues.
Since experimental determination of absorption capacity (or rich
loading) is very time-consuming and expensive and the values are

* Corresponding author. University of Tehran, Tehran 11365-4563, Iran. Tel.: 98


21 61114714.
E-mail address: riahi@ut.ac.ir (S. Riahi).
http://dx.doi.org/10.1016/j.jngse.2014.09.002
1875-5100/ 2014 Elsevier B.V. All rights reserved.

not always available in literature sources, estimation plays an


important role (Pourbasheer et al., 2011). Hence, the development
of capable methods for predicting absorption capacity of different
amines becomes an urgent task.
Gas sweetening or acid gas removal (for instance CO2 and H2S) is
conventionally used in various industries (Bohloul et al., 2014).
Almost all gas reservoirs around the world produce sour gas that
contains considerable amounts of acid gases including carbon dioxide and hydrogen sulde. Owing to the fact that carbon dioxide in
water tends to cause corrosion and the presence of CO2 in natural
gas reduces its heating value, it must be removed prior to the
preparation of natural gas for marketing (Mokhatab and Poe, 2012).
The most common absorption media for this purpose are aqueous
amine solutions. Amine derivatives including monoethanolamine
(MEA), diethanolamine (DEA) and methyldiethanolamine (MDEA)
are widely being used in commercial and industrial applications
(Kohl and Nielsen, 1997). Due to the importance of amines in acid
gas removal technologies, a descriptive and a novel model has to be
developed from which amine chemical properties can be predicted.
There are evidences in the literature indicating the existence
of relationships between the structure of an amine and its

M. Momeni, S. Riahi / Journal of Natural Gas Science and Engineering 21 (2014) 442e450

capacity for carbon dioxide absorption (rich loading). Signicant


contribution to analyzing the relationships between the structure
and absorption capacity of amines has been made by Chakraborty
et al. In their work, it has been shown that the existence of
substituents at a-carbon causes a carbamate instability, which
results in an accelerated hydrolysis; as a result, the amount of
bicarbonate increases which leads to higher carbon dioxide
loading (Chakraborty et al., 1986). In addition, it was explained by
Sartori and Savage that steric hindrance effects produced by asubstituent are responsible for these instabilities (Sartori and
Savage, 1983). In addition, Chakraborty studied the electronic
effects of substituents and suggested that substitution at carbon
atom causes an interaction of the p and p* methyl group orbital
with the lone pair of the nitrogen. Since nitrogen charge is
reduced by this interaction, it reduces the strength of the NeH
bond which results in the raise of the hydrolysis in the aqueous
solution. It seems that the rate of the initial reaction can be
reduced by the steric hindrance effects; however, the number of
amine available to react with CO2 grows noticeably (Chakraborty
et al., 1988). Furthermore, solvent screening experiments and
investigation of the effects of some variables for example chain
length, the number of functional groups, position of side chains
and functional group, etc. has been conducted by Singh et al. They
performed semi-quantitative study of these effects on the capacity of amines for CO2 absorption (Singh et al., 2007, 2009). In
addition a computational study in the reactions between functionalized amines and CO2 was performed by Lee and Kitchin.
They highlighted the molecular descriptors by which reactivity
trends can be obtained. Their work revealed that electron withdrawing and donating groups tend to destabilize and stabilize
CO2 reaction products, respectively (Lee and Kitchin, 2012). All of
the results in this paper are based on mathematical calculations
and model development. To the best of the author's knowledge,
this work is the rst quantitative research on amines capacity for
CO2 absorption based on the simple and robust model.
To achieve this goal, a close observation of the relationship between the chemical structure and the activity of different aminebased solutions is required. An effective method for processing,
analyzing and predicting the characteristics of different molecules
can be provided by Quantitative Structure Property/Activity Relationship (QSPR/QSAR) (Beheshti et al., 2012, 2009; Freire et al.,
2010; Liang et al., 2013; Godavarthy et al., 2006; Riahi, 2009,
2008; Riahi et al., 2008). Quantitative structureeproperty relationship technique relates chemical or physical properties of
compounds to their molecular structures. This technique is used to
quantitatively develop a correlation which can predict specic
molecular properties; for example, environmental functions or
physico-chemical behaviors. The QSPR approach is based on the
assumption that differences of molecules behaviors can be correlated with deviation of some molecular features that are technically
termed descriptors. The descriptors are numerical values that
belong to the shape and structure of the molecule. For using QSPR
method, the knowledge of molecules chemical structures is quite
adequate and there is no necessity to conduct experimental conditions. QSPR often requires consecutive procedures; consequently,
the following steps were taken (Fini et al., 2012):
1. A data set of molecules was taken from the literature with their
corresponding absorption capacities.
2. The structural properties of molecules were extracted and
calculated by using computer software.
3. The best model which contains an optimum number of descriptors was selected by the means of several alternative algorithms for example genetic algorithm (GA) and MLR.

443

4. The selected model was validated using statistical tests and


validation methods for instance leave-one-out-cross-validation
method.
In QSPR approaches, selecting the proper method for constructing a robust and precise model is very important. Multiple
linear regression (MLR), principle component regression (PCR) and
partial least squares (PLS) are most widely used in QSPR modeling
(Katritzky et al., 2000; Marengo et al., 1992). Variable selection for
building a well-tted model is a further step. Genetic algorithm
(GA) is one famous method by which this task can be accomplished. This paper focuses on the development of a descriptive
novel model in QSPR analysis by which the prediction of absorption capacity (or rich loading) of various amines used in industrial
carbon capturing units can be predicted. The quantitative relationship between the absorption capacities data and calculated
descriptors is achieved by the multiple linear regressions (MLR)
and model variables were selected by genetic algorithm (GA)
(Depczynski et al., 2000; Jouan-Rimbaud et al., 1995). The accuracy
of the model was veried by different statistical methods and the
result proved high statistical qualities of that model. One of the
main disadvantages of QSPR technique is that for most of the researches conducted in this area, the nal equation reported as best
model contains unfamiliar descriptors which are not only hard to
be calculated but also are difcult or impossible to be interpreted.
Fortunately, the equation reported in this paper, consists of descriptors which are simple in terms of both calculation and
interpretation. The model also demonstrates high statistical
qualities by which the predictive power and robustness of the
model can be guarantee.
2. Materials and methods
The absorption capacity (rich loading) of 23 amines-based solvents for carbon dioxide absorption (Table 1), were taken from the
literature (Singh et al., 2007). Firstly, density functional theory
(DFT) at the level of B3LYP and 6-311 G (d, p) basis set was
employed to perform geometrical optimization (Cramer, 2005; da
Silva and Svendsen, 2004). These calculations were performed by
Gaussian software (Frisch et al., 1998). The input of Gaussian software was pre-optimized molecule structures using semi-empirical
geometry optimization method AM1. This process calculates a
group of precise and applicable descriptors introducing electronic
and quantum chemical properties of molecules. Quantum chemical
descriptors include properties for example dipole moment, sum of
the electronic and thermal free energies, atomic charges, HOMO
energy (highest occupied molecular orbital energy), LUMO energies
(Lowest Unoccupied molecular orbital energy), exact polarizability,
etc. Consequently, a total number of 31 quantum chemical descriptors were calculated for each molecule.
Next, geometrically optimized structures of each molecule were
fed into the Dragon software developed by the Milano Chemometrics and QSAR research group (Todeschini et al., 2002). As a
result, for each molecule more than 1486 theoretical molecular
descriptors were calculated. These descriptors can be divided into
different groups for instance: constitutional descriptors, topological descriptors, functional group counts, molecular properties, etc.
Because of the large amount of numerical data that result in
imprecise and slow further calculation, the number of calculated
descriptors was decreased by the accepted procedure below:
1. Constant and near constant value descriptors were eliminated.
(361 excluded)
2. One of the collinear descriptors (R > 0.98) that had better cor-

444

M. Momeni, S. Riahi / Journal of Natural Gas Science and Engineering 21 (2014) 442e450

Table 1
The structure, experimental and calculated values of amines capacities for CO2 absorption (rich loading).
No

Name

Structure

Exp.

Eq. (2) (Model)

Eq. (1)

1,2-diamino propane

1.27

1.29

1.23

1,3-diamino propane

1.30

1.29

1.25

1,4-Diamino butane(T)

1.26

1.37

1.29

2-Amino-1-butanol

0.88

0.79

0.80

2-Methyl pyridine

0.06

0.09

0.08

2-Pyridylamine

0.28

0.59

0.23

3-Amino-1-Propanol

0.88

0.71

0.72

4-Amino-1-butanol

0.83

0.79

0.76

5-Amino-1-pentanol(T)

0.84

0.87

0.85

10

Butylamine

0.86

0.79

0.84

11

Diethylenetriamine

1.83

1.81

1.77

12

Ethylamine

0.91

0.63

0.82

13

Ethylenediamine

1.08

1.21

1.20

14

Hexamethylenediamine

1.48

1.53

1.46

15

Isobutylamine(T)

0.78

0.79

0.82

16

Monoethanolamine

0.72

0.63

0.61

17

N-(2-Hydroxyethyl)ethylenediamine

1.15

1.23

1.17

18

N,N'-bis(2-hydroxyethyl)ethylenediamine

1.20

1.25

1.27

19

N-Pentylamine

0.72

0.87

0.90

M. Momeni, S. Riahi / Journal of Natural Gas Science and Engineering 21 (2014) 442e450

445

Table 1 (continued )
No

Name

20

Structure

Exp.

Eq. (2) (Model)

Eq. (1)

Propylamine(T)

0.77

0.71

0.80

21

Pyridine(T)

0.05

0.01

0.12

22

sec-Butylamine

0.84

0.79

0.87

23

Triethylenetetramine

2.51

2.41

2.47

All the absorption capacities (rich loading) numerical values are in the basis of (mol CO2/mol amine). Bold names with (T) superscripts are test set molecules.

relation with absorption capacity was saved and other descriptors were eliminated. (611 excluded).
After the above constraints, a total of 514 descriptors were
selected for each molecule as an output of this stage.
Finally, the calculated descriptors formed a (23  545) data
matrix, where, 23 represents the number of compounds and 545
were the number of descriptors.
3. Model development
After descriptors calculation, GA-MLR was applied as a variable
selection and model development procedure for obtaining the best
model with the highest predictive power, based on the training set.
The procedure of constructing training and test sets will be discussed in the results section. The GA-MLR analysis led to the
development of one model with three variables. The following
linear equation was built based on molecules with the training set:

AC 0:36 1:57  Mor09v  0:32  RDF035m 0:43  nN


(1)
AC is used instead of absorption capacity. Mor09v is one of the
3D-MoRSE descriptors and it is dened as signal 09 weighted by
van der Waals volume. RDF035m belongs to the group of RDF descriptors and it describes the radial distribution function-035
weighted by mass and nN represent the number of Nitrogen
atoms. As can be noticed, the calculation of two descriptors in the
above model is difcult because these calculations should be performed by computer. It also seems it is not easy to describe the
relationship between these two descriptors and absorption capacity of amines. In QSPR studies, interpretation of the model and
descriptors is a necessary and important step. So it was decided to
investigate some new models with new simple descriptors. In
addition, due to the chemical reaction of amines with carbon dioxide, it is concluded that the number of amino groups may affect
amines capacity of carbon dioxide absorption. The information on
the chemistry of carbon dioxide reactions with amine-based solvents will be presented in the discussion section. After developing
numerous simple equations and evaluating them with different
statistical methods, the following model was selected:

AC 0:19 0:04  nH 0:54  nRNH2 0:40  nRNHR


(2)

Table 2 shows some statistical factors, in order to provide a


better comparison between the two models. The rst equation
demonstrates higher statistical parameters. But the simpler descriptors of the second model, either in the calculation or interpretation of results are more important. Therefore, we introduce
the second equation as a preferred model to predict absorption
capacity of amines and the rest of this paper, including discussion
and conclusion section will focus on this model.
Molecular descriptors and their denitions are given in Table 3.
The correlation matrix of descriptors is also shown in Table 4. The
linear correlation value for each of the two descriptors is less than
0.65, which demonstrates these descriptors are independent of
each other and can be used to develop a QSPR model.
As can be observed, the three descriptors appeared in the model
are easily calculated and thus there is no need for computational
calculation. Moreover, this model demonstrates high statistical
qualities. Indeed, to the best of our knowledge, the above model is
the simplest equation that can ever predict the capacity of amines
for carbon dioxide absorption under specic conditions.
4. Results
One of the most critical factors that inuence the quality of
regression model is how to select and construct training and test
set in order to warrant the molecular diversity on both of them. To
take this into account, from the total 23 amine-based carbon dioxide absorbents, 18 molecules (about 80% of molecules) were
selected to construct a training set and 5 molecules built test set
(about 20%). The test set was used for external cross-validation of
Table 2
Some basic statistical values for two models.
Models

Descriptors

R2

Q2

Eq. (1)
Eq. (2)

nN, mor09V, RDF035m


nH, nRNH2, nRNHR

0.979
0.950

0.971
0.945

300.96
121.54

0.082
0.127

All statistic parameters in this table calculated before training and test procedure.

Table 3
The three molecular descriptors used in Eq. (2).
Descriptor Type
nH
nRNH2
nRNHR

Denition

Constitutional indices
Number of Hydrogen atoms
Functional group counts Number of primary amines (aliphatic)
Functional group counts Number of secondary amines (aliphatic)

446

M. Momeni, S. Riahi / Journal of Natural Gas Science and Engineering 21 (2014) 442e450

Table 4
Correlation matrix of three descriptors used in Eq. (2).
Descriptor

nH

nRNH2

nRNHR

nH
nRNH2
nRNHR

1.000
0.344
0.642

0.344
1.000
0.005

0.642
0.005
1.000

the model. One of the common techniques in QSPR approach for


constructing training and test set with the constraint of structural
diversity is a PCA (principal component analysis) method (Hu et al.,
2009; Riahi et al., 2008). In the current work, PCA was employed to
classify data set of molecules into training and test sets. For this
purpose PC1 and PC2 were calculated based on descriptors in the
model. The result showed that these two principal components
made 57.5% and 33.5% of the variation in data respectively and
played the main roles. Fig. 1 shows the distribution of the data for
PC1 and PC2 and by observing this gure, it can be concluded that
the compounds in the training and test sets were representatives of
the whole data.
The training set was used to build the model while the test set
was used to validate the predicting power. During the model
development procedure, leave-one-out cross-validation (LOO-CV)
method was applied to assess the performances of different
resulting models. The Q2LOO was calculated for each obtained
equation, and then the best model was selected based on the high
value of this parameter. There are some statistical tests and parameters that need to be considered. Coefcient of determination
(R2), adjusted R2, Coefcient of leave-one-out cross validation (Q2),
the slopes of regression lines forced through zero (k, k0 ), root mean
square error (RMSE) and standard error of the estimate (s) are the
most important ones. The rst ve parameters should be near to
unity while RMSE and s should be low enough near to zero.
Furthermore, the intercepts of the model should be close to zero.
Moreover, the Fisher function (F) is another vital statistical test.
High values of the F-ratio test indicate reliable models. All statistical
parameters formulas used in this paper are mentioned below:

2
Pn  exp
yi  ycalc
i1
i
R2 1 
2
Pn  exp
y
i1 yi

(3)

RMSE

v
uP 
2
u n
exp
t i1 yi  ycalc
i
n

2 
df M

F


2
Pn
exp
df E
 ycalc
i1 yi
i
Pn 
i1

P
0

yexp  ycalc
i

(5)

exp

yi ycalc
i
P  calc 2
yi

(4)

(6)

exp

yi ycalc
i
P  exp 2
yi

v

uP 
u n yexp  ycalc 2
t i1 i
i
df E

(7)

(8)

where dfM and dfE refer to the degrees of freedom of the model and
error, respectively.
Also the following criteria described by Golbaraikh and Tropsha
were applied to check the predictability of the QSPR model
(Golbraikh and Tropsha, 2002):


1 Q 2 > 0:5

(9)


2 R2 > 0:6

(10)



3

R2  R20
R2


< 0:1 and 0:85  k  1:15

(11)

where R20 is the coefcient of determination characterizing linear


regression with Y-intercept set at zero. The predicted result of all
molecules either in training or test set with statistical parameters
are given in Table 5.

Fig. 1. The principal component analysis of the molecules in training and test sets. Some points belong to more than one molecule.

M. Momeni, S. Riahi / Journal of Natural Gas Science and Engineering 21 (2014) 442e450
Table 5
Validation parameters and statistical result of GA-MLR model.

Train
Test
Overall

R2

R2adj

RMSE

k0

18
5
23

0.942
0.976
0.950

0.930
0.904
0.942

0.127
0.060
0.116

76.50
17.31
123.79

1.004
0.962
0.999

0.984
1.035
0.990

0.144
0.135
0.128

In Table 6, Y-scrambling test was applied in order to examine the


robustness of the model (Tropsha et al., 2003). In Y-scrambling test,
the dependent variable (Absorption Capacity) is randomly dedicated to different amines and new QPSR modeling is performed
based on the previous matrix of independent variables. It is expected that newly developed QSPR models should have low enough
R2 and Q2 values. If it happens differently, the reported model is not
accurate for the particular data set and method of modeling.
The applicability domain of the model was studied by Williams
plot in Fig. 2 (OECD, 2007). In Williams plot; the standardized residuals (R) versus the leverage (hat diagonal) values (h) were
plotted. Leverage demonstrates the distance of a compound from
the centroid of the X, where X is the descriptor matrix. The leverage
of a compound is calculated by the following equation (Netzeva
et al., 2005):


1
hi xTi X T X
xi

individually or relative to other descriptors. Fig. 4 shows the standardized coefcients (also called beta coefcients) (XLSTAT, 2013).
The gure is used to compare the relative weights of the descriptors. The higher the standardized coefcients value of a
descriptor, the more important the weight of the corresponding
variable in the model. This gure demonstrates the mean effect of
the descriptors in the model.
By observing this gure, it can be concluded that the number of
primary amines (nRNH2) and secondary amines (nRNHR) descriptors play the main role in the amines capacity for carbon dioxide absorption respectively and the number of hydrogen (nH) has
the least effect. This gure shows, all descriptors in the model have
positive effects and the amines capacity for CO2 absorption is
directly related to each of these descriptors.
At the last part of this section it should be noticed that the
present work focuses exclusively on developing a simple model by
which amines capacities for carbon dioxide absorption can be
predicted. In fact, the predominant difference between this study
and the previous ones is that this work concentrates rstly on
quantitative and then on qualitative representation of structural
effects on the capacity of amines for CO2 absorption.
5. Discussion

(12)

where xi is the descriptor vector of the relevant compound. The


warning leverage (h*) is dened as (Eriksson et al., 2003):

h*

447

3p 1
n

(13)

n is the number of training objects and p is the number of descriptors in the model. Williams plot is used to identify both the
response outlier and the structurally inuential chemicals in the
model. A compound with hi > h* inuence the regression line, but it
does not consider as an outlier as its corresponding standardized
residual might be small. In this data set the warning value of
leverage is around 0.67. Furthermore, compounds with standardized residual rather than three standard residual unit (>3s) is
considered as an outlier compound. . It is common in the literature
to use 3 as an accepted cut-off value for evaluating prediction results of the model.
Fig. 2 demonstrates that there is no chemical with leverage
higher than the warning h* value of 0.67. It also shows that there is
no outlier in training or test sets and all compounds lie between the
two horizontal lines.
The experimental absorption capacity (rich loading) values of
amines are plotted in Fig. 3 against corresponding calculated values
for QSPR model.
Furthermore, mean effect (MF) is another term that helps to
interpret the result and shows the effect of each descriptor

Although high statistical parameters are signicant in demonstrating the capability of the model, QSPR should provide powerful
insight for the mechanism of carbon dioxide solubility in amine
based solvent. For this reason, an acceptable interpretation of descriptors in the QSPR model should be provided. It is better to diagnose which parameters affect the amines capacity and which
descriptors could appear in the model due to principal chemical
reactions between carbon dioxide and an amine-based solvent.
The overall reaction mechanism for chemical absorption of CO2
in amine solvent systems is still under debate. A mechanism for this
reaction which supports the formation of zwitterion intermediate
theory and by proton-remover base B through reactions (1) and (2)
below suggested by Caplow (1968):
CO2 R1R2NH 4 R1R2NH COO

(1)

R1R2NH COO B 4 R1R2COO BH

(2)

R1 and R2 demonstrate substituted group attached to amine


group. B is a base molecule which can be a water molecule. The
intermediate in the reaction is zwitterion. But more recent studies
showed zwitterion seemed to be short-lived and may be an entirely
transient state (da Silva and Svendsen, 2004). It led to the
assumption of the single-step mechanism of these reactions (reaction (3)). A termolecular single step mechanism suggested by
Crooks and Donnellan (1989):
B CO2 R1R2NH 4 R1R2NCOO BH

(3)

Table 6
R2 train values after several Y-scrambling tests.
Iteration

R2 train

1
2
3
4
5
6
7
8
9
10

0.060
0.074
0.119
0.027
0.188
0.102
0.119
0.209
0.096
0.039

where B is again the base molecule. In this mechanism, NH group is


attacked by base molecule and deprotonation of amine occurs. The
bonding between amine and carbon dioxide also takes place
simultaneously.

448

M. Momeni, S. Riahi / Journal of Natural Gas Science and Engineering 21 (2014) 442e450

Fig. 2. Williams plot of GA-MLR model development.

As can be noticed, the reaction between CO2 and amine based


solvent takes place because of the existence of NH bond. So, NH
group is an active site of the amine molecule where base molecule
(water) undergoes a chemical termolecular reaction. Consequently,
the amount of NH bonds, or in other words, the number of primary
and secondary amine groups in the amine molecule plays an
important role in the capacity of amines for CO2 absorption.
Number of primary (nRNH2) and secondary (nRNHR) amines is two
main descriptors appearing in the model. According to Fig. 4, these
two descriptors have a positive effect and a higher mean effect. All
these results demonstrate that the chemical reaction mechanism
coordinates with the proposed model.
The model also contains number of Hydrogen atoms (nH) as
another descriptor. Fig. 4 shows nH descriptor has a positive effect

which is considerably less than two other descriptors. The reason of


nH descriptor presence in the model can be explained by the result
of experimental work performed by Singh et al. They showed that
an increase in the chain length between amines and other functional groups in the amine structure, result in an increase in amine
capacity for CO2 absorption (Singh et al., 2007). Increasing with
chain length results in increasing numbers of hydrogen atoms, so
apparently it seems it should have a positive effect due to the
experimental work.
At last, it should be noted that the simplicity of the model is
interesting and the results are quite acceptable for predicting
amines capacity for carbon dioxide absorption. Although, the accuracy of the model is good for linear amine compounds, it is not
better for unsaturated cyclic amines. This can be explained by two

Fig. 3. Experimental vs. predicted rich loading values (mol CO2/mol amine) e regression line.

M. Momeni, S. Riahi / Journal of Natural Gas Science and Engineering 21 (2014) 442e450

449

predictive performance of the model validated with various statistical tests and examined with the test set of ve molecules,
permits using this model to estimate other amines rich loading
under specic conditions. According to the results, it could be
argued that, a good amine solvent for carbon dioxide absorption
should have a linear structure with a high number of primary and
secondary amine groups as side chains. In other words, increasing
the number of primary and secondary amine groups, results in
increasing the number of NH bonds active sites, which causes the
amine reaction with CO2 to happen.
The promising results of this study might aid other researchers
in the eld of chemistry and natural gas engineering to design and
synthesis new potential amine-based solvents and investigate the
feasibility of using them in gas removal processes. New improved
solvents should also be compared to more conventional ones from
corrosively, energy efciency and operability point of view.
Acknowledgment
The authors would like to gratefully acknowledge the support
from Institute of Petroleum Engineering (IPE), University of Tehran.
Fig. 4. Mean effects of model descriptors (standardized coefcient values).

main reasons. First, the three descriptors in model are not sensitive
to ring type functional group and just count the number of
hydrogen atoms, primary and secondary amines. Second, unsaturated cyclic amines show poor absorption rate and capacity and
they are not potential absorbents for CO2 absorption (2). Therefore,
according to the industrial point of view, it is preferable to use
linear amine for CO2 absorption and it is more important for the
model to predict CO2 absorption capacity for linear amines rather
than unsaturated cyclic amines.
Fortunately, the results of the rst equation (Eq. (1)) for predicting amines capacity of CO2 absorption are largely accepted
either for linear or aromatic ring type amines (items labeled 5, 6
and 21). This is because of the presence of RDF descriptor in this
model. RDF descriptors are based on the distance distribution in the
geometrical representation of a molecule. This function is independent of the number of atoms and is invariant against translation
and rotation of the entire molecule. The RDF code provides valuable
information, e.g., about bond distances, ring types, planar and nonplanar systems and atom types so it is sensitive to aromatic rings
(Todeschini and Consonni, 2008).
6. Conclusions
One of the main concerns of the natural gas industry is to have a
robust and accurate model which can predict the chemical behavior
of amines for gas treatment process. This study is attempted to
identify the effects chemical structure of amine on their capacity for
carbon dioxide absorption and develop a model for this purpose
which is not only robust and accurate but also simple and applicable. Therefore, QSPR approach has been chosen as a modeling
technique and model has been developed based on linear method
for its simplicity. As a result two linear equations were developed.
First model demonstrate high prediction power while second one is
notably simpler and powerfully interpretable due to the chemistry
of amines reaction with carbon dioxide. Consequently, second
equation introduced as a preferred model of this study. The most
important descriptors appearing in the model due to the weight of
the corresponding variable are number of primary aliphatic amines
(nRNH2), number of secondary aliphatic amines (nRNHR) and
number of hydrogen atoms (nH), respectively. The accuracy and

List of symbols
CO2
carbon dioxide
QSPR/QSAR quantitative structure property/activity relationship
DFT
Density Functional Theory
MLR
Multiple Linear Regression
GA
Genetic Algorithms
PCR
principle component regression
PLS
partial least square
HOMO Highest Occupied Molecular Orbital
LUMO
Lowest Unoccupied Molecular Orbital
AC
absorption capacity
PCA
principal component analysis
LOO-CV Leave-one-out cross-validation
RMSE
root mean square error
dfM
degrees of freedom of the model
dfE
degrees of freedom of the error
References
Beheshti, Abolghasem, Riahi, Siavash, Ganjali, Mohammad Reza, 2009. Quantitative
structureeproperty relationship study on rst reduction and oxidation potentials
of
donor-substituted
phenylquinolinylethynes
and
phenylisoquinolinylethynes: quantum chemical investigation. Electrochim. Acta 54
(23), 5368e5375.
Beheshti, A., Norouzi, P., Ganjali, M.R., 2012. A simple and robust model for predicting the reduction potential of quinones family; electrophilicity index effect.
Int. J. Electrochem. Sci. 7, 4811e4821.
Bohloul, M.R., Vatani, A., Peyghambarzadeh, S.M., 2014. Experimental and theoretical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP). Fluid Phase
Equilibr. 365, 106e111.
Caplow, Michael, 1968. Kinetics of carbamate formation and breakdown. J. Am.
Chem. Soc. 90 (24), 6795e6803.
Chakraborty, A.K., et al., 1988. Molecular orbital approach to substituent effects in
amine-CO2 interactions. J. Am. Chem. Soc. 110 (21), 6947e6954.
Chakraborty, A.K., Astarita, G., Bischoff, K.B., 1986. CO2absorption in aqueous solutions of hindered amines. Chem. Eng. Sci. 41 (4), 997e1003.
Cramer, Christopher J., 2005. Essentials of Computational Chemistry: Theories and
Models. Wiley.com.
Crooks, John E., Donnellan, J. Paul, 1989. Kinetics and mechanism of the reaction
between carbon dioxide and amines in aqueous solution. J. Chem. Soc. Perkin
Trans. 2 (4), 331e333.
da Silva, Eirik F., Svendsen, Hallvard F., 2004. Ab initio study of the reaction of
carbamate formation from CO2 and alkanolamines. Indust. Eng. Chem. Res. 43
(13), 3413e3418.
Depczynski, Uwe, Frost, V.J., Molt, K., 2000. Genetic algorithms applied to the selection of factors in principal component regression. Anal. Chim. Acta 420 (2),
217e227.
Eriksson, Lennart, Joanna, Jaworska, Worth, Andrew P., Cronin, Mark TD.,
McDowell, Robert M., Gramatica, Paola, 2003. Methods for reliability and

450

M. Momeni, S. Riahi / Journal of Natural Gas Science and Engineering 21 (2014) 442e450

uncertainty assessment and for applicability evaluations of classication-and


regression-based QSARs. Environ. Health Perspect. 111 (10), 1361.
Fini, Mojtaba Fallah, Riahi, Siavash, Alireza, Bahramian, 2012. Experimental and
QSPR studies on the effect of ionic surfactants on n-Decaneewater interfacial
tension. J. Surfact. Deterg. 15 (4), 477e484.
Freire, Mara G., et al., 2010. Solubility of non-aromatic ionic liquids in water and
correlation using a QSPR approach. Fluid Phase Equilibr. 294 (1), 234e240.
Frisch, Michael J., Nielsen, Alice B., Frisch, Aeleen (Eds.), 1998. Gaussian 98. Gaussian
Incorporated.
Godavarthy, Srinivasa S., Robinson Jr., Robert L., Gasem, Khaled AM., 2006.
SVRCeQSPR model for predicting saturated vapor pressures of pure uids. Fluid
Phase Equilibr. 246 (1), 39e51.
Golbraikh, Alexander, Tropsha, Alexander, 2002. Beware of q2! J. Mol. Graph. Model.
20 (4), 269e276.
Hu, Rongjing, et al., 2009. QSAR models for 2-amino-6-arylsulfonylbenzonitriles
and congeners HIV-1 reverse transcriptase inhibitors based on linear and
nonlinear regression methods. Eur. J. Med. Chem. 44 (5), 2158e2171.
Jouan-Rimbaud, Delphine, et al., 1995. Genetic algorithms as a tool for wavelength
selection in multivariate calibration. Anal. Chem. 67 (23), 4295e4301.
Katritzky, Alan R., et al., 2000. QSPR correlation and predictions of GC retention
indexes for methyl-branched hydrocarbons produced by insects. Anal. Chem. 72
(1), 101e109.
Kohl, Arthur L., Nielsen, Richard, 1997. Gas Purication (access online via Elsevier).
Lee, Anita S., Kitchin, John R., 2012. Chemical and molecular descriptors for the
reactivity of amines with CO2. Indust. Eng. Chem. Res. 51 (42), 13609e13618.
Liang, Guijie, Jie, Xu, Li, Liu, 2013. QSPR analysis for melting point of fatty acids
using genetic algorithm based multiple linear regression (GA-MLR). Fluid Phase
Equilibr. 353, 15e21.
Marengo, Emilio, et al., 1992. Comparative study of different structural descriptors
and variable selection approaches using partial least squares in quantitative
structure-activity relationships. Chemometr. Intell. Lab. Syst. 14 (1), 225e233.
Mokhatab, Saeid, Poe, William A., 2012. Handbook of Natural Gas Transmission and
Processing (access online via Elsevier).
Netzeva, Tatiana I., Worth, Andrew P., Aldenberg, Tom, Romualdo, Benigni,
Cronin, Mark TD., Gramatica, Paola, Jaworska, Joanna S., et al., 2005. Current
status of methods for dening the applicability domain of (quantitative)
structureeactivity relationships. ATLA 33, 155e173.

OECD, 2007. Guidance Document on the Validation of (Quantitative) StructureActivity Relationships [(Q)SAR] Models. Organisation for Economic CoOperation and Development, Paris.
Pourbasheer, Eslam, et al., 2011. Prediction of solubility of fullerene C60 in various
organic solvents by genetic algorithm-multiple linear regression. Fullerenes
Nanotubes Carbon Nanostruct. 19 (7), 585e598.
Riahi, Siavash, Ganjali, Mohammad Reza, Norouzi, Parviz, Jafari, Fatemeh, 2008.
Application of GA-MLR, GA-PLS and the DFT quantum mechanical (QM) calculations for the prediction of the selectivity coefcients of a histamineselective electrode. Sens. Actuat. B: Chem. 132 (1), 13e19.
Riahi, Siavash, Pourbasheer, Eslam, Ganjali, Mohammad Reza, Norouzi, Parviz, 2009.
Investigation of different linear and nonlinear chemometric methods for
modeling of retention index of essential oil components: concerns to support
vector machine. J. Hazard. Mater. 166 (2), 853e859.
Riahi, Siavash, Beheshti, Abolghasem, Ganjali, Mohammad Reza, Norouzi, Parviz,
2008. A novel QSPR study of normalized migration time for drugs in capillary
electrophoresis by new descriptors: quantum chemical investigation. Electrophoresis 29 (19), 4027e4035.
Riahi, Siavash, et al., 2008. QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benzimidazole-4-Carboxamide as PARP inhibitors for treatment of cancer. Chem.
Biol. Drug Design 72 (6), 575e584.
Sartori, Guido, Savage, David W., 1983. Sterically hindered amines for carbon dioxide removal from gases. Indust. Eng. Chem. Fundam. 22 (2), 239e249.
Singh, Prachi, Niederer, John PM., Versteeg, Geert F., 2007. Structure and activity
relationships for amine based CO2 absorbentsdI. Int. J. Greenhouse Gas Control
1 (1), 5e10.
Singh, Prachi, Niederer, John PM., Versteeg, Geert F., 2009. Structure and activity
relationships for amine-based CO2 absorbents-II. Chem. Eng. Res. Design 87 (2),
135e144.
Todeschini, R., Consonni, V., Mauri, A., Pavan, M., 2002. DRAGON-Software for the
calculation of molecular descriptors version 2.1.
Todeschini, Roberto, Consonni, Viviana, 2008. Handbook of Molecular Descriptors.
John Wiley & Sons.
Tropsha, Alexander, Gramatica, Paola, Gombar, Vijay K., 2003. The importance of
being earnest: validation is the absolute essential for successful application and
interpretation of QSPR models. QSAR Comb. Sci. 22 (1), 69e77.
XLSTAT 2013 software, XLSTAT-CCR module, Trial version.

You might also like