Professional Documents
Culture Documents
org/JCTC Article
Cite This: J. Chem. Theory Comput. 2020, 16, 2635−2646 Read Online
1. INTRODUCTION refs 21, 22). While COSMO-based models are useful tools for
The calculation of thermodynamic properties of multicompo- predicting properties of mixtures, their correct implementation
nent mixtures is of great importance for the chemical industry. can be tricky and time-consuming. The purpose of this work is
The reason for this is that experimental measurements of therefore to provide a reference implementation of three
mixtures are very time-consuming and may involve a high risk COSMO-SAC models, which are the original COSMO-SAC
when measuring under extreme conditions or considering toxic model by Lin and Sandler10 and its modification by Hsieh et
fluids. In the past, many successful models have been developed. al.23,24 These models will be discussed in more detail in section
The accuracy of results of predictive models such as the group 3.1 (the original model COSMO-SAC-200210), section 3.2
contribution method (GCM) is based on numerous adjustable (COSMO-SAC-201023), and section 3.3 (COSMO-SAC-
parameters which have to be fitted to experimental data. If no dsp24). The COSMO-SAC model can in principle be applied
adjusted parameters for specific group interactions exist, the to all types of liquid mixtures. Fingerhut et al. examined
model cannot be used. More predictive alternatives to GCM thoroughly its performance for over 10 000 binary mixtures
such as UNIFAC1−6 are based on quantum mechanical based on 2295 compounds, including water.25 The method can
conductor-like screening model (COSMO) calculations, also be used for polymers26 and, when combined with the Pitzer-
originally proposed by Klamt et al. (conductor-like screening Debye−Hückel model, for electrolyte27,28 and ionic liquids.29,30
model for real solvents, COSMO-RS).7−9 Based on the The article is organized thematically by addressing the following
COSMO-RS model, Lin and Sandler 10 developed the aspects of the COSMO-SAC model:
COSMO segment activity coefficient model (COSMO-SAC). Part I: Preprocessing
In these models, the interactions of molecules in a mixture are
not modeled as pairwise molecular group interactions but rather 1. Generate sigma profile(s) from the results of the
as pairwise interactions of charged surface segments of the COSMO quantum mechanical calculation
molecule that can be obtained from quantum mechanical
calculations when the molecule is placed in a perfect conductor.
COSMO-based models for liquid mixtures typically depend on Received: October 10, 2019
temperature and composition only, i.e., the pressure-depend- Published: February 14, 2020
ence is usually neglected. However, note that pressure-
dependence can be introduced into the model by either
combining COSMO-based models with equations of state
(see, e.g., refs 11−20) or by modifying the model itself (see, e.g.,
2. Split the profile into hydrogen bonding parts if The pairwise distance between each pair of nuclei m and n, in
desired Å, is calculated from
3. Calculate dispersive contributions
Part II: Use dmn = (xm − xn)2 + (ym − yn )2 + (zm − zn)2 (1)
1. Calculate activity coefficients and the excess Gibbs
energy Whether atoms m and n are covalently bonded is determined
2. Calculate phase equilibria by combining COSMO- by comparing their distance and the sums of the covalent radii of
SAC with the ideal gas law the atoms of the pair. If the distance between atoms is less than
the sum of the covalent radii times 1.15, the atoms are assumed
to be bonded together. Covalent radii were obtained from ref 32,
and these values are also used in the OpenBabel (v2.3.1)
cheminformatics library. The covalent radius for carbon was
taken to be 0.76 Å (equal to that of the sp3 hybridization).
2.1.1. Hydrogen Bonding. COSMO-SAC-2010 and
COSMO-SAC-dsp take different types of hydrogen bonding
into account. The molecular surface is separated into segments
that are non-hydrogen-bonding (NHB) and segments that form
hydrogen bonds. The hydrogen-bonding segments are further
divided into hydrogen bonds of a hydroxyl group (OH) and
other hydrogen bonds (OT). Other hydrogen bonds (OT)
consider surface segments of the atoms nitrogen (N), oxygen
(O), and fluorine (F) as well as hydrogen (H) atoms bonded to
Figure 1. Three-dimensional view of ibuprofen and its segment charge N or F. Therefore, information about bonding needs to be
densities (not averaged charge densities) with the visualization tool obtained from the COSMO file. The source code for
developed in this work. An interactive HTML file is available at https://
github.com/usnistgov/COSMOSAC.
determining nhb, OT, and OH segments is organized as follows:
Once it has been determined which atoms are bonded to each
other, the hydrogen bonding class of each atom is determined. If
the atom is not in the set of (O, H, N, F), the atom is not
considered to be a candidate for hydrogen bonding in the
2. PART I: COSMO FILE PROCESSING
COSMO-SAC framework and is given the hydrogen bonding
The COSMO file obtained as the output of a quantum flag of “NHB” (non-hydrogen-bonding). If the atom is an N or
mechanical density-functional-theory calculation is a text file of an F, the atom is considered to hydrogen bond, but is not in the
nonstandardized format containing the results of the calculation OH family, and therefore, it is given the “OT” designation
and information about the molecule. The information needed (hydrogen bonding, but not OH). If the atom is O or H, the
for creating the sigma profile (see section 2.2) in order to hydrogen bonding class of the atom is
conduct a COSMO-SAC calculation is essentially:
1. OH: if the atom is O and is bonded to an H, or vice versa
• Volume and surface area of the molecule
• Positions of all nuclei 2. OT: if the atom is O and is bonded to an atom other than
• Location of each segment patch of the molecule, together H, or if the atom is H and is bonded to N or F
with its area and charge 3. NHB: otherwise
The units of the parameters in the COSMO files are 2.1.2. Dispersion. The models COSMO-SAC-dsp24 and
frequently not specified. This unfortunate historical design COSMO-SAC 201333 take the dispersion contribution to the
decision has led to many mistakes in publications and activity coefficient into account. The dispersive interactions
implementations (see for instance section 2.2). Therefore, have been considered by Hsieh et al.24 by assuming equally sized
users must be exceptionally careful to ensure that a consistent set atoms with a size parameter σ = 3 Å of the Lennard-Jones
of units is used. The most frequent source of confusion is the potential and assigned a dispersion parameter ϵAtom to each atom
length unit, which is sometimes given in Bohr radius (atomic forming a molecule. They proposed to compute the dispersion
units) and sometimes in Ångstrom (Å). The conversion factor parameter of the molecule ϵMolecule from the relation
from Bohr radius to Å is not large in magnitude (1 Bohr radius a0 n
≈ 0.52918 Å31), further muddying the waters. ϵMolecule 1 ϵAtom, i
2.1. Atoms, Bonds, and Dispersion. The structural
= ∑
kB NAtom i=1
kB (2)
information on the molecules is not required for the original
model COSMO-SAC-2002 (see section 3.1). However, the where ϵAtom,i/kB is the dispersion parameter of atom i, n is the
structure of the molecule is important for the more advanced number of atoms in molecule i, and NAtom is the total number of
models COSMO-SAC-2010 (see section 3.2) and COSMO- atoms for which |ϵAtom,i/kB| > 0. The molecular dispersion
SAC-dsp (see section 3.3). parameter for the molecule ϵMolecule/kB depends on the atomic
Position data of all nuclei are read from the COSMO file and structure of the molecule and on the dispersion parameters of
converted to the Ångstrom scale by multiplying with the each atom ϵAtom,i/kB. A dispersion parameter is attached to each
conversion factor 0.52917721067 Å/(bohr radius) as given by atom, depending on its orbital hybridization. The orbital
CODATA.31 These nuclei position data are used to determine hybridization of an atom in a molecule is determined by the
(a) which atoms are bonded to each other and (b) what type of number of atoms that are bonded to it. In the case of carbon, sp3
hybridization of the electron orbitals is present in the atom, used hybridization corresponds to four, sp2 to three, and sp to two
in the analysis of dispersion. bonded neighbor atoms. In the case of nitrogen, sp3 and sp2
2636 https://dx.doi.org/10.1021/acs.jctc.9b01016
J. Chem. Theory Comput. 2020, 16, 2635−2646
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article
Figure 2. Atom locations (in Å) from the .cosmo file for propane (C3H8) from DMol3.
hybridization corresponds to three and two bonded neighbor file instead of recalculating the charge density by dividing the
atoms, respectively, and sp to one bonded neighbor atom. charge by the surface area of the segment, the differences can be
In addition, the w parameter for the COSMO-SAC-dsp model on the order of a few percent. Therefore, for full replicability of
contains additional molecule specific information (see section numerical values with the values in this work, the charge density
3.3). To calculate w, the dispersive nature of the molecules are of each segment must be calculated from the charge given in the
classified into categories: COSMO file divided by the area given in the COSMO file.
• DSP_WATER indicates water The following averaging equation was originally used by
Klamt et al.8 for COSMO-RS
• DSP_COOH indicates a molecule with a carboxyl group
• DSP_HB_ONLY_ACCEPTOR indicates that the mol- rn2rav2 2
is aborted. rn2reff
2 2
exp(−f )
dmn
2.1.3. Example. The block of the COSMO file with atom ∑n 2 2 decay rn2 + reff
2
rn + reff (4)
locations looks something like the example in Figure 2 taken
from the database of Mullins et al.34 In case of a .cosmo file from where dmn is the distance (in Å) between the centers of the
DMol3, the location of the atoms (x, y, z) are given in Å. For surface segments n and m, and reff = (aeff/π)0.5 in Å. Nonetheless,
instance, the x, y, z position of the first hydrogen nucleus (H1) is this equation remains dimensionally inconsistent (the argument
(0.888162953 Å, −1.326789759 Å, −0.880602803 Å). of the exponential function has units of bohr2/Å2); the
2.2. Sigma Profile Construction. When doing a quantum parameter f decay should therefore be thought of as a
mechanical COSMO calculation, the output file supplies a dimensionless scaling quantity (only). Hence, the practical
charge density σ*m on each surface element with area an as well as interpretation of fdecay is that, first, the coordinates of the
its charge. These numerical values for the surface charge density segments given in bohr are converted to Å and, second, fdecay is
are truncated to a few significant digits in the COSMO file. used to scale the values given in Å back to the numerical values of
When using the charge density values as given in the COSMO bohr (keeping Å as units).
2637 https://dx.doi.org/10.1021/acs.jctc.9b01016
J. Chem. Theory Comput. 2020, 16, 2635−2646
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article
Figure 3. Sigma values for ethanol (single profile, with averaging scheme of Mullins). Top: Values of averaged charge densities on segments (randomly
jittered in the vertical direction). Vertical lines correspond to the nodes at which the sigma profile will be generated. Bottom: Sigma profile generated
from these σ values.
Figure 4. First five segments of a COSMO file from DMol3. The charge is in units of e, the area in units of Å2, and segment positions are in Bohr radius
(from the atomic units (a.u.) system of measurement).
The model of Lin and Sandler10,35 was made available as a It is furthermore important to note that in the Fortran code
Fortran source code together with a comprehensive sigma- for the computation of the sigma profiles, Mullins et al.34 used
profile database from the very useful work of Mullins et al.34 the same averaging equation as Klamt (eq 3), but with rav =
They computed the sigma profiles with density-functional- 0.81764 Å. The use of an averaging radius of rav = 0.81764 Å and
theory calculations using the software DMol3. Note that the fdecay = 1 is equivalent (when assuming r2n ≪ r2av) to the
parametrization and the results of COSMO models in general assumption of dividing numerator and denominator by fdecay,
depend on the underlying method and software with which the such that the fdecay correction is applied to the averaging radius.
sigma profiles are calculated.36 Hence, it is very important for the Once the averaged value of σm has been obtained for each
comparability and evaluation of these models to use exactly the segment m, the p(σ)Ai values (probability p(σ) of finding a given
same set of sigma profiles. segment with specified value of σ multiplied with the entire
2638 https://dx.doi.org/10.1021/acs.jctc.9b01016
J. Chem. Theory Comput. 2020, 16, 2635−2646
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article
surface area Ai of molecule i, which gives the surface areas Ai(σm) NHB, OH, or OT sigma profiles depending on its averaged
of molecule i with charge densities σm) need to be obtained on charge density in the following manner:
gridded values. The values of σ for which the p(σ)A values are to • If the segment belongs to an O atom, and the hydrogen-
be obtained is generally −0.025 to 0.025 e/Å2 in increments of bonding class of the atom is OH, and the averaged charge
0.001 e/Å2, forming a set of 51 points. density value of the segment is greater than zero, the
Subsequently for each value of σ (in e/Å2), the (0-based) segment goes into the OH profile.
ÅÅ Ñ
Å σ − ( −0.025 e/Å2) ÑÑÑ
index corresponding to the left of the value is obtained from • If the segment belongs to an H atom, and the hydrogen-
ileft = ÅÅÅÅ ÑÑ
ÑÑ
ÅÅÅ
bonding class of the atom is OH, and the averaged charge
Ç ÑÑÖ
density value of the segment is less than zero, the segment
0.001 e/Å2 (5) goes into the OH profile.
where ⌊...⌋ indicates the mathematical floor function and the • If the segment belongs to an O, N, or F atom, and the
fractional distance of the value between the left and right edges hydrogen-bonding class of the atom is OT, and the
of the cell is given by averaged charge density value of the segment is greater
than zero, the segment goes into the OT profile.
w=
σ[ileft + 1] − σ • If the segment belongs to an H atom, and the hydrogen-
0.001 e/Å2 (6) bonding class of the atom is OT, and the averaged charge
density value of the segment is less than zero, the segment
which is by definition between 0 and 1. The area of the segment goes into the OT profile.
is then distributed between the gridded sigma values above and • Otherwise, the segment goes into the NHB profile.
below the value, according to the weighting parameter w
By definition, the superpositioned profiles (NHB + OH +
p(σ )Ai [ileft ] += wA n (7) OT) must equal the original profile
p(σ )Ai [ileft + 1] += (1 − w)A n (8) p(σ ) = p NHB (σ ) + pOH (σ ) + pOT (σ ) (9)
38
Figure 3 illustrates the construction of the sigma profile for Wang et al. proposed the use of a Gaussian-type function for
ethanol. For each patch, the value of sigma is obtained, and from the probability p of a hydrogen-bonding segment to indeed form
ji σ zy
that, the value of sigma is then distributed among the gridded a hydrogen bond
p hb (σ ) = 1 − expjjj− 2 zzz
values. As can be seen in this plot, the histogram is constructed
j 2σ z
2
k 0 {
from a relatively small number of patches.
2.2.1. Example. Figure 4 gives an example of a COSMO file (10)
from DMol3. It is important to note that the units are not
specified for the areas or the charge. It is especially problematic with σ0 = 0.007 e/Å2. This probability function considers that
that the length units differ within the same line of the file (the not all surfaces belonging to potentially hydrogen-bond forming
area in units of Å2, and segment positions are in Bohr radius atoms in fact do form hydrogen bonds. With the probability
(from the atomic units (a.u.) system of measurement)), a source function for hydrogen bonding according to eq 10, it follows
of confusion for many authors, ourselves included. Nonetheless, Ainhb(σ ) A hb(σ )
this is a standard file format. p nhb (σ ) = + i [1 − p hb (σ )]
2.3. Splitting of Profiles. Lin et al.37 proposed to split the Ai Ai (11)
sigma profile into hydrogen-bonding (hb) and non-hydrogen-
bonding (nhb) segments with pi(σ) = pnhb hb AiOH(σ ) hb
i (σ) + pi (σ). pOH (σ ) = p (σ )
Hydrogen-bonding atoms were defined to be oxygen, nitrogen, Ai (12)
and fluorine atoms as well as hydrogen atoms bound to one of
oxygen, nitrogen, or fluorine. Hence, all surfaces belonging to AiOT(σ ) hb
the aforementioned atoms contribute to the sigma profile phb pOT (σ ) = p (σ )
i (σ) Ai (13)
and the other atoms forming molecule i contribute to the sigma
profile pnhbi (σ). Hsieh et al.
23
suggested to further split the where Anhb
i (σ) is the surface area of non-hydrogen-bonding
hydrogen-bonding sigma profile into interactions of surfaces atoms of molecule i, Ahb OH OT
i (σ) = Ai (σ) + Ai (σ) is the surface
belonging to groups of oxygen and hydrogen (OH) and surfaces area of hydrogen-bonding segments of molecule i, AOH i (σ) is the
belonging to other groups (OT). surface area of hydrogen-bonding segments of OH-groups of
Each atom of the molecule is assigned to be in one of the molecule i, and AOT i (σ) is the surface area of hydrogen-bonding
hydrogen-bonding classes: segments other than OH.
• NHB: the atom is not a candidate to hydrogen bond 2.4. Implementation, Validation, and Verification. A
Python script profiles/to_sigma.py was written that
• OH: the atom is either the oxygen or the hydrogen in a
fully automates the process of reading in a COSMO file from a
OH hydrogen-bonding pair
DMol3 calculation and generates the split profiles as well as
• OT: the atom is N, F, or an O that is not part of an OH calculates the dispersion parameters. The output file with sigma
bonding group profiles is a space-delimited text file with additional metadata
Though these classes are consistent with the work of Hsieh et stored in JSON (JavaScript Object Notation) format in the
al.,23 they do not consider the fact that the H of a COOH group header of the sigma profile. In case of discrepancies between the
(likewise for other similar groups) is delocalized between the description above and the Python code, the latter should be used
two oxygens of the group. as the reference. This Python script makes heavy use of
A currently undocumented feature of the profile splitting is vectorized matrix operations of the numpy matrix library,
that the contribution for a given segment is deposited into the especially in the case of the sigma averaging, the computationally
2639 https://dx.doi.org/10.1021/acs.jctc.9b01016
J. Chem. Theory Comput. 2020, 16, 2635−2646
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article
most expensive part of sigma profile generation. Furthermore, ln(γi ,S) = ln(γic,S) + ln(γi r,S) (14)
the sigma profile generation from COSMO files is automated
with the Python script profiles/generate_all_pro- where the combinatorial part ln(γci,S) accounts for the size and
files.py, which generates sigma profiles in parallel. All of shape differences of the molecules. This quantity is usually
ij θ yz
described by the Staverman−Guggenheim combinatorial term
ij ϕ yz
these scripts are available in the provided code.
σm is the screening charge density of segment m, which is the where the superscripts t and s denote different types of sigma
average screening charge of the surface segment divided by aeff, profiles. The first modification concerns the electrostatic
pi(σm) denotes the probability of finding a segment with interaction parameter cES, which was made temperature
screening charge density σm on the surface of component i, dependent
ΓS(σm) is the activity coefficient of segment m in the mixture, BES
and Γi(σm) is the activity coefficient of segment m in the mixture c ES = AES +
T2 (32)
of segments of only pure component i. The quantity pi(σm) is
called the sigma profile of pure component i and is defined as with AES = 6525.69 kcal Å4 mol−1 e−2 and BES = 1.4859 × 108 kcal
Å4 K2 mol−1 e−2. The second modification concerns the
Ai (σm) hydrogen-bonding term given in eq 31. By considering different
pi (σm) =
Ai (26) types of hydrogen-bonding, the parameter chb is defined as
follows
where Ai(σm) is the surface area with screening charge density σm
l
chb(σmt , σns) =
o cOH−OH if s = t = OH and σmt·σns < 0
o
of a molecule of species i and Ai is again the entire surface area of
o
o
o
o
the molecule of species i. The sigma profile of the mixture S can
o
o
o cOT−OT if s = t = OT and σm·σn < 0
then be obtained by
m
o
o
t s
o
o
o
N
o
∑i = 1 xiAi pi (σm)
o
o
o0
pS (σm) = cOH−OT if s = OH, t = OT, and σmt·σns < 0
n
N
∑i = 1 xiAi (27)
otherwise (33)
where xi is the mole fraction of component i. The segment 4 −1 −2
where cOH−OH = 4013.78 kcal Å mol e , cOT−OT = 932.31 kcal
ÄÅ Éo
l ÅÅ ΔW (σm , σn) ÑÑÑ|
activity coefficient in the mixture can be calculated from
o
Å4 mol−1 e−2, and cOH−OT = 3016.43 kcal Å4 mol−1 e−2. Due to
o
o Å ÑÑo o
o∑ pS (σn)ΓS(σn)expÅÅÅÅ−
ln(ΓS(σm)) = − lnm ÑÑ}
ÑÑÖo
the separation of the sigma profiles into nhb, OT, and OH
o
o σn Å
Ç o
o
n ~
contributions, an additional sum over the different sigma-
RT profiles needs to be introduced in eqs 24, 28, and 30. Equation
(28) 24 becomes
nhb,OH,OT
where the sum on the right-hand side goes over all charge
densities σn in the mixture. Note that eq 28 needs to be solved ln(γi r,S) = ni ∑ ∑ pit (σmt)[ln(ΓSt (σmt)) − ln(Γit(σmt))]
numerically for the segment activity coefficients ΓS(σm). The t σm
quantity ΔW(σm, σn) is called the exchange energy and can be (34)
l
calculated from and eq 29 becomes
i α′ y o
ΔW (σm , σn) = jjj zzz(σm + σn)2 o
ln(ΓSt (σmt)) = − lno
m
k2{ o
o
nhb,OH,OT
o
n
∑ ∑ pSs (σns)
ÄÅ Éo
ÅÅ ΔW (σ t , σ s) ÑÑÑ|
s σn
Å ÑÑo o
+ chb max[0, σacc − σhb]min[0, σdon + σhb]
Γ S(σn )expÅÅÅ− ÑÑ}
ÅÅÇ ÑÑÖo
o
o
(29)
~
s s m n
where the first term on the right-hand side is the misfit energy, RT
(35)
accounting for the electrostatic interactions, and the second
term on the right-hand side accounts for hydrogen-bonding Equation 30 can again be obtained by changing the index S to i
interactions. The values of the generalized parameters are α′ = in eq 35. For COSMO-SAC 2010, Hsieh et al.23 used the sigma
16466.72 kcal Å4 mol−1 e−2, chb = 85580 kcal Å4 mol−1 e−2, and profile database of Mullins et al.34
σhb = 0.0084 e Å−2 (with 1 kcal = 4184 J). Note thatin Furthermore, the value for aeff was changed to 7.25 Å2 and the
accordance with the Fortran code supplied by Mullins et al.34 sigma profile averaging equation according to eq 4 was used. A
the value for α′ given in the article of Lin and Sandler10 was used summary of the model parameters is given in Table 2.
rather than the different value provided in the article by Mullins 3.3. Model of Hsieh at el.COSMO-SAC-dsp. On the
et al.34 Furthermore, σacc = max(σm, σn) and σdon = min(σm, σn) basis of their model modification summarized in section 3.2,
holds. Γi(σm) and ΓS(σm) have a similar form Hsieh et al.24 proposed to also take dispersive interactions
ÄÅ Éo
l
o
o ÅÅ ΔW (σm , σn) ÑÑÑ| o
o Å ÑÑÑo
between the molecules into account, which were entirely
m∑ pi (σn)Γi(σn)expÅÅ−
ln(Γi(σm)) = − lno Å }
o
o Ñ
Å ÑÖo
Ñ
neglected before. The activity coefficient of component i in the
o σn ÅÇ o
n ~
mixture S then becomes
RT
(30) ln(γi ,S) = ln(γic,S) + ln(γi r,S) + ln(γidsp
,S
) (36)
3.2. Model of Hsieh et al.COSMO-SAC 2010. In 2010, with γdsp
i,S being the contribution to the activity coefficient due to
Hsieh et al.23 suggested an improvement of COSMO-SAC for dispersion. The combinatorial part ln(γci,S) and the residual part
phase equilibrium calculations based on the modifications ln(γri,S) are calculated in the same way and with the same
published by Lin et al.37 and Wang et al.38 Hsieh et al. proposed parameters as given in sections 3.1 and 3.2, respectively. Hsieh et
two modifications for eq 29, which in their model reads al.24 suggest the use of the one-constant Margules equation for
2641 https://dx.doi.org/10.1021/acs.jctc.9b01016
J. Chem. Theory Comput. 2020, 16, 2635−2646
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article
l
o
o
o
o
o
o
Table 2. Parameters of COSMO-SAC 2010 and Their SI −0.27027 K−1
o
if water + hb‐only‐acceptor
o
o
Equivalentsa
o
o
w=m
−0.27027 K−1
o
if COOH
o
o
parameter value value (SI)
o
o
+ (nhb or hb‐donor‐acceptor)
o −0.27027 K−1
o
79.53 × 10−20 m2
o
q0 79.53 Å2
o
o
o
66.69 × 10−30 m3
o
r0 66.69 Å3 if water + COOH
n 0.27027 K
z 10 10 −1
aeff 7.25 Å2 7.25 × 10−20 m2 otherwise
reff (aeff/π)0.5 (41)
fdecay 3.57 3.57 The dispersion parameters of the atoms have been fitted to
cOH−OH 4013.78 kcal Å4 mol−1 e−2 6.542209585204081 × 104 experimental data by Hsieh et al.24 and are listed in Table 3
J m4 mol−1 C−2
cOT−OT 932.31 kcal Å4 mol−1 e−2 1.5196068091379239 × 104
(already considering the corrigendum40). The other model
J m4 mol−1 C−2 parameters stayed the same as already given in Table 2.
cOH−OT 3016.43 kcal Å4 mol−1 e−2 4.916591656517583 × 104
J m4 mol−1 C−2 Table 3. Dispersion Parameters for Atom Types in COSMO-
σ0 0.007 e Å−2 0.11215236437999998 C m−2 SAC-dsp as Implemented by Hsieh et al.24
AES 6525.69 kcal Å4 mol−1 e−2 1.06364652940795 × 105
J m4 mol−1 C−2 atom type i (εAtom,i/kB) / K note
BES 1.4859 × 108 2.421923778247623 × 109 C (sp3) 115.7023 bonded to four others
kcal Å4 K2 mol−1 e−2 J m4 K2 mol−1 C−2
C (sp2) 117.4650 bonded to three others
NA 6.022140758 × 1023 mol−1 6.022140758 × 1023 mol−1
C (sp) 66.0691 bonded to two others
kB 1.38064903 × 10−23 J K−1 1.38064903 × 10−23 J K−1
N (sp3) 15.4901 bonded to three others
R kBNA/4184 kcal mol−1 K−1 kBNA J mol−1 K−1
N (sp2) 84.6268 bonded to two others
a
The value for the charge of the electron is e = 1.602 176 634 × 10−19 N (sp) 109.6621 bonded to one other
C. −O− 95.6184 bonded to two others
=O −11.0549 double-bonded O
F 52.9318 bonded to one other
the calculation of the dispersive interaction and give the Cl 104.2534 bonded to one other
following equations for a binary mixture of components 1 and 2 H (water) 58.3301 H in water
dsp
ln(γ1,S ) = Ax 22 and dsp
ln(γ2,S ) = Ax12 (37)
H (OH) 19.3477 H−O bond (not water)
H (NH) 141.1709 H bonded to N
24
As given in the article by Hsieh et al., the parameter A can be H (other) 0
calculated according to other invalid
l
with the definition of w given in a corrigendum by Hsieh et al.40 Delaware database33 were used for the development of the
o
o
o
o
o
COSMO-SAC-dsp model. This database can be considered as a
o
o −0.27027
−0.27027 if water + hb‐only‐acceptor
w=o
revised and extended version of the VT-database34,39 and was
m
o
o
o
if COOH + (nhb or hb‐only‐acceptor) developed with the cooperation of Stanley Sandler’s research
o
o
o
o
group at the University of Delaware and Dr. S. Lustig, formerly
o 0.27027
n
−0.27027 if water + COOH at DuPont. The basis set used in Dmol3 was the GGA/VWN-
otherwise BP/DNP functional with double numerical basis with polar-
(39) ization functions (DNP). The detailed procedure for obtaining
24
Hsieh et al. define substances as hb-only acceptor, if they are the equilibrium geometry and the screening charges can be
able to form a hydrogen bond by accepting a proton from its found in ref 34. The effects from using different combinations of
neighbor and hb-donor-acceptors as substances that are able to DFT methods and basis set were studied by Chen et al.42 More
form hydrogen bonds by either providing or accepting a proton details are available in the work of Xiong et al.33 A fully open-
from its neighbors. All substances containing a carboxyl group source set of sigma profiles generated from an open-source
are denoted with COOH. Note that there is a typographical quantum chemical tool is available43 at https://github.com/
error in the corrigendum, where the value w = −0.27027 is lvpp/sigma and https://doi.org/10.5281/zenodo.3613785, and
proposed for the combination of “COOH + nhb or hb-only- their use in a COSMO approach has been investigated
acceptor”, whereas in the article, this value for w is given for the previously.44,45
combination of “COOH + nhb or hb-donor-acceptor”.
4. IMPLEMENTATION DETAILS FOR THE COSMO-SAC
Furthermore, note that there also is a confusion of units in the
original article, as the constant A should be dimensionless and MODELS
the dispersion parameters are usually divided by Boltzmann’s The numerical method used to solve the nonlinear system of eqs
constant kB = 1.380649 × 10−23 J K−1 (value taken from ref 41). 28 and 35 is the successive substitution method. This method is
ÄÅ ÉÑ
the solver used in the code of Mullins et al.34 forming the basis of
ÅÅ i ϵ ϵ1 ϵ2 ÑÑÑÑ
Therefore, we implemented eqs 38 and 39 as follows
Å
Å jj 1 ϵ2 yzz
A = wÅÅ0.5jj + zz − Ñ
this implementation. Successive substitution is characterized as
ÅÅ j kB
ÅÇ k kB z{ kB kB ÑÑÑÑÖ
being both reliable and slowly convergent. Analysis of the
successive substitution method and a comparison with the
(40)
Newton−Raphson method is provided in the work of Possani
and and de P. Soares.46
2642 https://dx.doi.org/10.1021/acs.jctc.9b01016
J. Chem. Theory Comput. 2020, 16, 2635−2646
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article
To solve the segment activity coefficient of eqs 28 and 35, φi vapxivapp = γi liqxiliqφi liq
,0
p (45)
initial values have to be specified for ΓS(σn) and ΓsS(σsn),
respectively. Therefore, all values of Γ are set to unity before with φliq
i,0 being the fugacity coefficient of the liquid phase of pure
initiating the calculation for all intervals of the considered pure component i, φvap i the fugacity coefficient of the vapor phase of
molecule or mixture. After the first iteration, the newly component i in the mixture, and γliq
i the activity coefficient of the
calculated Γ will be averaged with the previous values and the liquid phase of component i. The first assumption is φvap i = 1 for
differences between the averaged and former values ΔΓ will the vapor phase as it is assumed to be an ideal gas. The second
serve as convergence criteria. Only when ΔΓ of every interval assumption is φliq i,0(T) being independent of pressure p. Equating
reaches the convergence criterion (here, that the maximum the fugacity of pure component i in the liquid phase f i,0(T, p) =
absolute difference between values of Γ is less than 10−8) will the φliq
i,0 p to that of the pure fluid at saturation, and then also to that
successive substitution be terminated; otherwise, the iteration of the vapor phase
starts again with the averaged Γ substituting the previous initial
values.
Furthermore, the equations can be rewritten to remove the
( )
fi ,0 T , p = pφi liq
,0
≈ psat, i φi liq
,0,sat
≈ psat, i (46)
evaluation of the logarithm, as the evaluation of log(exp(x)) is pφliq
i,0 becomes the vapor pressure psat,i of the pure component i.
computationally much more expensive than division. Therefore, Equation 45 is now given by
a slightly more efficient implementation (here, demonstrated for
the case of the mixture segment activity coefficients with one xivapp = γi liqxiliqpsat, i (47)
sigma profile) is
l
o |
o
o
o o
o
For a binary mixture this leads to
=m ∑ A ΓS,old(σn)}oo
−1
o
o
o σn o
n ~
ΓS,new (+)
x1vapp = γ1liqx1liqpsat,1 (48)
(42)
In the present C++ code, the sum on the right-hand side is x 2vapp = γ2liqx 2liqpsat,2 (49)
carried out in a vectorized form with matrix-vector operations
ÄÅ É
Their sum is equal to
f ivap = filiq (44) Figure 5. p−x diagram for the mixture ethanol + water. The
experimental data are from Barr-David and Dodge,47 and the vapor
The fugacity is defined as f i = φixip and the activity coefficient for pressure of the pure components are obtained from the ancillary
the liquid phase as γliq
i = φi /φi,0 , so that eq 44 becomes
liq liq
equations of the respective fluid.48
2643 https://dx.doi.org/10.1021/acs.jctc.9b01016
J. Chem. Theory Comput. 2020, 16, 2635−2646
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article
6. CODE AND VALIDATION code to calculate the sigma-profiles from the COSMO output
files is also provided. Thus, this work intends to provide an open-
The development code including the sigma profiles and
source reference implementation of state-of-the-art COSMO-
COSMO-SAC postprocessing is contained in a git repository
SAC models.
■
at https://github.com/usnistgov/COSMOSAC. The archival
version of the code used in this paper is furthermore stored at
https://doi.org/10.5281/zenodo.3669311. Permission from AUTHOR INFORMATION
BioVia was obtained to make the .cosmo files available for Corresponding Author
academic and noncommercial use. Additional information about Ian H. Bell − Applied Chemicals and Materials Division, National
the .cosmo file database is given in a README file in the file Institute of Standards and Technology, Boulder, Colorado 80305,
profiles/UD/README.txt relative to the root of the United States; orcid.org/0000-0003-1091-9080;
code. Email: ian.bell@nist.gov
The code workflow mirrors the analysis described in this
paper. As a preprocessing step, the sigma profiles are generated Authors
from each of the .cosmo files according to the Python script in Erik Mickoleit − Institute of Power Engineering, Faculty of
profiles/to_sigma.py. This script has a command-line Mechanical Science and Engineering, Technische Universität
interface that allows for selection of the charge averaging scheme Dresden, 01069 Dresden, Germany; orcid.org/0000-0001-
and how many contributions the sigma profiles should be 7223-6205
subdivided into (1 or 3), and from this script, a single .sigma Chieh-Ming Hsieh − Department of Chemical & Materials
profile is generated. Engineering, National Central University, Taoyuan 32001,
Once the sigma profile has been generated, the COSMO-SAC Taiwan; orcid.org/0000-0002-3063-2135
analysis is applied. This code allows for the calculation of the Shiang-Tai Lin − Department of Chemical Engineering, National
activity coefficients, among other outputs. Either the C++ or Taiwan University, 10617 Taipei City, Taiwan; orcid.org/
Python interfaces may be used, according to the user’s 0000-0001-8513-8196
preference. A wide range of other numerical analysis Jadran Vrabec − Thermodynamics and Process Engineering,
programming environments now support calling Python in a Technische Universität Berlin, 10587 Berlin, Germany;
nearly native fashion, so between C++ and Python most users orcid.org/0000-0002-7947-4051
should be able to find a way to call the COSMO code. Cornelia Breitkopf − Institute of Power Engineering, Faculty of
Examples of the use of the COSMO-SAC implementation are Mechanical Science and Engineering, Technische Universität
provided in jupyter notebooks provided with the code, along Dresden, 01069 Dresden, Germany
with the calculation of phase equilibrium calculations, for which Andreas Jäger − Institute of Power Engineering, Faculty of
the saturation pressure curves for the pure fluids provided by Mechanical Science and Engineering, Technische Universität
CoolProp48 are used. A limitation of this method is that vapor Dresden, 01069 Dresden, Germany; orcid.org/0000-0001-
pressure curves must be available for the given fluid. 7908-4160
Alternatively, vapor pressure curves could be obtained with Complete contact information is available at:
the consistent alpha function parameters for the Peng− https://pubs.acs.org/10.1021/acs.jctc.9b01016
Robinson equation of state of Bell et al.49
Further verification is provided by a large set of calculated Notes
values from our model. Users should first ensure that they can The authors declare no competing financial interest.
precisely regenerate these values prior to making use of the The development code including the sigma profiles and
library. The script that generates the verification data is in the file COSMO-SAC postprocessing is contained in a git repository
profiles/generate_validation_data.py relative at https://github.com/usnistgov/COSMOSAC. The archival
to the root of the code. version of the code used in this paper is furthermore stored at
https://doi.org/10.5281/zenodo.3669311.
7. CONCLUSIONS
Since Lin and Sandler10 proposed the original COSMO-SAC
model, many modifications and improvements for this model
■ ACKNOWLEDGMENTS
I.H.B. would like to thank the German Excellence Initiative for
have been proposed. The reproduction of COSMO-SAC funding the research stay at TU Dresden.
models from the literature is often challenging, because on the
one hand the model results strongly depend on the sigma
profiles, which themselves depend on the program used to
■ REFERENCES
(1) Fredenslund, A.; Jones, R. L.; Prausnitz, J. M. Group-Contribution
calculate them. Therefore, it is crucial to use the same sigma Estimation of Activity Coefficients in Nonideal Liquid Mixtures. AIChE
profiles as the authors of the COSMO-SAC model in order to J. 1975, 21, 1086−1099.
reproduce their model. On the other hand, some misunder- (2) Fredenslund, A.; Gmehling, J.; Michelsen, M. L.; Rasmussen, P.;
standings regarding the description of COSMO-SAC models Prausnitz, J. M. Computerized Design of Multicomponent Distillation
exist in the literature, which further complicate the reimple- Columns Using the UNIFAC Group Contribution Method for
mentation of COSMO-SAC models. In this work, we provide an Calculation of Activity Coefficients. Ind. Eng. Chem. Process Des. Dev.
1977, 16, 450−462.
open source C++ and python implementation of three different
(3) Gmehling, J.; Li, J.; Schiller, M. A modified UNIFAC model. 2.
COSMO-SAC models,10,23,24 together with a detailed doc- Present parameter matrix and results for different thermodynamic
umentation of the implemented models. Furthermore, we properties. Ind. Eng. Chem. Res. 1993, 32, 178−193.
provide a consistent set of sigma profiles calculated with the (4) Gmehling, J.; Wittig, R.; Lohmann, J.; Joh, R. A Modified
software DMol3 based on the database provided by Mullins et UNIFAC (Dortmund) Model. 4. Revision and Extension. Ind. Eng.
al.34 The corresponding COSMO output files and computer Chem. Res. 2002, 41, 1678−1688.
2644 https://dx.doi.org/10.1021/acs.jctc.9b01016
J. Chem. Theory Comput. 2020, 16, 2635−2646
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article
(5) Hansen, H. K.; Rasmussen, P.; Fredenslund, A.; Schiller, M.; (26) Kuo, Y.-C.; Hsu, C.-C.; Lin, S.-T. Prediction of Phase Behaviors
Gmehling, J. Vapor-Liquid Equilibria by UNIFAC Group Contribu- of Polymer−Solvent Mixtures from the COSMO-SAC Activity
tion. 5. Revision and Extension. Ind. Eng. Chem. Res. 1991, 30, 2352− Coefficient Model. Ind. Eng. Chem. Res. 2013, 52, 13505−13515.
2355. (27) Hsieh, M.-T.; Lin, S.-T. A predictive model for the excess gibbs
(6) Weidlich, U.; Gmehling, J. A modified UNIFAC Model. 1. free energy of fully dissociated electrolyte solutions. AIChE J. 2011, 57,
Prediction of VLE, hE, and y∞. Ind. Eng. Chem. Res. 1987, 26, 1372− 1061−1074.
1381. (28) Wang, S.; Song, Y.; Chen, C.-C. Extension of COSMO-SAC
(7) Klamt, A. Conductor-like screening model for real solvents: a new Solvation Model for Electrolytes. Ind. Eng. Chem. Res. 2011, 50, 176−
approach to the quantitative calculation of solvation phenomena. J. 187.
Phys. Chem. 1995, 99, 2224−2235. (29) Lee, B.-S.; Lin, S.-T. A Priori Prediction of Dissociation
(8) Klamt, A.; Jonas, V.; Bürger, T.; Lohrenz, J. C. Refinement and Phenomena and Phase Behaviors of Ionic Liquids. Ind. Eng. Chem. Res.
parametrization of COSMO-RS. J. Phys. Chem. A 1998, 102, 5074− 2015, 54, 9005−9012.
5085. (30) Lee, B.-S.; Lin, S.-T. Prediction of phase behaviors of ionic liquids
(9) Klamt, A.; Eckert, F. COSMO-RS: a novel and efficient method for over a wide range of conditions. Fluid Phase Equilib. 2013, 356, 309−
the a priori prediction of thermophysical data of liquids. Fluid Phase 320.
Equilib. 2000, 172, 43−72. (31) Mohr, P. J.; Newell, D. B.; Taylor, B. N. CODATA
(10) Lin, S.-T.; Sandler, S. I. A Priori Phase Equilibrium Prediction Recommended Values of the Fundamental Physical Constants: 2014.
from a Segment Contribution Solvation Model. Ind. Eng. Chem. Res. J. Phys. Chem. Ref. Data 2016, 45, No. 043102.
2002, 41, 899−913. (32) Cordero, B.; Gómez, V.; Platero-Prats, A. E.; Revés, M.;
(11) Lee, M.-T.; Lin, S.-T. Prediction of mixture vapor−liquid Echeverría, J.; Cremades, E.; Barragán, F.; Alvarez, S. Covalent radii
equilibrium from the combined use of Peng−Robinson equation of revisited. Dalton Trans. 2008, 2832−2838.
state and COSMO-SAC activity coefficient model through the Wong− (33) Xiong, R.; Sandler, S. I.; Burnett, R. I. An Improvement to
Sandler mixing rule. Fluid Phase Equilib. 2007, 254, 28−34. COSMO-SAC for Predicting Thermodynamic Properties. Ind. Eng.
(12) Hsieh, C.-M.; Lin, S.-T. Determination of cubic equation of state Chem. Res. 2014, 53, 8265−8278.
parameters for pure fluids from first principle solvation calculations. (34) Mullins, E.; Oldland, R.; Liu, Y. A.; Wang, S.; Sandler, S. I.; Chen,
AIChE J. 2008, 54, 2174−2181. C.-C.; Zwolak, M.; Seavey, K. C. Sigma-Profile Database for Using
(13) Hsieh, C.-M.; Lin, S.-T. First-Principles Predictions of Vapor- COSMO-Based Thermodynamic Methods. Ind. Eng. Chem. Res. 2006,
Liquid Equilibria for Pure and Mixture Fluids from the Combined Use 45, 4389−4415.
of Cubic Equations of State and Solvation Calculations. Ind. Eng. Chem. (35) Lin, S.-T.; Sandler, S. I. A Priori Phase Equilibrium Prediction
Res. 2009, 48, 3197−3205. from a Segment Contribution Solvation Model. - Additions and
(14) Wang, L.-H.; Hsieh, C.-M.; Lin, S.-T. Improved Prediction of Corrections. Ind. Eng. Chem. Res. 2004, 43, 1322−1322.
Vapor Pressure for Pure Liquids and Solids from the PR+COSMOSAC (36) Mu, T.; Rarey, J.; Gmehling, J. Performance of COSMO-RS with
Equation of State. Ind. Eng. Chem. Res. 2015, 54, 10115−10125. Sigma Profiles from Different Model Chemistries. Ind. Eng. Chem. Res.
(15) Wang, L.-H.; Hsieh, C.-M.; Lin, S.-T. Prediction of Gas and 2007, 46, 6612−6629.
Liquid Solubility in Organic Polymers Based on the PR+COSMOSAC (37) Lin, S.-T.; Chang, J.; Wang, S.; Goddard, W. A.; Sandler, S. I.
Equation of State. Ind. Eng. Chem. Res. 2018, 57, 10628−10639. Prediction of Vapor Pressures and Enthalpies of Vaporization Using a
(16) Hsieh, C.-M.; Lin, S.-T. First-Principles Prediction of Vapor- COSMO Solvation Model. J. Phys. Chem. A 2004, 108, 7429−7439.
Liquid-Liquid Equilibrium from the PR+COSMOSAC Equation of (38) Wang, S.; Sandler, S. I.; Chen, C.-C. Refinement of COSMO−
State. Ind. Eng. Chem. Res. 2011, 50, 1496−1503. SAC and the Applications. Ind. Eng. Chem. Res. 2007, 46, 7275−7288.
(17) Silveira, C. L.; Sandler, S. I. Extending the range of COSMO-SAC (39) Mullins, E.; Liu, Y. A.; Ghaderi, A.; Fast, S. D. Sigma Profile
to high temperatures and high pressures. AIChE J. 2018, 64, 1806− Database for Predicting Solid Solubility in Pure and Mixed Solvent
1813. Mixtures for Organic Pharmacological Compounds with COSMO-
(18) Jäger, A.; Bell, I. H.; Breitkopf, C. A theoretically based departure Based Thermodynamic Methods. Ind. Eng. Chem. Res. 2008, 47, 1707−
function for multi-fluid mixture models. Fluid Phase Equilib. 2018, 469, 1725.
56−69. (40) Hsieh, C.-M.; Lin, S.-T.; Vrabec, J. Corrigendum to: Considering
(19) Jäger, A.; Mickoleit, E.; Breitkopf, C. A Combination of Multi- the dispersive interactions in the COSMO-SAC model for more
Fluid Mixture Models with COSMO-SAC. Fluid Phase Equilib. 2018, accurate predictions of fluid phase behavior [Fluid Phase Equilib. 367
476, 147−156. (2014) 109−116]. Fluid Phase Equilib. 2014, 384, 14−15.
(20) Jäger, A.; Mickoleit, E.; Breitkopf, C. Accurate and predictive (41) Mohr, P. J.; Newell, D. B.; Taylor, B. N.; Tiesinga, E. Data and
mixture models applied to mixtures with CO2. Proceedings 3rd European analysis for the CODATA 2017 special fundamental constants
supercritical CO2 conference, Paris, France, 2019, DOI: 10.17185/ adjustment. Metrologia 2018, 55, 125−146.
duepublico/48891. (42) Chen, W.-L.; Hsieh, C.-M.; Yang, L.; Hsu, C.-C.; Lin, S.-T. A
(21) Shimoyama, Y.; Iwai, Y. Development of activity coefficient Critical Evaluation on the Performance of COSMO-SAC Models for
model based on COSMO method for prediction of solubilities of solid Vapor−Liquid and Liquid−Liquid Equilibrium Predictions Based on
solutes in supercritical carbon dioxide. J. Supercrit. Fluids 2009, 50, Different Quantum Chemical Calculations. Ind. Eng. Chem. Res. 2016,
210−217. 55, 9312−9322.
(22) de P. Soares, R.; Baladão, L. F.; Staudt, P. B. A pairwise surface (43) Ferrarini, F.; Flôres, G. B.; Muniz, A. R.; de Soares, R. P. An open
contact equation of state: COSMO-SAC-Phi. Fluid Phase Equilib. 2019, and extensible sigma-profile database for COSMO-based models.
488, 13−26. AIChE J. 2018, 64, 3443−3455.
(23) Hsieh, C.-M.; Sandler, S. I.; Lin, S.-T. Improvements of (44) Gerber, R. P.; Soares, R. P. Assessing the reliability of predictive
COSMO-SAC for vapor−liquid and liquid−liquid equilibrium activity coefficient models for molecules consisting of several functional
predictions. Fluid Phase Equilib. 2010, 297, 90−97. groups. Braz. J. Chem. Eng. 2013, 30, 1−11.
(24) Hsieh, C.-M.; Lin, S.-T.; Vrabec, J. Considering the dispersive (45) Wang, S.; Lin, S.-T.; Watanasiri, S.; Chen, C.-C. Use of
interactions in the COSMO-SAC model for more accurate predictions GAMESS/COSMO program in support of COSMO-SAC model
of fluid phase behavior. Fluid Phase Equilib. 2014, 367, 109−116. applications in phase equilibrium prediction calculations. Fluid Phase
(25) Fingerhut, R.; Chen, W.-L.; Schedemann, A.; Cordes, W.; Rarey, Equilib. 2009, 276, 37−45.
J.; Hsieh, C.-M.; Vrabec, J.; Lin, S.-T. Comprehensive Assessment of (46) Possani, L. F. K.; de P. Soares, R. Numerical and Computational
COSMO-SAC Models for Predictions of Fluid-Phase Equilibria. Ind. Aspects of COSMO-based Activity Coefficient Models. Braz. J. Chem.
Eng. Chem. Res. 2017, 56, 9868−9884. Eng. 2019, 36, 587−598.
2645 https://dx.doi.org/10.1021/acs.jctc.9b01016
J. Chem. Theory Comput. 2020, 16, 2635−2646
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article
2646 https://dx.doi.org/10.1021/acs.jctc.9b01016
J. Chem. Theory Comput. 2020, 16, 2635−2646