Professional Documents
Culture Documents
New Solvation Radii for the CPCM Solvation Model: Addition of Nitrogen
by
Ruveid Rizvic
A Research Paper
Submitted in Partial Fulfillment of the
Requirements for the
Bachelor of Science Degree
in
Biochemistry
_____________________________ _____________________________
Dr. Adam Moser Dr. Andrew Kehr
Loras College
May 2022
ii
dissolve in solvent through multiple processes that are difficult to understand physically. These
processes can be better understood through computational solvation modeling. Implicit solvation
is one computational model which requires a defined cavity. The Conductor-like Polarizable
Continuum Model (CPCM) models the solute in the solvent’s cavity. The solvent is represented
as a continuous dielectric medium and has no physical properties known. The solute atom’s
solvation radii depict the range of the cavity, which determines the radii definition of the solute.
Existing radii definitions (UFF, Bondi, Pauling) were not made for solvation modeling as it used
atom radii instead of solvation radii. These radii definitions do not accurately represent and could
not replicate Gibb’s experimental values for both neutral and charged molecules, a challenge for
solvation models. New solvation radii definitions were made using carbon, hydrogen, and
oxygen molecules to improve accuracy compared to these previous radii definitions. In this
research, an addition of nitrogen-containing solutes was added to a total of 565 solutes from the
Loras Solvation Database in 420 solvation radii combinations. These combinations were
modeled using the CPCM solvation model. This resulted in 9 new solvation radii definitions that
were within reasonable error ranges and represented a realistic radii combination for each unique
iii
atom. 21 solvation radii combinations for neutrals, 3 combinations for cations, and 5
combinations for anions were within their own exceptions for error analysis. Results varied
between charged and neutral solute subgroups based on the size of oxygen’s solvation radii.
Bigger radii combinations performed best for neutral solutes and oppositely for the anion and
cation solutes.
iv
Acknowledgments
I would like to thank the Loras College chemistry and biology departments throughout
these past few years for helping me get to this point in my education. I am thankful to my peers,
the class of 2023, and the Loras Computational Chemistry Lab, for their extensive help and
support. A big thank you to Nicholas Haskin and Emma Hoefer for helping with data
organization and getting me started with this research. I would also like to thank my friends
sincerest gratitude to my research mentor, Dr. Adam Moser. You have helped me immensely
throughout my chemistry education, research, and thesis preparation. I would not be where I am
Table of Contents
.......................................................................................................................................Page
Abstract ...............................................................................................................................ii
List of Figures.....................................................................................................................vi
Introduction .........................................................................................................................1
Methods ............................................................................................................................11
Conclusion.........................................................................................................................21
References..........................................................................................................................24
Appendix ...........................................................................................................................26
vi
List of Figures
.......................................................................................................................................Page
Figure 3. Model of the molecule methoxide presenting individual solvation radii spheres…
.............................................................................................................................................5
Figure 4. Comparison of Gibb's experimental vs. calculated solvation energy for solutes in radii
combination for carbon 1.7, hydrogen 1.1, oxygen 1.4, and nitrogen 1.6.........................16
Figure 5. Comparison of Gibb's experimental vs. calculated solvation energy for solutes in radii
combination for carbon 1.9, hydrogen 1.1, oxygen 1.4, and nitrogen 1.6.........................17
vii
List of Tables
.......................................................................................................................................Page
Table 2. Neutral, Cation, and Anion Errors for the three existing solvation radii definitions.
.............................................................................................................................................8
Table 5. The resulting 9 radii definitions for all solutes including nitrogen......................15
Table 6. Solutes that consistently failed within the 9 new radii definitions......................15
Introduction
Solutions, where a solute is dissolved in a solvent, are crucial in the world of chemistry
because most important chemical and biochemical processes occur in solution. As an example,
the human body conducts many important chemical reactions in water. Cells in the human body
contain 70% of the body’s water, with the rest of the water mostly being contained around the
cells. Important biochemical reactions, such as protein interactions and ion channels, occur in
and around these cells. Reactions on the solute can sometimes only be observed or discovered in
London Dispersion Forces, but other stronger interactions such as hydrogen bonding and dipole-
induced dipole forces. When the solvent dissolves the solute, this process is known as solvation.
Many things occur and change the solute up to this point, which helps better understand the
process of solvation.
Solvation is a process with multiple steps. The first step is the breaking of the solvent’s
surface tension as it is entering the solution. This process is energetically unfavorable due to the
energy needed to break the intermolecular forces between solvent molecules. Once the solute
ix
manages to intrude into the solvent, a cavity is formed in the solvent for the solute to reside in.
The gain and loss of energy are the result of the making and breaking of intermolecular forces
between the solvent molecules. 2 Following this, repulsion and dispersion forces occur between
the solvent and the solute, along with electrostatic contributions and polarization.
These four processes; breaking the solvent’s surface tension, formation of the cavity,
dispersion, and repulsion, all affect the intermolecular forces as the solute enters and travels in
the solvent. It is known that the solute in a solution is affected geometrically and electronically.
Once within the cavity, the solvent causes the solute’s electronic and geometric structure to
change as a response. Geometric responses happen in solution phase reactions between the solute
and the solvent, compared to the reactions in the gas phase with no solvent. However, when the
solute transitions from the gas phase to the solution phase, both responses will occur. When the
solute enters the solvent, there is a change in polarization between atoms in the solute and
between the solvent. The solutes in the solution phase also undergo a change in geometry
resulting from the change in polarization. Additionally, the solvent responds to the solute by
arranging itself around the solute, depending on if they both have polar or nonpolar properties.
The four steps mentioned above result in a favorable or unfavorable solvation process
depending on the strength of these interactions for various solutes. The energy needed to break
the surface tension and form the cavity in the solvent is costly for the solute, making it more
stable. When inside the cavity, dispersion and repulsion is typically a gain in energy for the
solute. The sum of energy gained or lost during the process of solvation is known as the Gibbs
energy of the solute in the aqueous phase (ΔGaq). Using measurements of other changes such as
measure the change in energy for large solutes, such as proteins, when there are many physical
x
processes occurring. Therefore, to better understand how the solvent interacts with the solute,
these interactions can be modeled in computational calculations. The use of quantum chemistry
study the physical and chemical properties of molecules and their reactions. 1 These physical and
chemical properties are determined by the electronic structure. The Schrödinger equation, once
solved, can allow the prediction of a molecule’s properties and behavior. This allows for the
modeling of physical processes that can be done through physical experimentation. Quantum
mechanics accurately represents the solute. It is one of the most accurate methods used to predict
chemistry in the gas phase. However, there are major limitations to using quantum chemistry. It
requires a significant amount of computational power and time to perform calculations for
systems with large molecules. It requires parameters such as the theory and basis set, which
makes calculations complex depending on the choices inside these parameters. Also, quantum
chemistry is only beneficial for chemistry in the gas phase, as most complex chemical processes
occur in the solution phase, and quantum chemistry does not give insight into the loss or gain of
energy in the solution phase.3 Nevertheless, quantum mechanical methods use a range of theories
to investigate how atoms and molecules behave. With the method of solving the properties and
behavior of the solute figured out, the next step was to determine how to represent the solvent.
Solvation models model how solvents behave and affect the solute in a solution. They are
designed to explain how the solvent and the solute interact. Some computational models are
involved in simplifying calculations made and minimizing the terms used to perform the
computation, like the implicit solvation model which will be discussed shortly. Although the
models used are not as accurate as the models used in quantum computations because they don’t
xi
focus greatly on electronic and geometric structures, they are more reasonable to produce given
the shorter computational time required. Since this research will be focusing on a large range of
molecules, big and small, this research will use solvation models as it is more reasonable and
could allow progress in developing a more effective method of representing the solvent
There are two main types of solvation models used to represent the solvent. The first
type, explicit solvation, represents the solvent molecules around the solute (Figure 2). 4,9 This
model allows for locations where the quantum mechanical solvent is interacting with the solute.
As previously explained, the ability to accurately model the interactions between the solvent and
the solute can allow for more accurate calculations. However, the computational time needed is
Figure 2. Explicit Solvation vs. Implicit Solvation. The Explicit Solvation Model contains explicit solvent molecules
(blue) surrounding the solute molecule (red and white). The placement of the individual solvent molecules does
not visually leave a defined cavity in which the solute resides. The Implicit Solvation Model replaces the solvent
with a dielectric constant (blue stripe). The effect of the dielectric constant and the impact it has on all points of
the solute, gives the solvent cavity a defined boundary (solid black).
extensive and therefore not effective for this research.9
Implicit solvation models replace the explicit solvent molecules with a continuous
dielectric medium (Figure 2). This means that the system is only as big as the solute. 5 The
solvent now has no information regarding its structure. The construction of the implicit solvation
xii
with a cavity. There is the construction of the surface of the boundary in which the cavity ceases.
The most common construction of the surface is the van der Waals surface, which has a spherical
shape corresponding to its atom. Then, there is the surface type, which defines the shape of the
cavity depending on which atoms have it. The cavity shape corresponds to the molecule;
however, it should not be mistaken for the solvation radii of each atom. The size and shape of the
cavity represent the solvation radii definition. There are different radii definitions that are used to
define the shape of the cavity, one being the all-atom definition (AA). This definition allows for
the cavity to be defined by all atoms in a solute, whereas other definitions do not account for all.
Being able to change the size of the cavity could lead to stronger or weaker interactions between
the solvent and solute.10 The solvation radii can be seen as a sphere around each atom as seen in
Figure 3. Changing, or scaling, the size of solvation radii is known as alpha scaling.
xiii
If the overlapping lines of each atom’s solvation radii were to be removed, one could
visualize it to be the shape of the cavity. The size of the cavity affects the solvation energy. For
example, if the size of the cavity becomes smaller allowing the solvent to interact more closely
with the solute, the intermolecular forces would become stronger, therefore, leading to a greater
change in free energy between the solution phase and the gas phase. Then, there are ions and
C
neutrals to consider when modeling solvation. Neutrals are simpler compared to ions and
therefore more predictable to adjust the size of the cavity. Ions on the other hand require more
variables and are therefore less predictable. Changing the solvation radii size does not have to
apply to all the atoms defining the solute, as a combination of radii changes can be made to a
selective atom. This could allow the reproduction of experimental solvation values through radii
combinations in a model.
With the cavity included in all the terms discussed; electrostatics, repulsion, and
dispersion, the Gibbs energy can be calculated in the aqueous phase in Equation 1.
Gibb’s solvation energy is an experimental value, which then can be used as a foundation to
compare calculated values from computational methods. Gibb’s energy of solvation can then be
calculated through the difference in the solution phase energy (∆ aqG) and the gas phase energy
To obtain Gibb’s solvation energy error of the solutes in these radii definitions, the difference
between the calculated and experimental solvation energy was calculated, as shown in Equation
3.
to replicate the experimental solvation energy of various solutes. 3 The CPCM model is the
implicit solvation model with the solute in the cavity that also contains the options listed
previously for a solvent model. She tested the performance of each boundary definition when
MUE 6.83 2.02 2.60 18.88 5.98 4.80 15.88 6.39 4.25
RMSD 7.57 2.49 3.29 19.40 7.29 6.26 17.42 7.81 5.27
Table 2. Neutral, Cation, and Anion Errors for the three default solvation radii definitions with non-electrostatics
on. The Mean Signed Errors (MSE), Mean Unsigned Errors (MUE), and Root Mean Squared Deviations (RMSD)
are listed under each radii definition.3
The existing boundary definitions used in Sloan’s work did not bring promising results,
deeming the boundary definitions not designed for solvation modeling. 3 As shown in Table 2, the
average solvation errors for both cations and anions using the three boundary definitions came
out too positive or negative for Bondi and UFF definitions for the CPCM solvation model. The
errors in both cations and anions performed better for Pauling, however, it still was not
acceptable. For the neutrals, Bondi and Pauling obtained mean signed error (MSE) values within
xv
±1 kcal/mol. Overall, the UFF definition did not perform well for any molecule type while
Pauling and Bondi definitions could only perform well for neutrals.
It is difficult to generate an acceptable radii combination that performs well for all
molecules. As shown in Sloan’s research, charged molecules are a challenge to please using
these existing radii definitions to best represent all solutes. The leading reason for this is because
these three radii definitions were based on atomic radii instead of solvation radii. This led to the
future research of Nick Haskin’s. His goal was to improve the accuracy between calculated and
experimental Gibb’s solvation energies for the CPCM solvation model using new solvation radii
definitions. To achieve this, Haskin’s work consisted of choosing solvation radii ranges based on
the existing radii definitions, calculating the solvation energy through computational modeling,
and comparing the calculated solvation energy with the experimental solvation energy with the
use of average error and absolute average error. Therefore, Nick Haskin has searched 281 new
radii combinations for 354 molecules containing carbon, hydrogen, and oxygen. 12 Looking at
mean signed errors for radii definitions can be deceiving because molecules can have equally
over-solvated and under-solvated energies, making the average error seem small while the
magnitude and spread of the error can be large. It was crucial to look for radii definitions where
all solutes in that specific radii combination are within acceptable error ranges. The term “fails”
when looking at which molecules had exceeded the threshold in solvation error is considered to
be ±5 kcal/mol for neutrals due to neutrals being relatively simpler to replicate solvation energy
within this range. Any error beyond this range is unacceptable for neutrals as it is overall
difficult to not hit this range. Considering ions tend to have a harder time replicating
experimental solvation energies due to their big magnitude in solvation energy values, therefore
leading to an easier time having big fluctuations for calculated values, their threshold is ±10
xvi
fewer for the charged solutes. A total of 11 possible radii definitions were found for all solutes
within acceptable statistical error ranges (Table 3). As shown, all 11 radii definitions have
average error values within the acceptable MSE range. It is unknown currently which radii
definitions were optimal for specific solute subgroups like cations and anions, however with 281
possible solvation radii combinations, there could be combinations that accurately represent
them.
In this research, nitrogen-containing solutes were added in new radii combinations. . This
gave more charged molecules to be analyzed compared to Haskin’s set of molecules. There were
two main goals to obtain from this research. The first goal was to search for new solvation radii
combinations containing all molecule subgroups (neutrals, cations, and anions) with the addition
of nitrogen, as well as finding which radii combinations worked best for each subgroup
individually. These radii combinations were then to be compared to Neuzil’s radii used from the
three existing radii definitions. The second goal was to suggest a new subset of radii for future
searches. Solvation radii ranges from Haskin’s research were taken into consideration to find any
xvii
similarities and differences of his radii combinations to these newer radii combinations. Once the
calculated Gibb’s solvation energy is determined, it can be compared to the experimental to get
the solvation energy error as described in the previous equation (Equation 3). This research
intends to replicate experimental solvation values by constructing a new set of solvation radii,
with nitrogen-containing solutes added, designed for the CPCM solvation model.
xviii
Methods
A total of 565 solutes consisting of carbon, hydrogen, oxygen, and nitrogen were
obtained from the Loras Solvation Database. These were all derived from the Minnesota and
Mobley Molecular Databases.6-7 Within these 565 solutes, 475 were neutrals, 48 were cations,
and 42 were anions. The first step to obtaining Gibb’s solvation energy was to obtain a
coordinate map for every atom in all 565 solutes, giving the geometry or location of the atoms in
the molecule. An example of cavitation, repulsion, and dispersion variables used in these
coordinate maps are shown in Appendix A. The molecular geometries, as well as experimental
Gibbs aqueous and gaseous energies, were already stored through the Loras Solvation Database.
To represent the solute in the solvation model, quantum mechanical methods were used. The
geometry of the solutes was represented in the gas phase because this takes less time to compute
compared to solutes in the aqueous phase. Hartree Fock theory and 6-31G(d) basis set was used
to calculate the solute due to its simplicity and its use in previous research.
Solvent cavities were created using Gaussian16 software. The solvent was also in the gas
phase geometry, utilizing the van der Waals surface construction for creating the boundary or
shape of the cavity. To generate radii definitions, the all-atom surface type was applied to every
atom in the solute, giving each atom a solvation radii sphere. In implicit solvation methods, alpha
solvation radii size. The alpha value was set to default (1.1), like Sloan Neuzil’s alpha value.
With Haskin’s radii definitions being scaled to a 1.0 value, it would be inaccurate to compare
xix
how well these new radii definitions replicated solvation with his without knowing precisely
The solvation radii range for each atom was similar to Haskin’s 12, however, the addition
of nitrogen contributes to smaller radii ranges. In this research, carbon’s solvation radii size
ranged from 1.7 to 2.1 angstroms, hydrogen ranged from 1.0 to 1.2 angstroms, oxygen ranged
from 1.4 to 1.7 angstroms, and nitrogen ranged from 1.4 to 2.0 angstroms, with a 0.1-angstrom
increment change (Table 4). With ranges from these four elements with 0.1 increment changes,
Atom C H O N
Solvation Radii
1.7 – 2.1 1.0 - 1.2 1.4 - 1.7 1.4 - 2.0
(angstroms)
Table 4. Solvation radii ranges for each unique atom (in angstroms), with 0.1 angstrom increment changes within
the range.
Each radii combination for a solute displayed a calculated solvation energy which can be
compared to the experimental energy. Calculating the difference between the experimental and
the calculated Gibbs solvation energy values gives the error between the two.
(Equation 5)
√
N
6)
across all the calculated Gibb’s solvation energy error values were determined. The mean
signed and unsigned error shows the average error for both signed and unsigned values,
respectively. The root means squared error shows the spread of errors. There are possibilities for
the MSE in radii definitions to be acceptable, however since errors can be scattered widely in
both negative and positive values, the balance could deceive the magnitude of error by making it
look as if the average error is close to zero. It can be determined if the values are truly closer to
When choosing which radii combinations have small errors between experimental and
calculated solvation energy values for molecules, there are certain boundaries in errors for
neutral and ionic molecules that may not be accepted if passed, also known as fails. After
looking at the errors and looking at decisions from previous research 3,12, the range of ≥5 kcal/mol
for neutral error and ≥10 kcal/mol for ion error. The ionic error being greater than the error for
neutrals is reasonable because of the greater ∆solvG values on the ionic molecules. These
calculated values were then plotted with the experimental values to see how close the calculated
A goal of this research was to narrow down the range of solvation radii for each atom in
the database so that more unique atoms do not have to be searched in too large of a solvation
radii range, as well as generating more radii definitions within that range by lowering the
increment change. To do this, acceptable radii definitions consisting of all the molecules
analyzed must have an MSE value within ±1 kcal. This shrunk the search range to 82 remaining
out of the 420 radii definitions. Furthermore, radii definitions with MUE values greater than ±3
kcal were excluded. Finally, radii definitions with an RMSD value greater than ±4 kcal were
excluded. These decisions are based on previous research from Haskin, allowing solvation radii
trends to be seen more easily using the same error ranges. This left 36 out of the 82 radii
However, it must be accounted for that, compared to the previous three default radii
definitions, radii definitions must represent the physical reality of atoms in molecules. Atoms
with a bigger atomic radius would have a bigger solvation radius compared to atoms with a
smaller atomic radius to better replicate these phenomena. It is also known that charged
xxii
molecules need a smaller cavity to be able to interact with the solvent. This means that atoms
like oxygen and nitrogen cannot be close in size to the size of a hydrogen atom, and oxygen
cannot be greater in size than nitrogen or carbon. In order, carbon will have the biggest solvation
radii. Following that would be either an equal size between nitrogen and oxygen, or nitrogen
would be slightly bigger than oxygen in terms of atomic radii. Lastly, hydrogen has the smallest
atomic radii, also having the smallest solvation radii. Therefore, any radii definitions that did not
Table 5. The resulting 9 radii definitions for all solutes. These radii definitions listed are the result of
excluding radii definitions that were not within ± 1 MSE, ± 3 MUE, and ±4 RMSD, as well as excluding
radii definitions that did not follow the rules of radii size corresponding to atoms.
Radii Total
Errors Failures
Combinations Failures
C H O N MSE MUE RMSD Neutral Cation Anion /565
1.7 1.1 1.4 1.5 -0.90 2.33 3.26 36 4 1 41
1.7 1.1 1.4 1.6 -0.66 2.22 3.10 34 4 1 39
1.8 1.1 1.4 1.5 -0.68 2.11 3.10 31 5 1 37
1.8 1.1 1.4 1.6 -0.47 2.04 2.98 29 4 1 34
1.9 1.1 1.4 1.5 -0.10 1.99 3.09 20 6 2 28
1.9 1.1 1.4 1.6 0.09 1.98 3.01 19 6 2 27
2.0 1.0 1.4 1.5 -0.80 2.39 3.44 28 7 5 40
2.0 1.0 1.4 1.6 -0.58 2.30 3.28 27 6 5 38
2.0 1.1 1.4 1.5 0.83 2.22 3.41 24 7 7 38
represent this reality were excluded from the analysis. This resulted in 9 radii combinations that
Error analysis is not enough to determine which radii combinations are most optimal for
the 565 molecules. Some radii combinations are more favored toward neutral molecules while
molecules, or fails, can be seen as the atom solvation radii size changes. Following previous
research, a neutral molecule with an error greater than ±5 kcal/mol is considered a failure while
cations and anions with an error greater than ±10 kcal/mol are considered failures. Among the
xxiii
trends that were seen, neutral molecules had fewer failures when carbon’s solvation radii
increased in size (>1.7) and oxygen increased in size (>1.4). Cations had fewer failures when
carbon is greater in size (>1.8) and oxygen is also greater in size (≤1.6). Anions had the least
number of failures when oxygen’s radii were at 1.4 angstroms (Table 5). When comparing these
solvation radii to the existing radii definitions that Sloan tested, we see that carbon’s solvation
radii is, on average, closer to the radii of UFF. No existing radii definition could match the most
common solvation radii for hydrogen shown (1.1 angstroms). For oxygen’s most common
solvation radii shown, Pauling was easily comparable with the atomic radii being 1.4 angstroms
in size. Finally, both Pauling and Bondi’s radii, 1.5 and 1.55 angstroms respectively, were
When looking at which molecules failed consistently within the 9 new solvation radii
Gsolv exp Average Error
Solutes Charge Amount Failed
(kcal/mol) (kcal/mol)
[2-benzhydryloxyethyl]-dimethylamine 0 -9.34 9 8.33
1,2-dinitroxypropane 0 -5.00 9 -6.76
1,3-bis-[nitrooxy]butane 0 -4.29 9 -7.44
1,4,5,8-tetraminoanthraquinone 0 -8.90 8 -9.49
1-acetoxyethylacetate 0 -4.97 3 -5.89
1-amino-4-hydroxy-9,10-
0 -9.53 8 -9.87
anthracenedione
1-methyl-3-nitrobenzene 0 -3.45 3 -5.28
1-methylthymine 0 -10.40 4 -5.94
1-nitrobutane 0 -3.08 3 -5.15
1-nitropropane 0 -3.34 6 -5.25
2-methoxyphenol 0 -5.94 4 -5.44
2-nitrophenol 0 -4.58 9 -10.50
3-nitrooxypropyl nitrate 0 -4.80 9 -7.99
Amitriptyline 0 -7.43 9 8.72
Cyanuric acid 0 -18.06 4 -5.81
Dicyandiamide 0 -10.95 9 -11.66
Dinitrogen tetroxide 0 -2.14 9 -11.47
Dinoseb 0 -6.20 9 -9.65
Fenbufen 0 -12.75 4 -6.02
Glycerol triacetate 0 -8.84 4 -5.53
Isobutyl formate 0 -2.22 8 -6.29
Isopropyl formate 0 -2.02 8 -7.04
N,N-dimethylpiperazine 0 -7.58 5 -7.04
Nitroethane 0 -3.71 5 -5.76
Nitroglycol 0 -5.70 9 -6.57
Nitromethane 0 -3.95 7 -5.91
Nitroxyacetone 0 -6.00 3 -5.74
Peracetic acid 0 -5.88 8 -7.11
Propyl formate 0 -2.48 8 -6.66
Trimethoxymethane 0 -4.42 4 -5.49
Urea 0 -13.80 2 -5.45
Water 0 -6.31 2 -5.55
4-nitroaniline 1 -75.90 5 -11.76
Diethyl ether 1 -71.50 5 11.27
Dimethyl ether 1 -79.50 7 12.02
Ethanol (cation) 1 -88.40 9 12.22
Hydronium 1 -110.30 9 13.27
Methanol (cation) 1 -93.00 9 12.12
Benzyl alcohol -1 -85.10 9 12.24
Ethanol (anion) -1 -90.70 3 11.63
Methanol (anion) -1 -95.00 5 11.94
combinations, most were neutral solutes. Neutrals were too under-solvated, showing average
Table 6. Solutes that consistently failed within the 9 new solvation radii definitions. Solutes are
sorted by charge (neutrals, cations, and then anions), with their respective experimental solvation error
energy. Each solute displays an average solvation energy error in the combinations they failed in.
xxv
values greater than -5 kcal/mol. The calculated solvation energies were way too positive and
therefore the difference between the experimental and calculated values were negative.
Meanwhile, most cations and all anions that failed consistently in these 9 new solvation radii
combinations had too positive average errors. This means that cations and neutrals had calculated
values greater than the experimental, hence giving a positive value meaning they were over-
solvated. Between which solutes failed more than half of the radii combinations compared to the
solutes that failed in less than half, trends can be difficult to spot. If the solute has complex
structures, such as containing both nitrogen and oxygen atoms, the molecule seems to fail more
in these 9 radii combinations. There doesn’t seem to be a trend on how big the solute is
dimethylamine is a large solute containing two aryl groups. Comparing this solute to Fenbufen,
containing two aryl groups and a hydroxyl group, while also being just as large, it has twice the
amount of failures. However, the solutes that seem to fail more contain more than one oxygen
and nitrogen atoms, with hydroxyl or amino groups at the end. These solutes also tend to have
double or even triple bonds between oxygen and nitrogen atoms. This makes sense as the radii
definition has to account for both atoms, which tend to have difficulty having a small enough
Increasing the solvation radii size of oxygen drastically increases the number of failures
that anions produce. The solvation radii of oxygen must be at 1.4 or lower to have minimal fails
for anions. However, oxygen’s radii solvation range was only analyzed between 1.4 and 1.7.
Cations and anions cannot perform well together at the same radii definition, as increasing
oxygen’s and carbon’s solvation radii works well for cations but drastically increases the number
of fails for anions. However, with 48 cations and 42 anions, having a subtle increase in cation
xxvi
fails to reduce the number of anions failing is exceptional. It is not exceptional to consider an
increase in fails for cations and anions to decrease the number of fails of neutral molecules as
there are over 5 times as many neutral molecules than charged molecules, heavily biasing error
Looking at specific radii combinations using a 45-degree plot, we can see the trends of
bigger solvation radii sizes due to having no charge. On the other hand, cation solutes seem to be
better at replicating experimental solvation energies at this definition than neutral solutes.
combination, many trends can be found (Figure 5). Looking at each subgroup of solutes, neutrals
compared to the previous radii combination, also being in the spot where calculated solvation
combinations for each subgroup of molecules, cations and anions need to be analyzed
individually. With cation and anion solutes, there were no radii combinations that were within
xxviii
error statistical standards and didn’t have nonideal atom radii compared to its neighboring unique
atoms. However, with values slightly higher than ±4 kcal/mol RMSD, there were few radii
definitions for cation solutes (Table 7). Looking at which radii definitions were most optimal for
cations; hydrogen’s solvation radii size must be 1.1 angstroms to be close within the statistical
error range. Something that wasn’t seen in the radii definitions representing all solutes is that
oxygen’s radii is seen at 1.5 angstroms. Finally, we see common radii definitions seen from the
definitions for all 565 solutes, with carbon at 1.4 angstroms, hydrogen at 1.1 angstroms, oxygen
at 1.4 angstroms, and nitrogen at 1.5 and 1.6 angstroms. In these definitions, we see that only 4
were only seen on the smaller side for these definitions. An interesting find is that hydrogen’s
radii was mostly at 1.0 angstroms, but that is to be expected for charged molecules as they need
smaller radii sizes to be able to interact with the solvent more strongly. Oxygen’s radii lied at the
typical size of 1.4 angstroms as seen as a common trend in the radii definition concerning all 565
solutes. Out of the 9 radii definitions found for all solutes, only one radii definition for anion
solutes matched it, with carbon at 1.7 angstroms, hydrogen at 1.1 angstroms, oxygen at 1.4
angstroms. However, as seen on the table, most of it has to do with the drastic increase in
oxygen’s and nitrogen’s solvation radii size. When oxygen was at 1.4 angstroms, most of the
definitions had hydrogen’s solvation radii at 1.1 angstroms. There are not many combinations
with oxygen’s radii at 1.4 angstroms, with the exception of hydrogen being at 1.1 angstroms.
There is a visible trend seen across the table where the most minimal errors are seen in the
biggest of radii combinations. This supports the claim that neutrals tend to need bigger solvation
narratives. Neutrals have multiple times more combinations with oxygen being greater than 1.4
angstroms in solvation radii than cations and anions. This also leads to seeing more combinations
where nitrogen is bigger than 1.6 angstroms in its solvation radii. There is a wider range of
xxx
carbon solvation radii sizes in neutrals compared to cations and anions. Finally, bigger radii
combinations for neutral solutes tend to perform better with neutrals and cations (only when
nitrogen’s solvation radii increase), while the opposite can be said for anions.
Conclusion
Solutions are crucial for biochemical processes to function and therefore it is important to
improve the understanding of solvation through computational modeling. The purpose of this
research was to generate new solvation radii for the CPCM solvation model. In this research,
obtained 565 solutes containing carbon, hydrogen, oxygen, and nitrogen, with 475 neutral
molecules, 48 cation molecules, and 42 anion molecules being analyzed. These new solvation
radii definitions considered previous radii ranges as well as the atoms’ radii size.
There were 420 radii definitions to be analyzed for the performance of replicating
experimental solvation energies. Some radii definitions were optimal for neutral molecules while
others were optimal for ion molecules. Ultimately, 9 new solvation radii definitions were found
to be acceptable for all molecules analyzed in this research. All 9 radii definitions had MSE
values of ±1 kcal/mol, MUE values of ±3 kcal/mol, and RMSD values of ±4 kcal/mol. Results
show when increasing oxygen’s radii size, neutral molecules failed less. Decreasing oxygen’s
radii size did the contrary. There were enough anions and cations to be analyzed in this research
to see trends in molecular failures. With hydrogen’s radii size being 1.0 and also increasing
oxygen’s radii size showed better performance for cations. Decreasing oxygen’s radii size
From what was gathered in the results, searching at a smaller minimum solvation radius
value for oxygen can be predicted to have significant results in decreasing anion failures and
xxxi
overall improving statistical errors. It is not worth changing the range of carbon or hydrogen as it
greatly affects the number of neutral failures while also not providing much improvement for
ions. Carbon’s radii range is wide enough to analyze the effect of the decrease of oxygen’s radii
size to 1.3 angstroms. Having nitrogen’s radii size any higher than 1.6 angstroms would require
oxygen’s radii size to be at least 1.5 angstroms or greater to be within 0.2 angstroms of
nitrogen’s radii size. Since having a radii size of 1.5 angstroms or higher for oxygen makes
anions perform worse, searching higher radii values for nitrogen should not be worth
considering.
Extensive research in this field would be to add one additional atom at a time such as
phosphorus or sulfur as they are also common in biological molecules and help function
biochemical processes. Adding one atom at a time could help understand the solvation trends
when molecules become bigger or more complex. This also allows for replicating solvation
energies for molecules that are too long to experiment on or dangerous to analyze. To expand
more on research for these new radii definitions, using these suggestions, it could also prove
beneficial to change the atom’s solvation radii in smaller increments (such as 0.05 angstroms)
while generating radii combinations, as default radii definitions were specific for the solvation
radii size of atoms. Using smaller increments in radii changes can drastically increase the
number of combinations generated, so shortening the solvation radii range for each atom is
crucial.
To narrow down the next search for solutes containing nitrogen, carbon’s solvation radii
range should stay the same (1.7 – 2.1 angstroms), similarly to hydrogen (1.0 – 1.2 angstroms),
decreasing oxygen’s minimum and maximum solvation radii size down (1.3 – 1.4 angstroms),
and decreasing nitrogen’s maximum radii size (1.4 – 1.6) (Table 5).
Atom C H O N
Solvation Radii
1.7 – 2.1 1.0 – 1.2 1.3 – 1.4 1.4 – 1.6
(angstroms)
xxxii
There needs to be a safe range for carbon’s solvation radii size as it is shown to be performing
well for solutes in almost every size, so not changing the solvation radii range for carbon would
be ideal. Hydrogen’s radii range will remain consistent to make sure decreasing the radii size for
oxygen does not significantly increase neutral failures. This means that if oxygen’s radii were to
be at 1.3 angstroms, we may see greater values for carbon’s and hydrogen’s solvation radii size
so it is important not to exclude sizes where it might be present in radii definitions. While most
cation solutes were introduced by adding nitrogen into the set of atoms, changing the solvation
radii for nitrogen did not affect the number of failures for these ions, and since most radii
definitions consisted of nitrogen being at 1.5 to 1.6 angstroms, that will be the recommended
range.
xxxiii
References
(2) Adam Moser. Charge-Dependent Radii for Improved Treatment of Solvation Effects
in Quantum Mechanical Calculations, University of Minnesota, 2004.
(3) Sloan Neuzil. Benchmarking Boundary Definitions of the CPCM Implicit Solvation
Model, Loras College, Dubuque, IA, 2019.
(4) Kelly, C. P.; Cramer, C. J.; Truhlar, D. G. SM6: A Density Functional Theory
Continuum Solvation Model for Calculating Aqueous Solvation Free Energies of
Neutrals, Ions, and Solute−Water Clusters. J. Chem. Theory Comput. 2005, 1 (6),
1133–1152. https://doi.org/10.1021/ct050164b.
(5) Marenich, A. V.; Cramer, C. J.; Truhlar, D. G. Universal Solvation Model Based on
Solute Electron Density and on a Continuum Model of the Solvent Defined by the
Bulk Dielectric Constant and Atomic Surface Tensions. J. Phys. Chem. B 2009, 113
(18), 6378–6396. https://doi.org/10.1021/jp810292n.
(7) Rolf Sander. Henry’s Law Constants. In NIST Chemistry WebBook, NIST Standard
Reference Database Number 69, Eds. P.J. Linstrom and W.G. Mallard; National
Institute of Standards and Technology: Gaithersburd, MD, 20899.
xxxiv
(8) Sander, R. Compilation of Henry’s Law Constants (Version 4.0) for Water as
Solvent. Atmos. Chem. Phys. 2015, 15 (8), 4399–4981. https://doi.org/10.5194/acp-
15-4399-2015.
(9) Gupta, M.; da Silva, E. F.; Svendsen, H. F. Explicit Solvation Shell Model and
Continuum Solvation Models for Solvation Energy and p K a Determination of Amino
Acids. J. Chem. Theory Comput. 2013, 9 (11), 5021–5037.
https://doi.org/10.1021/ct400459y.
(10) Zhang, B. W.; Matubayasi, N.; Levy, R. M. Cavity Particle in Aqueous Solution with
a Hydrophobic Solute: Structure, Energetics, and Functionals. J Phys Chem B 2020,
124 (25), 5220–5237. https://doi.org/10.1021/acs.jpcb.0c02721.
Appendixes