Introduction To Computational Chemistry: Cleopas Machingauta, PHD

Introduction to
Computational Chemistry
Cleopas Machingauta, PhD

Chem. 301
2018
Definition of Computational Chemistry
• Computational Chemistry: Use mathematical approximations
and computer programs to obtain results relative to chemical
problems.
• Molecular Mechanics: uses classical mechanics to
model molecular systems. The Born-Oppenheimer
approximation is assumed valid and the potential energy of all
systems is calculated as a function of the nuclear coordinates
using force fields.
• Quantum Mechanics: Focuses specifically on equations and
approximations derived from the postulates of quantum
mechanics. Solve the Schrödinger equation for molecular
systems.
• Ab Initio Quantum Chemistry: Uses methods that do not
include any empirical parameters or experimental data.
Computational Chemistry
Different levels of theory (Method)
• Molecular Mechanics (MM)
– MM2, MM3, AMBER, etc.
• Semi-empirical methods
– CNDO, AM1, ZINDO, etc
• Quantum Mechanics (QM)

– Hatree Fock (HF), CI, CISD, MP, etc
• ab initio methods
– Density functional theory (DFT)
• B3LYP, B3PW91, BLYP, etc
Basis Sets
• MM methods does not require you to choose a basis set.
• S.E methods need no basis set selection.
• QM and ab initio methods require basis set selection.
• Examples of Basis set

– STO
– 631G can also have 631G*, 631G**
– 6311G
– cc-pVTZ
– aug-cc-pVTZ
•
CH 301 Lecture #1 Outline
• Introduction to Computational Chemistry
• Introduction to Molecular Modeling:

Force field methods
1. Why force field methods?

2. Components of a force field
3. Types of force fields
Introduction to Computational Chemistry
• Computational chemistry seeks to predict the structure,

properties and reactivity of matter
• A variety of computational methods are available,

depending on the size of the system, the level of
accuracy desired, and the computational cost
• Methods range from accurate and expensive highly

correlated ab initio methods to hybrid methods to
methods based solely on classical mechanics
• Computational chemistry is most useful when combined

with experiment
Advantages of Computational Chemistry
• Resolving experimental controversies
Example: structure of CH2
• Optimizing an experimental program
Design of experiments or synthetic

procedures, etc..
1 week of calculations may save months of
experiments!
• Prediction of properties which cannot easily or safely

be obtained from experiment
What do we wish to calculate?
• Molecular structures: governed by the Potential Energy

Surface
• Molecular properties:
Charge Distributions (Dipole moment)
Electron affinities/Ionization Potentials
Vibrational Spectra (IR/Raman)
Electronic Spectra (UV/Visible)
NMR Chemical Shifts

• Thermochemistry and chemical Reactivity:

related to PES
Force Field Methods: 1
• Problems addressed by these methods:

(1) Calculating energies for given structures
(2) Finding stable geometries of molecules
• Molecules are modeled as balls held together with springs
• All atoms are treated by classical mechanics,

using Newton’s equations of motion (Molecular mechanics)
• Quantum effects of nuclear motions are neglected

• What is the basis of FF methods?
• Molecules tend to be composed of units which are structurally similar in different

molecules.
• Example: The C-H bond

– bond length : 1.06 – 1.10 A
– stretch vibration: 2900 – 3300 cm-1
• Heat of formation for CH3 – (CH2)n – CH3 molecules

– Almost straight line when plotted against n
• Force field methods need to be parameterized – i.e., they are based

on input from experimental data
• This differentiates these methods from ab initio methods which

have no input from experiment
• Parametrization means that these methods are typically only

accurate for molecules similar to those included in the parameter
set (i.e., benchmarks)
Common Force Field Methods
• MM2/MM3 (general purpose organics)

U. Burkert and N. L. Allinger, Molecular Mechanics, ACS Monograph 177, 198N. L. Allinger, Y.
H. Yuh, and J.–H. Lii, J. Am. Chem. Soc. 111 (1989) 8551.
• OPLS (general purpose organics/proteins)
W. L. Jorgensen and J. Tirado-Rives, J. Am. Chem. Soc. 110 (1988) 1657.
• AMBER (proteins and nucleic acids)
S. J. Weiner, P. A. Kollman, D. T. Nguyen, and D. A. Case, J. Comput. Chem. 7 (1986) 230.
• CHARMM (proteins, nucleic acids, sugars)
B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus, J.
Comput. Chem. 4 (1983) 187. L. Nilsson and M. Karplus, J. Comput. Chem. 7 (1986) 591. S. N.
Ha, A. Giammona, M. Field, and J. W. Brady, Carbohydr. Res. 180 (1988) 207.
• ECEPP/ECEPP2 (proteins)
G. Nemethy, M. S. Pottle, and H. A. Scheraga, J. Phys. Chem. 87 (1983) 1883.
M. J. Stipple, G. Nemethy, and H. A. Scheraga, J. Phys. Chem. 88 (1984) 6231.
The Force Field Energy
• Expressed as a sum of terms
– Ustr : Bond Stretching

– Ubend : Bending energy Bonded
interactions
– Utor : Torsion energy
– Ucross : Cross terms
– Uvdw : van der Waals energy Non-bonded

– Uel : Electrostatic energy interactions
UFF  Ustr  Ubend  Utor  Uvdw  Uel  Ucross

Bond stretching energy: 1
• Ustr : The energy function for stretching a bond between two

atom types A and B
A B
• Equilibrium bond length is point of minimum energy

The Bending Energy
• Ubend : The energy required for bending an

angle formed by three atoms A-B-C
• Harmonic Approximation 
Ubend ( ABC   0ABC )  k ABC ( ABC   0ABC )2

• Improvement can be observed when higher
order terms are included
• For most applications, simple harmonic

approximation is quite adequate
The out-of-plane bending energy
• sp2-hybridized atoms (ABCD)

– there is a significant energy penalty associated with making the center
pyramidal
– ABD, ABC, CBD angle distortion should reflect the energy cost
associated with pyramidization
Uoop (  B )  k B (  B )2
The torsional energy: 1
• Rotation around B-C bond for four atom

sequence A-B-C-D
• Difference between bending and torsional
energy:
– The energy function must be periodic
about the angle 
– The cost of energy for distortion is often
low
• Large deviations from the minimum can easily
occur
• Fourier series expansion
Utors ( )  Vn cos(n )

n 1
The torsional energy: 2
Utors ( )  Vn cos(n )

n 1
• Usually only a few of the Vn terms are non-zero

– n=1 : periodic by 360 degree
– Ethane : three minima,

three maxima
• n = 3,6,9, … can have Vn
The van der Waals energy: 1
• Uvdw : energy describing the repulsion and attraction between

non-bonded atoms
– Interaction energy is not related to electrostatic energy due to atomic
charges
– Involves both repulsion and attraction
• Small distances, very repulsive  overlap of electron clouds
• Intermediate distance, slight attraction
– motion of electrons create
induced dipole moment
Repulsion
+- +- Attraction
The Electrostatic Energy
• Related to atomic charges and thus distribution of electrons

– positive and negative part of molecule
– long range force is Coulombic
• Two modeling approaches
– Point charge Model
– Bond Dipole Model

Lennard-Jones
1.5
Coulomb
 1.0
 
0.5
0.0
-0.5
-1.0
1 2 3 4
Cross terms
• Bonds, angles and torsions are not isolated – motions are

coupled
• Examples:
– Stretch/bend coupling
Ucross (R AB , ABC )  k AB,ABC (R AB  R0AB )( ABC   0ABC )

– Stretch/stretch coupling
– Bend/bend coupling
– Stretch/torsion coupling
– Bend/torsion coupling
– Bend/torsion/bend coupling …
What is the relative importance of these terms?
Term Scale
(kcal mol-1)
Bond stretching 100
Bending 10
Torsional 1
Hydrogen Bonding 2
Electrostatic 0.5
Van der Waals 0.1
What is Geometry Optimization?
• Given a starting geometry for the molecule, what is (are) the

geometry(ies) of lowest (minimum) energy?
• “Global” vs. “local” minima – the global minimum is the

lowest energy on the entire PES
• Given a force field, the problem becomes one of minimizing

the force field energy with respect to geometry. There are a
number of algorithms available to accomplish this
• The Hartree–Fock (HF) method is a method of
approximation for the determination of the wave function
and the energy of a quantum many-body system in
a stationary state.
• Especially in the older literature, the Hartree–Fock method

is also called the self-consistent field method (SCF).
The Hartree Fock-SCF approach
• In the most common implementation of this method, the Fock
operator for one electron is given by:
all nuclei
1 Zk
f i   i2  k r  V i
HF
 ri 
2
    ik
  
electron  electron
electron electron  nuclear potential energy
kinetic potential energy
1. An effective energy
nuclear charge is used, to account for shielding of the nucleus by other electrons
2. Each one-electron orbital is expressed as a linear combination
Vi  atom-like
HF
of hydrogen 2 J i  Korbitals,
i with the coefficients used as parameters
The Hartree Fock-SCF approach
• In the mean field approach, the effective potential for a given
electron depends on the functional form of the orbitals for all other
electrons!
• The primary limitation of the HF method is the one-electron nature

of the Fock operator. Electron correlation is neglected.
• Another limitation is the scaling with basis set size. Because of the
four-index integrals, the scaling is N4, where N is the number of
basis functions
• Let’s take a look at the limitations of the HF-SCF theory

Limitations of the HF method
• The HF method will tend to give an energy which is too large, since
the electron-electron repulsion is overestimated
• In practice, we can improve the solution by increasing the size of

our basis set (i.e., the set of atomic basis functions used to construct
our molecular orbitals). However, we have already seen that the
scaling is not favorable.
• Moreover, in increasing the basis set size we eventually reach a limit

(the HF limit).
• HF methods recover about 99% of the total energy; however, the 1%

is comparable to the energy changes in chemical reactions and thus
is very important.
• Let’s now see how well the HF model works for various types of
calculations.
HF method and bond dissociation energies
• The HF method performs poorly for simple bond dissociation reactions,

due to fact that electron correlation is very different for reactants and
products.
HF method and isomerization reactions
• The HF method performs much better for reactions where the total
number and types of bonds are conserved (correlation is similar).
HF method and bond distances
• The HF method tends to underestimate bond distances. Why?

HF method and vibrational frequencies
• The HF method overestimates vibrational frequencies. Why?

Beyond the Hartree-Fock method
• Moving beyond Hartree-Fock theory, two approaches have generally

been developed:
1. Introduce approximations to simplify the solution and

improve the accuracy of the HF equations
 Semi-empirical MO theory
2. Use HF theory as a stepping stone on the way to

(hopefully) exact solution of the Schrödinger equation.
This approach recognizes that HF gets ~ 99% of the
problem correct – the missing ingredient is electron
correlation.
 Ab initio MO theory
Semiempirical Methods
• Semiempirical calculations are set up with the same
general structure as a HF calculation in that they have a
Hamiltonian and a wave function.
• Within this framework, certain pieces of information are
approximated or completely omitted.
• Usually, the core electrons are not included in the
calculation and only a minimal basis set is used. Also,
some of the two-electron integrals are omitted.
• In order to correct for the errors introduced by omitting
part of the calculation, the method is parameterized.
• The advantage of semiempirical calculations is that they
are much faster than ab initio calculations.
• The disadvantage of semiempirical calculations is that the
results can be erratic and fewer properties can be predicted
reliably.
• If the molecule being computed is similar to molecules in
the database used to parameterize the method, then the
results may be very good.
• Geometry and energy (usually the heat of formation) are
mostly calculated.
• Some researchers have extended this by including dipole
moments, heats of reaction, and ionization potentials in the
parameterization set.
• A few methods have been parameterized to reproduce a
specific property, such as electronic spectra or NMR
chemical shifts.
• Semiempirical calculations can be used to compute
properties other than those in the parameterization set.
• Semiempirical calculations have been very successful in
the description of organic chemistry, where there are only
a few elements used extensively and the molecules are of
moderate size.
• Some semiempirical methods have been devised
specifically for the description of inorganic chemistry as
well.
• The following are some of the most commonly used
semiempirical methods:
1. HUCKEL
2. Extended HUCKEL
3. PPP
4. CNDO
5. MINDO
6. ZINDO
7. AM1
8. PM3
9. Gaussian methods (G1, G2 and G3)
Comparison: Semi-empirical vs. ab initio
Semi-empirical MO theory Ab initio MO theory
All core electrons are ignored Core electrons are treated
Slater-type orbitals are used Gaussian-type orbitals are used
3 and 4 center integrals are completely 3 and 4 center integrals are calculated
neglected explicitly
1 and 2 center integrals are 1 and 2 center integrals are calculated

parameterized explicitly
Scaling with basis set size N is usually Scaling with basis set size N
no worse than N2 goes as N4
Capable of treating relatively large Limited to relatively small systems (<

systems (> 100 atoms) 100 atoms)
Basis Set
Basis Set
• A basis set is a set of functions used to describe the shape
of the orbitals in an atom.
• Molecular orbitals and entire wave functions are created
by taking linear combinations of basis functions and
angular functions.
• Most semiempirical methods use a predefined basis set.
• When ab initio or density functional theory calculations
are done, a basis set must be specified.
• The type of calculation performed and basis set chosen are
the two biggest factors in determining the accuracy of
results.
Basis Set
• The choice of basis set also has a large effect on the
amount of CPU time required to perform a calculation.
• In general, the amount of CPU time for Hartree Fock
calculations scales as N4.
• This means that making the calculation twice as large will
make the calculation take 16 times as long to run.
• Making the calculation twice as large can occur by
– switching to a molecule with twice as many electrons
or
– by switching to a basis set with twice as many
functions.
Basis Set
• Approximating a STO
with several GTO.
• The exact solution to the
Schrodinger equation for
the hydrogen atom is
a Slater type orbital, or STO,
of the form expξzr.
Basis Set
• However, the integrals over GTOs can be computed

analytically, which is so much faster than the numeric
integrals over STO functions that any given accuracy can
be obtained most quickly using GTO functions.
• As such, STO basis sets are sometimes used for high-
accuracy work, but most calculations are now done with
GTO basis sets.
Basis Sets in ab initio MO theory
• What types of basis functions are used?

• It might appear obvious that we would continue to use Slater-type
orbitals (STOs), with a radial dependence given by:
r a0
e
• However, the integrals of STOs are difficult to evaluate (costly in
terms of computer time). All that is required for analytic solutions
to the four center integral equations is that we make the small
change: 2
r r
e e
• Thus, typically the AO’s are expanded in terms of Gaussian
functions. These Gaussian-type orbitals (GTOs) are not as
accurate; thus, more basis functions must be used. However,
this is offset by the greater efficiency of calculation
Pople-style basis sets
• Named for Prof. John Pople who won the Nobel Prize in Chemistry
for his work in quantum chemistry (1998).
• Notation: 6-31G
Use 6 primitives Use 2 functions to
contracted to a single describe valence orbitals (2s, 2p in C).
contracted-Gaussian One is a contracted-Gaussian
to describe inner (core) composed of 3 primitives,
electrons (1s in C) the second is a single primitive.
6-311G Use 3 functions to describe valence orbitals...
6-31G* Add functions of ang. momentum type 1 greater than

occupied in bonding atoms (For N2 we’d add a d)
6-31G(d) Same as 6-31G* for 2nd and 3rd row atoms

Correlation-Consistent Basis Sets
• Designed such that they have the unique property of forming a
systematically convergent set.
• Calculations with a series of correlation consistent (cc) basis sets can
lead to accurate estimates of the Complete Basis Set (CBS) limit.
• Notation: cc-pVnZ
– correlation consistent polarized valence n-zeta
• n = D, T, Q, 5,... (double, triple, quadruple, quintuple, ...)
– double zeta-use 2 Gaussians to describe valence orbitals; triple zeta-use 3
Gaussians...
– aug-cc-pVnZ: add an extra diffuse function of each angular momentum
type
• Relation between Pople and cc basis sets
– cc-pVDZ ≈ 6-31G(d,p)
– cc-pVTZ ≈ 6-311G(2df,2pd)
Electron Correlation
• Electron Correlation: Difference between energy calculated with exact
wave-function and energy from using Hartree-Fock wavefunction.
Ecorr = Eexact - EHF
• Accounts for the neglect of instantaneous electron-electron interactions of
Hartree-Fock method.
• In general, we get correlation energy by adding additional Slater
determinants to our expansion of .
 el  d0  HF   di i
i 1
• Hartree-Fock wavefunction is often used as our starting point.

• Additional Slater determinants are often called “excited.”
– Mental picture of orbitals and electron configurations must be abandoned.
• Different correlation methods differ in how they choose which i to
include and in how they calculate the coefficients, di.
Configuration Interaction
• Write  as a linear combination of Slater Determinants and calculate the
expansion coefficients such that the energy is minimized.
 el  d0  HF   di i
i 1
• Makes us of the linear variational principle: no matter what wave
function is used, the energy is always equal to or greater than the true
energy.
• If we include all excited i we will have a full-CI, and an exact solution
for the given basis set we are using.
• Full-CI calculations are generally not computationally feasible, so we
must truncate the number of i in some way.
• CISD: Configuration interaction with single- and double-excitations.
– Include all determinants of S- and D- type.
• MRCI: Multi-reference configuration interaction
• CI methods can be very accurate, but require long (and therefore
expensive) expansions.
– hundreds of thousands, millions, or more
Møller-Plesset Perturbation Theory
• Perturbation methods, like Møller-Plesset (MP)
perturbation theory, assume that the problem we’d like to
solve (correlated  and E) differ only slightly from a
problem we’ve already solved (HF  and E).
• The energy is calculated to various orders of
approximation.
– Second order MP2; Third order MP3; Fourth order MP4...
– Computational cost increases strongly with each succesive order.
– At infinite order the energy should be equal to the exact solution
of the S.E. (for the given basis set). However, there is no
guarantee the series is actually convergent.
– In general only MP2 is recommended
• MP2 ~ including all single and double excitations
Coupled Cluster (CC) Theory
• An exponential operator is used in constructing the
expansion of determinants.
• Leads to accurate and compact wave function expansions
yielding accurate electronic energies.
• Common Variants:
– CCSD: singles and doubles CC
– CCSD(T): CCSD with approximate treatment of triple excitations.
This method, when used with large basis sets, can generally
provide highly accurate results. With this method, it is often
possible to get thermochemistry within chemical accuracy, 1
kcal/mol (4.184 kJ/mol)
Density Functional Theory
• The methods we’ve been discussing can be grouped
together under the heading “Wavefunction methods.”
– They all calculate energies/properties by calculating/improving
upon the wavefunction.
• Density Functional Theory (DFT) instead solves for the
electron density.
– Generally computational cost is similar to the cost of HF
calculations.
– Most DFT methods involve some empirical parameterization.
– Generally lacks the systematics that characterize wavefunction
methods.
– Often the best choice when dealing with very large molecules
(proteins, large organic molecules...)
• DFT tends to be classified either as an ab initio method or in a class
by itself.
• The advantage of using electron density is that the integrals for
Coulomb repulsion need be done only over the electron density,
which is a three-dimensional function, thus scaling as N3.
• Furthermore, at least some electron correlation can be included in
the calculation.
• This results in faster calculations than HF calculations (which scale
as N4) and computations that are a bit more accurate as well.
• The better DFT functionals give results with an accuracy similar to
that of an MP2 calculation.
Performance of DFT methods: H2O
n Method rOH (Å) θHOH (°)
3 HF 0.940 106.1
3 CIS 1.078 114.0
3 CISD 0.954 104.1
3 MP2 0.959 103.5
3 MP3 0.955 104.0
3 MP4 0.957 103.7
3 CCD 0.956 103.9
3 LSDA 0.969 104.4
3 BLYP 0.972 103.8
… Exp. 0.958 104.5
Performance of DFT methods: 1
Job cost comparison: DFT vs. MO theory
• Wall clock times for geometry optimization of methanol:
Advantages of DFT
• For the “average” problem, DFT will generally achieve a given

level of accuracy at the lowest cost
• Unlike HF, DFT includes (some level of) electron correlation
• DFT is generally more robust that HF theory in dealing with open

shell systems
• Scaling behavior of DFT is no worse than N3: DFT thus offers

good cost-scaling with system size
• Convergence with basis set size is typically more rapid with DFT
(good results obtained with double- or triple-zeta basis sets)
Disadvantages of DFT
• DFT optimizes the electron density, MO theory the wavefunction.

To determine a property, in DFT we need to know its connection
to the density, while in MO theory we need the operator
• DFT is problematic for cases which aren’t well described by a

single determinantal wavefunction
• With DFT, it is often unclear how to do a “better” calculation; i.e.,

how to systematically improve the calculation
• DFT methods do not properly describe long-range forces,

because of the focus on the local density

Introduction To Computational Chemistry: Cleopas Machingauta, PHD

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Computational Chemistry: Cleopas Machingauta, PHD

Uploaded by

Copyright:

Available Formats

Introduction to

Cleopas Machingauta, PhD

• Quantum Mechanics (QM)

• Examples of Basis set

• Introduction to Computational Chemistry

• Introduction to Molecular Modeling:

1. Why force field methods?

• Computational chemistry seeks to predict the structure,

• A variety of computational methods are available,

• Methods range from accurate and expensive highly

• Computational chemistry is most useful when combined

• Resolving experimental controversies

Example: structure of CH2

• Optimizing an experimental program

Design of experiments or synthetic

• Prediction of properties which cannot easily or safely

• Molecular structures: governed by the Potential Energy

Charge Distributions (Dipole moment)

Electron affinities/Ionization Potentials

Vibrational Spectra (IR/Raman)

Electronic Spectra (UV/Visible)

NMR Chemical Shifts

• Thermochemistry and chemical Reactivity:

• Problems addressed by these methods:

• Molecules are modeled as balls held together with springs

• All atoms are treated by classical mechanics,

• Quantum effects of nuclear motions are neglected

• What is the basis of FF methods?

• Molecules tend to be composed of units which are structurally similar in different

• Example: The C-H bond

• Heat of formation for CH3 – (CH2)n – CH3 molecules

• Force field methods need to be parameterized – i.e., they are based

• This differentiates these methods from ab initio methods which

• Parametrization means that these methods are typically only

• MM2/MM3 (general purpose organics)

– Ustr : Bond Stretching

– Uvdw : van der Waals energy Non-bonded

UFF  Ustr  Ubend  Utor  Uvdw  Uel  Ucross

• Ustr : The energy function for stretching a bond between two

• Equilibrium bond length is point of minimum energy

• Ubend : The energy required for bending an

Ubend ( ABC   0ABC )  k ABC ( ABC   0ABC )2

• For most applications, simple harmonic

• sp2-hybridized atoms (ABCD)

• Rotation around B-C bond for four atom

Utors ( )  Vn cos(n )

Utors ( )  Vn cos(n )

• Usually only a few of the Vn terms are non-zero

– Ethane : three minima,

• Uvdw : energy describing the repulsion and attraction between

• Related to atomic charges and thus distribution of electrons

– Bond Dipole Model

• Bonds, angles and torsions are not isolated – motions are

Ucross (R AB , ABC )  k AB,ABC (R AB  R0AB )( ABC   0ABC )

• Given a starting geometry for the molecule, what is (are) the

• “Global” vs. “local” minima – the global minimum is the

• Given a force field, the problem becomes one of minimizing

• Especially in the older literature, the Hartree–Fock method

• The primary limitation of the HF method is the one-electron nature

• Let’s take a look at the limitations of the HF-SCF theory

• In practice, we can improve the solution by increasing the size of

• Moreover, in increasing the basis set size we eventually reach a limit