You are on page 1of 64

Introduction to

Computational Chemistry

Cleopas Machingauta, PhD


Chem. 301
2018
Definition of Computational Chemistry
• Computational Chemistry: Use mathematical approximations
and computer programs to obtain results relative to chemical
problems.
• Molecular Mechanics: uses classical mechanics to
model molecular systems. The Born-Oppenheimer
approximation is assumed valid and the potential energy of all
systems is calculated as a function of the nuclear coordinates
using force fields.
• Quantum Mechanics: Focuses specifically on equations and
approximations derived from the postulates of quantum
mechanics. Solve the Schrödinger equation for molecular
systems.
• Ab Initio Quantum Chemistry: Uses methods that do not
include any empirical parameters or experimental data.
Computational Chemistry
Different levels of theory (Method)
• Molecular Mechanics (MM)
– MM2, MM3, AMBER, etc.

• Semi-empirical methods
– CNDO, AM1, ZINDO, etc

• Quantum Mechanics (QM)


– Hatree Fock (HF), CI, CISD, MP, etc

• ab initio methods
– Density functional theory (DFT)
• B3LYP, B3PW91, BLYP, etc
Basis Sets
• MM methods does not require you to choose a basis set.
• S.E methods need no basis set selection.
• QM and ab initio methods require basis set selection.

• Examples of Basis set


– STO
– 631G can also have 631G*, 631G**
– 6311G
– cc-pVTZ
– aug-cc-pVTZ


CH 301 Lecture #1 Outline

• Introduction to Computational Chemistry

• Introduction to Molecular Modeling:


Force field methods

1. Why force field methods?


2. Components of a force field
3. Types of force fields
Introduction to Computational Chemistry

• Computational chemistry seeks to predict the structure,


properties and reactivity of matter

• A variety of computational methods are available,


depending on the size of the system, the level of
accuracy desired, and the computational cost

• Methods range from accurate and expensive highly


correlated ab initio methods to hybrid methods to
methods based solely on classical mechanics

• Computational chemistry is most useful when combined


with experiment
Advantages of Computational Chemistry

• Resolving experimental controversies

Example: structure of CH2

• Optimizing an experimental program

Design of experiments or synthetic


procedures, etc..
1 week of calculations may save months of
experiments!

• Prediction of properties which cannot easily or safely


be obtained from experiment
What do we wish to calculate?

• Molecular structures: governed by the Potential Energy


Surface
What do we wish to calculate?

• Molecular properties:

Charge Distributions (Dipole moment)

Electron affinities/Ionization Potentials

Vibrational Spectra (IR/Raman)

Electronic Spectra (UV/Visible)

NMR Chemical Shifts


What do we wish to calculate?

• Thermochemistry and chemical Reactivity:


related to PES
Force Field Methods: 1

• Problems addressed by these methods:


(1) Calculating energies for given structures
(2) Finding stable geometries of molecules

• Molecules are modeled as balls held together with springs

• All atoms are treated by classical mechanics,


using Newton’s equations of motion (Molecular mechanics)

• Quantum effects of nuclear motions are neglected


Force Field Methods: 2

• What is the basis of FF methods?

• Molecules tend to be composed of units which are structurally similar in different


molecules.

• Example: The C-H bond


– bond length : 1.06 – 1.10 A
– stretch vibration: 2900 – 3300 cm-1

• Heat of formation for CH3 – (CH2)n – CH3 molecules


– Almost straight line when plotted against n
Force Field Methods: 3

• Force field methods need to be parameterized – i.e., they are based


on input from experimental data

• This differentiates these methods from ab initio methods which


have no input from experiment

• Parametrization means that these methods are typically only


accurate for molecules similar to those included in the parameter
set (i.e., benchmarks)
Common Force Field Methods

• MM2/MM3 (general purpose organics)


U. Burkert and N. L. Allinger, Molecular Mechanics, ACS Monograph 177, 198N. L. Allinger, Y.
H. Yuh, and J.–H. Lii, J. Am. Chem. Soc. 111 (1989) 8551.
• OPLS (general purpose organics/proteins)
W. L. Jorgensen and J. Tirado-Rives, J. Am. Chem. Soc. 110 (1988) 1657.
• AMBER (proteins and nucleic acids)
S. J. Weiner, P. A. Kollman, D. T. Nguyen, and D. A. Case, J. Comput. Chem. 7 (1986) 230.
• CHARMM (proteins, nucleic acids, sugars)
B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus, J.
Comput. Chem. 4 (1983) 187. L. Nilsson and M. Karplus, J. Comput. Chem. 7 (1986) 591. S. N.
Ha, A. Giammona, M. Field, and J. W. Brady, Carbohydr. Res. 180 (1988) 207.
• ECEPP/ECEPP2 (proteins)
G. Nemethy, M. S. Pottle, and H. A. Scheraga, J. Phys. Chem. 87 (1983) 1883.
M. J. Stipple, G. Nemethy, and H. A. Scheraga, J. Phys. Chem. 88 (1984) 6231.
The Force Field Energy
• Expressed as a sum of terms

– Ustr : Bond Stretching


– Ubend : Bending energy Bonded
interactions
– Utor : Torsion energy
– Ucross : Cross terms

– Uvdw : van der Waals energy Non-bonded


– Uel : Electrostatic energy interactions

UFF  Ustr  Ubend  Utor  Uvdw  Uel  Ucross


Bond stretching energy: 1

• Ustr : The energy function for stretching a bond between two


atom types A and B
A B

• Equilibrium bond length is point of minimum energy


The Bending Energy

• Ubend : The energy required for bending an


angle formed by three atoms A-B-C
• Harmonic Approximation 

Ubend ( ABC   0ABC )  k ABC ( ABC   0ABC )2


• Improvement can be observed when higher
order terms are included

• For most applications, simple harmonic


approximation is quite adequate
The out-of-plane bending energy

• sp2-hybridized atoms (ABCD)


– there is a significant energy penalty associated with making the center
pyramidal
– ABD, ABC, CBD angle distortion should reflect the energy cost
associated with pyramidization

Uoop (  B )  k B (  B )2
The torsional energy: 1

• Rotation around B-C bond for four atom


sequence A-B-C-D
• Difference between bending and torsional
energy:
– The energy function must be periodic
about the angle 
– The cost of energy for distortion is often
low
• Large deviations from the minimum can easily
occur
• Fourier series expansion

Utors ( )  Vn cos(n )


n 1
The torsional energy: 2

Utors ( )  Vn cos(n )


n 1

• Usually only a few of the Vn terms are non-zero


– n=1 : periodic by 360 degree
– n=2 : periodic by 170 degree
– n=3 : periodic by 120 degree

– Ethane : three minima,


three maxima
• n = 3,6,9, … can have Vn
The van der Waals energy: 1

• Uvdw : energy describing the repulsion and attraction between


non-bonded atoms
– Interaction energy is not related to electrostatic energy due to atomic
charges
– Involves both repulsion and attraction
• Small distances, very repulsive  overlap of electron clouds
• Intermediate distance, slight attraction
– motion of electrons create
induced dipole moment

Repulsion

+- +- Attraction
The Electrostatic Energy

• Related to atomic charges and thus distribution of electrons


– positive and negative part of molecule
– long range force is Coulombic
• Two modeling approaches
– Point charge Model

– Bond Dipole Model


Lennard-Jones
1.5
Coulomb

 1.0
 
0.5

0.0

-0.5

-1.0
1 2 3 4
Cross terms

• Bonds, angles and torsions are not isolated – motions are


coupled
• Examples:
– Stretch/bend coupling

Ucross (R AB , ABC )  k AB,ABC (R AB  R0AB )( ABC   0ABC )


– Stretch/stretch coupling
– Bend/bend coupling
– Stretch/torsion coupling
– Bend/torsion coupling
– Bend/torsion/bend coupling …
What is the relative importance of these terms?

Term Scale
(kcal mol-1)
Bond stretching 100
Bending 10
Torsional 1
Hydrogen Bonding 2
Electrostatic 0.5
Van der Waals 0.1
What is Geometry Optimization?

• Given a starting geometry for the molecule, what is (are) the


geometry(ies) of lowest (minimum) energy?

• “Global” vs. “local” minima – the global minimum is the


lowest energy on the entire PES

• Given a force field, the problem becomes one of minimizing


the force field energy with respect to geometry. There are a
number of algorithms available to accomplish this
• The Hartree–Fock (HF) method is a method of
approximation for the determination of the wave function
and the energy of a quantum many-body system in
a stationary state.

• Especially in the older literature, the Hartree–Fock method


is also called the self-consistent field method (SCF). 
The Hartree Fock-SCF approach
• In the most common implementation of this method, the Fock
operator for one electron is given by:

all nuclei
1 Zk
f i   i2  k r  V i
HF
 ri 
2
    ik
  
electron  electron
electron electron  nuclear potential energy
kinetic potential energy
1. An effective energy
nuclear charge is used, to account for shielding of the nucleus by other electrons
2. Each one-electron orbital is expressed as a linear combination
Vi  atom-like
HF
of hydrogen 2 J i  Korbitals,
i with the coefficients used as parameters
The Hartree Fock-SCF approach
• In the mean field approach, the effective potential for a given
electron depends on the functional form of the orbitals for all other
electrons!

• The primary limitation of the HF method is the one-electron nature


of the Fock operator. Electron correlation is neglected.

• Another limitation is the scaling with basis set size. Because of the
four-index integrals, the scaling is N4, where N is the number of
basis functions

• Let’s take a look at the limitations of the HF-SCF theory


Limitations of the HF method

• The HF method will tend to give an energy which is too large, since
the electron-electron repulsion is overestimated

• In practice, we can improve the solution by increasing the size of


our basis set (i.e., the set of atomic basis functions used to construct
our molecular orbitals). However, we have already seen that the
scaling is not favorable.

• Moreover, in increasing the basis set size we eventually reach a limit


(the HF limit).

• HF methods recover about 99% of the total energy; however, the 1%


is comparable to the energy changes in chemical reactions and thus
is very important.

• Let’s now see how well the HF model works for various types of
calculations.
HF method and bond dissociation energies

• The HF method performs poorly for simple bond dissociation reactions,


due to fact that electron correlation is very different for reactants and
products.
HF method and isomerization reactions

• The HF method performs much better for reactions where the total
number and types of bonds are conserved (correlation is similar).
HF method and bond distances

• The HF method tends to underestimate bond distances. Why?


HF method and vibrational frequencies

• The HF method overestimates vibrational frequencies. Why?


Beyond the Hartree-Fock method

• Moving beyond Hartree-Fock theory, two approaches have generally


been developed:

1. Introduce approximations to simplify the solution and


improve the accuracy of the HF equations

 Semi-empirical MO theory

2. Use HF theory as a stepping stone on the way to


(hopefully) exact solution of the Schrödinger equation.
This approach recognizes that HF gets ~ 99% of the
problem correct – the missing ingredient is electron
correlation.

 Ab initio MO theory
Semiempirical Methods
• Semiempirical calculations are set up with the same
general structure as a HF calculation in that they have a
Hamiltonian and a wave function.
• Within this framework, certain pieces of information are
approximated or completely omitted.
• Usually, the core electrons are not included in the
calculation and only a minimal basis set is used. Also,
some of the two-electron integrals are omitted.
• In order to correct for the errors introduced by omitting
part of the calculation, the method is parameterized.
Semiempirical Methods
• The advantage of semiempirical calculations is that they
are much faster than ab initio calculations.
• The disadvantage of semiempirical calculations is that the
results can be erratic and fewer properties can be predicted
reliably.
• If the molecule being computed is similar to molecules in
the database used to parameterize the method, then the
results may be very good.
Semiempirical Methods
• Geometry and energy (usually the heat of formation) are
mostly calculated.
• Some researchers have extended this by including dipole
moments, heats of reaction, and ionization potentials in the
parameterization set.
• A few methods have been parameterized to reproduce a
specific property, such as electronic spectra or NMR
chemical shifts.
• Semiempirical calculations can be used to compute
properties other than those in the parameterization set.
Semiempirical Methods
• Semiempirical calculations have been very successful in
the description of organic chemistry, where there are only
a few elements used extensively and the molecules are of
moderate size.
• Some semiempirical methods have been devised
specifically for the description of inorganic chemistry as
well.
• The following are some of the most commonly used
semiempirical methods:
Semiempirical Methods
1. HUCKEL
2. Extended HUCKEL
3. PPP
4. CNDO
5. MINDO
6. ZINDO
7. AM1
8. PM3
9. Gaussian methods (G1, G2 and G3)
Comparison: Semi-empirical vs. ab initio

Semi-empirical MO theory Ab initio MO theory

All core electrons are ignored Core electrons are treated

Slater-type orbitals are used Gaussian-type orbitals are used

3 and 4 center integrals are completely 3 and 4 center integrals are calculated
neglected explicitly

1 and 2 center integrals are 1 and 2 center integrals are calculated


parameterized explicitly

Scaling with basis set size N is usually Scaling with basis set size N
no worse than N2 goes as N4

Capable of treating relatively large Limited to relatively small systems (<


systems (> 100 atoms) 100 atoms)
Basis Set
Basis Set
• A basis set is a set of functions used to describe the shape
of the orbitals in an atom.
• Molecular orbitals and entire wave functions are created
by taking linear combinations of basis functions and
angular functions.
• Most semiempirical methods use a predefined basis set.
• When ab initio or density functional theory calculations
are done, a basis set must be specified.
• The type of calculation performed and basis set chosen are
the two biggest factors in determining the accuracy of
results.
Basis Set
• The choice of basis set also has a large effect on the
amount of CPU time required to perform a calculation.
• In general, the amount of CPU time for Hartree Fock
calculations scales as N4.
• This means that making the calculation twice as large will
make the calculation take 16 times as long to run.
• Making the calculation twice as large can occur by
– switching to a molecule with twice as many electrons
or
– by switching to a basis set with twice as many
functions.
Basis Set

• Approximating a STO
with several GTO.
• The exact solution to the
Schrodinger equation for
the hydrogen atom is
a Slater type orbital, or STO,
of the form expξzr.
Basis Set

• However, the integrals over GTOs can be computed


analytically, which is so much faster than the numeric
integrals over STO functions that any given accuracy can
be obtained most quickly using GTO functions.
• As such, STO basis sets are sometimes used for high-
accuracy work, but most calculations are now done with
GTO basis sets.
Basis Sets in ab initio MO theory

• What types of basis functions are used?


• It might appear obvious that we would continue to use Slater-type
orbitals (STOs), with a radial dependence given by:
r a0
e
• However, the integrals of STOs are difficult to evaluate (costly in
terms of computer time). All that is required for analytic solutions
to the four center integral equations is that we make the small
change: 2
r r
e e
• Thus, typically the AO’s are expanded in terms of Gaussian
functions. These Gaussian-type orbitals (GTOs) are not as
accurate; thus, more basis functions must be used. However,
this is offset by the greater efficiency of calculation
Pople-style basis sets
• Named for Prof. John Pople who won the Nobel Prize in Chemistry
for his work in quantum chemistry (1998).
• Notation: 6-31G
Use 6 primitives Use 2 functions to
contracted to a single describe valence orbitals (2s, 2p in C).
contracted-Gaussian One is a contracted-Gaussian
to describe inner (core) composed of 3 primitives,
electrons (1s in C) the second is a single primitive.

6-311G Use 3 functions to describe valence orbitals...

6-31G* Add functions of ang. momentum type 1 greater than


occupied in bonding atoms (For N2 we’d add a d)

6-31G(d) Same as 6-31G* for 2nd and 3rd row atoms


Correlation-Consistent Basis Sets
• Designed such that they have the unique property of forming a
systematically convergent set.
• Calculations with a series of correlation consistent (cc) basis sets can
lead to accurate estimates of the Complete Basis Set (CBS) limit.
• Notation: cc-pVnZ
– correlation consistent polarized valence n-zeta
• n = D, T, Q, 5,... (double, triple, quadruple, quintuple, ...)
– double zeta-use 2 Gaussians to describe valence orbitals; triple zeta-use 3
Gaussians...
– aug-cc-pVnZ: add an extra diffuse function of each angular momentum
type
• Relation between Pople and cc basis sets
– cc-pVDZ ≈ 6-31G(d,p)
– cc-pVTZ ≈ 6-311G(2df,2pd)
Electron Correlation
• Electron Correlation: Difference between energy calculated with exact
wave-function and energy from using Hartree-Fock wavefunction.
Ecorr = Eexact - EHF
• Accounts for the neglect of instantaneous electron-electron interactions of
Hartree-Fock method.
• In general, we get correlation energy by adding additional Slater
determinants to our expansion of .
 el  d0  HF   di i
i 1

• Hartree-Fock wavefunction is often used as our starting point.


• Additional Slater determinants are often called “excited.”
– Mental picture of orbitals and electron configurations must be abandoned.
• Different correlation methods differ in how they choose which i to
include and in how they calculate the coefficients, di.
Configuration Interaction
• Write  as a linear combination of Slater Determinants and calculate the
expansion coefficients such that the energy is minimized.

 el  d0  HF   di i
i 1
• Makes us of the linear variational principle: no matter what wave
function is used, the energy is always equal to or greater than the true
energy.
• If we include all excited i we will have a full-CI, and an exact solution
for the given basis set we are using.
• Full-CI calculations are generally not computationally feasible, so we
must truncate the number of i in some way.
• CISD: Configuration interaction with single- and double-excitations.
– Include all determinants of S- and D- type.
• MRCI: Multi-reference configuration interaction
• CI methods can be very accurate, but require long (and therefore
expensive) expansions.
– hundreds of thousands, millions, or more
Møller-Plesset Perturbation Theory
• Perturbation methods, like Møller-Plesset (MP)
perturbation theory, assume that the problem we’d like to
solve (correlated  and E) differ only slightly from a
problem we’ve already solved (HF  and E).
• The energy is calculated to various orders of
approximation.
– Second order MP2; Third order MP3; Fourth order MP4...
– Computational cost increases strongly with each succesive order.
– At infinite order the energy should be equal to the exact solution
of the S.E. (for the given basis set). However, there is no
guarantee the series is actually convergent.
– In general only MP2 is recommended
• MP2 ~ including all single and double excitations
Coupled Cluster (CC) Theory
• An exponential operator is used in constructing the
expansion of determinants.
• Leads to accurate and compact wave function expansions
yielding accurate electronic energies.
• Common Variants:
– CCSD: singles and doubles CC
– CCSD(T): CCSD with approximate treatment of triple excitations.
This method, when used with large basis sets, can generally
provide highly accurate results. With this method, it is often
possible to get thermochemistry within chemical accuracy, 1
kcal/mol (4.184 kJ/mol)
Density Functional Theory
• The methods we’ve been discussing can be grouped
together under the heading “Wavefunction methods.”
– They all calculate energies/properties by calculating/improving
upon the wavefunction.
• Density Functional Theory (DFT) instead solves for the
electron density.
– Generally computational cost is similar to the cost of HF
calculations.
– Most DFT methods involve some empirical parameterization.
– Generally lacks the systematics that characterize wavefunction
methods.
– Often the best choice when dealing with very large molecules
(proteins, large organic molecules...)
• DFT tends to be classified either as an ab initio method or in a class
by itself.
• The advantage of using electron density is that the integrals for
Coulomb repulsion need be done only over the electron density,
which is a three-dimensional function, thus scaling as N3.
• Furthermore, at least some electron correlation can be included in
the calculation.
• This results in faster calculations than HF calculations (which scale
as N4) and computations that are a bit more accurate as well.
• The better DFT functionals give results with an accuracy similar to
that of an MP2 calculation.
Performance of DFT methods: H2O

n Method rOH (Å) θHOH (°)

3 HF 0.940 106.1
3 CIS 1.078 114.0
3 CISD 0.954 104.1
3 MP2 0.959 103.5
3 MP3 0.955 104.0
3 MP4 0.957 103.7
3 CCD 0.956 103.9
3 LSDA 0.969 104.4
3 BLYP 0.972 103.8
… Exp. 0.958 104.5
Performance of DFT methods: 1
Performance of DFT methods: 2
Performance of DFT methods: 3
Job cost comparison: DFT vs. MO theory
• Wall clock times for geometry optimization of methanol:
Advantages of DFT

• For the “average” problem, DFT will generally achieve a given


level of accuracy at the lowest cost

• Unlike HF, DFT includes (some level of) electron correlation

• DFT is generally more robust that HF theory in dealing with open


shell systems

• Scaling behavior of DFT is no worse than N3: DFT thus offers


good cost-scaling with system size

• Convergence with basis set size is typically more rapid with DFT
(good results obtained with double- or triple-zeta basis sets)
Disadvantages of DFT

• DFT optimizes the electron density, MO theory the wavefunction.


To determine a property, in DFT we need to know its connection
to the density, while in MO theory we need the operator

• DFT is problematic for cases which aren’t well described by a


single determinantal wavefunction

• With DFT, it is often unclear how to do a “better” calculation; i.e.,


how to systematically improve the calculation

• DFT methods do not properly describe long-range forces,


because of the focus on the local density

You might also like