Summary FMCP: 1 Statistical Mechanics

Summary FMCP
victor.parra
February 2021
1 Statistical mechanics
1.1 Define the partition functions.
q measures how the total number of molecules is distributed over the available
microstates. Q gives the probability of occurrence of members in a single state
i of the entire system of energy Ei
• For distinguishable independent molecules: Q = q N

qN
• For indistinguishable independent molecules: Q = N!
1.2 Relation between Isothermal-Isobaric and Canonical

ensembles.
Pn Pn
• ∆= i=1 e−β(Ei +P Vi ) = i=1 e−β(P Vi ) Q
The NPT ensemble has an additional constrain on the total volume
of the ensemble.
1.3 Define a microstate

A microstate of a system of molecules (particles) is a complete specification of
all position and momenta of all molecules
1.4 Ensemble and bridge equation

• Microcanonical ensemble: S = k ln(W (N, V, E))
• Canonical ensemble: A = −kT ln(Q(N, V, T ))
• Isothermal-Isobaric: G = −kT ln(∆(N, T, P ))
The logarithm of the partition function for any ensemble, times K, gives the
macroscopic thermodynamic potential
1
1.5 Electronic Partition Function
X E X E
qE = e−βεj = gj e−βεj
Electronic energy separations from the (degenerate) ground state are usually
very large, so for most cases at not very large temperatures: q E = g0
1.6 Translational Partition Function

• If container of 1-D:
n2 h2
εT =
8ma2
• If 3-D container
h2 n2x n2y n2z

εT = ( + + )
8ma2 a2 a2 a2
2πmkT 1/2
qiT = a( )
h2
V
q T = qxT qyT qzT = 3
Λ
h2
Λ=( )1/2
2πmkT
1.7 Rotational Partition Function

• Rotational energy:
εR
J = hcBJ(J + 1)
• B is a rotational constant that can be compute for linear and non linear
molecules
• B= h
8cIπ 2 , where the moment of Inertia can be computed as I = Σ(mi ri2 )
• Rotational temperature for T >> ΘR :
kΘR = hcB
• Rotational partition function linear homo-nuclear and hetero-nuclear di-

atomic molecules:
2
T
qR =
σ ΘR
• σ is the number of indistinguishable orientations of the molecule
For non linear molecules: we have three moments of inertia and three
rotational constants A, B and C.
π 1/2 T 3/2
qR =
σ (ΘA ΘB ΘC )1/2
where,
h2
ΘX = , X = A, B, C
8π 2 Xk
1.8 Vibrational Partition Function

• Vibrational energy for a diatomic linear molecule treated as a Harmonic
Oscillator :
1
εV = (n + )hν
2
• Rotational partition function
−ΘV
e 2T T
qV = −ΘV
≈
1−e T
ΘV
For poly-atomic molecules, each molecule can be treated as a system of har-

monic oscillators using the vibrational normal modes:
f
X 1
εV = ((ni + )hνi )
i
2
−ΘV
i
e 2T
q V
= Πfi qiV = Πfi −ΘV
i
1−e T
3
1.9 Mean Energies
When the molecular partition function can be factorized into contributions from
each mode, the mean energy of each mode M

M 1 ∂q
hε i = −
q ∂β V
with M = T, R, V, or E modes
e.g. Rotational Partition Function - Mean energy 1-D
X
qT =
Λ

T Λ ∂X/Λ d 1 1
hε i = − ( )V = −β 1/2 = kT
X ∂β dβ β 1/2 2
1.10 Internal Energy, Entropy and Canonical Partition

Function
• Internal Energy
U (T ) = U (0) + E(T )
N
X
U (T ) = U (0) + (pi Ei )
i
N
X e−βEi 1 ∂Q
U (T ) = U (0) + (Ei ) = U (0) −
i
Q Q ∂β

∂(ln(Q))
U (T ) − U (0) = −
∂β V
Entropy
N
X
S(T ) = kln(W ) = k(ln(N !) − (ln(ni ))
i
’
Sterling approximation ln(N !) = N ln(N ) − N
N
!
X
S(T ) = k N lnN − (ni ln(ni )
i
4
n
i
X X
S(T ) = k (ni lnN − ni ln(ni )) = −k (ni ln
N
X
S(T ) = −kN (pi ln(pi )
−βEi
X e
S(T ) = −kN pi ln )
q
X X
pi ln e−βEi + kN

S(T ) = −kN (pi ln(q))
X
S(T ) = kN (pi βEi ) + kN ln(q)
E(T )
S(T ) = + k ln(Q)
T
U (T ) − U (0)
S(T ) = + k ln(Q)
T
1.11 Mono-atomic gas:

For indistinguishable independent molecules:
U (T ) − U (0) U (T ) − U (0)
S= + klnQ = + kN lnq − klnN !
T T
Using:
• Stirling Approximation
• n = N/Na
• K = R/NA
U (T ) − U (0)
S(T ) = + nR ln(q) − nR ln(N ) + nR
T
The only mode of motion for a gas of atoms is translational
V h
q= Λ3 with Λ = (2πkmkT )1/2
3
U (T ) = U (0) + nRT
2
e5/2 V

S(T ) = nR ln
nNA λ3

∂lnQ
G = −kT ln Q + kT V
∂lnV T
G − G(0) = −kT ln Q + nRT

qm
G − G(0) = −nRT ln
NA
5
2 Molecular Dynamics
2.1 Limitations of classical MD
1. Use of classical laws of motion:
q It is required a minimum average dis-
2πh̄2
tance between particles to be: M kB T << a
2. Realism of forces: A simulation is realistic if the inter-atomic forces

mimic the behavior that the real system would experience when arranged
in the same configuration.
3. Simulation time is much larger than any relaxation time of the quantities
we are interested in. MD box size is much larger than any correlation
lengths of the spatial correlation functions of interest.
2.2 Statistical ensembles

Physical quantities A are represented by averages over configurations dis-
tributed according to a certain statistical ensemble
hAit = hAiensemble
For a sufficiently large time, we expect Ergodic Hypothesis
2.3 Short-range vs Long range potentials

The potential of a two-body interaction decays when the relative distance r be-
comes larger and larger. We call:
• Short-range potential if this decay is proportional to (r−n )∞ with n >

d − 1 . For example: Lenard-Jones potential:
φ(r) α r− 6
• Long-range potential if this decay is proportional to (r−n )∞ with n <

d − 1 . For example: Coulomb potential:
φ(r) α r− 1
2.4 Periodic Boundary Conditions

1. It is necessary to obtain good simulations. No matter how big your system
is, it’s number of atoms N would be negligible compared with the
number of atoms contained in a macroscopic piece of matter.
6
2. We can use Periodic boundary condition rather than reflecting rigid
walls. When one particle abandons the unit cell trough one side of the
box, another particle with the exact same velocity re-appears through the
opposite side.
3. In computer simulations, one of these is the original simulation box,
and others are copies called images.
4. During the simulation, only the properties of the original simulation box
need to be recorded and propagated
2.5 The minimum image criterion

This simulation condition guarantees that each particle interacts only with the
closest image of the remaining particles in the system.
To implement this condition it is required that:

1. Periodic Boundary Condition
2. Short-range potential
3. Box size ¿ 2Rc along each Cartesian direction
2.6 Approximations for MD simulations

1. Classical law’s of motion
2. Time discretization
3. Periodic Boundary Condition
4. Potential Truncation (Cut-off)
5. Finite representation of numbers
6. Finite time averages (ergodic hypothesis)
7. Pairwise potentials (LJ)
2.7 Files to Run a MD simulation

1. Topology File: How is the system to be run? (e.g. force-field parameters)
2. Coordinate file: Where is the system to be run? (coordinates of the atoms

forming the system)
3. Simulation parameter file: What do you want to run? (e.g. time step,
cutoffs)
7
2.8 Initial Structure
The conventional unit cell is the smallest 3D repeating unit that covers the area
of the whole lattice once and only once if translated by all the lattice vectors
1. Simple cubic = 1 particle
2. Body-centered cubic (bcc) = 2 particles
3. Face-centered cubic (fcc) = 4 particles
2.9 Time integration algorithm errors

• Truncation errors: Related to the accuracy of the finite difference method.
Methods usually based on Taylor expansion
• Round-off errors: Related to a particular implementation of the algorithm.

For instance, the finite number of digits used in computer arithmetic.
2.10 Finite difference algorithms

1. Euler: Not stable enough: Does not conserve energy.
∆t2
r(t + ∆t) = r(t) + v(t)∆t + a(t)
2
v(t + ∆t) = v(t) + a(t)∆t
2. Verlet Algorithm: Does not use the velocity to compute the new position.
One step forward and one step backwards
r(t + ∆t) = 2r(t) − r(t − ∆t) + a(t)∆t2
r(t + ∆t) − r(t − ∆t)

v(t + ∆t) =
2∆t
• It requires larger storage capacity (need two points to compute the new
position)
• New position is obtained by adding a small term O(∆t2 ) to the difference
of two larger terms O(∆t0 ) leading loss of precision. Not very accurate
for long time steps, but it conserves energy for short time
8
• Kinetic energy v(t + ∆t) and Potential energy r(t + ∆t) can not be
calculated simultaneously.
— Available in GROMACS —
3. Leap-Frog algorithm: Half-steps method. It is a variant of the Verlet
algorithm
∆t
r(t + ∆t) = r(t) + v(t + )
2
∆t f (t)
v(t + ∆t) = v(t − )+ ∆t
2 m
• Leap-Frog explicitly includes the velocity in the evolution scheme.
• Does not require difference of large numbers
• Total energy E=T+U cannot be calculated directly in this scheme.
4. Velocity Verlet Algorithm
f (t) 2
r(t + ∆t) = r(t) + v(t)∆t + ∆t
2m
a(t + ∆t) + a(t)
v(t + ∆t) = v(t) + ∆t
2
• Best estimate of velocities among the Verlet-like algorithm
• More function evaluations with respect to Leap-Frog
• Total energy E=T+U cannot be calculated directly in this scheme.

5. Velocity Verlet (Ave K) Algorithm:
• Same kinetic energy as Leap-Frog
2.11 Total Energy Comparison

• Kinetic energy is more accurate with half-step-averaged methods,
meaning that it changes less with larger time steps.
• The root mean squared deviation RMSD of the total energy (os-
cillation) of the system in the half-step-averaged kinetic energy will be
higher than in the full-step kinetic energy.
9
2.12 Temperature
Equipartition theorem links the concept of macroscopic temperature and mi-
croscopic dynamics
N
X 1 3
Ek = mi vi2 = N kb T
i
2 2
2.13 NVT Ensemble

• Along the trajectory, the energy is no more conserved
• Each point in the phase space, which corresponds to a state with total
energy E, is visited with the Boltzmann probability:
e−βE
p(E) = R
dx1 ..dxN e−βE
2.14 Thermostat algorithms

• Velocity scaling
Isokinetic:
• Measures the instantaneous T

• Compare with TRef
q
TRef
• Multiply the velocity of each particle by λ = T
Berendsen:
• weak-coupling with first-order kinetics to an external heat bath with
given temperature
• Strongly damped exponential relaxation
s
n∆t TRef
λ= 1+ −1
τ T (t − ∆t/2)
dT TRef − T
=
dt τ
2Cv ∗
tau = τ
3N kB
10
• This algorithm does not sample correctly the canonical ensemble.
If we add an additional stochastic term that ensures a correct
kinetic energy distribution, then it could work better.
r
dt 4KKref
dK = (Kref − K) ∗ + dW
τ 3N τ ∗
Andersen:
• Select a particle (randomly)
• Exchange its actual velocity by a velocity from Max wells Boltzmann
distribution at Tref
• Repeat in periodic intervals
Extended ensemble
Nose-Hoover
• The time it takes to relax with Nose-Hoover coupling is several times larger
tan τ
• It produces an oscillatory relaxation
• It is not so efficient for relaxing a system to the target temperature (in
comparison with Berendsen algorithm), but it can be used to correctly
sample a canonical ensemble
2.15 Constant Pressure

1. Along the trajectory, the volume and consequently the density of
the system is no more constant
2. The fluctuations in (instantaneous) pressure are connected with volume
and density fluctuations
3. Normally is coupled with constant temperature (Isothermal-Isobaric En-
semble)
2.16 Barostat algorithms

Coordinate scaling
Berendsen
• weak-coupling with first-order kinetics relaxation of the pressure to-
wards a reference pressure.
• Every n steps re-scale the coordinates of each particle and the box
vectors.
11
nδt
β P R efij − Pij (t)

µij = δij −
τp
• It yields with the correct average pressure, but not exactly the exact
isobaric ensemble
• It is very useful when the initial pressure is very far from the reference
equilibrium pressure.
Extended ensemble
1. Andersen
2. Parrinello-Rahmann
• The Hamiltonian is extended by introducing additional degrees

of freedom and terms. The box vectors as represented by the matrix
h obey the matrix equation of motion:
d2 h
= V W−1 (hT )−1 (P − P ref )Gives the true isobaric ensemble Using Leap-frog algorithm the
dt2
1. Berendsen thermostat + Berendsen Barostat (weak coupling): Does not

yield the exact NPT ensemble
2. Nosé-Hoover thermostat + Parrinello-Rahmann Barostat (Extended en-

semble): Does yield the exact NPT ensemble
2.18 Equilibration Procedure

1. Preparation of the input files (topology, coordinates, Simulation Parameters).
2 System equilibration i.e. Bringing the system to the requested initial T

and P
2.1 Heat the system to the wanted initial T by using thermostat.
2.2 Employ a barostat in order to find the correct density compatible with
the thermodynamic conditions of the system.
3. Production run.
4. Trajectory analysis, from which to get the desired information
12
2.19 Radial Distribution Function
- It can be used to characterize the phases of a substance.
1
P
g(r) = ρ4πr 2 dr (hδ(r − |ri − rj |)i
with δ(x) = 1 if x = 0
It accounts for the probability of finding two particles a distance r apart.
1. It is sufficient information to calculate thermodynamic properties, partic-

ularly energy and pressure
2. There are very well developed integral-equation theories that permit the
estimation of the RDF for a given molecular model
3. The RDF can be measured experimentally by using neutron-scattering
techniques
For an Ideal gas the probability of finding two molecules separated by an

arbitrary distance r is 100% for any r.
2.20 Heat Capacity

It can be shown that Cv is related to the fluctuations of the total energy in an
NVT ensemble by:
N KB T 2 Cv = hE 2 i − hEi2
3 Free Energy Calculations

3.1 Perturbation Theory
1. Initial problem = Unperturbed or reference problem
2. The problem of interest, called Target problem. Presented in terms of
the perturbation to the reference problem
3. The effect of the perturbation expressed as an expansion in a series with
respect to small quantity, called perturbation parameter
Goal: Calculate the free energy difference between the reference system and
the target system characterized by Hamiltonian H1 (pN , rN )
13
3.2 Free Energy and Energy Probability Distribution

Q1
∆A = −kT ln Q0
with some trick of multiplying by the reference Hamiltonian we get:

R R N N −β∆H −βH
dr dp e e o
∆A = −kT ln RR
dr N dpN e−βHo
drN dpN e−β∆H Po

RR
∆A = −kT ln
∆A = − β1 lnhe−β∆H i0
If the number of particles is constant and the mass is also constant, the
Kinetic energy would cancels out.
∆A = − β1 lnhe−β∆U i0
It is also possible to obtain the difference in free energy based on the sam-
pling of the probability distribution of the internal energy.
∆A = − β1 ln e−β∆U Po (∆U )d(∆U )

R
3.3 Limitations of FEP

1. Reasonably accurate evaluation of ∆A via a direct numerical integration is
possible if the probability distribution in the low ∆U region is sufficiently
well known up to two standard deviations from the peak of the integrand
or β2 + 2 standard deviations from the peak of P0 (∆U ), located at h∆U i0
2. If the standard deviation is smal enough e.g. equal to kT, then 97% of the
sampled values of ∆U are within ±2σ from the peak at h∆U i0
3. If P0 (∆U ) is pretty much a Gaussian distribution then:
1
∆A = h∆U i0 − βσ 2
2
• Direct use of the forward FEP equation can be successful only
if P0 (∆U ) is narrow function of ∆U . This does not imply that the
difference between the reference and the target states must be small.
Non-overlapping states
What if H0 − H1 >> kT ??
We can extend this approach to include multiple intermediate states with

increasing overlap:
14
∆A = A(1) − A(0) = (A(1) − A(N )) + (A(N ) − A(N − 1)) + ... + (A(I) − A(0))
1. In this method, intermediate states do not need to correspond with

actual physical states
2. Hamiltonian can be considered to be a function of some parameter λ
3. λ is defined between 0 and 1 for such as λ = 0 and λ = 1 for the reference
and target states, respectively.
4. As we change λ we move from state 0 to 1
3.4 Free Energy Calculations
H(λi ) = H0 + λi ∆H
At each intermediate step λi we perform a simulation (MD) by first performing

a short equilibration run and then a ”production” run where we calculate:
∆A(λi → λi+1 ) = −kT lnhe−β∆H i
Total free energy difference (integrating out the kinetic term)
N
X −1
∆A = −kT lnhe−β∆λ∆U iλi
i=1
Conditions
1. If the important region of the target system fully overlaps, or is a subset of

the important region of the reference system P0 (∆U ) estimated from FEP
should be reliable
2. Good sampling of the important region in the reference system, will also
yield good sampling of the important region of the target system
3. If the important region of the two systems do not overlap, the important
region of the target state is not expected to be sufficiently sampled during a
simulation of the reference state. Satisfactory estimates of ∆A are unlikely
to be obtained.
3.5 Thermodynamics Integration

Z λ1
∂A(λ)
∆A = dλ
λ0 ∂λ
15
Z Z
drN dpN e−βH

∆A = −kT ln(Q(λ)) = −kT ln

∂H(λ)
drN e−βH(λ)
R
∂A(λ) 1 ∂Q ∂λ
= −β = R
∂λ Q ∂λ drN e−βH(λ)
Z λ1
∂H(λ)
∆A = h idλ
λ0 ∂λ
Where h ∂H(λ)
∂λ iλ is the ensemble average of the derivative of H with respect to λ
The partial derivative h ∂H(λ)

∂λ iλ is calculated analytically or approximating
those to finite difference. The Free Energy difference is the area under
the curve
3.6 The Potential Mean Force PMF

1. Not only the energy difference is desired, but the free energy along some
specific reaction coordinate ξ
2. PMF is calculated for a physically achievable process.
3. The energy function A(ξ) = −kT ln(P (ξ)) + A0 is defined by a particular
value of the coordinate ξ(rN , pN ). With P (ξ) as the probability density of
finding the chemical system of interest at ξ
If, Z Z
N N −βH
A(ξ0 ) = −kT ln dr dp e δ(ξ − ξ0 )
Then,
A(ξ0 ) = −kT ln(Q) − kT lnhδ(ξ − ξ0 )i
Important:
1. Small changes in the free energy A(ξ) may correspond to P (ξ) changing
by an order of magnitude from its ”most-likely” value
2. Montecarlo and MD do not sample properly the region where P (ξ) drasti-
cally change from its ”most-likely” value.
3.7 Umbrella Sampling

1. To improve sampling efficiency, the reaction pathway is broken down into
windows or ranges of ξ where individual free-energy profiles are determined.
• Main idea: We can force/bias the system to sample a particular small
region(s) by applying a biasing potential. This additional potential shall
depend only on the reaction coordinate
16
2. The umbrella sampling is applied to a pre-defined reaction coordinate
that best describes the transition of the system from the original state to the
target state. This reaction coordinate summarizes all the degrees of freedom
into few parameters called collective variables.
A(ξ0 ) = −kT ln(P ∗ (ξ0 )) − V (ξ0 ) + K
The unbiased Free energy can be calculated as:
1. Calculate the ensemble average P ∗ (ξ0 ) with the biased potential.

2. Subtract the biased potential V (ξ0 )
3. Add the term K = −kT ln(he−βV iH+V )
3.8 Weighted Histogram Analysis Method (WHAM)

1. It is useful for combining sets of simulations with different biasing potentials
in a manner such that the unbiased potential of mean force can be found.
2. Based on multiple histograms generated from MD simulations
3.Histograms at different parameters (one histogram at a particular value of

the specific parameter) are combined to form a single histogram.
4. Starting point is required to have a perfect mach to minimize

total error
17

Summary FMCP: 1 Statistical Mechanics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Summary FMCP: 1 Statistical Mechanics

Uploaded by

Copyright:

Available Formats

Summary FMCP

• For distinguishable independent molecules: Q = q N

1.2 Relation between Isothermal-Isobaric and Canonical

1.3 Define a microstate

1.4 Ensemble and bridge equation

1.6 Translational Partition Function

h2 n2x n2y n2z

1.7 Rotational Partition Function

• Rotational temperature for T >> ΘR :

• Rotational partition function linear homo-nuclear and hetero-nuclear di-

1.8 Vibrational Partition Function

For poly-atomic molecules, each molecule can be treated as a system of har-

e.g. Rotational Partition Function - Mean energy 1-D

1.10 Internal Energy, Entropy and Canonical Partition

1.11 Mono-atomic gas:

G − G(0) = −kT ln Q + nRT

2. Realism of forces: A simulation is realistic if the inter-atomic forces

2.2 Statistical ensembles

For a sufficiently large time, we expect Ergodic Hypothesis

2.3 Short-range vs Long range potentials

• Short-range potential if this decay is proportional to (r−n )∞ with n >

• Long-range potential if this decay is proportional to (r−n )∞ with n <

2.4 Periodic Boundary Conditions

2.5 The minimum image criterion

To implement this condition it is required that:

2.6 Approximations for MD simulations

2.7 Files to Run a MD simulation

2. Coordinate file: Where is the system to be run? (coordinates of the atoms

1. Simple cubic = 1 particle

2. Body-centered cubic (bcc) = 2 particles

3. Face-centered cubic (fcc) = 4 particles

2.9 Time integration algorithm errors

• Round-off errors: Related to a particular implementation of the algorithm.

2.10 Finite difference algorithms

r(t + ∆t) = 2r(t) − r(t − ∆t) + a(t)∆t2

r(t + ∆t) − r(t − ∆t)

• Total energy E=T+U cannot be calculated directly in this scheme.

4. Velocity Verlet Algorithm

• Total energy E=T+U cannot be calculated directly in this scheme.

• Same kinetic energy as Leap-Frog

2.11 Total Energy Comparison

2.13 NVT Ensemble

2.14 Thermostat algorithms

• Measures the instantaneous T

• Strongly damped exponential relaxation

2.15 Constant Pressure

2.16 Barostat algorithms

• The Hamiltonian is extended by introducing additional degrees

1. Berendsen thermostat + Berendsen Barostat (weak coupling): Does not

2. Nosé-Hoover thermostat + Parrinello-Rahmann Barostat (Extended en-

2.18 Equilibration Procedure

2 System equilibration i.e. Bringing the system to the requested initial T

2.1 Heat the system to the wanted initial T by using thermostat.

4. Trajectory analysis, from which to get the desired information

It accounts for the probability of finding two particles a distance r apart.

1. It is sufficient information to calculate thermodynamic properties, partic-

For an Ideal gas the probability of finding two molecules separated by an

2.20 Heat Capacity

3 Free Energy Calculations

with some trick of multiplying by the reference Hamiltonian we get:

drN dpN e−β∆H Po

∆A = − β1 ln e−β∆U Po (∆U )d(∆U )

3.3 Limitations of FEP