Professional Documents
Culture Documents
• Thermodynamic integration
1
University of Wisconsin-Madison Lecture 15
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 24, 2019
1
wi (x) = k(x − xi )2 (15.3)
2
The potential energy function of the system is then given by E(rN ) + wi (x), such that the
weight function significantly increases the energy of any configurations with values of the reaction
coordinate that differ significantly from the restrained value xi . In other words, we add a fictitious
force (a spring force) that is not meant to model a physical force in the system, but rather is added
solely to force the system to sample a particular value of the reaction coordinate. Conceptually,
the idea behind this is to effectively “flatten” the free energy landscape by forcing the system to
explore a local region near xi , thus allowing the sampling of values of xi that would not be explored
in an unbiased simulation.
We can now write the biased probability of finding the system at a particular value of the
reaction coordinate x(rN ) = x0 for the ith simulation using the modified potential energy function:
N
drN e−β[E(r )+wi (x)] δ(x(rN ) − x0 )
R
0
pbias,i (x ) = (15.4)
drN e−β[E(rN )+wi (x)]
R
The delta function selects only values of x(rN ) = x0 , so if x0 ≈ xi , then pbias,i (x0 ) will be large;
otherwise, the weight function will lead to large values of the total energy and thus negligible values
of pbias,i (x0 ). We can sample this probability distribution directly in a simulation by adding the
weight function to the system dynamics (i.e., adding a spring force to relevant particles) to increase
sampling of the value x0 = xi . However, we need to sample the unbiased probability to calculate
the potential of mean force, so we need to relate pbias,i (x0 ) to p(x0 ). To do so, we can first rewrite
the biased probability distribution as:
N
drN e−β[E(r )] e−βwi (x) δ(x(rN ) − x0 )
R
0
pbias,i (x ) = (15.5)
drN e−β[E(rN )] e−βwi (x)
R
N )]
drN e−β[E(r e−βwi (x) δ(x(rN ) − x0 )
R
Z
= R
N −β[E(rN )] −βw (x) (15.6)
Z dr e e i
By inspecting the second term we see that it is an integral over all phase space of e−βwi (x)
multiplied by a Boltzmann weight; this is exactly the expression for the ensemble average he−βwi (x) i,
so we simplify to:
N )]
drN e−β[E(r e−βwi (x) δ(x(rN ) − x0 ) −βwi (x) −1
R
0
pbias,i (x ) = he i (15.7)
Z
2
University of Wisconsin-Madison Lecture 15
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 24, 2019
Next, we can recognize that the delta function in the integral selects only those states for which
x(rN ) = x0 (unlike the previous ensemble average, where the integral includes all values of rN and
thus all values of x(rN )). As a result, the value of the weight function can be computed analytically
and removed from the integral, yielding:
0 N
e−βwi (x ) drN e−β[E(r )] δ(x(rN ) − x0 ) −βwi (x) −1
R
0
pbias,i (x ) = he i (15.8)
Z
0
= e−βwi (x ) p(x0 )he−βwi (x) i−1 (15.9)
Note that we are being careful to distinguish between the value of the weight function for a
0
specific value of the reaction coordinate, e−βwi (x ) , which is analytically defined, and the ensemble-
average value of the weight function for all possible values of the reaction coordinate, he−βwi (x) i,
which will depend on the entire phase space. We can then rearrange this expression for the unbiased
probability to write:
0
p(x0 ) = eβwi (x ) pbias,i (x0 )he−βwi (x) i (15.10)
This expression thus relates the biased probability distribution from the ith biased simulation
to the unbiased probability distribution. We can then write the value of the PMF, Fi (x0 ) associated
with x0 based on the ith simulation (i.e., the simulation with a bias applied to xi ) as:
Let’s consider each of these terms in turn. The first term, −kB T ln [pbias,i (x0 )], can be estimated
directly from the ith biased molecular simulation for which the weight function will restrain the
simulation to sample configurations with x(rN ) ≈ x0i , allowing pbias,i (x0 ) to be calculated even
if x0 is normally not sampled in an unbiased simulation. The second term, wi (x0 ), is calculated
analytically since the expression for the weight function is specified. The fourth term, −kB T ln Z,
is a constant that does not depend on x0 and can be eliminated by only consider differences in the
PMF. Finally, the third term, −kB T lnhexp [−βwi (x)]i is the ensemble average of the exponential
weight function for x0 sampled from the unbiased ensemble. As we will show below, this term is
equal to the free energy cost associated with introducing the weight function. We can define this
term as Ki to write our final expression as:
3
University of Wisconsin-Madison Lecture 15
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 24, 2019
simulations (i.e. different biased values of xi ) and adjust the values of Ki such that the estimate
for F (x0 ) matches across all biased windows. This approach requires that the biased simulations
overlap - that is, that there is a non-negligible value of pbias,i (x0 ) for the same value of x0 sampled
in each of the overlapping windows. In practice, this means that the harmonic weight function
must allow the system to sample configurations slightly different from xi to ensure that x0 can be
sampled in multiple different biased simulations.
So, to recap: umbrella sampling allows us to calculate the PMF (i.e. the change in the free
energy) associated with any arbitrary process by sampling configurations associated with different
values of a reaction coordinate associated with the process. The key advantage of the umbrella
sampling approach is that any value of the reaction coordiante can be sampled by applying weight
functions, thus enabling the estimate of the PMF even for very low probability (high free energy)
states. From a computational standpoint, this method requires a series of independent simulations
to be performed and then free energies to be determined by matching estimates of the PMF from
overlapping biased simulations. The requirement of overlap renders this technique inefficient com-
putationally, although more efficient methods (such as the Weighted Histogram Analysis Method)
have been developed to compute the set of Ki . These techniques are outside the scope of this discus-
sion. Umbrella sampling is very commonly used to compute PMFs for processes with large energy
barriers, such that the processes cannot be directly observed in unbiased simulations. For example,
one could apply umbrella sampling to calculate the free energy change associated with adsorbing
a molecule to a surface by defining the distance to the surface as the reaction coordinate, choosing
multiple values of this distance, then performing multiple independent simulations in which the
molecule of interest is restrained to each value of the reaction coordinate using a harmonic spring.
4
University of Wisconsin-Madison Lecture 15
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 24, 2019
For this calculation, we will first define two partition functions, Z0 and Z1 , corresponding to two
different systems with potential energy functions E0 (rN ) and E1 (rN ). Note that while the potential
energy functions are different, we assume that the set of possible values of rN are the same (i.e.,
both systems access the same phase space). For example, one could imagine computing the free
energy difference between an ideal gas and a non-ideal gas with the same number of particles, with
the interactions associated with the non-ideal gas leading to a different potential energy function.
The Helmholtz free energy change for transforming from system 0 to 1 is then:
∆F = F1 − F0 (15.15)
= −kB T ln Z1 /Z0 (15.16)
"R #
drN exp −βE1 (rN )
= −kB T ln R N (15.17)
dr exp [−βE0 (rN )]
Next, we define p1 (∆E 0 ) as the probability distribution for the energy difference ∆E(rN ) =
E1 (rN ) − E0 (rN ) with configurations sampled using E1 . In other words, we can imagine generating
a large number of configurations using the potential energy function for system 1, calculating
the energy of those configurations according to both E1 and E0 , then finding the probability of
identifying a particular energy difference ∆E 0 . Similarly, p0 (∆E 0 ) is the probability density for the
same energy difference with configurations sampled using E0 . We then write:
5
University of Wisconsin-Madison Lecture 15
CBE 710, Fall 2019 - Prof. R. C. Van Lehn October 24, 2019
From this equation alone we can see that calculating the two probability densities from simula-
tions in both ensembles would allow for the calculation of the free energy change ∆F . Finally, we
can integrate both sides over all possible values of ∆E 0 to yield a more concise expression:
Z ∞ Z ∞
d∆E 0 p1 (∆E 0 ) = exp(β∆F ) d∆E 0 p0 (∆E 0 ) exp(−β∆E 0 ) (15.28)
−∞ −∞
1 = exp(β∆F )hexp [−β∆E]i0 (15.29)
exp [−β∆F ] = hexp [−β∆E]i0 (15.30)
Here, we integrate the probability distribution for ∆E 0 over all possible energy differences
sampled in system 1; since the probability distributions is normalized, this just equals 1. The value
∆F is independent of ∆E 0 so it can be removed from the integral on the right hand side, which
then is equal to the ensemble average value of the exponential of ∆E 0 , yielding the final expression.
Note again that this ensemble average is sampled using the energy function of system 0.
The final free energy perturbation expression relates the free energy change for transforming
from system 0 to system 1 to the ensemble average of the energy change for this transformation
for configurations sampled from Z0 . Free energy perturbation can be used directly in molecular
simulations by defining system 0 and 1, generating configurations according to the potential energy
function of system 0, calculating the energy of the same configuration calculated using both E1
and E0 , then averaging E1 − E0 to get ∆F according to eq. (15.30). Note that there are no
constraints on what the potential energy functions of system 0 and 1 can be, so it is possible to
use this approach to completely change the chemical identify of molecules during a simulation and
measure the corresponding free energy change. Such transformations are called alchemical free
energy calculations.
Alchemical free energy calculations are often used to compute the free energy difference between
two states that have no clear reaction coordinate connecting them, and for which only differences in
energy (and not complete free energy pathways) are necessary. A typical example is in the design of
drug inhibitors to bind proteins - free energy perturbation can be used to calculate the free energy
change between a molecule bound to a receptor and a slightly different molecule bound to the same
receptor to quantify relative binding affinities. Alternatively, the same technique could be used to
calculate the absolute free energy of binding by defining a difference in free energy between the
bound drug molecule and a drug molecule free in solution.