Professional Documents
Culture Documents
Abstract
In the past decade, great progress has been made in the development of enhanced sampling methods, aimed
at overcoming the time-scale limitations of molecular dynamics (MD) simulations. Many sampling schemes
rely on adding an external bias to favor the sampling of transitions and to estimate the underlying free
energy landscape. Nevertheless, sampling molecular processes described by many order parameters, or
collective variables (CVs), such as complex biomolecular transitions, remains often very challenging. The
computational cost has a prohibitive scaling with the dimensionality of the CV-space. Inspiration can be
taken from methods that focus on localizing transition pathways: the CV-space can be projected onto a
path-CV that connects two stable states, and a bias can be exerted onto a one-dimensional parameter that
captures the progress of the transition along the path-CV. In principle, such a sampling scheme can handle
an arbitrarily large number of CVs. A standard enhanced sampling technique combined with an adaptive
path-CV can then locate the mean transition pathway and obtain the free energy profile along the path. In
this chapter, we discuss the adaptive path-CV formalism and its numerical implementation. We apply the
path-CV with several enhanced sampling methods—steered MD, metadynamics, and umbrella sampling—
to a biologically relevant process: the Watson–Crick to Hoogsteen base-pairing transition in double-
stranded DNA. A practical guide is provided on how to recognize and circumvent possible pitfalls during
the calculation of a free energy landscape that contains multiple pathways. Examples are presented on how
to perform enhanced sampling simulations using PLUMED, a versatile plugin that can work with many
popular MD engines.
Key words Enhanced sampling, Metadynamics, Path sampling, Path collective variable, Free energy,
Molecular dynamics, PLUMED, DNA, Hoogsteen base-pairing
1 Introduction
Massimiliano Bonomi and Carlo Camilloni (eds.), Biomolecular Simulations: Methods and Protocols, Methods in Molecular Biology,
vol. 2022, https://doi.org/10.1007/978-1-4939-9608-7_11, © Springer Science+Business Media, LLC, part of Springer Nature 2019
255
256 Alberto Pérez de Alba Ortı́z et al.
2 Theory
2.1 Defining the Path Let us consider an L-particle system with positions qðtÞ∈3L and
Collective Variable velocities vðtÞ∈3L . The dynamics of the system is governed by a
potential U(q) and follows a canonical distribution at a temperature
T. Let us also assume that the system has an underlying free energy
surface (FES) with two stable states A and B. The FES can be fully
described by a set of N key descriptive degrees of freedom, the
collective variables (CVs) {zi(q)}, with i ¼ 1,. . .,N. We aim to find
the average transition path between A and B in the space of the
CVs, zi, and define the progress along it as a reaction coordinate.
Provided that the CVs are sufficiently good descriptors of
the system, the reaction coordinate is well-defined in terms of
258 Alberto Pérez de Alba Ortı́z et al.
2.2 Finding In principle, Eq. 1 enables the calculation of the average transition
the Average path by sampling and making a histogram of z during an MD run,
Transition Path or a Monte Carlo simulation. However, the transition is a rare event
on the time-scales accessible to standard simulations. The flux
density away from the neighborhood of the A or the B basins will
be poorly sampled. This problem is typically overcome with
enhanced sampling methods—e.g., metadynamics—by biasing
directly the dynamics of the CVs, z. But in practice, such an
approach is limited to low-dimensional CV-spaces. The path-CV
aims to overcome this limitation.
Let us exert a time-dependent metadynamics bias,
X
ð ðσ 2W Þ,
2
σ ðtÞÞ
g g
V bias ðσ g , tÞ ¼ H ðtÞexp 2 ð2Þ
t
We can now optimize the path toward the average transition path
by iteratively relocating the guess path to the cumulative average
density sg ðσ g Þ ¼ hziσ g (Fig. 1). Simultaneously, the metadynamics
algorithm will adapt the bias potential after each path update as it
keeps adding layers of Gaussian potentials to the total bias, over-
writing the free energy at previous trial paths [48], and continu-
ously converging toward the free energy at the average transition
path F ¼ Vbias(σ, t).
Most of the extensive machinery developed in the last years for
metadynamics can be applied directly with the PMD method. For
example, the well-tempered method [2, 3] can aid in converging a
free energy profile by gradually reducing the height of the Gaus-
sians on the fly. The same effect can be obtained by manually
reducing the Gaussian height at every recrossing from A to
B [37] (see Note 1). Another interesting feature is multiple-walker
metadynamics [43], which can be applied with the path-CV to
speed up the simulation. Here, several replicas of the system are
simulated simultaneously in parallel for the exploration of different
regions of CV-space, while each replica communicates its updates
260 Alberto Pérez de Alba Ortı́z et al.
Fig. 1 Graphical representation of an initial guess path-CV section (red) converging to the average transition
path (green) between basins A and B. The curve points sg(σ g) are relocated to the cumulative average density
hziσg in CV-space, which peaks at the valley in the free energy F (yellow) at each hyperplane S σ g
on both the path and on the bias potential to the other replicas. By
initializing walkers both in the A and B basins, the shape of the path
can be rendered significantly faster (see Note 2).
The presented recipe to locate the average transition path with
an adaptive path-CV is not exclusively coupled to metadynamics.
Other enhanced sampling methods, such as steered MD, con-
strained MD, umbrella sampling, or even TPS, can be used with
the path-CV, without changes to the path formalism. In this chap-
ter, we will exemplify the use of the path-CV with steered MD,
metadynamics, and with umbrella sampling.
where sm is the closest path node to z, and sm1 and sm+1 are its
neighboring nodes. This expression implies that points beyond the
first or last nodes are mapped to values of σ g < 0 and σ g > 1,
respectively. To have control over this mapping at and outside the
stable states, extra trailing nodes can be added at both ends of the
original path. If necessary, wall potentials can be added to restrict
the sampling on a particular σg-region. Note also that the projec-
tion in Eq. 4 requires that the nodes are equidistant. This require-
ment is imposed by a reparameterization step [26] after each path
update.
2.3.2 Evolution The path update step, which sets sg ðσ g Þ ¼ hziσ g , uses the time
of the Path-CV averaged distance between the sampled z-points and their projected
points on the path, sg(σ g(z)). This distance is weighted by a weight,
w, which is only non-zero for the two closest nodes, giving the
following path node propagation equation:
P
wj , k ðzk st i ðσðzk ÞÞÞ
t iþ1 ti
sj ¼ sj þ k P , with
k wj , k
2 0 13
ti ð5Þ
sj st i ðσðzk ÞÞ
6 B C 7
wj , k ¼ max40, @1 ti
A5,
sj stj þ1
i
2.3.3 Tube Potential, When facing landscapes with multiple or ill-defined transition
CV-space Scaling, Multiple valleys, it can be beneficial to not only bias the sampling along the
Walkers, and Corner- path, but also restrain the sampling to the path vicinity. A harmonic
Cutting restraint potential set at jjzk st i ðσðzk ÞÞjj, either at zero distance or
allowing some freedom, can help in converging a transition path.
We refer to this restraint as a “tube” potential. In the limit of an
infinitesimally narrow tube potential, the path is optimized by
following the local free energy gradient—in a similar fashion to
the string method [26]—and PMD converges to the MFEP closest
to the initial guess path instead of to the average transition path.
Thus, the tube potential allows to control the behavior of PMD by
switching between a path optimization based on CV density, quick
262 Alberto Pérez de Alba Ortı́z et al.
Fig. 2 Graphical explanation of a path node update. The sampled average density hziσg is projected onto the
path at sg(σ g). The two closest path nodes sm1 and sm are relocated according to weights that depend on
their distances from sg(σ g). A subsequent reparameterization step redistributes the nodes along the path to
make them equidistant again
nodes at the other end of the path, moving them gradually back to a
straight path and thus undoing previous path optimization. This
side effect is much reduced in the multiple-walker PMD implemen-
tation, because in that case the sampling is more continuous along
the entire path. Apart from preventing the information loss, of
course the multiple-walker option also results in an almost trivial
parallelization speedup for the sampling of the path and the free
energy, thus providing a powerful extension to the original method
[41] (see Note 5).
In summary, the path-CV consists of a set of ordered nodes
describing the transition from basin A to basin B in the high-
dimensional CV-space. The system can be biased to move along
the path, while the positions of the nodes in CV-space can be
optimized by following the average density of the sampled points,
which peaks at the free energy valley. By means of this sampling
along and around the path, we can converge the average transition
path and the free energy along it. Additional actions can be taken to
control the extent of the sampling and the flexibility of the path
when facing challenging, forking free energy landscapes. Namely,
we can add a tube potential to restrain the sampling in the direction
perpendicular to the path and switch from a density-based optimi-
zation toward the average transition path, to a gradient-based
optimization toward the closest MFEP to the initial guess path.
2.4 Path-CV The theoretical and numerical framework discussed above has been
Implementation implemented into the PLUMED software [42] as a function of
in PLUMED CVs. Invoking the action PATHCV in PLUMED requires that the
following keywords are specified:
l LABEL: sets the identifier for this instance.
l ARG: sets the list of (priorly defined) CVs that span the space in
which the path exists.
l GENPATH: generates a straight path between two points in
CV-space; the two points typically marking the stable states. It
takes 3 integers as arguments, corresponding to the number of
anterior trailing nodes, actual transition nodes, and posterior
trailing nodes, followed by the CV-space coordinates of the
initial and final transition nodes separated by commas.
l INFILE: points to a file containing an input path.
l FIXED: indicates the two fixed nodes corresponding to the
initial and final states. The default values are the first and last
nodes, thus assuming no trailing nodes.
l OUTFILE (PATH) : points to a file where the path updates are
printed, concatenated one after the other.
l SCALE: lists the scaling factors to normalize the CV-space. The
default value is one for each CV.
264 Alberto Pérez de Alba Ortı́z et al.
3 Materials
4 Methods
Fig. 3 (a) Double-stranded A6-DNA; (b) WC A16–T9 bp; (c) HG A16–T9 bp after the 180∘ rotation of the
adenine; (d) Base flipping of the adenine in the A16–T9 bp
266 Alberto Pérez de Alba Ortı́z et al.
4.1 System The setup of the aqueous double-stranded DNA system for classical
Preparation MD simulation with the GROMACS software was done as part of a
previous work [59, 60] and will be described in full detail elsewhere.
In brief, an ideal B-DNA duplex structure is created with
the make_na webtool [61]. The nucleotide sequence 50 -
CGATTTTTTGGC-30 —and its complementary strand—is repro-
duced from ref. 53. The chain is placed in a periodic dodecahedron
box and solvated in 6691 TIP3P water molecules [62] with 28 Na+
and 6 Cl ions (25 mM NaCl), resulting in a charge-neutral system of
20868 atoms. The AMBER03 force field [63] is used to describe
atomic interactions. Non-bonded interactions are treated with a
cut-off at 0.8 nm and long-range electrostatics are handled by the
particle mesh Ewald method [64, 65]. Energy minimization is per-
formed by conjugate gradient (with a threshold of 100 N), and
followed by a 1 ns position restrained run (with a force constant of
1000 kJ/(mol nm2) for DNA heavy atoms). Equilibration is per-
formed in nine, 200 ns long, MD runs starting with different random
The Adaptive Path Collective Variable 267
4.2 Collective Determining a good set of CVs to describe the slow dynamics of a
Variables system is a highly non-trivial problem. For a long time, chemical
intuition was the main way to identify geometric order parameters
to describe a molecular transition. Today, state-of-the-art efforts in
automating the discovery of CVs include: the spectral gap optimi-
zation of order parameters (SGOOP) [69], the time-structure
based independent component analysis (tICA) [70], and the har-
monic linear discriminant analysis (HLDA) [71]. Regardless of the
method for choosing CVs, one big advantage of the path approach
is that it enables to use a large set of CVs (i.e., more than the usual
two or three). We may even have some redundancy or include some
irrelevant CVs, without running immediately into problems. Here,
we constructed the set of CVs after a careful analysis of available
TPS trajectories of the transition and using a trial-and-error
approach at the hand of test runs with steered MD and PMD.
Key aspects aimed for in the test run included: (1) structural con-
sistency of the stable states and (2) of the overall DNA double helix,
(3) consistent capture of the transition mechanism, and (4) mini-
mizing hysteresis effects in the path finding and free energy estima-
tion. The set contains the following 7 CVs (Fig. 4 for a graphical
illustration):
l dWC: The distance of the characteristic WC H-bond between
A16 (N1) and T9 (N3).
l dHG: The distance of the characteristic HG H-bond between
A16 (N7) and T9 (N3).
l dHB: The distance of the conserved H-bond, present both in
WC and in HG, between A16 (N6) and T9 (O4).
l dCC: The distance between A16 (C1’) and T9 (C1’).
l dNB: The distance between the centers of mass P1’ and P2’.
l tGB: The torsion around the glycosidic bond defined by the
pseudo-dihedral angle formed by the axis A16 (C1’-N9) and
the vectors P2-P1 and P5-P6.
l tBF: The base-flipping torsion defined by the pseudo-dihedral
angle (P1+P2)-P3-P4-P5.
The first two CVs—dWC and dHG—discriminate the H-bond
forming and breaking that distinguishes each of the stable states.
On the other hand, dHB and dCC ensure the alignment and
268 Alberto Pérez de Alba Ortı́z et al.
Fig. 4 (a) WC bp with graphical representations of the CVs: dWC, dHB, and dCC; (b) HG bp with graphical
representations of the CVs: dHG, dHB, and dCC; (c) A16–T9 bp and its two neighboring bps with graphical
representations of the centers of mass involved in the calculation of CVs: dNB, tGB, and tBF
4.3 Stable States Before we can embark on introducing and optimizing a transition
path in CV-space, we need to define the two stable states. To this
end, we perform two 100 ns equilibration simulations of the WC
and HG states with the parmbsc1 force field. The resulting trajec-
tories are stored for analysis using the PLUMED driver (see Note
7). The PLUMED driver is the stand-alone feature of the
PLUMED package, which does not require linking to an MD
code to operate on configurations generated during a simulation,
but rather operates on configurations from an input trajectory. This
tool allows us to compute and print CVs from a pre-existent trajec-
tory (.xtc file). The inclusion of a protein data bank (PDB) file
[72] is required to provide atomic masses to the PLUMED driver.
We execute:
# set units
UNITS LENGTH=A TIME=ps ENERGY=kcal/mol
270 Alberto Pérez de Alba Ortı́z et al.
# define CVs
dWC: DISTANCE ATOMS=495,276
dHG: DISTANCE ATOMS=489,276
dHB: DISTANCE ATOMS=275,492
dCC: DISTANCE ATOMS=264,484
dNB: DISTANCE ATOMS=p1_prime,p2_prime
tGB: TORSION VECTOR1=p2,p1 AXIS=484,486 VECTOR2=p5,p6
tBF: TORSION ATOMS=p1p2,p3,p4,p5
# output
PRINT ARG=dWC,dHG,dHB,dCC,dNB,tGB,tBF STRIDE=10 FILE=COLVAR
Table 1
Average values of the seven CVs in the stable WC and HG states
bp/CV dWC (Å) dHG (Å) dHB (Å) dCC (Å) dNB (Å) tGB (rad) tBF (rad)
WC 2.9 6.4 3.0 10.6 7.9 1.5 0.1
HG 5.9 3.0 3.0 9.0 7.8 1.7 0.0
The Adaptive Path Collective Variable 271
# define path-CV
PATHCV LABEL=pcv ARG=dWC,dHG,dHB,dCC,dNB,tGB,tBF
GENPATH= 0,50,0,2.9,6.4,3.0,10.6,7.9,1.5,-0.1,5.9,
3.0,3.0,9.0,7.8,-1.7,0.0 HALFLIFE=100 PACE=100
# do steered MD
MOVINGRESTRAINT ...
LABEL=steer
ARG=pcv.s
STEP0=50000 AT0=0.00 KAPPA0=2000.0
STEP1=550000 AT1=1.00 KAPPA1=2000.0
... MOVINGRESTRAINT
# output
PRINT ARG=dWC,dHG,dHB,dCC,dNB,tGB,tBF,pcv.s,pcv.z,
steer.pcv.s_cntr,steer.bias,steer.work,steer.
force2,tube.bias STRIDE=10 FILE=COLVAR
1.4 2 12
sampled config. sampled config. sampled config.
1.2 restraint center 1.5
1 10
1
0.8
0.5 8
tBF (rad)
dHG (Å)
0.6 HG WC
0 WC
σ
0.4
−0.5 6
0.2
0 −1
4
HG
−0.2 −1.5
−0.4 −2 2
0 200 400 600 800 1000 1200 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2 4 6 8 10 12
t (ps) tGB (rad) dWC (Å)
Fig. 5 Left: time evolution of the path progress parameter, σ (gray) and of the target restraint value (purple)
during the steered MD simulation; Center: sampled configurations projected onto the torsion angle CVs, tGB
and tBF (gray); Right: sampled configurations projected onto the distance CVs, dWC and dHG (gray). The stable
states are indicated by crosses
4.5 Metadynamics After the steered MD simulation, we set out for an exploratory
metadynamics run. We use the multiple-walker PMD approach,
using 16 walkers—8 starting in the WC state and 8 in the HG
state—all probing and updating the same path and biasing poten-
tial. Gaussians with a height of 0.05 kcal/mol and a width of 0.05
normalized path units are deposited every 0.5 ps (see Notes 2 and
10). Since the range of the biased variable is known a priori, it is
easy and computation-efficient to use a grid to store the potential.
The Adaptive Path Collective Variable 273
For the initial guess path, we take the same linear interpolation as
used before in the steered MD case, but now add 20 trailing nodes
at the beginning and at the end of the original 50 transition nodes
(see Notes 8 and 11). The path update has a half-life of 1 ps (see
Note 9) and is performed every 0.5 ps (see Note 10). No tube
potential is set, so that the CV-space perpendicular to the path
is sampled freely. Harmonic walls with force constants of 1000
kcal/mol per squared normalized path unit are set on σ to limit
the sampling inside the [0.2,1.2] interval. Similarly, harmonic
walls with a force constant of 1000 kcal/(mol rad2) are set on tGB
to prevent counter-clockwise rotations, which are mapped by the
path as sudden jumps from negative values to values greater than
one. Note that, when putting a wall on a periodic CV, it is necessary
to define a non-periodic instance of it to avoid force artifacts.
We generate 16 plumed.{ID}.dat files that include the pre-
vious CV definitions, followed by:
# define path-CV
PATHCV LABEL=pcv ARG=dWC,dHG,dHB,dCC,dNB,tGB,tBF GEN-
PATH=20,50,20,2.9,6.4,3.0,10.6,7.9,1.5,-0.1,5.9,3.0,3.0,9.0,7.8,-1.7,0.0
FIXED=21,70 HALFLIFE=500 PACE=250 WALKERS_RSTRIDE=250 WALKERS_ID={ID} WALK-
ERS_N=16 WALKERS_DIR=.
# do metadynamics
METAD LABEL=metadyn ARG=pcv.s SIGMA=0.05 HEIGHT=0.05 PACE=250 GRID_MIN=-1.0
GRID_MAX=2.0 WALKERS_MPI
#output
PRINT ARG=dWC,dHG,dHB,dCC,dNB,tGB,tBF,pcv.s,pcv.z,metadyn.bias,s_lwall.
bias,s_uwall.bias,tGB_lwall.bias,tGB_uwall.bias STRIDE=10 FILE=COLVAR
1.4 2 20 1
1.2 1.5 18
0.5
1 1 16
0.8 14 0
0.5
tBF (rad)
tBF (rad)
dHG (Å)
0.6 HG WC 12
0 −0.5
σ
0.4 10
−0.5
0.2 8 WC −1
0 −1 6
−1.5
−0.2 −1.5 4 HG
−0.4 −2 2 −2
0 1000 2000 3000 4000 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2 4 6 8 10 12 14 16 18
t (ps) tGB (rad) dWC (Å)
Fig. 6 Left: time evolution of the path progress parameter, σ, for each of the 16 PMD walkers (green-pink
colormap); Center: sampled configurations by all walkers projected onto tGB and tBF; Right: sampled
configurations by all walkers projected onto dWC and dHG. The stable states are indicated by crosses
4.6 Converging Starting again from the guess path based on the linear interpolation
a Path between the stable states, we will now aim to converge the path-CV
to a specific mechanistic transition pathway and its corresponding
free energy profile. To do so, we use a tube potential to control the
sampling in the neighborhood of the path. This restraint prevents
bifurcation, such as that seen in Fig. 6, and maintains all walkers
exploring the same valley. The force constant for the tube, after
several trials, is set to 20 kcal/mol per squared normalized path unit
(see Note 3). The number of walkers is decreased to eight (4 starting
in each stable state), and the Gaussian height to 0.04 kcal/mol, as
smaller and more infrequent increases of the bias potential favor the
ability of metadynamics to overwrite and self-heal a free energy
profile [48]. We also reduce to 500 kcal/mol per squared normal-
ized path unit the force constant of the wall potentials on σ that
restrict the sampling to the [0.2,1.2] range. The rest of the
parameters are the same as in Subheading 4.5. Eight parallel runs
are started with different random Maxwell–Boltzmann distributed
velocities.
The Adaptive Path Collective Variable 275
1.4 2 8
sampled config. sampled config.
1.2 1.5 path at t=3500 ps path at t=3500 ps
7
1 WC
1
0.8 6
0.5
tBF (rad)
dHG (Å)
0.6 HG WC
0 5
σ
0.4
−0.5
0.2 4
0 −1 HG
3
−0.2 −1.5
−0.4 −2 2
0 1000 2000 3000 4000 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2 3 4 5 6 7 8
t (ps) tGB (rad) dWC (Å)
Fig. 7 Left: time evolution of the path progress parameter, σ, for each of the 8 PMD walkers (in colors);
Center: sampled configurations by all walkers projected onto tGB and tBF (gray), and the optimized path
(purple) with trailing nodes (black); Right: sampled configurations by all walkers projected onto dWC and dHG
(gray), and the optimized path (purple) with trailing nodes (black). The stable states are indicated by crosses
node
0 10 20 30 40 50 60 70 80 90
8.0
dWC (Å)
6.0
4.0
2.0
8.0
dHG (Å)
6.0
4.0
2.0
5.0
dHB (Å)
4.0
3.0
2.0
12.0
dCC (Å)
10.0
8.0
12.0
dNB (Å)
10.0
8.0
6.0
2.0
tGB (rad)
0.0
−2.0
0.4
tBF (rad)
0.0
−0.4
−0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4
σ
Fig. 8 Sampled configurations by the 8 PMD walkers projected onto each of the
7 CVs and σ (gray), and the optimized path nodes for each CV (in colors)
The Adaptive Path Collective Variable 277
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
1.4
1.2
1
0.8
0 0.6
500 0.4
1000 0.2 σ
1500
2000 0
2500 −0.2
3000
t (ps) 3500
4000 −0.4
Fig. 9 Time evolution of the distance from the path, jjzk st i ðσðzk ÞÞjj, with
respect to the path progress parameter, σ, averaged over the 8 PMD walkers.
The color map on the z-axis shows the average distance from the path,
jjzk st i ðσðzk ÞÞjj, weighted according to the inverse distance to the grid points
278 Alberto Pérez de Alba Ortı́z et al.
# define path-CV
PATHCV LABEL=pcv ARG=dWC,dHG,dHB,dCC,dNB,tGB,tBF
INFILE=PMD_path.input HALFLIFE=-1 PACE=0
# set umbrella
RESTRAINT LABEL=umbrella ARG=pcv.s AT={WINDOW} KAPPA=2000.0
# output
PRINT ARG=dWC,dHG,dHB,dCC,dNB,tGB,tBF,pcv.s,pcv.z,
umbrella.bias,tube.bias STRIDE=10 FILE=COLVAR
20
16
FE (kcal/mol)
12
0
0 0.2 0.4 0.6 0.8 1
σ
Fig. 10 WHAM free energy profile obtained from the umbrella sampling simula-
tions along the WC-to-HG path without base flipping optimized by PMD
4.8 A Posteriori Path As a final illustration of optimizing the adaptive path-CV to find a
Optimization transition mechanism, we will employ the path-CV as an a poster-
iori analysis tool on a pre-existing trajectory (see Note 12). Of
course, this trajectory should contain configurations from at least
one transition between two stable states, for example, from a very
long (or very lucky) brute-force MD or Monte Carlo simulation, or
from an enhanced sampling simulation. Here, we will analyze a
reactive trajectory obtained from a TPS simulation [59], and opti-
mize a path for one WC-to-HG transition involving the base open-
ing to the major groove (described in more detail in Subheading
4.5). We start with a straight path and use the PLUMED driver to
optimize it using the trajectory as input. The driver can compute
and print CVs, and, for the current purpose, also compute the
distance to the path-CV in each trajectory frame to optimize the
path. Typically, a single run of the PLUMED driver over the
trajectory is not enough to converge the path-CV. But by running
the driver several times on the trajectory, while every time providing
the previously optimized path as the initial guess for the next run,
the path-CV can be optimized efficiently in a few iterations.
The plumed.dat file contains the CV definitions and the
following commands:
#!/bin/bash
# do 100 iterations
for (( i=0; i<100; i++ )); do
# run optimization
sed ’s/#RESTART//’ plumed.dat > plumed.input
plumed driver --pdb dna.pdb --mf_xtc traj.xtc --
plumed plumed.input
done
# clean up
mv path.out path.out_$i
rm bck.*
After 100 iterations a good fit is obtained for all CVs. Notice that
this simulation was done without a tube, which explains the fluc-
tuating sampling (Fig. 11).
To obtain the free energy along the optimized path, we use
umbrella sampling, similar as we did before for the mechanism
without base flipping found by PMD. Harmonic window potentials
are placed at every 0.1 normalized path units, from σ ¼ 0 to σ ¼ 1,
with force constants of 2000 kcal/mol per squared normalized path
unit. Additional windows with the same force constant are placed at
σ values of 0.35, 0.65, and 0.95; and with a force constant of 3000
kcal/mol per squared normalized path unit at σ ¼ 0.1.
Figure 12 shows the resulting free energy profile for the base-
flipping transition, which is characterized by additional small bar-
riers close to σ ¼ 0.1 and σ ¼ 0.9. These barriers mark the flipping
of the nucleotide out of, and back into, the confines of the double
helix. In between these steps, the base rotation takes place. In this
mechanism, the top of the central barrier is not located at the
mid-rotation state, but to a stage in which the base starts to
282 Alberto Pérez de Alba Ortı́z et al.
node
0 10 20 30 40 50
20.0
15.0
dWC (Å)
10.0
5.0
20.0
15.0
5.0
20.0
15.0
dHB (Å)
10.0
5.0
16.0
14.0
dCC (Å)
12.0
10.0
8.0
10.0
9.0
dNB (Å)
8.0
7.0
6.0
2.0
tGB (rad)
0.0
−2.0
0.0
tBF (rad)
−0.5
−1.0
−1.5
0 0.2 0.4 0.6 0.8 1
σ
Fig. 11 Sampled configurations by the TPS trajectory projected onto each of the
7 CVs and σ (gray), and the optimized path nodes for each CV (in colors)
The Adaptive Path Collective Variable 283
20
no base−flipping path
base−flipping path
FE (kcal/mol) 16
12
0
0 0.2 0.4 0.6 0.8 1
σ
Fig. 12 WHAM free energy profiles obtained from umbrella sampling simulations along WC-to-HG transition
paths with and without base flipping
5 Notes
Acknowledgements
References
1. Laio A, Parrinello M (2002) Escaping free- 7. Darve E, Pohorille A (2001) Calculating free
energy minima. Proc Natl Acad Sci USA 99 energies using average force. J Chem Phys
(20):12562–12566 115:9169
2. Barducci A, Bussi G, Parrinello M (2008) Well- 8. Carter EA, Ciccotti G, Hynes JT, Kapral R
tempered metadynamics: a smoothly converg- (1989) Constrained reaction coordinate
ing and tunable free-energy method. Phys Rev dynamics for the simulation of rare events.
Lett 100(2):020603 Chem Phys Lett 156:472
3. Bonomi M, Barducci A, Parrinello M (2009) 9. den Otter WK, Briels WJ (1998) The calcula-
Reconstructing the equilibrium Boltzmann tion of free-energy differences by constrained
distribution from well-tempered metady- molecular dynamics simulations. J Chem Phys
namics. J Comput Chem 30:1615–1621 109:4139
4. Grubmüller H, Heymann B, Tavan P (1996) 10. Huber T, Torda A, van Gunsteren W (1994)
Ligand binding: molecular mechanics calcula- Local elevation: a method for improving the
tion of the streptavidin-biotin rupture force. searching properties of molecular dynamics sim-
Science 271(5251):997–999 ulation. J Comput Aided Mol Des 8:695–708
5. Jarzynski C (1997) Nonequilibrium equality 11. Grubmüller H (1995) Predicting slow struc-
for free energy differences. Phys Rev Lett 78 tural transitions in macromolecular systems:
(14):2690 conformational flooding. Phys Rev E 52:2893
6. Torrie GM, Valleau JP (1977) Nonphysical 12. Voter A (1997) Hyperdynamics: accelerated
sampling distributions in Monte Carlo free- molecular dynamics of infrequent events. Phys
energy estimation: umbrella sampling. J Com- Rev Lett 78:3908
put Phys 23(2):187–199
288 Alberto Pérez de Alba Ortı́z et al.
13. Babin V, Roland C, Sagui C (2008) Adaptively 28. Jónsson H, Mills G, Jacobsen KW (1998)
biased molecular dynamics for free energy cal- Nudged elastic band method for finding mini-
culations. J Chem Phys 128:134101 mum energy paths of transitions. In: Berne B,
14. Wang F, Landau DP (2001) Efficient, multiple- Ciccotti G, Coker DF (eds) Classical and quan-
range random walk algorithm to calculate the tum dynamics in condensed phase simulations.
density of states. Phys Rev Lett 86:2050 World Scientific, Singapore, pp 385–404
15. Hansmann UH (1997) Parallel tempering 29. Crooks GE, Chandler D (2001) Efficient tran-
algorithm for conformational studies of sition path sampling for nonequilibrium sto-
biological molecules. Chem Phys Lett 281 chastic dynamics. Phys Rev E 64:026109
(1):140–150 30. Van Erp TS, Moroni D, Bolhuis PG (2003) A
16. Sugita Y, Okamoto Y (1999) Replica-exchange novel path sampling method for the calculation
molecular dynamics method for protein fold- of rate constants. J Chem Phys 118:7762
ing. Chem Phys Lett 314:141–151 31. Faradjian AK, Elber R (2004) Computing time
17. Berg B, Neuhaus T (1992) Multicanonical scales from reaction coordinates by mileston-
ensemble: a new approach to simulate first- ing. J Chem Phys 120:10880
order phase transitions. Phys Rev Lett 68:9–12 32. Allen RJ, Frenkel D, ten Wolde PR (2006)
18. Maragliano L, Vanden-Eijnden E (2006) A Simulating rare events in equilibrium or non-
temperature accelerated method for sampling equilibrium stochastic systems. J Chem Phys
free energy and determining reaction pathways 124:94111
in rare events simulations. Chem Phys Lett 33. Branduardi D, Gervasio FL, Parrinello M
426:168–175 (2007) From A to B in free energy space. J
19. Kirkpatrick S, Gelatt C, Vecchi M (1983) Opti- Chem Phys 126:054103
mization by simulated annealing. Science 34. Pan AC, Sezer D, Roux B (2008) Finding
220:671–680 transition pathways using the string method
20. Sorensen M, Voter A (2000) Temperature- with swarms of trajectories. J Phys Chem B
accelerated dynamics for simulation of infre- 112(11):3432–3440
quent events. J Chem Phys 112:9599–9606 35. Bussi G, Gervasio FL, Laio A, Parrinello M
21. Rosso L, Minary P, Zhu Z, Tuckerman M (2006) Free-energy landscape for beta hairpin
(2002) On the use of the adiabatic molecular folding from combined parallel tempering and
dynamics technique in the calculation of free metadynamics. J Am Chem Soc
energy profiles. J Chem Phys 116:4389–4402 128:13435–13441
22. Dellago C, Bolhuis PG, Csajka FS, Chandler D 36. Piana S, Laio A (2007) A bias-exchange
(1998) Transition path sampling and the calcu- approach to protein folding. J Phys Chem B
lation of rate constants. J Chem Phys 108 111:4553–4559
(5):1964–1977 37. Dı́az Leines G, Ensing B (2012) Path finding
23. Bolhuis PG, Chandler D, Dellago C, Geissler on high-dimensional free energy landscapes.
PL (2002) Transition path sampling: throwing Phys Rev Lett 109(2):020601
ropes over rough mountain passes, in the dark. 38. Gallet GA, Pietrucci F, Andreoni W (2012)
Annu Rev Phys Chem 53:291 Bridging static and dynamical descriptions of
24. Weinan E, Ren W, Vanden-Eijnden E (2002) chemical reactions: an ab initio study of CO2
String method for the study of rare events. interacting with water molecules. J Chem The-
Phys Rev B 66(5):052301 ory Comput 8:4029–4039
25. Weinan E, Ren W, Vanden-Eijnden E (2005) 39. Pietrucci F, Saitta AM (2015) Formamide reac-
Finite temperature string method for the study tion network in gas phase and solution via a
of rare events. J Phys Chem B 109 unified theoretical approach: toward a reconcil-
(14):6688–6693 iation of different prebiotic scenarios. Proc
26. Maragliano L, Fischer A, Vanden-Eijnden E, Natl Acad Sci USA 112:15030–15035
Ciccotti G (2006) String method in collective 40. Chen C (2017) Fast exploration of an optimal
variables: minimum free energy paths and iso- path on the multidimensional free energy sur-
committor surfaces. J Chem Phys 125 face. PLoS One 12(5):e0177740
(2):024106 41. Pérez de Alba Ortı́z A, Tiwari A,
27. Vanden-Eijnden E, Venturoli M (2009) Revi- Puthenkalathil R, Ensing B (2018) Advances
siting the finite temperature string method for in enhanced sampling along adaptive paths of
the calculation of reaction tubes and free ener- collective variables. J Chem Phys 149
gies. J Chem Phys 130(19):194103 (7):072320
42. Tribello GA, Bonomi M, Branduardi D,
Camilloni C, Bussi G (2014) PLUMED 2:
The Adaptive Path Collective Variable 289
new feathers for an old bird. Comput Phys a structure-based survey. Nucleic Acids Res 43
Commun 185(2):604–613 (7):3420–3433
43. Raiteri P, Laio A, Gervasio FL, Micheletti C, 57. Yang C, Kim E, Pak Y (2015) Free energy
Parrinello M (2006) Efficient reconstruction of landscape and transition pathways from Wat-
complex free energy landscapes by multiple son–Crick to Hoogsteen base pairing in free
walkers metadynamics. J Phys Chem B 110 duplex DNA. Nucleic Acids Res 43
(8):3533–3539 (16):7769–7778
44. Grossfield A (2013) WHAM: the weighted his- 58. Chakraborty D, Wales DJ (2017) Energy land-
togram analysis method, version 2.0.9. http:// scape and pathways for transitions between
membrane.urmc.rochester.edu/content/ Watson–Crick and Hoogsteen base pairing in
wham DNA. J Phys Chem Lett 9(1):229–241
45. Ferrario M, Ciccotti G, Binder K (2007) Com- 59. Vreede J, Bolhuis PG, Swenson DW (2016)
puter simulations in condensed matter: from Predicting the mechanism and kinetics of the
materials to chemical biology, vol 1. Springer, Watson-Crick to Hoogsteen base pairing tran-
Berlin sition. Biophys J 110(3):563a–564a
46. Onsager L (1938) Initial recombination of 60. Vreede J, Bolhuis PG, Swenson DW (2017)
ions. Phys Rev 54(8):554 Path sampling simulations of the mechanisms
47. Bolhuis PG, Dellago C, Chandler D (2000) and rates of transitions between Watson-Crick
Reaction coordinates of biomolecular isomeri- and Hoogsteen base pairing in DNA. Biophys J
zation. Proc Natl Acad Sci USA 97 112(3):214a
(11):5877–5882 61. Macke TJ, Case DA (1998) Modeling unusual
48. Ensing B, Laio A, Parrinello M, Klein ML nucleic acid structures. In: Leontes NB, Santa-
(2005) A recipe for the computation of the Lucia J Jr (eds) Molecular modeling of nucleic
free energy barrier and the lowest free energy acids. American Chemical Society, Washington,
path of concerted reactions. J Phys Chem B DC, pp 379–393
109(14):6676–6687 62. Jorgensen WL, Chandrasekhar J, Madura JD,
49. Berendsen HJC, van der Spoel D, van Drunen Impey RW, Klein ML (1983) Comparison of
R (1995) GROMACS: a message-passing par- simple potential functions for simulating liquid
allel molecular dynamics implementation. water. J Chem Phys 79:926–935
Comput Phys Commun 91(1–3):43–56 63. Duan Y, Wu C, Chowdhury S, Lee MC,
50. Williams T, Kelley C et al (2013) Gnuplot 4.6: Xiong G, Zhang W, Yang R, Cieplak P,
an interactive plotting program. http:// Luo R, Lee T, Caldwell J, Wang J, Kollman P
gnuplot.sourceforge.net/ (2003) A point-charge force field for molecular
51. Watson JD, Crick FH et al (1953) Molecular mechanics simulations of proteins based on
structure of nucleic acids. Nature 171 condensed-phase quantum mechanical calcula-
(4356):737–738 tions. J Comput Chem 24:1999–2012
52. Hoogsteen K (1959) The structure of crystals 64. Darden T, York D, Pedersen L (1993) Particle
containing a hydrogen-bonded complex of mesh Ewald: an NLog(N) method for Ewald
1-methylthymine and 9-methyladenine. Acta sums in large systems. J Chem Phys
Crystallogr 12(10):822–823 98:10089–10092
53. Nikolova EN, Kim E, Wise AA, O’Brien PJ, 65. Essmann U, Perera L, Berkowitz ML,
Andricioaei I, Al-Hashimi HM (2011) Tran- Darden T, Lee H, Pedersen LG (1995) A
sient Hoogsteen base pairs in canonical duplex smooth particle mesh Ewald method. J Chem
DNA. Nature 470(7335):498–502 Phys 103:8577–8593
54. Nikolova EN, Zhou H, Gottardo FL, Alvey 66. Bussi G, Donadio D, Parrinello M (2007)
HS, Kimsey IJ, Al-Hashimi HM (2013) A his- Canonical sampling through velocity rescaling.
torical account of Hoogsteen base-pairs in J Chem Phys 126:014101
duplex DNA. Biopolymers 99(12):955–968 67. Parrinello M, Rahman A (1981) Polymorphic
55. Alvey HS, Gottardo FL, Nikolova EN, transitions in single crystals: a new molecular
Al-Hashimi HM (2014) Widespread transient dynamics method. J Appl Phys 52:7182–7190
Hoogsteen base-pairs in canonical duplex 68. Ivani I, Dans PD, Noy A, Pérez A, Faustino I,
DNA with variable energetics. Nat Commun Walther J, Andrio P, Goñi R, Balaceanu A, Por-
5:4786 tella G et al (2016) Parmbsc1: a refined force
56. Zhou H, Hintze BJ, Kimsey IJ, field for DNA simulations. Nat Methods 13
Sathyamoorthy B, Yang S, Richardson JS, (1):55–58
Al-Hashimi HM (2015) New insights into 69. Tiwary P, Berne B (2016) Spectral gap optimi-
Hoogsteen base pairs in DNA duplexes from zation of order parameters for sampling
290 Alberto Pérez de Alba Ortı́z et al.
complex molecular systems. Proc Natl Acad Sci 74. Swenson D, Prinz JH, Noe F, Chodera JD,
USA 113:2839–2844 Bolhuis PG (2019) OpenPathSampling: a
70. Sultan MM, Pande VS (2017) tICA- Python framework for path sampling simula-
metadynamics: accelerating metadynamics by tions. I. Basics. J Chem Theory Comput
using kinetically selected collective variables. J 15:813–836
Chem Theory Comput 13(6):2440–2447 75. Swenson D, Prinz JH, Noe F, Chodera JD,
71. Mendels D, Piccini G, Parrinello M (2018) Bolhuis PG (2019) OpenPathSampling: a
Collective variables from local fluctuations. J Python framework for path sampling simula-
Phys Chem Lett 9(11):2776–2781 tions. II. Building and customizing path
72. Berman HM, Westbrook J, Feng Z, ensembles and sample schemes. J Chem The-
Gilliland G, Bhat TN, Weissig H, Shindyalov ory Comput 15:837–856
IN, Bourne PE (2000) The Protein Data Bank. 76. Pérez de Alba Ortı́z A (2017) PLUMED
Nucleic Acids Res 28:235–242 Wrapper for OpenPathSampling. https://e-
73. Dama JF, Rotskoff G, Parrinello M, Voth GA cam.readthedocs.io/en/latest/Classical-MD-
(2014) Transition-tempered metadynamics: Modules/modules/OpenPathSampling/ops_
robust, convergent metadynamics via on-the- plumed_wrapper/readme.html
fly transition barrier estimation. J Chem The-
ory Comput 10(9):3626–3633