Professional Documents
Culture Documents
MD Intro 1 PDF
MD Intro 1 PDF
Jarosaw Meller, Cornell University, Ithaca, New York, USA Nicholas Copernicus University, Article Contents
Torun, Poland . Introduction
. Computer Simulation Is a Powerful Research Tool
Molecular dynamics is a technique for computer simulation of complex systems, modelled . Atomic Force Field Model of Molecular Systems
at the atomic level. The equations of motion are solved numerically to follow the time . Molecular Dynamics Algorithm
evolution of the system, allowing the derivation of kinetic and thermodynamic properties . Numerical Integration of the Equations of Motion
of interest by means of computer experiments. Biologically important macromolecules . Force Calculation and Long-range Interactions
and their environments are routinely studied using molecular dynamics simulations. . Molecular Dynamics Is a Statistical Mechanics Method
. Limitations of Molecular Dynamics
. Molecular Modeller Kit
more than two point masses can be solved only approxi- motions) conguration of the heavy nuclei. The nuclei, in
mately. This is what physicists call the many-body turn, move in the eld of the averaged electron densities. As
problem. a consequence, one may introduce a notion of the potential
It is intuitively clear that less accurate approximations energy surface, which determines the dynamics of the
become inevitable with growing complexity. We can nuclei without taking explicit account of the electrons.
compute a more accurate wave function for the hydrogen Given the potential energy surface, we may use classical
molecule than for large molecules such as porphyrins, mechanics to follow the dynamics of the nuclei.
which occur at the active centres of many important Identifying the nuclei with the centres of the atoms and
biomolecules. It is also much harder to include explicitly the adiabatic potential energy surface with the implicit
the electrons in the model of a protein, rather than interaction law, we obtain a rigorous justication of the
representing the atoms as balls and the bonds as springs. intuitive representation of a molecule in terms of interact-
The use of the computer makes less drastic approximations ing atoms. The separation of the electronic and nuclear
feasible. Thus, bridging experiment and theory by means of variables implies also that, rather than solving the
computer simulations makes possible testing and improv- quantum electronic problem (which may be in practice
ing our models using a more realistic representation of infeasible), we may apply an alternative strategy, in which
nature. It may also bring new insights into mechanisms and the eect of the electrons on the nuclei is expressed by an
processes that are not directly accessible through experi- empirical potential.
ment. The problem of nding a realistic potential that would
On the more practical side, computer experiments can be adequately mimic the true energy surfaces is nontrivial but
used to discover and design new molecules. Testing it leads to tremendous computational simplications.
properties of a molecule using computer modelling is Atomic force eld models and the classical MD are based
faster and less expensive than synthesizing and character- on empirical potentials with a specic functional form,
izing it in a real experiment. Drug design by computer is representing the physics and chemistry of the systems of
commonly used in the pharmaceutical industry. interest. The adjustable parameters are chosen such that
the empirical potential represents a good t to the relevant
regions of the ab initio BornOppenheimer surface, or they
may be based on experimental data. A typical force eld,
Atomic Force Field Model of Molecular used in the simulations of biosystems, takes the form
shown in eqn [2].
Systems
X ai X bi
Ur1 ; ; rN li li0 2 i i0 2
The atomic force eld model describes physical systems as 2 2
bonds angles
collections of atoms kept together by interatomic forces. In X ci
particular, chemical bonds result from the specic shape of 1 cos n!i i
2
the interactions between atoms that form a molecule. The torsions
"8 9
X 12 8 96 #
interaction law is specied by the potential U(r1, _, rN), >ij > > ij >
which represents the potential energy of N interacting 4"ij > : ; > >
: > ;
atom pairs
r ij rij
atoms as a function of their positions ri 5 (xi, yi, zi). Given X qi qj
the potential, the force acting upon ith atom is determined k 2
by the gradient (vector of rst derivatives) with respect to atom pairs
rij
atomic displacements, as shown in eqn [1]. In the rst three terms summation indices run over all the
bonds, angles and torsion angles dened by the covalent
@U @U @U
F i rri Ur1 ; ; rN ; ; 1 structure of the system, whereas in the last two terms
@xi @yi @zi summation indices run over all the pairs of atoms (or sites
The notion of atoms in molecules is only an approxima- occupied by point charges qi), separated by distances
tion of the quantum-mechanical picture, in which mole- rij 5 |ri 2 rj | and not bonded chemically.
cules are composed of interacting electrons and nuclei. Physically, the rst two terms describe energies of
Electrons are to a certain extent delocalized and shared by deformations of the bond lengths li and bond angles yi
many nuclei and the resulting electronic cloud determines from their respective equilibrium values li0 and yi0. The
chemical bonding. It turns out, however, that to a very harmonic form of these terms (with force constants ai and
good approximation, known as the adiabatic (or Born bi) ensures the correct chemical structure, but prevents
Oppenheimer) approximation and based on the dierence modelling chemical changes such as bond breaking. The
in mass between nuclei and electrons, the electronic and third term describes rotations around the chemical bond,
nuclear problems can be separated. which are characterized by periodic energy terms (with
The electron cloud equilibrates quickly for each periodicity determined by n and heights of rotational
instantaneous (but quasistatic on the time scale of electron barriers dened by ci). The fourth term describes the van
der Waals repulsive and attractive (dispersion) interatomic reads as in eqn [4].
forces in the form of the LennardJones 12-6 potential,
and the last term is the Coulomb electrostatic potential. J
J
O
Some eects due to specic environments can be accounted
for by properly adjusted partial charges qi (and an eective Equation [4] is accurate up to terms of the fourth power in
value of the constant k) as well as the van der Waals Dt. Velocities can be calculated from the positions or
parameters eij and sij. propagated explicitly as in alternative leapfrog or velocity
Verlet schemes.
The exact trajectories correspond to the limit of an
Molecular Dynamics Algorithm innitesimally small integration step. It is, however,
desirable to use larger time steps to sample longer
In MD simulations the time evolution of a set of interacting trajectories. In practice Dt is determined by fast motions
particles is followed via the solution of Newtons equations in the system. Bonds involving light atoms (e.g. the OH
of motion, eqn [3], where ri(t) 5 (xi(t), yi(t), zi(t)) is the bond) vibrate with periods of several femtoseconds,
position vector of ith particle and Fi is the force acting implying that Dt should be on a subfemtosecond scale to
upon ith particle at time t and mi is the mass of the ensure stability of the integration. Although the fastest and
particle. not crucial vibrations can be eliminated by imposing
constraints on the bond length in the integration algo-
IJ rithm, a time step of more than 5 fs can rarely be achieved in
K simulations of biomolecules.
IJ
Particles usually correspond to atoms, although they may
represent any distinct entities (e.g. specic chemical
groups) that can be conveniently described in terms of a
certain interaction law. To integrate the above second-
Force Calculation and Long-range
order dierential equations the instantaneous forces acting Interactions
on the particles and their initial positions and velocities
need to be specied. Due to the many-body nature of the Updating the positions and velocities in the stepwise
problem the equations of motion are discretized and solved numerical integration procedure requires that the forces
numerically. The MD trajectories are dened by both acting upon the atoms (which change their relative
position and velocity vectors and they describe the time positions each time frame) have to be recomputed at each
evolution of the system in phase space. Accordingly, the step. Biomolecular force elds include long-range electro-
positions and velocities are propagated with a nite time static and dispersion interactions and a summation of
interval using numerical integrators, for example the Verlet order N2 has to be performed in a straightforward
algorithm. The (changing in time) position of each particle implementation to account for all nonbonded pairs.
in space is dened by ri(t), whereas the velocities vi(t) Therefore the repeated calculation of the forces denes
determine the kinetic energy and temperature in the the overall complexity of the MD algorithm and many
system. As the particles move their trajectories may be clever techniques have been developed to deal with the
displayed and analysed (Figure 1), providing averaged problem of long-range forces.
properties. The dynamic events that may inuence the An aqueous solution is the typical environment for
functional properties of the system can be directly traced at biological macromolecules and it has to be accounted for in
the atomic level, making MD especially valuable in realistic simulations, preferably by using an atomically
molecular biology. detailed representation of the solvent. Because of limited
computer memory and also to speed up the calculations,
only a nite sample of an extended (innite) system can be
represented explicitly in a computer model. The treatment
Numerical Integration of the Equations of long-range forces is related to the choice of boundary
of Motion conditions imposed on a system to deal with its nite size
and surface eects. The two common approaches are based
The aim of the numerical integration of Newtons on either periodic boundary conditions (Figure 2) and the
equations of motion is to nd an expression that denes Ewald method for lattice summations or on spherical
positions ri(t 1 Dt) at time t 1 Dt in terms of the already boundary conditions and the reaction eld method. Recent
known positions at time t. Because of its simplicity and algorithmic developments, for example the so-called
stability, the Verlet algorithm is commonly used in MD particle meshed Ewald method or fast multiple method,
simulations. The basic formula for this algorithm can be allow ecient computation of the long-range interactions
derived from the Taylor expansions for the positions ri(t); it without resorting to crude cuto approximation, in which
Figure 1 Ligand diffusion pathway through myoglobin mutant (Phe29) as observed in molecular dynamics (MD) simulation (Meller and Elber,
unpublished results). The positions of the carbonmonoxy ligand with respect to the protein, as the ligand escapes from the haem (marked in red) to the
external medium, are recorded and overlapped in order to obtain a suggestive view of the trajectory (carbon monoxide is represented by spheres). Several
alternative diffusion pathways have been reported in MD simulations of myoglobin. The MOIL package (Elber et al., 1994) was used to perform the
simulation and the figure was generated using the MOIL-View program (Simmerling et al., 1995).
contributions of sites separated by distance larger than a ensemble, in which the temperature T is constant whereas
certain cut-o are neglected (Sagui and Darden, 1999). the energy can be exchanged with the surroundings
(thermal bath) and the distribution of states is given by
the Boltzmann function.
Newtonian dynamics implies the conservation of energy
Molecular Dynamics Is a Statistical and MD trajectories provide a set of congurations
Mechanics Method distributed according to the microcanonical ensemble.
Therefore, a physical quantity can be measured by MD
According to statistical mechanics, physical quantities are simulation by taking an arithmetic average over instanta-
represented by averages over microscopic states (cong- neous values of that quantity obtained from the trajec-
urations) of the system, distributed in accord with a certain tories. In the limit of innite simulation time such averages
statistical ensemble. One important example is the micro- converge to the true value of the measured thermodyna-
canonical ensemble, in which only the dierent states mical properties. In practice the quality of sampling and
corresponding to a specic energy E have nonzero the accuracy of the interatomic potentials used in simula-
probability of occurring. Another example is the canonical tions are always limited. In fact the quality of sampling
Figure 2 Simulation model of C peptide (Chakrabar and Baldwin, 1995) in aqueous solution. The peptide (in the native conformation) is put into a box
with about 1400 water molecules and periodic boundary conditions are employed to define the implicit lattice copies of the simulation box. Molecular
dynamics folding simulations of peptides and small proteins are becoming feasible.
may be very poor, especially for processes of time scale Limitations of Molecular Dynamics
larger than typical MD simulations, and caution should be
exerted when drawing conclusions from such computer It is important to be aware of the limitations of MD in
experiments. order to make reasonable use of it. In addition, addressing
Generalizations of MD can be used to sample other current problems gives a avour of the new frontiers and
statistical ensembles. For example, by introducing a developments that will dene the future of MD simula-
coupling to the thermal bath, we may obtain a set of tions.
trajectories representing the canonical ensemble. It is
found that the energy is never strictly conserved, even in
the microcanonical simulation, because of the accumulat- Quantum effects
ing errors in the numerical integration (estimating the
uctuation of energy is a good test for the quality of MD Oxygen binding to haemoglobin, catalytic cleavage of the
simulations!). Tricks of the trade such as occasional peptide bond by chymotrypsin or the light-induced charge
rescaling of velocities may need to be used to adjust the transfer in the photosynthetic reaction centre are well-
energy, making a thorough equilibration of the system an known examples of biologically important processes.
important issue. These dynamical events involve quantum eects such as
Perceived from the point of view of statistical mechanics, changes in chemical bonding, the presence of important
MD is merely a method of conformational sampling that noncovalent intermediates and tunnelling of protons or
yields average structural and thermodynamical properties. electrons. Straightforward atomic force eld simulations
However, sampling many possible congurations can cannot be used to model such phenomena.
be also used as an optimization method. The (quasi) The changes in bonding and the existence of inter-
equilibrium congurations correspond to local minima mediates characteristic for the enzyme reactions can be
in the potential energy U (and not the total energy E) accounted for using rst-principles MD. However, quan-
and the most stable conguration corresponds to a tum (or ab initio) MD simulations for all valence electrons
global minimum of U. With a suciently long MD run are still impractical for large systems. Besides, rst-
(and some additional tricks) we may have enough luck to principles MD techniques such as the CarParinello
overcome numerous energy barriers and reach the global method are based on the ground state density functional
minimum on the complicated energy surface of a protein. theory (DFT) and they are at present restricted to the
dynamics of the ground-state adiabatic surfaces. There-
fore, various schemes of quantum-mechanical calculations to tackle the relatively strong interactions of macromole-
for an embedded quantum subsystem (with the rest of the cules with their water and lipid environments. Computa-
degrees of freedom treated classically) have been proposed tionally more demanding evaluation of the forces for large
as an alternative approach. systems implies that each integration step takes longer
time. Many alternative strategies and extensions of MD are
being explored to study slow conformational changes and
Reliability of the interatomic potentials activated processes.
The atomic force eld denes the physical model of the
simulated system. The results of simulations will be
realistic only if the potential energy function mimics the Molecular Modeller Kit
forces experienced by the real atoms. On the other hand
potential should have a simple functional form to speed up Molecular dynamics is only one of a number of computer
the evaluation of the forces. Ideally empirical potentials methods available to molecular modellers. Global optimi-
should also be transferable and applicable to possibly zation techniques, free energy methods and alternative
many systems under dierent conditions. Designing a good approaches to conformational analysis enable applications
force eld is a challenging task! to a much broader range of problems.
The standard approach combines experimental data and
the results of ab initio calculations on model systems that
can be used as building blocks for a macromolecule. For Monte Carlo method
example, structural parameters such as the equilibrium Thermodynamic and structural equilibrium properties are
bond lengths can be taken from crystallographic studies static averages independent of the dynamics of the system.
and the initial guess of the partial atomic charges can be Hence, they can be calculated by any (ecient and correct)
based on ab initio electronic densities. Further renement sampling method. One such method is the widely used
of the parameters is possible through computer simula- Monte Carlo (MC) technique, which was developed even
tions of well-characterized systems and comparison of before MD. The core of the MC algorithm is a heuristic
calculated and experimental results. The available force prescription for a plausible pattern of changes in the
elds, such as AMBER (Cornell et al., 1995), CHARMM congurations assumed by the system. Such an elementary
(Brooks et al., 1983) or GROMOS (van Gunsteren et al., move depends on the type of problem. In the realm of
1996), have proved to be suciently accurate in terms of protein structure it may be, for instance, a rotation around
kinetic and thermodynamic properties derived by means of a randomly chosen backbone bond. A long series of
MD simulations for proteins or nucleic acids. Yet, it is clear random moves is generated with only some of them
that there is vast room for improvement. The coming era of considered to be good moves. In the standard Metropolis
long-time simulations will pose new challenges in terms of MC a move is accepted unconditionally if the new
stability and accuracy of MD and systematic techniques of conguration results in a better (lower) potential energy.
improving the quality of the atomic force elds are highly Otherwise it is accepted with a probability given by the
desirable. Boltzmann factor, as in eqn [5], where DU 5 U(r) 2 U(r)
denotes the change in the potential energy associated with a
move r!r and kB is the Boltzmann constant.
Time and size limitations
The time limitation is the most severe problem in MD STU
W
simulations. Relevant time scales for biologically impor- V
tant processes extend over many orders of magnitude. For As a result average properties obtained from the accepted
example, the nitric oxide (NO) rebinding to myoglobin congurations are consistent with the canonical ensemble
takes tens of picoseconds, the R to T conformational at temperature T. The advantage of the MC method is its
transition in haemoglobin takes tens of microseconds, generality and a relatively weak dependence on the
whereas protein folding may take minutes. However, the dimensionality of the system. However, nding a move
presence of signicant fast motions limits the time step in that would ensure ecient sampling may be a nontrivial
numerical integration to about one femtosecond. Thus, problem.
following the allosteric transition in haemoglobin requires Except for conformational sampling, MC can also be
tens of billions of steps for a system of about 10 000 atoms! used as a global optimization method in which we seek a
While this may become feasible in the near future, the global minimum of the potential energy. Global optimiza-
nanosecond time scale for biosystems comprising several tion is an important and dicult problem in protein
tens of thousands of atoms is the current domain of structure determination. The MC optimization protocol is
standard MD simulations. The desired length of simula- particularly ecient in conjunction with the so-called
tions also places limits on increasing the size of the problem simulated annealing method based on the sequential
heating and cooling of the system. Indeed, the chances of ligand diusion (Figure 1) have been studied, adding to our
accepting energetically unfavourable moves (or in other understanding of the discrimination mechanism that
words, the chances to overcome barriers) can be increased prevents (dysfunctional) binding of alternative ligands.
by raising the eective temperature. The thermodynamics of the cooperativity of oxygen
binding to haemoglobin units has been studied and
Free energy methods changes in the free energy of cooperativity upon crucial
point mutations have been quantitatively reproduced (for
The Helmholtz (canonical ensemble) free energy is dened a review, see Kuczera, 1996).
as F 5 E 2 TS, where E and S are the energy and entropy of Folding of peptides and small proteins (Figure 2) has also
a system, respectively. Relative free energies determine the been investigated extensively. The structure and kinetics of
relative stability of dierent states of a system whereas the folding of many peptides have been studied using MD
free energy barriers determine the rates of transitions simulations in conjunction with free energy and global
between dierent states. For example, the relative anities optimization methods. Unfolding MD simulations, often
of ligand binding to receptor molecules result from the performed at high temperatures to enhance the rate of
dierences in free energy, in accord with the relation log K unfolding, provide insights into forces that stabilize
/ 2 DF/kBT, where K is the equilibrium constant for proteins. Important kinetic intermediates can be identied
association (dened as the relative concentration of and subsequently studied in refolding computer experi-
liganded and free species under equilibrium conditions). ments (for a recent review, see Brooks, 1998). Initial stages
Therefore atomic simulations allowing estimation of the in folding of a 36-residue protein (with explicit representa-
relative free energies are of great importance. The standard tion of water) have been observed directly in a 1-ms MD
techniques for calculating free energy dierences include simulation (Duan and Kollman, 1998). Other important
the so-called thermodynamic integration and thermody- applications of MD to conformational changes in proteins
namic perturbation methods. Both approaches usually include studies of ion channel proteins such as gramicidin
imply a series of MD or MC simulations (for intermediate and other transmembrane proteins (for a recent review, see
states) with extensive sampling and they are computation- Sansom, 1998) and studies on the ligand-induced or
ally very intensive. The development of better free energy solvent-induced conformational changes such as the hinge
methods is an active research eld fuelled by the interest of bending observed in many enzymes.
the pharmaceutical industry.