Professional Documents
Culture Documents
PDF Computational Biology A Statistical Mechanics Perspective 2Nd Edition Ralf Blossey Ebook Full Chapter
PDF Computational Biology A Statistical Mechanics Perspective 2Nd Edition Ralf Blossey Ebook Full Chapter
https://textbookfull.com/product/statistical-mechanics-entropy-
order-parameters-and-complexity-2nd-edition-sethna/
https://textbookfull.com/product/computational-continuum-
mechanics-ahmed-a-shabana/
https://textbookfull.com/product/an-introduction-to-statistical-
mechanics-and-thermodynamics-2nd-edition-robert-h-swendsen/
https://textbookfull.com/product/non-equilibrium-statistical-
mechanics-prigogine/
Fundamental statistical inference a computational
approach 1st Edition Paolella
https://textbookfull.com/product/fundamental-statistical-
inference-a-computational-approach-1st-edition-paolella/
https://textbookfull.com/product/methods-in-statistical-
mechanics-a-modern-view-osvaldo-civitarese/
https://textbookfull.com/product/nonequilibrium-statistical-
physics-a-modern-perspective-1st-edition-roberto-livi/
https://textbookfull.com/product/computational-methods-in-
synthetic-biology-2nd-edition-mario-andrea-marchisio/
https://textbookfull.com/product/a-computational-approach-to-
statistical-learning-1st-edition-taylor-arnold/
Chapman & Hall/CRC
Mathematical and Computational Biology Series
Computational Biology
A Statistical Mechanics Perspective
Second Edition
Ralf Blossey
Computational Biology
A Statistical Mechanics
Perspective
Second Edition
Computational Biology
A Statistical Mechanics
Perspective
Second Edition
Ralf Blossey
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Rea-
sonable efforts have been made to publish reliable data and information, but the author
and publisher cannot assume responsibility for the validity of all materials or the conse-
quences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if
permission to publish in this form has not been obtained. If any copyright material has not
been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted,
reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other
means, now known or hereafter invented, including photocopying, microfilming, and record-
ing, or in any information storage or retrieval system, without written permission from the
publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Cen-
ter, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-
for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system
of payment has been arranged.
vii
viii Contents
Index 281
Preface to the Second
Edition
I hope that with its modifications, including updated additional notes to the
literature and references, this book will continue to find its readership and
help to introduce its readers to this exciting field of research.
ix
Preface to the First
Edition
The readers of this book will indeed find a number of topics in it which com-
monly classify as biophysics; an example is the discussion of the electrostatic
properties of biomolecules. But the book’s ambition is different from being yet
another introduction into biophysics. I therefore like to explain my motivation
for the selection of topics I made.
The title of the book establishes a link between computational biology on the
one side, and statistical mechanics on the other. Computational biology is
the name of a new discipline. It is probably fair to say that it is, in fact, an
emerging one, since a new name is not sufficient to establish a new discipline.
Quantitative methods have been applied to biological problems since many
years; this has led to a vast number of different subdisciplines of established
fields of science: there is a mathematical biology, a biomathematics, a bio-
statistics, a bioinformatics, a biophysics, a theoretical biology, a quantitative
biology... this list is certainly not exhaustive.
All these subdisciplines emerged out of the existing fields and developed by
an act of transfer: the use of a method or a mathematical approach within a
new context, its application to new problems. One may expect that at some
point in the future all these different subdisciplines may merge into a common
discipline. For the time being and for the lack of a definite name, let us call
this discipline computational biology.
This book wants to contribute a particular element to this field; the use of
statistical mechanics methods for the modelling of the properties of biologi-
cal systems. Statistical physics is the scientific discipline which was developed
in order to understand the properties of matter composed of many particles.
Traditionally, it has been applied to non-living matter: gases, liquids, and
solids. Meanwhile, it is increasingly applied to what is nowadays called “soft
xi
xii Preface to the First Edition
But there is a second aspect for which these methods can prove important,
and this relates to the information content of the biological systems. Biology
is built on recognition processes: DNA strands have to recognize each other,
proteins have to identify DNA binding sites, etc. In bioinformatics, these recog-
nition problems are commonly modelled as pattern recognition problems: this
mapping is the basis of the enormous success of the field of modern genomics.
The first part of the book gives a concise introduction into the main concepts
of statistical mechanics, equilibrium, and non-equilibrium. The exposition is
introductory in the choice of topics addressed, but still mathematically chal-
lenging. Whenever possible, I have tried to illustrate the methods with simple
examples.
The second part of the book is devoted to biomolecules, to DNA, RNA, pro-
teins, and chromatin, i.e., the progression of topics follows more or less what
is commonly known as the central dogma of molecular biology. In this part,
mostly equilibrium statistical mechanics is needed. The concern here is to
understand and model the processes of base-pair recognition and (supra-)
molecular structure formation.
Preface to the First Edition xiii
The third part of the book is devoted to biological networks. Here, both equi-
librium and non-equilibrium concepts introduced in the first part are used.
The presentation covers several of the non-equilibrium statistical physics ap-
proaches described in Chapter 2 of Part I, and illustrates them on biologically
motivated and relevant model systems.
Throughout the book, Exercises and Tasks are scattered, most of them in the
first part. They are intended to motivate the readers to participate actively
in the topics of the book. The distinction between Exercises and Tasks is the
following: Exercises should be done in order to verify that the concept that
was introduced has been understood. Tasks are more ambitious and usually
require either a more involved calculation or an additional idea to obtain the
answer.
A final technical note: the book is complemented by a detailed key word list.
Key words listed are marked in italics throughout the text.
In the course of shaping the idea of this book and writing it, I profited from
discussions with many colleagues. I am especially thankful to Arndt Benecke,
Dennis Bray, Martin Brinkmann, Luca Cardelli, Enrico Carlon, Avi Halperin,
Andreas Hildebrandt, Martin Howard, Oliver Kohlbacher, Hans Meinhardt,
Thomas Lengauer, Hans-Peter Lenhof, Annick Lesne, Ralf Metzler, Johan
Paulsson, Andrew Phillips, Wilson Poon, Helmut Schiessel, Bernard Vanden-
bunder, Jean-Marc Victor, Pieter Rein ten Wolde, Edouard Yeramian.
Lille, 2006
I
Equilibrium Statistical
Mechanics
1
CHAPTER 1
Equilibrium Statistical
Mechanics
Après tout, vous êtes ici, c’est l’essentiel! Nous allons faire du bon travail
ensemble [...]! D’ici peu, le monde ébloui decouvrira la puissance du grand Z!
This section introduces the basic physical concepts and mathematical quan-
tities needed to describe systems composed of many particles, when only a
statistical description remains possible.
3
4 Computational Biology
set of these state variables x, irrespective of their physical nature, and treat
x here as a set of discrete variables;1 we call such a collection of variables
characterizing the state of a system a microstate. The possible microstates a
system can assume, typically under certain constraints, can be subsumed un-
der the name of a statistical ensemble. One has to distinguish the microstate
the system can assume in each of its realisations from the ultimate macrostate
that is to be characterized by a macroscopic physical observable after a suit-
able averaging procedure over the microstates.2
Coming back to the conceptual question: what quantity governs the probabil-
ity P (x) of each state to occur?
Probability
P and information. We call P (x) ≥ 0 a probability; consequently,
{x} P (x) = 1, where the summation is over all possible states. To each prob-
ability we can associate an information measure3
s(x) = − ln P . (1.1)
way; we will encounter this later and will pass liberally from one notation to the other.
2 The averaging procedure means that we can determine the macroscopic state by an
average over the ensemble of microstates; the resulting ensemble average gives a correct
description of the macrostate of the system if the system had sufficient time in its dynamic
evolution to sample all its microstates; this condition is called ergodicity.
3 The choice for the logarithm will become clear in the section on thermodynamics. Note
that we do not distinguish in our notation between log and ln, the meaning should be clear
from the context.
Equilibrium Statistical Mechanics 5
Consider first the case of equiprobabilities, as we have assumed for our simple
DNA example: there is no other constraint. In order to apply Jaynes’ concept,
we have to maximize S(P ) under the only natural constraint of normalization,
4
P
{x} P (x) = 1. This leads to a variational problem
X
δ[S + λ[ P (x) − 1]] = 0 (1.3)
{x}
S = ln Ω . (1.5)
4 For those confused by the δ-notation common to variational calculations, the same
The canonical ensemble. We now repeat the above argument and cal-
culation for the so-called canonical ensemble. Here, one additional constraint
appears, since we want to characterize the states of the system now by an
energy E(x), which we allow to fluctuate with average hEi = E0 .
In this case we must maximize S(P ) under two constraints, the normalization
condition, and in addition the condition on the average energy
X
hEi = E(x)P (x) . (1.6)
{x}
β −1 = kB T (1.10)
F ≡ −kB T ln Z . (1.11)
The meaning of this definition will become clear in the following section. Eq.
(1.9) is the key quantity in all of the first chapter of this part of the book; we
will be mostly concerned with methods how to compute it.
Task. Determine Pβ,µ if not only the average energy hEi is taken as a con-
straint condition, but also the number of particles N is allowed to fluctuate
with average hN i. The associated Lagrange parameter µ is called chemical
Equilibrium Statistical Mechanics 7
dE = dQ + dW (1.12)
where Q is the heat flowing into the system, and W the work done by
the system. If a process is adiabatic, dQ is zero.
• Law 2: In a thermally isolated macroscopic system, entropy never de-
creases.
5 The chemical potential often confuses: it is a measure for the availability of particles,
and plays thus an analogous role as temperature does in providing an energy source.
6 There are exceptions for the equivalence of ensembles, a point we do not pursue here.
And, obviously, for systems in which N is very much smaller than Avogadro’s constant, the
deviations from the thermodynamic limit can be important.
8 Computational Biology
Consider now the first statement, the notion of the extensivity of S. We can
express it mathematically by writing S as a first-order homogeneous function
of its extensive parameters
where λ is a scalar.
7 E is also extensive, but this is actually a subtle point if systems become strongly
interacting. We are not concerned with sophisticated problems of this sort here.
8 The equality is restricted to a set of measure zero.
Equilibrium Statistical Mechanics 9
Exercise. Perform the generalization of the cyclic rule, Eq. (1.23), to more
variables.
∂E ∂E ∂E
dE = dS + dV + dN (1.24)
∂S V,N ∂V S,N ∂N S,V
where the subscripts indicate the variables to be held fixed. We define the
following intensive parameters:
∂E
T = (1.25)
∂S V,N
is temperature;
∂E
P =− (1.26)
∂V S,N
is pressure;
∂E
µ= (1.27)
∂N S,V
is chemical potential; hence
dE = T dS − P dV + µdN . (1.28)
The intensive parameters are all functions of S, V, N , and the functional rela-
tionships in eqs.(1.25) - (1.27) are called equations of state.
E = T S − P V + µN . (1.29)
9 The expressions for E show that T S has the dimension of an energy. Hence the in-
After this formal definition we want to apply this concept. This is best done
by simple cases, ignoring the physical context for the moment.
df
y = f 0 (x) = ≡ g(x) . (1.32)
dx
We can understand y = g(x) as a variable transformation from x to y. How
can we express f in terms of y?
For the function f (x) we can write the equation within each point x along the
curve
f (x) = xf 0 (x) + b(x) . (1.33)
Exercise. Give the Legendre transform of the harmonic form with the vector x,
1 T
u(x) = (x · A · x) , (1.36)
2
We now return to the physical context. For the energy E, the four most
common thermodynamic potentials that result from the application of the
Legendre transform are the
• Helmholtz free energy (T given, canonical ensemble)
• Enthalpy (P given)
dH = T dS + V dP + µdN (1.40)
dΦ = −SdT − P dV − N dµ (1.44)
This ends our brief look into thermodynamics. The message that we want to
retain is that
Equilibrium Statistical Mechanics 13
1.3 COMPUTING Z
The parameter A > 0 controls the width of P (x) and, together with B, the
peak position. Introducing the average µ and variance σ 2 we find
B 1
µ1 = − , σ2 = (1.47)
A A
and can thus write the normalized form of the Gaussian or standard normal
distribution
1 (x−µ1 )2
P (x) = √ e − 2σ2 . (1.48)
2πσ 2
14 Computational Biology
Based on these results we can now easily write down the mean of each of the xi
X
hxi i = − (A −1 )ij Bj (1.51)
j
The inverse of A is the covariance matrix. Its diagonal elements are the vari-
ances, the off-diagonals are the covariances.
where the symbol h...i was used to abbreviate the average in distribution. Ob-
viously, G(k) is a Fourier transform of P . It exists for real k and obeys
where Z
µn = hxn i = dxP (x)xn . (1.56)
Another random document with
no related content on Scribd:
back
back
back
back
back
back
back
back
back
back
back
back
back
back
back
back
back
back