This action might not be possible to undo. Are you sure you want to continue?
(solitons, chaos, discrete breathers)
N. Theodorakopoulos
Konstanz, June 2006
Contents
Foreword vi
1 Background: Hamiltonian mechanics 1
1.1 Lagrangian formulation of dynamics . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Hamiltonian dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.1 Canonical momenta . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.2 Poisson brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.3 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.4 Canonical transformations . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.5 Point transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 HamiltonJacobi theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.1 HamiltonJacobi equation . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.2 Relationship to action . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.3 Conservative systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.4 Separation of variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.5 Periodic motion. Actionangle variables . . . . . . . . . . . . . . . . . 5
1.3.6 Complete integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Symmetries and conservation laws . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.1 Homogeneity of time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.2 Homogeneity of space . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.3 Galilei invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.4 Isotropy of space (rotational symmetry of Lagrangian) . . . . . . . . . 7
1.5 Continuum ﬁeld theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5.1 Lagrangian ﬁeld theories in 1+1 dimensions . . . . . . . . . . . . . . . 8
1.5.2 Symmetries and conservation laws . . . . . . . . . . . . . . . . . . . . 8
1.6 Perturbations of integrable systems . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Background: Statistical mechanics 11
2.1 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Phase space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Liouville’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 Averaging over time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.4 Ensemble averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.5 Equivalence of ensembles . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.6 Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 The FPU paradox 15
3.1 The harmonic crystal: dynamics . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 The harmonic crystal: thermodynamics . . . . . . . . . . . . . . . . . . . . . 16
3.3 The FPU numerical experiment . . . . . . . . . . . . . . . . . . . . . . . . . . 17
i
Contents
4 The Korteweg  de Vries equation 20
4.1 Shallow water waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.1 Background: hydrodynamics . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.2 Statement of the problem; boundary conditions . . . . . . . . . . . . . 21
4.1.3 Satisfying the bottom boundary condition . . . . . . . . . . . . . . . . 21
4.1.4 Euler equation at top boundary . . . . . . . . . . . . . . . . . . . . . . 22
4.1.5 A solitary wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.6 Is the solitary wave a physical solution? . . . . . . . . . . . . . . . . . 24
4.2 KdV as a limiting case of anharmonic lattice dynamics . . . . . . . . . . . . . 24
4.3 KdV as a ﬁeld theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.1 KdV Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.2 Symmetries and conserved quantities . . . . . . . . . . . . . . . . . . . 26
4.3.3 KdV as a Hamiltonian ﬁeld theory . . . . . . . . . . . . . . . . . . . . 27
5 Solving KdV by inverse scattering 28
5.1 Isospectral property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.2 Lax pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.3 Inverse scattering transform: the idea . . . . . . . . . . . . . . . . . . . . . . 29
5.4 The inverse scattering transform . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.4.1 The direct problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.4.2 Time evolution of scattering data . . . . . . . . . . . . . . . . . . . . . 31
5.4.3 Reconstructing the potential from scattering data (inverse scattering
problem) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.4.4 IST summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.5 Application of the IST: reﬂectionless potentials . . . . . . . . . . . . . . . . . 35
5.5.1 A single bound state . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.5.2 Multiple bound states . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.6 Integrals of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.6.1 Lemma: a useful representation of a(k) . . . . . . . . . . . . . . . . . 39
5.6.2 Asymptotic expansions of a(k) . . . . . . . . . . . . . . . . . . . . . . 39
5.6.3 IST as a canonical transformation to actionangle variables . . . . . . 41
6 Solitons in anharmonic lattice dynamics: the Toda lattice 42
6.1 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.2 The dual lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.2.1 A pulse soliton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.3 Complete integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.4 Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7 Chaos in low dimensional systems 48
7.1 Visualization of simple dynamical systems . . . . . . . . . . . . . . . . . . . . 48
7.1.1 Two dimensional phase space . . . . . . . . . . . . . . . . . . . . . . . 48
7.1.2 4dimensional phase space . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.1.3 3dimensional phase space; nonautonomous systems with one degree
of freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.2 Small denominators revisited: KAM theorem . . . . . . . . . . . . . . . . . . 52
7.3 Chaos in area preserving maps . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.3.1 Twist maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.3.2 Local stability properties . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.3.3 Poincar´eBirkhoﬀ theorem . . . . . . . . . . . . . . . . . . . . . . . . . 55
7.3.4 Chaos diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7.3.5 The standard map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7.3.6 The Arnold cat map . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.3.7 The baker map; Bernoulli shifts . . . . . . . . . . . . . . . . . . . . . . 64
ii
Contents
7.3.8 The circle map. Frequency locking . . . . . . . . . . . . . . . . . . . . 66
7.4 Topology of chaos: stable and unstable manifolds, homoclinic points . . . . . 67
8 Solitons in scalar ﬁeld theories 69
8.1 Deﬁnitions and notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.1.1 Lagrangian, continuum ﬁeld equations . . . . . . . . . . . . . . . . . . 69
8.2 Static localized solutions (general KG class) . . . . . . . . . . . . . . . . . . . 71
8.2.1 General properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.2.2 Speciﬁc potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.2.3 Intrinsic Properties of kinks . . . . . . . . . . . . . . . . . . . . . . . . 73
8.2.4 Linear stability of kinks . . . . . . . . . . . . . . . . . . . . . . . . . . 74
8.3 Special properties of the SG ﬁeld . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.3.1 The SineGordon breather . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.3.2 Complete Integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
9 Atoms on substrates: the FrenkelKontorova model 77
9.1 The CommensurateIncommensurate transition . . . . . . . . . . . . . . . . . 78
9.1.1 The continuum approximation . . . . . . . . . . . . . . . . . . . . . . 78
9.1.2 The special case = 0: kinks and antikinks . . . . . . . . . . . . . . . 79
9.1.3 The general case > 0: the soliton lattice . . . . . . . . . . . . . . . . 79
9.2 Breaking of analyticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
9.2.1 FK ground state as minimizing periodic orbit of the standard map . . 84
9.2.2 Small amplitude motion . . . . . . . . . . . . . . . . . . . . . . . . . . 85
9.2.3 Free end boundary conditions . . . . . . . . . . . . . . . . . . . . . . . 85
9.3 Metastable states: spatial chaos as a model of glassy structure . . . . . . . . 86
10 Solitons in magnetic chains 88
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
10.2 Classical spin dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
10.2.1 Spin Poisson brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
10.2.2 An alternative representation . . . . . . . . . . . . . . . . . . . . . . . 89
10.3 Solitons in ferromagnetic chains . . . . . . . . . . . . . . . . . . . . . . . . . . 90
10.3.1 The continuum approximation . . . . . . . . . . . . . . . . . . . . . . 90
10.3.2 The classical, isotropic, ferromagnetic chain . . . . . . . . . . . . . . . 91
10.3.3 The easyplane ferromagnetic chain in an external ﬁeld . . . . . . . . 96
10.4 Solitons in antiferromagnets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.4.1 Continuum dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.4.2 The isotropic antiferromagnetic chain . . . . . . . . . . . . . . . . . . 101
10.4.3 Easy axis anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
10.4.4 Easy plane anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
10.4.5 Easy plane anisotropy and symmetrybreaking ﬁeld . . . . . . . . . . . 106
11 Solitons in conducting polymers 110
11.1 Peierls instability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
11.1.1 Electrons decoupled from the lattice . . . . . . . . . . . . . . . . . . . 110
11.1.2 Electronphonon coupling; dimerization . . . . . . . . . . . . . . . . . 111
11.2 Solitons and polarons in (CH)
x
. . . . . . . . . . . . . . . . . . . . . . . . . . 114
11.2.1 A continuum approximation . . . . . . . . . . . . . . . . . . . . . . . . 114
11.2.2 Dimerization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
11.2.3 The soliton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
11.2.4 The polaron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
iii
Contents
12 Solitons in nonlinear optics 122
12.1 Background: Interaction of light with matter, MaxwellBloch equations . . . 122
12.1.1 Semiclassical theoretical framework and notation . . . . . . . . . . . . 122
12.1.2 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
12.2 Propagation at resonance. Selfinduced transparency . . . . . . . . . . . . . . 123
12.2.1 Slow modulation of the optical wave . . . . . . . . . . . . . . . . . . . 123
12.2.2 Further simpliﬁcations: Selfinduced transparency . . . . . . . . . . . 125
12.3 Selffocusing oﬀresonance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
12.3.1 Oﬀresonance limit of the MB equations . . . . . . . . . . . . . . . . . 126
12.3.2 Nonlinear terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
12.3.3 Spacetime dependence of the modulation: the nonlinear Schr¨odinger
equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
12.3.4 Soliton solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
13 Solitons in BoseEinstein Condensates 132
13.1 The GrossPitaevskii equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
13.2 Propagating solutions. Dark solitons . . . . . . . . . . . . . . . . . . . . . . . 132
14 Unbinding the double helix 134
14.1 A nonlinear lattice dynamics approach . . . . . . . . . . . . . . . . . . . . . . 134
14.1.1 Mesoscopic modeling of DNA . . . . . . . . . . . . . . . . . . . . . . . 134
14.1.2 Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
14.2 Nonlinear structures (domain walls) and DNA melting . . . . . . . . . . . . . 139
14.2.1 Local equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
14.2.2 Thermodynamics of domain walls . . . . . . . . . . . . . . . . . . . . . 142
15 Pulse propagation in nerve cells: the HodgkinHuxley model 144
15.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
15.2 The HodgkinHuxley model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
15.2.1 The axon membrane as an array of electrical circuit elements . . . . . 145
15.2.2 Ion transport via distinct ionic channels . . . . . . . . . . . . . . . . . 146
15.2.3 Voltage clamping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
15.2.4 Ionic channels controlled by gates . . . . . . . . . . . . . . . . . . . . . 146
15.2.5 Membrane activation is a threshold phenomenon . . . . . . . . . . . . 148
15.2.6 A qualitative picture of ion transport during nerve activation . . . . . 148
15.2.7 Pulse propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
16 Localization and transport of energy in proteins: The Davydov soliton 151
16.1 Background. Model Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . 151
16.1.1 Energy storage in C=O stretching modes. Excitonic Hamiltonian . . . 151
16.1.2 Coupling to lattice vibrations. Analogy to polaron . . . . . . . . . . . 151
16.2 BornOppenheimer dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
16.2.1 Quantum (excitonic) dynamics . . . . . . . . . . . . . . . . . . . . . . 152
16.2.2 Lattice motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
16.2.3 Coupled excitonphonon dynamics . . . . . . . . . . . . . . . . . . . . 153
16.3 The Davydov soliton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
16.3.1 The heavy ion limit. Static Solitons . . . . . . . . . . . . . . . . . . . 153
16.3.2 Moving solitons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
iv
Contents
17 Nonlinear localization in translationally invariant systems: discrete breathers 157
17.1 The SieversTakeno conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . 157
17.2 Numerical evidence of localization . . . . . . . . . . . . . . . . . . . . . . . . 159
17.2.1 Diagnostics of energy localization . . . . . . . . . . . . . . . . . . . . . 160
17.2.2 Internal dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
17.3 Towards exact discrete breathers . . . . . . . . . . . . . . . . . . . . . . . . . 161
A Impurities, disorder and localization 164
A.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
A.1.1 Electrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
A.1.2 Phonons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
A.2 A single impurity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
A.2.1 An exact result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
A.2.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
A.3 Disorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
A.3.1 Electrons in disordered onedimensional media . . . . . . . . . . . . . 169
A.3.2 Vibrational spectra of onedimensional disordered lattices . . . . . . . 169
Bibliography 173
v
Foreword
The fact that most fundamental laws of physics, notably those of electrodynamics and quan
tum mechanics, have been formulated in mathematical language as linear partial diﬀerential
equations has resulted historically in a preferred mode of thought within the physics com
munity  a “linear” theoretical bias. The Fourier decomposition  an admittedly powerful
procedure of describing an arbitrary function in terms of sines and cosines, but nonetheless
a mathematical tool  has been ﬁrmly embedded in the conceptual framework of theoretical
physics. Photons, phonons, magnons are prime examples of how successive generations of
physicists have learned to describe properties of light, lattice vibrations, or the dynamics of
magnetic crystals, respectively, during the last 100 years.
This conceptual bias notwithstanding, engineers or physicists facing speciﬁc problems in
classical mechanics, hydrodynamics or quantum mechanics were never shy of making par
ticular approximations which led to nonlinear ordinary, or partial diﬀerential equations.
Therefore, by the 1960’s, signiﬁcant expertise had been accumulated in the ﬁeld of nonlin
ear diﬀerential and/or integral equations; in addition, major breakthroughs had occurred on
some fundamental issues related to chaos in classical mechanics (Poincar´e, Birkhoﬀ, KAM
theorems). Due to the underlying linear bias however, this substantial progress took unusu
ally long to ﬁnd its way to the core of physical theory. This changed rapidly with the advent
of electronic computation and the new possibilities of numerical visualization which accom
panied it. Computer simulations became instrumental in catalyzing the birth of nonlinear
science.
This set of lectures does not even attempt to cover all areas where nonlinearity has proved
to be of importance in modern physics. I will however try to describe some of the basic
concepts mainly from the angle of condensed matter / statistical mechanics, an area which
provided an impressive list of nonlinearly governed phenomena over the last ﬁfty years 
starting with the FermiPastaUlam numerical experiment and its subsequent interpretation
by Zabusky and Kruskal in terms of solitons (“paradox turned discovery”, in the words of
J. Ford).
There is widespread agreement that both solitons and chaos have achieved the status of
theoretical paradigm. The third concept introduced here, localization in the absence of dis
order, is a relatively recent breakthrough related to the discovery of independent (nonlinear)
localized modes (ILMs), a.k.a. “discrete breathers”.
Since neither the development of the ﬁeld nor its present state can be described in terms
of a unique linear narrative, both the exact choice of topics and the digressions necessary to
describe the wider context are to a large extent arbitrary. The latter are however necessary
in order to provide a selfcontained presentation which will be useful for the nonexpert, i.e.
typically the advanced undergraduate student with an elementary knowledge of quantum
mechanics and statistical physics.
Konstanz, June 2006
vi
1 Background: Hamiltonian mechanics
Consider a mechanical system with s degrees of freedom.
The state of the mechanical system at any instant of time is described by the coordinates
¦Q
i
(t), i = 1, 2, , s¦ and the corresponding velocities ¦
˙
Q
i
(t)¦.
In many applications that I will deal with, this may be a set of N point particles which
are free to move in one spatial dimension. In that particular case s = N and the coordinates
are the particle displacements.
The rules for temporal evolution, i.e. for the determination of particle trajectories, are
described in terms of Newton’s law  or, in the more general Lagrangian and Hamiltonian
formulations. The more general formulations are necessary in order to develop and/or make
contact with fundamental notions of statistical and/or quantum mechanics.
1.1 Lagrangian formulation of dynamics
The Lagrangian is given as the diﬀerence between kinetic and potential energies. For a
particle system interacting by velocityindependent forces
L(¦Q
i
,
˙
Q
i
¦) = T −V (1.1)
T =
1
2
s
i=1
m
i
˙
Q
2
i
V = V (¦Q
i
¦, t) .
where an explicit dependence of the potential energy on time has been allowed. Lagrangian
dynamics derives particle trajectories by determining the conditions for which the action
integral
S(t, t
0
) =
_
t
t
0
dτL(¦Q
i
,
˙
Q
i
, τ¦) (1.2)
has an extremum. The result is
d
dt
∂L
∂
˙
Q
i
=
∂L
∂Q
i
(1.3)
which for Lagrangians of the type (1.2) becomes
m
i
¨
Q
i
= −
∂V
∂Q
i
(1.4)
i.e. Newton’s law.
1.2 Hamiltonian dynamics
1.2.1 Canonical momenta
Hamiltonian mechanics, uses instead of velocities, the canonical momenta conjugate to the
coordinates ¦Q
i
¦, deﬁned as
P
i
=
∂L
∂
˙
Q
i
. (1.5)
1
1 Background: Hamiltonian mechanics
In the case of (1.2) it is straightforward to express the Hamiltonian function (the total
energy) H = T +V in terms of P’s and Q
s. The result is
H(¦P
i
, Q
i
¦) =
s
I=1
P
2
i
2m
i
+V (¦Q
i
¦) . (1.6)
1.2.2 Poisson brackets
Hamiltonian dynamics is described in terms of Poisson brackets
¦A, B¦ =
s
i=1
_
∂A
∂Q
i
∂B
∂P
i
−
∂A
∂P
i
∂B
∂Q
i
_
(1.7)
where A, B are any functions of the coordinates and momenta. The momenta are canonically
conjugate to the coordinates because they satisfy the relationships
1.2.3 Equations of motion
According to Hamiltonian dynamics, the time evolution of any function A(¦P
i
, Q
i
¦, t) is
determined by the linear diﬀerential equations
˙
A ≡
dA
dt
= ¦A, H¦ +
∂A
∂t
. (1.8)
where the second term denotes any explicit dependence of A on the time t. Application of
(1.8) to the cases A = P
i
and A = Q
i
respectively leads to
˙
P
i
= ¦P
i
, H¦
˙
Q
i
= ¦Q
i
, H¦ (1.9)
which can be shown to be equivalent to (1.4). The time evolution of the Hamiltonian itself
is governed by
dH
dt
=
∂H
∂t
_
=
∂V
∂t
_
. (1.10)
1.2.4 Canonical transformations
Hamiltonian formalism important because the “symplectic”structure of equations of motion
(from Greek συµπλκω = crosslink  of momenta & coordinate variables ) remains invari
ant under a class of transformations obtained by a suitable generating function (“canoni
cal”transformations). Example, transformation from old coordinates & momenta ¦P, Q¦ to
new ones ¦p, q¦, via a generating function F
1
(Q, q, t) which depends on old and new coor
dinates (but not on old and new momenta  NB there are three more forms of generating
functions  ):
P
i
=
∂F
1
(q, Q, t)
∂Q
i
p
i
= −
∂F
1
(q, Q, t)
∂q
i
K = H +
∂F
1
∂t
(1.11)
2
1 Background: Hamiltonian mechanics
new coordinates are obtained by solving the ﬁrst of the above eqs., and new momenta by
introducing the solution in the second. It is straightforward to verify that the dynamics
remains forminvariant in the new coordinate system, i.e.
˙ p
i
= ¦p
i
, K¦
˙ q
i
= ¦q
i
, K¦ (1.12)
and
dK(p, q, t)
dt
=
∂K(p, q, t)
∂t
. (1.13)
Note that if there is no explicit dependence of F
1
on time, the new Hamiltonian K is equal
to the old H.
1.2.5 Point transformations
A special case of canonical transformations are point transformations, generated by
F
2
(Q, p, t) =
i
f
i
(Q, t)p
i
; (1.14)
New coordinates depend only on old coordinates  not on old momenta; in general new mo
menta depend on both old coordinates and momenta. A special case of point transformations
are orthogonal transformations, generated by
F
2
(Q, p) =
i,k
a
ik
Q
k
p
i
(1.15)
where a is an orthogonal matrix. It follows that
q
i
=
k
a
ik
Q
k
p
i
=
k
a
ik
P
k
. (1.16)
Note that, in the case of orthogonal transformations, coordinates transform among them
selves; so do the momenta. Normal mode expansion is an example of (1.16).
1.3 HamiltonJacobi theory
1.3.1 HamiltonJacobi equation
Hamiltonian dynamics consists of a system of 2N coupled ﬁrstorder linear diﬀerential equa
tions. In general, a complete integration would involve 2N constants (e.g. the initial values
of coordinates and momenta). Canonical transformations enable us to play the following
game:
1
Look for a transformation to a new set of canonical coordinates where the new
Hamiltonian is zero and hence all new coordinates and momenta are constants of the mo
tion.
2
Let (p, q) be the set of original momenta and coordinates in eqs of previous section,
1
HamiltonJacobi theory is not a recipe for integration of the coupled ODEs; nor does it in general lead to
a more tractable mathematical problem. However, it provides fresh insight to the general problem, in
cluding important links to quantum mechanics and practical applications on how to deal with mechanical
perturbations of a known, solved system.
2
Does this seem like too many constants? We will later explore what independent constants mean in
mechanics, but at this stage let us just note that the original mathematical problem of integrating the
2N Hamiltonian equations does indeed involve 2N constants.
3
1 Background: Hamiltonian mechanics
(α, β) the set of new constant momenta and coordinates generated by the generating func
tion F
2
(q, α, t) which depends on the original coordinates and the new momenta. The choice
of K ≡ 0 in (1.11) means that
∂F
2
∂t
+H(q
1
, q
s
;
∂F
2
∂q
1
, ,
∂F
2
∂q
s
; t) = 0 . (1.17)
Suppose now that you can [miraculously] obtain a solution of the ﬁrstorder in general
nonlinearPDE (1.17), F
2
= S(q, α, t). Note that the solution in general involves s constants
¦α
i
, i = 1, , s¦. The s + 1st constant involved in the problem is a trivial one, because if
S is a solution, so is S +A, where A is an arbitrary constant.
It is now possible to use the deﬁning equation of the generating function F
2
β
i
=
∂S(q, α, t)
∂α
i
(1.18)
to obtain the new [constant] coordinates ¦β
i
, i = 1, , s¦; ﬁnally, “turning inside out”(1.18)
yields the trajectories
q
j
= q
j
(α, β, t) . (1.19)
In other words, a solution of the HamiltonJacobi equation (1.17) provides a solution of the
original dynamical problem.
1.3.2 Relationship to action
It can be easily shown that the solution of the HamiltonJacobi equation satisﬁes
dS
dt
= L , (1.20)
or
S(q, α, t) −S(q, α, t
0
) =
_
t
t
0
dτ L(q, ˙ q, τ) (1.21)
where the r.h.s involves the actual particle trajectories; this shows that the solution of the
HamiltonJacobi equation is indeed the extremum of the action function used in Lagrangian
mechanics.
1.3.3 Conservative systems
If the Hamiltonian does not depend explicitly on time, it is possible to separate out the time
variable, i.e.
S(q, α, t) = W(q, α) −λ
0
t (1.22)
where now the timeindependent function W(q) (Hamilton’s characteristic function) satisﬁes
H
_
q
1
, q
s
;
∂W
∂q
1
, ,
∂W
∂q
s
_
= λ
0
, (1.23)
and involves s −1 independent constants, more precisely, the s constants α
1
, α
s
depend
on λ
0
.
4
1 Background: Hamiltonian mechanics
1.3.4 Separation of variables
The previous example separated out the time coordinate from the rest of the variables
of the HJ function. Suppose q
1
and
∂W
∂q
1
enter the Hamiltonian only in the combination
φ
1
_
q
1
,
∂W
∂q
1
_
. The Ansatz
W = W
1
(q
1
) +W
(q
2
, , q
s
) (1.24)
in (1.23) yields
H
_
q
2
, q
s
;
∂W
∂q
2
, ,
∂W
∂q
s
; φ
1
_
q
1
,
∂W
1
∂q
1
_
_
= λ
0
; (1.25)
since (1.25) must hold identically for all q, we have
φ
1
_
q
1
,
∂W
1
∂q
1
_
= λ
1
H
_
q
2
, q
s
;
∂W
∂q
2
, ,
∂W
∂q
s
; λ
1
_
= λ
0
. (1.26)
The process can be applied recursively if the separation condition holds. Note that cyclic
coordinates lead to a special case of separability; if q
1
is cyclic, then φ
1
=
∂W
∂q
1
=
∂W
1
∂q
1
, and
hence W
1
(q
1
) = λ
1
q
1
. This is exactly how the time coordinate separates oﬀ in conservative
systems (1.23).
Complete separability occurs if we can write Hamilton’s characteristic function  in some
set of canonical variables  in the form
W(q, α) =
i
W
i
(q
i
, α
1
, , α
s
) . (1.27)
1.3.5 Periodic motion. Actionangle variables
Consider a completely separable system in the sense of (1.27). The equation
p
i
=
∂S
∂q
i
=
∂W
i
(q
i
, α
1
, , α
s
)
∂q
i
(1.28)
provides the phase space orbit in the subspace (q
i
, p
i
). Now suppose that the motion in all
subspaces ¦(q
i
, p
i
), i = 1, , s¦ is periodic  not necessarily with the same period. Note that
this may mean either a strict periodicity of p
i
, q
i
as a function of time (such as occurs in
the bounded motion of a harmonic oscillator), or a motion of the freely rotating pendulum
type, where the angle coordinate is physically signiﬁcant only mod 2π. The action variables
are deﬁned as
J
i
=
1
2π
_
p
i
dq
i
=
1
2π
_
dq
i
∂W
i
(q
i
, α
1
, , α
s
)
∂q
i
(1.29)
and therefore depend only on the integration constants, i.e. they are constants of the motion.
If we can “turn inside out”(1.29), we can express W as a function of the J’s instead of the
α’s. Then we can use the function W as a generating function of a canonical transformation
to a new set of variables with the J’s as new momenta, and new “angle”coordinates
θ
i
=
∂W
∂J
i
=
∂W
i
(q
i
, J
1
, , J
s
)
∂J
i
. (1.30)
5
1 Background: Hamiltonian mechanics
In the new set of canonical variables, Hamilton’s equations of motion are
˙
J
i
= 0
˙
θ
i
=
∂H(J)
∂J
i
≡ ω
i
(J) . (1.31)
Note that the Hamiltonian cannot depend on the angle coordinates, since the action coordi
nates, the J’s, are  by construction  all constants of the motion. In the set of actionangle
coordinates, the motion is as trivial as it can get:
J
i
= const
θ
i
= ω
i
(J) t +const . (1.32)
1.3.6 Complete integrability
A system is called completely integrable in the sense of Liouville if it can be shown to have
s independent conserved quantities in involution (this means that their Poisson brackets,
taken in pairs, vanish identically). If this is the case, one can always perform a canonical
transformation to actionangle variables.
1.4 Symmetries and conservation laws
A change of coordinates, if it reﬂects an underlying symmetry of physical laws, will leave the
form of the equations of motion invariant. Because Lagrangian dynamics is derived from an
action principle, any such inﬁnitesimal change which changes the particle coordinates
q
i
→q
i
= q
i
+f
i
(q, t)
˙ q
i
→ ˙ q
i
= ˙ q
i
+
˙
f
i
(q, t) (1.33)
and adds a total time derivative to the Lagrangian, i.e.
L
= L +
dF
dt
, (1.34)
will leave the equations of motion invariant. On the other hand, the transformed Lagrangian
will generally be equal to
L
(¦q
i
, ˙ q
i
¦) = L(¦q
i
, ˙ q
i
¦)
= L(¦q
i
, ˙ q
i
¦) +
s
i=1
_
∂L
∂q
i
f
i
+
∂L
∂ ˙ q
i
˙
f
i
_
= L(¦q
i
, ˙ q
i
¦) +
s
i=1
_
d
dt
_
∂L
∂ ˙ q
i
_
f
i
+
∂L
∂ ˙ q
i
˙
f
i
_
= L(¦q
i
, ˙ q
i
¦) +
s
i=1
d
dt
_
∂L
∂ ˙ q
i
f
i
_
and therefore the quantity
s
i=1
∂L
∂ ˙ q
i
f
i
−F (1.35)
will be conserved.
Such underlying symmetries of classical mechanics are:
6
1 Background: Hamiltonian mechanics
1.4.1 Homogeneity of time
L
= L(t +) = L(t) +dL/dt, i.e. F = L; furthermore, q
i
= q
i
(t +) = q
i
+ ˙ q
i
, i.e. f
i
= ˙ q
i
.
As a result, the quantity
H =
s
i=1
∂L
∂ ˙ q
i
˙ q
i
−L (1.36)
(Hamiltonian) is conserved.
1.4.2 Homogeneity of space
The transformation q
i
→q
i
+ (hence f
i
= 1) leaves the Lagrangian invariant (F = 0). The
conserved quantity is
P =
s
i=1
∂L
∂ ˙ q
i
(1.37)
(total momentum).
1.4.3 Galilei invariance
The transformation q
i
→ q
i
− t (hence f
i
= −t) does not generally change the potential
energy (if it depends only on relative particle positions). It adds to the kinetic energy a
term −P, i.e. F = −
m
i
q
i
. The conserved quantity is
s
i=1
m
i
q
i
−Pt (1.38)
(uniform motion of the center of mass).
1.4.4 Isotropy of space (rotational symmetry of Lagrangian)
Let the position of the ith particle in space be represented by the vector coordinate q
i
.
Rotation around an axis parallel to the unit vector ˆ n is represented by the transformation
q
i
→q
i
+
f
i
where
f
i
= ˆ n q
i
. The change in kinetic energy is
i
˙
q
i
˙
f
i
= 0 .
If the potential energy is a function of the interparticle distances only, it too remains invariant
under a rotation. Since the Lagrangian is invariant, the conserved quantity (1.35) is
s
i=1
∂L
∂
˙
q
i
f
i
=
s
i=1
m
i
˙
q
i
(ˆ n q
i
) = ˆ n
I ,
where
I =
s
i=1
m
i
(q
i
˙
q
i
) (1.39)
is the total angular momentum.
7
1 Background: Hamiltonian mechanics
1.5 Continuum ﬁeld theories
1.5.1 Lagrangian ﬁeld theories in 1+1 dimensions
Given a Lagrangian in 1+1 dimensions,
L =
_
dxL(φ, φ
x
, φ
t
) (1.40)
where the Lagrangian density L depends only on the ﬁeld φ and ﬁrst space and time deriva
tives, the equations of motion can be derived by minimizing the total action
S =
_
dtdxL (1.41)
and have the form
d
dt
_
∂L
∂φ
t
_
+
d
dx
_
∂L
∂φ
x
_
−
∂L
∂φ
= 0 . (1.42)
1.5.2 Symmetries and conservation laws
The form (1.42) remains invariant under a transformation which adds to the Lagrangian
density a term of the form
∂
µ
J
µ
(1.43)
where the implied summation is over µ = 0, 1, because this adds only surface boundary terms
to the action integral. If the transformation changes the ﬁeld by δφ, and the derivatives by
δφ
x
, δφ
t
, the same argument as in discrete systems leads us to conclude that
∂L
∂φ
δφ +
∂L
∂φ
x
δφ
x
+
∂L
∂φ
t
δφ
t
=
_
dJ
0
dt
+
dJ
1
dx
_
(1.44)
which can be transformed, using the equations of motion, to
d
dt
_
∂L
∂φ
t
_
δφ +
∂L
∂φ
t
δφ
t
+
d
dx
_
∂L
∂φ
x
_
δφ +
∂L
∂φ
x
δφ
x
=
_
dJ
0
dt
+
dJ
1
dx
_
(1.45)
Examples:
1. homogeneity of space (translational invariance)
x → x +
δφ = φ(x +) −φ(x) = φ
x
δφ
t
= φ
t
(x +) −φ
t
(x) = φ
xt
δφ
x
= φ
x
(x +) −φ
x
(x) = φ
xx
δL =
dL
dx
δx =
dL
dx
⇒J
1
= L , J
0
= 0 . (1.46)
Eq. (1.45) becomes
d
dt
_
∂L
∂φ
t
_
φ
x
+
∂L
∂φ
t
φ
xt
+
d
dx
_
∂L
∂φ
x
_
φ
x
+
∂L
∂φ
x
φ
xx
=
dL
dx
(1.47)
or
d
dt
_
∂L
∂φ
t
φ
x
_
+
d
dx
_
∂L
∂φ
x
φ
x
−L
_
= 0 ; (1.48)
8
1 Background: Hamiltonian mechanics
integrating over all space, this gives
_
dx
∂L
∂φ
t
φ
x
≡ −P (1.49)
i.e. the total momentum is a constant.
2. homogeneity of time
t → t +
δφ = φ(t +) −φ(t) = φ
t
δφ
t
= φ
t
(t +) −φ
t
(t) = φ
tt
δφ
x
= φ
x
(t +) −φ
x
(t) = φ
xt
δL =
dL
dt
δt =
dL
dt
⇒J
0
= L , J
1
= 0 . (1.50)
Eq. (1.45) becomes
d
dt
_
∂L
∂φ
t
_
φ
t
+
∂L
∂φ
t
φ
tt
+
d
dx
_
∂L
∂φ
x
_
φ
t
+
∂L
∂φ
x
φ
tx
=
dL
dt
(1.51)
or
d
dt
_
∂L
∂φ
t
φ
t
−L
_
+
d
dx
_
∂L
∂φ
x
φ
t
_
= 0 ; (1.52)
integrating over all space, this gives
_
dx
_
∂L
∂φ
t
φ
t
−L
_
≡ H (1.53)
i.e. the total energy is a constant.
3. Lorentz invariance
1.6 Perturbations of integrable systems
Consider a conservative Hamiltonian system H
0
(J) which is completely integrable, i.e. it
possesses s independent integrals of motion. Note that I use the actionangle coordinates,
so that H
0
is a function of the (conserved) action coordinates J
j
. The angles θ
j
are cyclic
variables, so they do not appear in H
0
.
Suppose now that the system is slightly perturbed, by a timeindependent perturbation
Hamiltonian µH
1
(µ ¸1) A sensible question to ask is: what exactly happens to the integrals
of motion? We know of course that the energy of the perturbed system remains constant 
since H
1
has been assumed to be time independent. But what exactly happens to the other
s −1 constants of motion?
The question was ﬁrst addressed by Poincar´e in connection with the stability of the
planetary system. He succeeded in showing that there are no analytic invariants of the
perturbed system, i.e. that it is not possible, starting from a constant Φ
0
of the unperturbed
system, to construct quantities
Φ = Φ
0
(J) +µΦ
1
(J, θ) +µ
2
Φ
2
(J, θ) , (1.54)
where the Φ
n
’s are analytic functions of J, θ, such that
¦Φ, H¦ = 0 (1.55)
9
1 Background: Hamiltonian mechanics
holds, i.e. Φ is a constant of motion of the perturbed system. The proof of Poincar´e’s
theorem is quite general. The only requirement on the unperturbed Hamiltonian is that it
should have functionally independent frequencies ω
j
= ∂H
0
/∂J
j
. Although the proof itself
is lengthy and I will make no attempt to reproduce it, it is fairly straightforward to see
where the problem with analytic invariants lies.
To second order in µ, the requirement (1.55) implies
¦Φ
0
+µΦ
1
+µ
2
Φ
2
, H
0
+µH
1
¦ = 0
¦Φ
0
, H
0
¦ +µ(¦Φ
1
, H
0
¦ +¦Φ
0
, H
1
¦) +µ
2
(¦Φ
2
, H
0
¦ +¦Φ
1
, H
1
¦) = 0 .
The coeﬃcients of all powers must vanish. Note that the zeroth order term vanishes by
deﬁnition. The higher order terms will do so, provided
¦Φ
1
, H
0
¦ = −¦Φ
0
, H
1
¦ (1.56)
¦Φ
2
, H
0
¦ = −¦Φ
1
, H
1
¦ .
The process can be continued iteratively to all orders, by requiring
¦Φ
n
, H
0
¦ = −¦Φ
n+1
, H
1
¦ . (1.57)
Consider the lowestorder term generated by (1.57). Writing down the Poisson brackets
gives
s
j=1
_
∂Φ
1
∂θ
i
∂H
0
∂J
i
−
∂Φ
1
∂J
i
∂H
0
∂θ
i
_
= −
s
j=1
_
∂Φ
0
∂θ
i
∂H
1
∂J
i
−
∂Φ
0
∂J
i
∂H
1
∂θ
i
_
. (1.58)
The second term on the left hand side and the ﬁrst term on the righthand side vanish
because the θ’s are cyclic coordinates in the unperturbed system. The rest can be rewritten
as
s
j=1
ω
i
(J)
∂Φ
1
∂θ
i
=
s
j=1
∂Φ
0
∂J
i
∂H
1
∂θ
i
. (1.59)
For notational simplicity, let me now restrict myself to the case of two degrees of freedom.
The perturbed Hamiltonian can be written in a double Fourier series
H
1
=
n
1
,n
2
A
n
1
,n
2
(J
1
, J
2
) cos(n
1
θ
1
+n
2
θ
2
) . (1.60)
Similarly, one can make a double Fourier series ansatz for Φ
1
,
Φ
1
=
n
1
,n
2
B
n
1
,n
2
(J
1
, J
2
) cos(n
1
θ
1
+n
2
θ
2
) . (1.61)
Now apply (1.59) to the case Φ
0
(J) = J
1
. Using the double Fourier series I obtain
B
(J
1
)
n
1
,n
2
=
n
1
n
1
ω
1
+n
2
ω
2
A
n
1
,n
2
, (1.62)
which in principle determines the ﬁrstorder term in the µ expansion of the constant of
motion J
1
which should replace J
1
in the new system. It is straightforward to show, using
the same process for J
2
, that the perturbed Hamiltonian can be written in terms of the new
constants J
1
as
H = H
0
(J
1
, J
2
) +O(µ
2
) . (1.63)
Unfortunately, what looks like the beginning of a systematic expansion suﬀers from a fatal
ﬂaw. If the frequencies are functionally independent, the denominator in (1.62) will in gen
eral vanish on a denumerably inﬁnite number of surfaces in phase space. This however means
that Φ
1
cannot be an analytic function of J
1
, J
2
. Analytic invariants are not possible. All
integrals of motion  other than the energy  are irrevocably destroyed by the perturbation.
10
2 Background: Statistical mechanics
2.1 Scope
Classical statistical mechanics attempts to establish a systematic connection between micro
scopic theory which governs the dynamical motion of individual entities (atoms, molecules,
local magnetic moments on a lattice) and the macroscopically observed behavior of matter.
Microscopic motion is described  depending on the particular scale of the problem  either
by classical or quantum mechanics. The rules of macroscopically observed behavior under
conditions of thermal equilibrium have been codiﬁed in the study of thermodynamics.
Thermodynamics will tell you which processes are macroscopically allowed, and can es
tablish relationships between material properties. In principle, it can reduce everything 
everything which can be observed under varying control parameters ( temperature, pres
sure or other external ﬁelds) to the “equation of state”which describes one of the relevant
macroscopic observables as a function of the control parameters.
Deriving the form of the equation of state is beyond thermodynamics. It needs a link to
microscopic theory  i.e. to the underlying mechanics of the individual particles. This link
is provided by equilibrium statistical mechanics. A more general theory of nonequilibrium
statistical mechanics is necessary to establish a link between nonequilibrium macroscopic
behavior (e.g. a steady state ﬂow) and microscopic dynamics. Here I will only deal with
equilibrium statistical mechanics.
2.2 Formulation
A statistical description always involves some kind of averaging. Statistical mechanics is
about systematically averaging over hopefully nonessential details. What are these details
and how can we show that they are nonessential? In order to decide this you have to look
ﬁrst at a system in full detail and then decide what to throw out  and how to go about it
consistently.
2.2.1 Phase space
An Hamiltonian system with s degrees of freedom is fully described at any given time if we
know all coordinates and momenta, i.e. a total of 2s quantities (=6N if we are dealing with
point particles moving in threedimensional space). The microscopic state of the system
can be viewed as a point, a vector in 2s dimensional space. The dynamical evolution of the
system in time can be viewed as a motion of this point in the 2s dimensional space (phase
space). I will use the shorthand notation Γ ≡ (q
i
, p
i
, i = 1, s) to denote a point in phase
space. More precisely, Γ(t) will denote a trajectory in phase space with the initial condition
Γ(t
0
) = Γ
0
.
1
1
Note that trajectories in phase space do not cross. A history of a Hamiltonian system is determined by
diﬀerential equations which are ﬁrstorder in time, and is therefore reversible  and hence unique.
11
2 Background: Statistical mechanics
2.2.2 Liouville’s theorem
Consider an element of volume dσ
0
in phase space; the set of trajectories starting at time
t
0
at some point Γ
0
∈ dσ
0
lead, at time t to points Γ ∈ dσ. Liouville’s theorem asserts
that dσ = dσ
0
. (invariance of phase space volume). The proof consists of showing that the
Jacobi determinant
D(t, t
0
) ≡
∂(q, p)
∂(q
0
, p
0
)
(2.1)
corresponding to the coordinate transformation (q
0
, p
0
) ⇒ (q, p), is equal to unity. Using
general properties of Jacobians
∂(q, p)
∂(q
0
, p
0
)
=
∂(q, p)
∂(q
0
, p)
∂(q
0
, p)
∂(q
0
, p
0
)
=
∂(q)
∂(q
0
)
¸
¸
¸
¸
p=const
∂(p)
∂(p
0
)
¸
¸
¸
¸
q=const
(2.2)
and
∂D(t, t
0
∂t
¸
¸
¸
¸
t=t
0
=
s
i=1
_
∂ ˙ q
i
∂q
i
+
∂ ˙ p
i
∂p
i
_
¸
¸
¸
¸
¸
t=t
0
=
s
i=1
_
∂
2
H
∂q
i
∂p
i
−
∂
2
H
∂p
i
∂q
i
_
= 0 , (2.3)
and noting that D(t
0
, t
0
) = 1, it follows that D(t, t
0
) = 1 at all times.
2.2.3 Averaging over time
Consider a function A(Γ) of all coordinates and momenta. If you want to compute its long
time average under conditions of thermal equilibrium, you need to follow the state of the
system over a long time, record it, evaluate the function A at each instant of time, and take
a suitable average. Following the trajectory of the point in phase space allows us to deﬁne
a longtime average
¯
A = lim
T→∞
1
T
_
T
0
dtA[Γ(t)] . (2.4)
Since the system is followed over inﬁnite time this can then be regarded as a true equilibrium
average. More on this later.
2.2.4 Ensemble averaging
On the other hand, we could consider an ensemble of identically prepared systems and
attempt a series of observations. One system could be in the state Γ
1
, another in the state
Γ
2
. Then perhaps we could determine the distribution of states ρ(Γ), i.e. the probability
ρ(Γ)δΓ, that the state vector is in the neighborhood (Γ, Γ + δΓ). The average of A in this
case would be
< A >=
_
dΓρ(Γ)A(Γ) (2.5)
Note that since ρ is a probability distribution, its integral over all phase space should be
normalized to unity:
_
dΓρ(Γ) = 1 (2.6)
A distribution in phase space must obey further restrictions. Liouville’s theorem states that
if we view the dynamics of a Hamiltonian system as a ﬂow in phase space, elements of
volume are invariant  in other words the ﬂuid is incompressible:
d
dt
ρ(Γ, t) = ¦ρ, H¦ +
∂
∂t
ρ(Γ, t) = 0 . (2.7)
12
2 Background: Statistical mechanics
For a stationary distribution ρ(Γ)  as one expects to obtain for a system at equilibrium 
¦ρ, H¦ = 0 , (2.8)
i.e. ρ can only depend on the energy
2
. This is a very severe restriction on the forms of
allowed distribution functions in phase space. Nonetheless it still allows for any functional
dependence on the energy. A possible choice (Boltzmann) is to assume that any point on
the phase space hypersurface deﬁned by H(Γ) = E may occur with equal probability. This
corresponds to
ρ(Γ) =
1
Ω(E)
δ ¦H(Γ) −E¦ (2.9)
where
Ω(E) =
_
dΓ δ ¦H(Γ) −E¦ (2.10)
is the volume of the hypersurface H(Γ) = E. This is the microcanonical ensemble. Other
choices are possible  e.g. the canonical (Gibbs) ensemble deﬁned as
ρ(Γ) =
1
Z(β)
e
−βH(Γ)
(2.11)
where the control parameter β can be identiﬁed with the inverse temperature and
Z(β) =
_
dΓe
−βH(Γ)
(2.12)
is the classical partition function.
2.2.5 Equivalence of ensembles
The choice of ensemble, although it may appear arbitrary, is meant to reﬂect the actual
experimental situation. For example, the Gibbs ensemble may be “derived” in the sense
that it can be shown to correspond to a small (but still macroscopic) system in contact
with a much larger “reservoir”of energy  which in eﬀect holds the smaller system at a ﬁxed
temperature T = 1/β. Ensembles must  and to some extent can  be shown to be equivalent,
in the sense that the averages computed using two diﬀerent ensembles coincide if the control
parameters are appropriately chosen. For example a microcanonical average of a function
A(Γ) over the energy surface H(Γ) = will be equal with the canonical average at a certain
temperature T if we choose to be equal to the canonical average of the energy at that
temperature, i.e. < A(Γ) >
micro
=< A(Γ) >
canon
T
if =< H(Γ) >
canon
T
.
If ensembles can be shown to be equivalent to each other in this sense, we do not need to
perform the actual experiment of waiting and observing the realization of a large number
of identical systems as postulated in the previous section. We can simply use the most
convenient ensemble for the problem at hand as a theoretical tool for calculating averages. In
general one uses the canonical ensemble, which is designed for computing average quantities
as functions of temperature.
2.2.6 Ergodicity
The usage of ensemble averages  and therefore of the whole ediﬁce of classical statistical
mechanics  rests on the implicit assumption that they somehow coincide with the more
physical time averages. Since the various ensembles can be shown to be equivalent (cf.
2
or  in principle  on other conserved quantities; in dealing with large systems it may well be necessary to
account for other macroscopically conserved quantities in deﬁning a proper distribution function.
13
2 Background: Statistical mechanics
above), it would be suﬃcient to provide a microscopic foundation for the ensemble most
directly accessible to Hamiltonian dynamics, i.e. the microcanonical ensemble. The ergodic
hypothesis states that
lim
T→∞
1
T
_
T
0
dtA[Γ(t)] =
1
Ω(E)
_
dΓ δ ¦H (Γ) −E¦ A(Γ) (2.13)
i.e. that time averages and microcanonical averages coincide. This requires that as a point
Γ moves around phase space, it spends  on the average  equal times on equal areas of
the energy hypersurface (recall that the phase point must stay on the energy hypersurface
because H(Γ) is a constant of the motion. This seems like a strong & rather nonobvious
assertion; Boltzmann had a rough time when he tried to sell it as a plausible basis for the
emerging theory of statistical mechanics.
One of the reasons why (2.13) appears implausible was a theorem proved by Poincar´e which
stated that if a Hamiltonian system is bounded, its trajectory in phase space  although not
allowed to cross itself  will return arbitrarily close to any point already traveled, provided
one waits long enough. Therefore, even statistically improbable microstates may recur. The
catch is that Poincar´e recurrence times for rare events in large systems are of order e
N
and
may easily exceed the age of the universe[1].
In fact, ergodicity was later shown by Birkhoﬀ to hold if the energy surface cannot be
divided in two invariant regions of nonzero measure (i.e. regions such that the trajectories
in phase space always remain in one of them). The energy surface is then called metrically
indecomposable. One way this decomposition could occur might be if further integrals of
motion are present.
14
3 The FPU paradox
3.1 The harmonic crystal: dynamics
Consider a chain of N point particles, each of unit mass. Each of the particles is coupled to
its nearest neighbor via a harmonic spring of unit strength; let Q
i
be the displacement of
the ith particle; the Hamiltonian (1.6) is
H(P, Q) =
1
2
N
i=1
P
2
i
+
1
2
N
i=0
(Q
i+1
−Q
i
)
2
, (3.1)
where the canonical momenta are P
i
=
˙
Q
i
and the end particles are held ﬁxed, i.e. Q
0
=
Q
N+1
= 0 (NB: N degrees of freedom).
The Fourier decomposition
Q
i
=
_
2
N + 1
N
λ=1
sin
_
iπλ
N + 1
_
A
λ
P
i
=
_
2
N + 1
N
λ=1
sin
_
iπλ
N + 1
_
B
λ
(3.2)
is a canonical transformation (cf. above) to a new set of coordinate and momenta ¦A
λ
, B
λ
¦.
(NB: exercise, check properties, orthogonality, trigonometric sums, boundary conditions
satisﬁed). In this new set of coordinates, the Hamiltonian can be written as
H =
N
λ=1
H
λ
≡
1
2
N
λ=1
_
B
2
λ
+ Ω
2
λ
A
2
λ
_
(3.3)
where
Ω
2
λ
= 4 sin
2
_
πλ
2(N + 1)
_
. (3.4)
This is a case of a separable Hamiltonian, where HamiltonJacobi theory can be trivially
applied, i.e.
1
2
_
∂W
λ
∂A
λ
_
2
+
1
2
Ω
2
λ
A
2
λ
=
λ
∀ λ = 1, , N. (3.5)
where each
λ
is a constant representing the energy stored in the λth normal mode. The
substitution
A
λ
=
√
2
λ
Ω
λ
sin
¯
θ
λ
(3.6)
transforms (3.5) to
∂W
λ
∂
¯
θ
λ
=
2
λ
Ω
λ
cos
2
¯
θ
λ
. (3.7)
The corresponding action variable
J
λ
=
1
2π
_
B
λ
dA
λ
=
1
2π
_
∂W
λ
∂A
λ
dA
λ
(3.8)
15
3 The FPU paradox
can now be evaluated as
J
λ
=
1
2π
2
λ
Ω
λ
_
2π
0
d
¯
θ
λ
cos
2
¯
θ
λ
=
λ
Ω
λ
(3.9)
by integrating over a full cycle of the substitution variable
¯
θ
λ
. The Hamiltonian can be
rewritten in terms of the action variables
H =
λ
λ
=
λ
Ω
λ
J
λ
(3.10)
The angle variables conjugate to the action variables can be found from (1.30
θ
λ
=
∂W
λ
(A
λ
, J
λ
)
∂J
λ
. (3.11)
It can be shown explicitly that θ
j
=
¯
θ
j
.
The Hamiltonian equations in actionangle variables are
˙
J
λ
= 0
˙
θ
λ
=
∂H
∂J
λ
= Ω
λ
, (3.12)
i.e. the Ω
λ
’s are the natural frequencies of the normal modes. Note that we did not need
the explicit form of the solution of the HamiltonJacobi equation to derive this.
More explicitly, the time evolution of the normal mode coordinates is
A
λ
(t) =
_
2J
λ
Ω
λ
_
1/2
sin
_
Ω
λ
t +θ
0
λ
_
, (3.13)
with an analogous expression for the momenta B
λ
.
In the actionangle representation, the 2N constants of integration are the N action
variables ¦J
λ
¦ and the N initial phases ¦θ
0
λ
¦.
3.2 The harmonic crystal: thermodynamics
The average energy of the harmonic chain at any given temperature T is given by the
canonical average
< H >=
1
Z
_
dΓe
−H(Γ)/T
H(Γ) , (3.14)
where Z is the partition function
Z(T) =
_
dΓe
−H(Γ)/T
. (3.15)
It is possible to transform the integrals in both numerator and denominator of (3.14) to
actionangle coordinates (cf. previous section). Because of the separability property of the
Hamiltonian, the denominator splits into product over all N normal modes
Z =
N
λ=1
Z
λ
(3.16)
16
3 The FPU paradox
where
Z
λ
=
_
∞
0
dJ
λ
_
2π
0
dθ
λ
e
−Ω
λ
J
λ
/T
=
2πT
Ω
λ
(3.17)
whereas the numerator transforms to is a sum of the form
N
λ=1
_
_
λ
=λ
Z
λ
_
_
^
λ
(3.18)
where
^
λ
=
_
∞
0
dJ
λ
_
2π
0
dθ
λ
e
−Ω
λ
J
λ
/T
Ω
λ
J
λ
=
2πT
2
Ω
λ
. (3.19)
It follows that
< H >=
N
λ=1
<
λ
>=
N
λ=1
^
λ
/Z
λ
=
N
λ=1
T = NT , (3.20)
i.e. each the average energy which corresponds to each degree of freedom is equal to T
(equipartition property).
The “statistical mechanics of the harmonic chain” has a fundamental ﬂaw: although
canonical averages are straightforward to obtain, there is obviously no basis for assuming
ergodicity  in the presence of N integrals of motion. Now, this might not be a serious
problem if one could argue that a tiny generic perturbation, as might arise from e.g. a small
nonlinearity of the interactions, could drive the system away from complete integrability,
and into an ergodic regime. If this turned out to be the case, one could still argue that the
computed canonical averages reﬂect the intrinsic thermodynamic properties of the harmonic
chain, in the “programmatic” sense of statistical mechanics. Fermi, Pasta and Ulam decided
to put this implicit assumption to a numerical test.
3.3 The FPU numerical experiment
Fermi, Pasta and Ulam (FPU[2]) investigated the Hamiltonian
H(P, Q) =
1
2
N−1
i=1
P
2
i
+
1
2
N−1
i=0
(Q
i+1
−Q
i
)
2
+
α
3
N−1
i=0
(Q
i+1
−Q
i
)
3
, (3.21)
where the canonical momenta are P
i
=
˙
Q
i
and the end particles are held ﬁxed, i.e. Q
0
=
Q
N
= 0. Their work  undertaken as a suitable “test” problem for one of the very ﬁrst
electronic computers, the Los Alamos “MANIAC” is considered as the ﬁrst numerical ex
periment. In other words, it is the ﬁrst case where physicists observed and analyzed the
numerical output of Newton’s equations, rather than the properties of a mechanical system
described by these same equations.
The dynamics of the Hamiltonian (3.21) was studied as an initial value problem; the initial
conﬁguration was a halfsine wave Q
i
= sin(iπ/N), with N = 32 and all particles at rest;
the nonlinearity parameter was chosen as α = 1/4. Energy was thus pumped at the lowest
17
3 The FPU paradox
Figure 3.1: The quantity plotted is the energy (kinetic plus potential in each of the ﬁrst four
modes). The time is given in thousands of computational cycles. Each cycle is 1/2
√
2
of the natural time unit. The initial form of the string was a single sine wave (mode
1). The energy of the higher modes never exceeded 6% of the total. (from [2]).
Fourier mode, λ = 1, in the notation of (). The objective of the experiment was to study
the energies stored in the ﬁrst few Fourier modes, i.e. the quantities
H
λ
≡
1
2
_
˙
A
2
λ
+ Ω
2
λ
A
2
λ
_
(3.22)
where
A
λ
=
_
2
N
N
i=1
sin
_
iπλ
N
_
Q
i
(3.23)
as a function of time, i.e. to test the onset of equipartition. Note that the decomposition of
the total energy in Fourier modes is not exact  but as long as α stays small, H ≈
λ
H
λ
will hold.
Fig. 3.1 shows the time dependence of the energies of the ﬁrst four modes. After an initial
redistribution, all of the energy (within 3%) returns to the lowest mode. The energy residing
in higher modes never exceeded 6 % of the total. Longer numerical studies have shown the
return of the energy to the initial mode to be a periodic phenomenon; the period is about
157 times the period of the lowest mode. The phenomenon is known as FPU recurrence.
The results of a more recent numerical study on FPU recurrence[3] are summarized in
Fig. 3.2.
The Hamiltonian (3.21) is fairly generic. In fact, the original FPU paper describes a fur
ther study with quartic, rather than cubic, anharmonicities which exhibits similar behavior.
FPU recurrence has been shown to be a robust phenomenon. The upshot of those exhaustive
numerical observations is that anharmonic corrections to the Hamiltonian, contrary to the
original expectation which held them as agents that might help establish ergodicity, actually
appear to generate new forms of approximately periodic behavior. The process of under
18
3 The FPU paradox
Figure 3.2: FPU recurrence time, divided by N
3
vs a scaling variable R = α(E/N)
1/2
N
2
where
E/N ≈ [πB/(2N)]
2
is the energy density. Typical values used by FPU correspond to
R 1. The asymptotic regime is well described by the relationship T
r
/N
3
= R
−1/2
(from Ref. [3]).
standing the source of this behavior  also known as the FPU paradox  and relating it to
other manifestations of nonlinearity [4] has led to a profound change in theoretical physics.
19
4 The Korteweg  de Vries equation
4.1 Shallow water waves
Original context: Wave motion in shallow channels, cf. ScottRussell
1
Mathematical description due to Korteweg and deVries (KdV [6]). The equation arises in
wide variety of physical contexts (e.g. plasma physics, anharmonic lattice theory). Hence it
counts as one of the “canonical” soliton equations.
Long waves (typical length l) in a shallow channel l ¸h.
Small amplitude (¸h) waves (weak nonlinearity)
Twodimensional ﬂuid ﬂow (motion in lateral dimension of channel neglected)
x: horizontal direction, y: vertical direction
4.1.1 Background: hydrodynamics
Fluid velocity
V ≡ uˆ x +vˆ y (4.1)
Equations of (Eulerian) incompressible ﬂuid dynamics
• continuity equation
∇
V = 0 (4.2)
• Euler equation
∂
V
∂t
+ (
V ∇)
V = −
1
ρ
∇p +g (4.3)
where g = −gˆ y plus
• irrotational ﬂow (no vortices)
∇
V = 0 ⇒
V = ∇Φ . (4.4)
Using vector identity
(
V ∇)
V =
1
2
∇V
2
−
V (∇
V ) (4.5)
in (4.3) (only ﬁrst term survives due to (4.4) ), and (4.4) in (4.2) transforms hydrodynamics
equations to
1
“I was observing the motion of a boat which was rapidly drawn along a narrow channel by a pair of horses,
when the boat suddenly stopped  not so the mass of water in the channel which it had put in motion; it
accumulated round the prow of the vessel in a state of violent agitation, then suddenly leaving it behind,
rolled forward with great velocity, assuming the form of a large solitary elevation, a rounded, smooth
and welldeﬁned heap of water, which continued its course along the channel apparently without change
of form or diminution of speed. I followed it on horseback, and overtook it still rolling on at a rate of
some eight or nine miles an hour, preserving its original ﬁgure some thirty feet long and a foot to a foot
and a half in height. Its height gradually diminished, and after a chase of one or two miles I lost it in
the windings of the channel. Such, in the month of August 1834, was my ﬁrst chance interview with that
singular and beautiful phenomenon which I have called the Wave of translation.”[5]
20
4 The Korteweg  de Vries equation
1. continuity
´Φ = 0 , (4.6)
2. Euler
∂Φ
∂t
+
1
2
(∇Φ)
2
+
p
ρ
+gy = 0 . (4.7)
4.1.2 Statement of the problem; boundary conditions
The above eqs (4.6) and (4.7) must now be solved subject to the boundary conditions
1. bottom: no vertical motion of the ﬂuid
v(x, y = 0) = 0 ∀x (4.8)
2. top: free surface deﬁned as
y = h +η(x, t). (4.9)
Velocity of free boundary coincides with ﬂuid velocity,
dy
dt
=
∂η
∂t
+
∂η
∂x
dx
dt
hence
v =
∂η
∂t
+
∂η
∂x
u (4.10)
holds at the free surface.
The solution will involve two steps: ﬁrst, ﬁnd a general class of solutions of (4.6) which
satisfy the bottom BC (4.8), and then use this general class to determine the height proﬁle
(4.9) by demanding that the Euler equation (4.7) be satisﬁed at the free surface, where p = 0
holds. The Euler equation can then be used to determine the pressure at any point.
4.1.3 Satisfying the bottom boundary condition
Consider the general form of an expansion (the height O(h) is small in a sense which will
be made precise below) of the type
u = f(x) +f
1
(x)y +f
2
(x)y
2
+f
3
(x)y
3
+
v = g
1
(x)y +g
2
(x)y
2
+g
3
(x)y
3
+ . (4.11)
The conditions
∂u
∂y
=
∂v
∂x
and
∂u
∂x
= −
∂v
∂y
imposed by (4.6) can now be written, respectively,
as
f
1
+ 2f
2
y + 3f
3
y
2
= g
1x
y +g
2x
y
2
(4.12)
and
f
x
+f
1
y +f
2
y
2
= −g
1
−2g
2
y −3g
3
y
2
(4.13)
from which
f
1
= 0 (4.14)
2f
2
= g
1x
(4.15)
3f
3
= g
2x
(4.16)
21
4 The Korteweg  de Vries equation
and
f
x
= −g
1
(4.17)
f
1x
= −2g
2
(4.18)
f
2x
= −3g
3
(4.19)
follow. Using the second set in the ﬁrst, results in f
1
= 0, 2f
2
= −f
xx
, 2f
3
= −1/2f
1xx
(= 0);
it follows that g
2
= 0 and g
3
= −1/3f
2x
= 1/3!f
xxx
. Collecting terms,
u = f −
1
2
f
xx
y
2
+O(y
4
) (4.20)
v = −f
x
y +
1
3!
f
xxx
y
3
. (4.21)
4.1.4 Euler equation at top boundary
Set p = 0 in (4.7) and diﬀerentiate with respect to x:
∂u
∂t
+
1
2
∂
∂x
(u
2
+v
2
) +g
∂η
∂t
= 0 . (4.22)
The problem is now to solve the system of coupled diﬀerential equations (4.22) and (4.10)
using the expressions (4.20) and (4.21). Key: follow the scale of variation of the physical
quantities involved. First note that if the water height is not much diﬀerent from h (small
nonlinearity), it will be useful to set
η = h¯ η (4.23)
Note is not a parameter of the problem. It simply serves as a “tag” to let us keep
track of scales. At the end we will have to check the consistency of the assumptions and
approximations made.
According to our assumption, the length scale on which the ﬂuid proﬁle varies along the x
direction is of the order l ¸h. In order to incorporate this assumption in the approximation,
I deﬁne a rescaled variable via
x = l¯ x . (4.24)
Dimensional consideration determine a natural velocity scale c =
√
gh. The motion should
be slow with respect to that scale  in agreement with small amplitude variations of the
proﬁle. In other words, we expect u ¸ c. Note that from the leading orders of (4.20) and
(4.21)it follows that v is typically of order h/l ≡ δ smaller than u. It is therefore reasonable
to rescale
f = c
¯
f (4.25)
u = c¯ u (4.26)
v = δc¯ v . (4.27)
Finally I use a rescaled time
t =
¯
t l/c . (4.28)
With these rescalings, keeping lowest order terms, i.e. of O() and O(δ
2
), the rescaled
equations (4.20) and (4.21) become  on the surface 
¯ u =
¯
f −
1
2
δ
2
¯
f
¯ x¯ x
(4.29)
¯ v = −(1 +¯ η)
¯
f
¯ x
+
1
6
δ
2
¯
f
¯ x¯ x¯ x
; (4.30)
22
4 The Korteweg  de Vries equation
accordingly, the top boundary condition (4.10) and the Euler equation (4.22) transform to
¯
f
¯ x
+ ¯ η¯
t
+(
¯
f ¯ η)
¯ x
−
1
6
δ
2
¯
f
¯ x¯ x¯ x
= 0 (4.31)
¯
f¯
t
+ ¯ η
¯ x
+
2
(
¯
f
2
)
¯ x
−
1
2
δ
2
¯
f
¯ x¯ x
¯
t
= 0 . (4.32)
First we note that in the absence of nonlinearity ( = 0) and dispersion (δ = 0), free
wave propagation with unit velocity (in dimensionless units) occurs; in that (zeroth) order,
¯
f = ¯ η. But of course this is hypothetical because δ and are not parameters of the problem
 they just help us keep track of things! However, the zeroth order approximation is useful
in the sense that it suggests a coordinate transformation which absorbs the fastest time
dependence; let
ξ = ¯ x −
¯
t (4.33)
τ =
¯
t . (4.34)
Keeping terms to ﬁrst order in and δ
2
, we use the property
¯ η
¯ x
= ¯ η
ξ
(4.35)
¯ η¯
t
= −¯ η
ξ
+ ¯ η
τ
(4.36)
(which holds for
¯
f as well) transform the system (4.32) to
¯
f
ξ
− ¯ η
ξ
+¯ η
τ
+(¯ η
2
)
ξ
−
1
6
δ
2
¯ η
ξξξ
= 0 (4.37)
−
¯
f
ξ
+ ¯ η
ξ
+¯ η
τ
+
2
(¯ η
2
)
ξ
+
1
2
δ
2
¯ η
ξξξ
= 0 . (4.38)
where we have used the property
¯
f = ¯ η in terms which contain or δ
2
factors. The sum of
(4.38) is
2¯ η
τ
+
3
2
(¯ η
2
)
ξ
+
1
3
δ
2
¯ η
ξξξ
= 0 . (4.39)
The three terms in (4.39) will be of the same order if δ
2
= O(), i.e. if the nonlinearity
balances the dispersion. We choose = δ
2
/6. Note that the choice must be tested at the
end to check whether it satisﬁes the original requirements (small amplitude, long waves).
With this choice and the substitution ¯ η = 4φ I arrive at the “canonical” KdV form,
φ
τ
+ 6φφ
ξ
+φ
ξξξ
= 0 . (4.40)
4.1.5 A solitary wave
At this stage, without recourse to advanced mathematical techniques, it is possible to follow
the path of KdV and look for special, exact, propagating solutions of (4.40) of the type φ(s),
where s = ξ −λτ. (4.40) becomes
−λφ
s
+ 3(φ
2
)
s
+φ
sss
= 0 (4.41)
which has an obvious ﬁrst integral
−λφ + 3φ
2
+φ
ss
= const. (4.42)
If we are looking for solutions which vanish at inﬁnity (lim
s→∞
φ(s) = 0 and lim
s→∞
φ
s
(s) =
0) the constant will be zero, i.e.
φ
ss
= λφ −3φ
2
=
d
dφ
(
1
2
λφ
2
−φ
3
) (4.43)
23
4 The Korteweg  de Vries equation
Multiplying both sides by 2φ
s
we can integrate once more, obtaining
φ
2
s
= λφ
2
−2φ
3
(4.44)
where the integration constant must vanish once again (cf. above). Note that, if a solution
exists, the parameter λ must be > 0 and φ < λ/2. Taking the square root of (4.44) and
inverting the fractions I obtain
ds = ±
dφ
φ
√
λ −2φ
(4.45)
which can be integrated directly, resulting in
φ(s) =
λ
2 cosh
2
_
√
λ
2
(s −s
0
)
_ (4.46)
where s
0
is an arbitrary constant. (The plus sign in (4.45) has been chosen for s < s
0
and
the minus for s > s
0
).
Note that the properties of the propagating solution (4.46)  except for its initial position,
which is determined by s
0
 are all governed by a single parameter. If the velocity λ is
given, the amplitude is ﬁxed at λ/2 and the spatial extent at 2λ
−1/2
. In other words  in the
canonical units of (4.40)  a slow pulse will also have a small amplitude and a large spatial
extent.
4.1.6 Is the solitary wave a physical solution?
Eq. (4.46 ) is an exact, propagating, pulselike solution of (4.40). But is it an acceptable
solution of the original problem? In other words, is the surface proﬁle of low amplitude and is
it a long wave? To do this, we have to go back to the original variables, and convince ourselves
that (4.46) generates (some) acceptable solutions for the original problem (Exercise)
4.2 KdV as a limiting case of anharmonic lattice dynamics
Consider the 1d anharmonic chain; atomic displacements are denoted by ¦u
n
¦; neighboring
atoms of mass m interact via anharmonic potentials of the type
V (r) =
1
2
kr
2
+
1
3
kbr
3
(4.47)
where r is the distance between nearest neighbors. The equations of motion are
m¨ q
n
= −
∂
∂q
n
[V (q
n+1
−q
n
) +V (q
n
−q
n−1
]) (4.48)
= k(q
n+1
+q
n−1
−2q
n
) −kb[−(q
n+1
−q
n
)
2
+ (q
n
−q
n−1
)
2
]
= k(q
n+1
+q
n−1
−2q
n
) −kb(q
n+1
+q
n−1
−2q
n
)(q
n+1
−q
n−1
) .
If the displacements do not vary appreciably on the scale of the lattice constant a, we can
use a continuum approximation; keeping terms of fourth order in the lattice constant,
m¨ q ≡ q
tt
= ka
2
q
xx
+ka
4
2
4!
q
xxxx
+kba
2
q
xx
2aq
x
,
where x = na is the continuum space variable; deﬁning c
2
= ka
2
/m, this can be written as
1
c
2
q
tt
−q
xx
=
1
12
a
2
q
xxxx
+ 2αq
x
q
xx
, (4.49)
24
4 The Korteweg  de Vries equation
where α = ab provides a dimensionless measure of the anharmonicity.
I now look for solutions which vary smoothly in space, i.e. over a typical length of many
lattice spacings, and where the main time dependence is contained in the wave equation
part, i.e. of the form
q(ξ, τ) ≡ q(
x −ct
a
, δω
0
t) , (4.50)
where ω
0
= c/a =
_
k/m, ¸ 1 and δ ¸ ; the exact dependence of δ on will be ﬁxed
later.
The relevant derivatives transform according to
q
x
=
a
q
ξ
q
xx
=
_
a
_
2
q
ξξ
q
xxx
=
_
a
_
3
q
ξξξ
q
tt
= ω
2
0
_
2
q
ξξ
−2δq
ξτ
+O(δ
2
)
_
.
Using them in (4.49) gives
2δq
ξτ
+
1
12
3
q
ξξξξ
+ 2αq
ξ
q
ξξ
= 0 , (4.51)
which, after a rescaling
q
ξ
(=
a
q
x
) = −
4α
aφ (4.52)
and setting
2
δ =
1
24
3
can be reduced to the canonical KdV form
φ
τ
−6φφ
ξ
+φ
ξξξ
= 0 . (4.53)
Note that the rescaling of length, i.e. the value of the small parameter is still a matter of
free choice, depending on the (initial) conditions of the problem.
The above analysis shows that one may legitimately suspect that nonlinear propagating
solitary waves will be generic in anharmonic lattices, at least for certain parameter ranges.
Again, one has to make sure that the solutions found from solving the KdV equation (4.53)
are appropriate for the original problem (4.49) (check consistency of approximations made).
4.3 KdV as a ﬁeld theory
4.3.1 KdV Lagrangian
The KdV equation
u
t
−3(u
2
x
)
x
+u
xxx
= 0 (4.54)
can be derived from the Lagrangian
L =
_
dxL(φ, φ
t
, φ
x
, φ
xx
) (4.55)
2
note that this guarantees δ as demanded above.
25
4 The Korteweg  de Vries equation
where
L =
1
2
φ
x
φ
t
−φ
3
x
−
1
2
φ
2
xx
. (4.56)
Note that because the Lagrangian density depends on the second derivative of the ﬁeld,
(1.42) contain an extra term
−
d
2
dx
2
_
∂L
∂φ
xx
_
. (4.57)
Minimization of the action leads to the ﬁeld equations of motion
φ
xt
−3(φ
2
x
)
x
+φ
xxxx
= 0 (4.58)
which reduces to (4.54) upon the substitution
φ
x
= u . (4.59)
Continuous symmetries of the Lagrangian will again give rise to an equation like (1.44), with
an extra term
∂L
∂φ
xx
δφ
xx
(4.60)
on the lefthand side. The above modiﬁcations generate an extra contribution
∂L
∂φ
xx
δφ
xx
−
d
2
dx
2
_
∂L
∂φ
xx
_
δφ (4.61)
to the lefthand side of (1.45).
4.3.2 Symmetries and conserved quantities
For some inﬁnitesimal transformations (cf. section ) one can verify explicitly that δφ
xx
=
d
2
δφ/dx
2
. If this is the case, the integral over all space of the extra contribution (4.61) can
easily be seen to vanish (repeated integration by parts of either of the two terms). In this
case, the standard symmetries are reﬂected in the same standard conservation (with the
same densities of conserved quantities), as in section .... .
Translational invariance in space
Conservation of the total momentum
P = −
_
∞
−∞
dx
∂L
∂φ
t
φ
x
= −
1
2
_
∞
−∞
dx φ
2
x
= −
1
2
_
∞
−∞
dx u
2
. (4.62)
Translational invariance in time
Conservation of the total energy
H =
_
∞
−∞
dx
_
∂L
∂φ
t
φ
t
−L
_
= −
_
∞
−∞
dx
_
1
2
φ
2
xx
+φ
3
x
_
= −
_
∞
−∞
dx
_
1
2
u
2
x
+u
3
_
. (4.63)
26
4 The Korteweg  de Vries equation
Conservation of mass
The symmetry φ → φ + generates δφ = , and all other variations are zero. From (1.45),
conservation of
M =
_
∞
−∞
dx
∂L
∂φ
t
=
1
2
_
∞
−∞
dx φ
x
=
1
2
_
∞
−∞
dx u , (4.64)
the total “mass”, is deduced.
Galilei invariance
The transformation x →x−t, φ(x, t) →φ(x−t) −x (or in terms of the uﬁeld, u(x, t) →
u −, generates (cf. section ....)
x → x −t
δφ = φ(x −t) −φ(x) −x = −tφ
x
−x
δφ
t
= φ
t
(x −t) −φ
t
(x) = −tφ
xt
δφ
x
= φ
x
(x −t) −φ
x
(x) − = −tφ
xx
−
δL =
dL
dx
δx = −
dL
dx
t ⇒J
1
= −tL , J
0
= 0 . (4.65)
Owing to δφ
xx
= (δφ)
xx
there are no extra terms in the conserved currents. Eq. (1.45)
applies. Since δφ
x
= (δφ)
x
the two last terms in the lefthand side of (1.45) combine to
form a total space derivative; similarly, because of δφ
t
= (δφ)
t
, the ﬁrst two terms combine
to form a total time derivative, i.e. the conserved density is
∂L
∂φ
t
δφ/ =
1
2
φ
x
(−tφ
x
−x) , (4.66)
or, integrating over all space, and dividing by the total mass M,
¯
X =
1
M
_
∞
−∞
dx x
u
2
=
P
M
t +const. (4.67)
which expresses the fact that the center of mass moves at a constant velocity.
4.3.3 KdV as a Hamiltonian ﬁeld theory
27
5 Solving KdV by inverse scattering
5.1 Isospectral property
Given the KdV equation
u
t
−6uu
x
+u
xxx
= 0 (5.1)
and a well behaved initial condition u(x, 0), which vanishes at inﬁnity, it is possible to
determine the time evolution u(x, t) in terms of a general scheme, which is known as inverse
scattering theory.
The scheme is based on the following particular property of (5.1):
Given the linear operator
L(t) = −∂
2
xx
+u(x, t) (5.2)
whose parametric time dependence is governed by (5.1), and the associated eigenvalue equa
tion
L(t)ψ
j
(x, t) = λ
j
(t)ψ
j
(x, t) , (5.3)
it can be shown that
dλ
j
dt
= 0 . (5.4)
5.2 Lax pairs
The “isospectral” property can be formulated somewhat more generally: Suppose we can
construct a linear, selfadjoint operator B = B
†
, dependent on u and such that
iL
t
≡ i
dL
dt
≡ i lim
∆→0
L(t + ∆) −L(t)
∆
= [L, B] (5.5)
holds as an operator identity, i.e.
iL
t
f = [L, B]f ∀f ⇔(5.1) . (5.6)
The operators L and B are then called a Lax pair. The time evolution of L is governed by
L(t) = U(t)L(0)U
†
(5.7)
where
U = e
iBt
. (5.8)
Consider (5.3) at t = 0, and apply the operator U(t) to both sides from the left, i.e.
U(t)L(0) U
†
(t)U(t)ψ
j
(0) = λ
j
(0)U(t)ψ
j
(0) (5.9)
where, in addition I have inserted a factor U
†
U = 1. It can be recognized immediately that
the l.h.s. of (5.9) and (5.3) are identical, provided
ψ
j
(t) = U(t)ψ
j
(0) , (5.10)
28
5 Solving KdV by inverse scattering
and that, in order for the r.h.sides to coincide, I must have
λ(t) = λ(0) ∀t (5.11)
(isospectral property).
The form of the operator B in the KdV case is
B = 4i∂
3
xxx
−3i (u∂
x
+∂
x
u) (5.12)
(verify explicitly (5.6).
5.3 Inverse scattering transform: the idea
The isospectral property tentatively suggests that it might possible to proceed as follows:
• solve the linear problem (5.3) at time t = 0, i.e. determine the eigenvalues ¦λ
j
¦ and
the eigenfunctions ¦ψ
j
(x, 0)¦ from the known u(x, 0).
• determine the evolution of the eigenfunctions from (5.10) at a later time t.
• try to solve the “inverse problem” of determining the “potential” u(x, t) from the
known spectra and eigenfunctions at the time t.
In fact, the last step is the well known problem of inverse scattering theory in quantum
mechanics, where physicists had tried to extract information on the nature of interparticle
interactions from analyzing particle scattering data. The onedimensional problem (corre
sponding to a spherically symmetric potentials in 3 dimensions) was completely solved in
the 1950’s (Gel’fand, Levitan & Marchenko). I will present the solution below, but before
doing that, let me outline some broad features:
“Scattering data”in the mathematical sense are the asymptotic properties of the solution
of the associated linear problem, i.e. the properties far from the source of scattering, where
the potential is eﬀectively zero. What GLM have shown is that you can reconstruct the
potential from the scattering data. Furthermore, it turns out that the operator B takes
an especially simple form in the asymptotic limit, which allows us to write down an exact,
analytic formula for the time evolution of scattering data. Evolution of the scattering data
is the easy part of the game. But then if I only need scattering data at time t, and I know
how these data evolve in time, all the input I need is the scattering data for the potential
u(x, 0). This is exactly the program of the inverse scattering transform (IST). Because it is
based only on the asymptotic part of the solution of the associated linear problem, it can
be written down in closed form. I summarize the IST program schematically:
1. determine the scattering data S of the linear problem (5.3) at time t = 0, from the
known u(x, 0).
2. determine the evolution of the scattering data S(t) at a later time t from the asymptotic
from of the operator B.
3. do the inverse problem at time t, i.e. determine the potential u(x, t) from the known
scattering data S(t).
5.4 The inverse scattering transform
5.4.1 The direct problem
This is just a summary of properties known from elementary quantum mechanics.
29
5 Solving KdV by inverse scattering
Jost solutions
The linear eigenvalue problem
_
−
∂
∂x
2
+u(x)
_
ψ(x) = k
2
ψ(x) (5.13)
has, in general, a discrete and a continuum spectrum, corresponding to imaginary and real
values of k respectively. For real k there are in general two linearly independent solutions.
Such a linearly independent set is provided by the Jost solutions:
f
1
(x, k) ∼ e
ikx
x →∞
f
2
(x, k) ∼ e
−ikx
x →−∞ . (5.14)
The Jost solutions of (5.13) satisfy the integral equations
f
1
(x, k) = e
ikx
−
_
∞
x
dx
G(x, x
)f
1
(x
, k)
f
2
(x, k) = e
−ikx
+
_
x
−∞
dx
G(x, x
)f
2
(x
, k) (5.15)
where
G(x, x
) =
sin k(x −x
)
k
u(x
) . (5.16)
Eqs. (5.15) can be analytically continued to the upper half plane of complex k. Some
information on the analytic properties can be obtained by considering the lowest iteration,
where we substitute f
1
(x
, k) = e
ikx
in the r.h.s. of the ﬁrst equation. This gives
f
1
(x, k) ≈ e
ikx
−
_
∞
x
dx
e
ik(x
−x)
−e
−ik(x
−x)
2ik
u(x
)e
ikx
≈ e
ikx
−e
ikx
1
2ik
_
∞
x
dx
¦1 −e
2ik(x
−x)
¦u(x
) (5.17)
which can be thought of as the beginning of a systematic expansion in inverse powers of k.
Note that since x
−x > 0, the exponential will be convergent in the upperhalf plane of k;
therefore, if the potential vanishes suﬃciently rapidly at inﬁnity, I estimate
g
1
(x, k) ≡ f
1
(x, k) −e
ikx
∼ e
ikx
h(x, k) (5.18)
where h vanishes as 1/k for high values of k.
The property
f
2
(x, k) = a(k)f
1
(−k, x) +b(k)f
1
(k, x) . (5.19)
will be useful.
For bound states, corresponding to k = iκ, the Jost solutions are degenerate.
Asymptotic scattering data
The asymptotic (scattering) data of (5.3) is deﬁned as follows:
• discrete spectrum (bound states)
λ
n
= −κ
2
n
n = 1, , N , (5.20)
30
5 Solving KdV by inverse scattering
where κ
n
> 0;
ψ
n
(x) = f
1
(x, k) ∼ e
−κ
n
x
x →∞
= C
n
f
2
(x, k) ∼ C
n
e
κ
n
x
x →−∞ . (5.21)
I will also need the normalization integral of each bound state
1
α
n
=
_
∞
−∞
dxψ
2
n
(x) =
_
∞
−∞
dxf
2
1
(x, iκ
n
) (5.22)
• continuous spectrum (scattering states)
λ(k) = k
2
−∞< k < ∞ . (5.23)
The “physical ”scattering states corresponding to waves incident from the right, are
ψ(x, k) ∼ e
−ikx
+R(k)e
ikx
x →∞
∼ T(k)e
−ikx
x →−∞ . (5.24)
where R(k), T(k) are, respectively, the reﬂection and transmission coeﬃcients, which
satisfy
[R(k)[
2
+[T(k)[
2
= 1 .
The Jost solutions are related to the physical solution (5.24) via
ψ(x, k) = T(k)f
2
(x, k) = f
1
(x, −k) +R(k)f
1
(x, k) ∀x. (5.25)
This identiﬁes a(k) = 1/T(k) and b(k) = R(k)/T(k).
The complete set of scattering data for any one dimensional potential of a Schroedingertype
equation is
S ≡ [¦κ
n
, C
n
, α
n
¦, n = 1 , N; T(k), R(k)]. (5.26)
In fact, for the purposes of performing the inverse scattering transform I will only need the
reduced set
1
S ≡ [¦κ
n
, α
n
¦, n = 1 , N; R(k)] (5.27)
5.4.2 Time evolution of scattering data
I promised this will be the easy part. The operator B has the property
lim
x→±∞
= B
∗
= 4i∂
3
xxx
. (5.28)
Since
∂
∂t
ψ
j
(x, t) = iBψ
j
(x, t) (5.29)
holds for all eigenfunctions, we can apply in the asymptotic regime, where B ∼ B
∗
.
• In the case of a discrete eigenfunction, this gives
ψ
n
(x) ∼ e
−κ
n
x+4κ
3
n
t
x →∞
∼ C
n
e
κ
n
x−4κ
3
n
t
x →−∞ , (5.30)
1
Note that if scattering theory is to make sense, the potential must be vanishing at (±)inﬁnity. I have not
speciﬁed the minimal exact mathematical conditions which satisfy this demand.
31
5 Solving KdV by inverse scattering
or, in keeping with the agreed normalization of the type (5.21), I multiply with a factor
e
−4κ
3
n
t
, and obtain
ψ
n
(x) ∼ e
−κ
n
x
x →∞
∼ C
n
(t)e
κ
n
x
x →−∞ , (5.31)
with
C
n
(t) = C
n
(0)e
−8κ
3
n
t
. (5.32)
• In the case of Jost solutions I obtain
f
1
(k, x) ∼ e
ikx+4ik
3
t
x →∞
f
2
(k, x) ∼ e
−ikx−4ik
3
t
x →−∞ ; (5.33)
the physical solution therefore evolves according to
ψ(k, x) ∼ e
−ikx−4ik
3
t
+R(k)e
ikx+4ik
3
t
x →∞
∼ T(k)e
−ikx−4ik
3
t
x →−∞ ,
or, multiplying both by a factor e
4ik
3
t
, to keep the standard normalization of ()
ψ(k, x) ∼ e
−ikx
+R(k, t)e
ikx
x →∞
∼ T(k)e
−ikx
x →−∞ ,
where
R(k, t) = R(k)e
8ik
3
t
. (5.34)
The scattering data evolve according to (5.32) and (5.34). The transmission coeﬃcient T(k)
stays constant in time.
5.4.3 Reconstructing the potential from scattering data (inverse
scattering problem)
Reconstruction of the potential from scattering data is an old problem in quantum me
chanics. A complete solution has been given in one dimension, subject to fairly general
conditions, by Gelfand and Levitan [7] and Marchenko[8]. Reviews by Faddeyev[9] and
Scott[10]. Deﬁnition of the problem: Given the eigenvalue equation
_
−
d
2
dx
2
+u(x)
_
ψ(x) = k
2
ψ(x) (5.35)
determine u(x) from scattering data in the form of eqs. (5.27).
Fourier transforms of the g(x, k) functions
ˆ g
j
(x, y) =
1
2π
_
∞
−∞
dk e
−iky
g
j
(x, k) , (5.36)
where j = 1, 2, with an inverse
g
j
(x, k) =
_
∞
−∞
dy e
iky
ˆ g
j
(x, y) . (5.37)
Note that, due to the analytic properties of f
1
(cf. (5.18), which allows to close the contour
of (5.36) from above without ﬁnding any singularities)
ˆ g
1
(x, y) = 0 if y < x . (5.38)
Similarly,
ˆ g
2
(x, y) = 0 if y > x . (5.39)
32
5 Solving KdV by inverse scattering
Relating ˆ g
1
(x, x) to the potential u(x)
The starting point is to recognize that
_
∂
2
∂x
2
+k
2
−u
_
g
1
(x, k) =
_
∂
2
∂x
2
+k
2
−u
_
(f
1
(x, k) −1) = u(x)e
ikx
; (5.40)
multiplying both sides by e
−iky
/2π and integrating over all k, I obtain
_
∂
2
∂x
2
−
∂
2
∂y
2
−u(x)
_
ˆ g
1
(x, y) = u(x) δ(x −y) ; (5.41)
deﬁning new variables ζ = (x +y)/2, η = y −x, I use ∂
2
x
−∂
2
y
= 2∂
η
∂
ζ
, to transform (5.41)
to
−2
∂
2
∂η∂ζ
ˆ g
1
_
ζ −
η
2
, ζ +
η
2
_
−u
_
ζ −
η
2
_
ˆ g
1
_
ζ −
η
2
, ζ +
η
2
_
= u(ζ −
η
2
)δ(−η) ,
which can be integrated over an interval of length around η = 0. The result is
−2
∂
∂ζ
ˆ g
1
(ζ, ζ) −u(ζ)ˆ g
1
(ζ, ζ) = −u(ζ) , (5.42)
which in the limit →0 becomes
u(x) = −2
d
dx
ˆ g
1
(x, x) (5.43)
where I have reverted to the original variables.
Relating g
1
(x, y) to the scattering data
Deﬁne Fourier transforms
ˆ
T(y) =
_
∞
−∞
dk
2π
e
−iky
[T(k) −1] (5.44)
ˆ
R(y) =
_
∞
−∞
dk
2π
e
iky
R(k) . (5.45)
Rewrite (5.25) as
T(k)f
2
(x, k) = f
1
(x, −k) +R(k)
_
f
1
(x, k) −e
ikx
+e
ikx
¸
; (5.46)
adding −f
2
(x, k) to both sides and adding and subtracting e
−ikx
to the right hand side gives
(T(k) −1) f
2
(x, k) = g
1
(x, −k) −g
2
(x, k) +R(k)g
1
(x, k) +R(k)e
ikx
; (5.47)
multiplying both sides by e
iky
/2π and integrating over all k produces
_
∞
−∞
dk
2π
e
iky
[T(k) −1] f
2
(x, k) = ˆ g
1
(x, y) − ˆ g
2
(x, y) +
_
∞
−∞
dy
ˆ
R(y +y
) ˆ g
1
(x, y
) +
ˆ
R(x +y) (5.48)
Lemma: T(k) has only simple poles
33
5 Solving KdV by inverse scattering
It can be shown that the transmission coeﬃcient T(k) is analytic in the upper half plane,
including the real axis, except for simple poles which correspond to the bound states, i.e. at
k = iκ
n
. In the neighborhood of such a pole
T(k) ≈ i
C
n
α
n
k −iκ
n
, (5.49)
where
1
α
n
=
_
∞
−∞
dx[f
1
(x, iκ
n
)]
2
(5.50)
is the normalization integral of the bound state.
Using the above lemma, it is possible to compute the integral in the lefthand side of
(5.48) by closing the contour from above. The contribution from each bound state is
2πie
−κ
n
y
i
1
2π
C
n
α
n
f
2
(x, iκ
n
) = −α
n
e
−κ
n
y
f
1
(x, iκ
n
)
= −α
n
e
−κ
n
y
g
1
(x, iκ
n
) −α
n
e
−κ
n
(x+y)
where I have used the fact that Jost states are degenerate if k = κ
n
(cf. Eq. (5.21)).
Moreover, I only need (5.48) for x ≤ y, since otherwise ˆ g
1
(x, y) = 0. Deﬁning a combined
kernel which incorporates all scattering data,
ˆ
K(z) =
ˆ
R(z) +
N
n=1
α
n
e
−κ
n
z
, (5.51)
I ﬁnally obtain
ˆ g
1
(x, y) +
ˆ
K(x +y) +
_
∞
−∞
dy
ˆ
K(y +y
) ˆ g
1
(x, y
) = 0 if x ≤ y (5.52)
ˆ g
1
(x, y) = 0 if x > y (5.53)
(Gel’fand, Levitan, Marchenko equation).
5.4.4 IST summary
In order to solve the initial value problem of the KdV equation (5.1) we proceed as follows:
• Extract initial scattering data for the associated linear problem (5.13),
¦κ
n
, α
n
; n = 1 , N¦, ¦R(k), −∞< k < ∞¦ (5.54)
from potential u(x) at time 0.
• Deﬁne the scattering kernel at time t:
ˆ
K(z; t) =
_
∞
−∞
dk
2π
e
ikz+8ik
3
t
R(k) +
N
n=1
α
n
e
−κ
n
z+8κ
3
n
t
. (5.55)
• Solve the Gel’fand, Levitan, Marchenko equation for x ≤ y,
ˆ g
1
(x, y; t) +
ˆ
K(x +y; t) +
_
∞
x
dy
ˆ
K(y +y
; t) ˆ g
1
(x, y
; t) = 0 . (5.56)
• Extract the limit
u(x, t) = −2
d
dx
ˆ g
1
(x, x
+
; t) . (5.57)
Note that (i) I have explicitly included the parametric dependence on time, and (ii)
the normalization integral α
n
has a time dependence such that the product C
n
α
n
stays
constant (cf. timeindependence of the transmission coeﬃcient).
34
5 Solving KdV by inverse scattering
5.5 Application of the IST: reﬂectionless potentials
Suppose the scattering kernel (5.51) contains only bound states, i.e. R(k) = 0. This would
correspond to a reﬂectionless potential in the original quantum mechanical context. In this
case it turns out that I can systematically derive a whole class of solutions to the original
KdV equation by just solving the GLM equation (5.56) and taking the appropriate limit
(5.57).
5.5.1 A single bound state
The scattering kernel has the form
ˆ
K(z; t) = αe
−κz+8κ
3
t
. (5.58)
I will look for solutions of (5.56) which are separable, i.e.
ˆ g
1
(x, y; t) = e
−κy
h(x, t) ; (5.59)
with the above Ansatz, (5.56) tranforms to
h(x, t) +αe
8κ
3
t
_
e
−κx
+h(x, t)
_
∞
x
dy
e
−2ky
_
= 0 ;
inserting the expression for the integral, (2κ)
−1
e
−2κx
, and setting α = 2κe
2δ
, I obtain
h(x, t)
_
1 +e
2δ
e
−2κx+8κ
3
t
_
= −αe
−κx+8κ
3
t
ˆ g
1
(x, y; t) = −2κ
e
−κ(x+y)+8κ
3
t+2δ
1 +e
−[2κx−8κ
3
t−2δ]
;
the limiting form for y = x is
ˆ g
1
(x, x; t) = −2κ
1
1 +e
2[κx−4κ
3
t−δ]
;
it follows that
u(x; t) = −2
d
dx
ˆ g
1
(x, x; t) = −
2κ
2
cosh
2
[κ(x −4κ
2
t) −δ]
(5.60)
which is identical to the solitary wave (4.46), if we identify λ in (4.46) with 4κ
2
in (5.60).
Comments:
• the velocity of the wave corresponds (to within a factor of 4) to the eigenvalue of the
associated problem.
• The velocity coincides with the ratio P/M (cf. symmetries conservation laws; show
this (exercise)).
• Note that I made no attempt to guess the form of the wave. The form was “imposed” by
the separation ansatz (5.59), i.e. it is “builtin” in the association of the KdV equation
with the linear eigenvalue problem. This will be useful in the next subsection, where
I will try to construct solutions that correspond to multiple bound states.
• It is of course possible to treat the “proper” initial value problem. Starting with any
localized potential which may support a bound state, one can perform the IST steps.
Exercise: do this (i) for an attractive delta function potential −µδ(x), and (ii) for a
potential of the type −N(N + 1)sech
2
x, where N is an integer; (hint: in case (ii) the
form of the solution is an example of the case discussed in the next subsection).
35
5 Solving KdV by inverse scattering
5.5.2 Multiple bound states
Again, I will restrict myself to the case where the reﬂection coeﬃcient vanishes. The scat
tering kernel has the form
ˆ
K(z; t) =
N
i=1
α
i
(t) e
−κ
i
z
. (5.61)
where α
i
(t) = α
i
e
8κ
3
i
t
carries the time dependence. A generalized form of the separation
ansatz (5.59)
ˆ g
1
(x, y; t) =
N
i=1
e
−κ
i
y
h
i
(x, t) (5.62)
transforms the GLM equation (5.56) to
N
i=1
e
−κ
i
y
¦ h
i
(x, t) +α
i
(t)e
−κ
i
x
+
N
j=1
α
i
(t)
κ
i
+κ
j
e
−(κ
i
+κ
j
)x
h
j
(x, t)¦ = 0 ,
which must hold for all y > x; hence, in nonsymmetric matrix form,
N
j=1
A
ij
h
j
(x, t) = C
i
(x, t) (5.63)
where
A
ij
(x, t) = δ
ij
+
α
i
(t)
κ
i
+κ
j
e
−(κ
i
+κ
j
)x
(5.64)
and
C
j
(x, t) = −α
i
(t)e
−κ
i
x
. (5.65)
Thus
h
j
=
1
det A
det
_
_
_
_
_
_
_
_
A
11
A
12
C
1
A
1 j+1
A
21
A
22
C
2
A
2 j+1
C
3
_
_
_
_
_
_
_
_
. (5.66)
where the jth column in the matrix A has been substituted by the vector C; it follows that
ˆ g
1
(x, x) =
1
det A
N
j=1
det
_
_
_
_
_
_
_
_
A
11
A
12
C
1
e
−κ
j
x
A
1 j+1
A
21
A
22
C
2
e
−κ
j
x
A
2 j+1
C
3
e
−κ
j
x
_
_
_
_
_
_
_
_
; (5.67)
note however that, since
dA
ij
dx
= −α
i
e
−(κ
i
+κ
j
)x
= C
i
e
−κ
j
x
,
this is equivalent to
ˆ g
1
(x, x) =
d
dx
ln det A . (5.68)
36
5 Solving KdV by inverse scattering
At this stage it is convenient to introduce the symmetrized form of the matrix A, obtained
by
ˆ
A = DAD
−1
, where D
ij
= (α
i
)
−1/2
δ
ij
, i.e.
ˆ
A
ij
(x, t) = δ
ij
+
(α
i
α
j
)
1/2
κ
i
+κ
j
e
−(κ
i
+κ
j
)x
, (5.69)
whereupon
u(x, t) = −2
d
2
dx
2
ln det
ˆ
A (5.70)
where I have reintroduced the time dependence, with the understanding that it arises solely
from the α
i
s.
Application: N = 2, the twosoliton solution
In the case N = 2
det
ˆ
A = 1 +
α
1
2κ
1
e
−2κ
1
x
+
α
1
2κ
1
e
−2κ
2
x
+
α
1
α
2
4κ
1
κ
2
_
κ
1
−κ
2
κ
1
+κ
2
_
2
e
−2(κ
1
+κ
2
)x
or, setting
κ
1
−κ
2
κ
1
+κ
2
≡ e
−∆
, α
j
≡ 2κ
j
e
2θ
j
+∆
, (5.71)
det
ˆ
A = 1 +e
−2(κ
1
x−θ
1
−
∆
2
)
+e
−2(κ
2
x−θ
2
−
∆
2
)
+e
−2[(κ
1
+κ
2
)x−(θ
1
+θ
2
)]
. (5.72)
Note that now the time dependence is carried by the θ
j
’s, i.e.,
θ
j
→θ
j
(t) = θ
0
j
+ 4κ
3
j
t (5.73)
In order to extract the asymptotic behavior of u(x, t) at early and late times, I proceed as
follows: Assume κ
1
> κ
2
without loss of generality. Then as, t →−∞, at suﬃciently early
times, it is possible to satisfy the double inequality
θ
1
κ
1
¸
θ
2
κ
2
.
It is easy to see that, unless x ≈
θ
1
κ
1
or x ≈
θ
2
κ
2
, the 2nd derivative of the expression (5.72)
will be vanishingly small. This is true
• for x ¸
θ
2
κ
2
, because the last three terms vanish, leaving det
ˆ
A = 1
• for x ¸
θ
1
κ
1
, because, although the three last terms are all exponentially large, the last
one will be dominant. This leaves ln det
ˆ
A ∝ x and the second derivative vanishes.
• for
θ
1
κ
1
¸ x ¸
θ
2
κ
2
the second term will be exponentially small, and the third term be
much larger than the last. Again, ln det
ˆ
A ∝ x and the second derivative vanishes.
This leaves the cases where x is appreciably near either
θ
1
κ
1
or
θ
2
κ
2
. In the ﬁrst case, the
contributions to (5.72) come from the 3rd and 4th terms, i.e.
det
ˆ
A ≈ e
−2(κ
2
x−θ
2
−
∆
2
)
_
1 +e
−2(κ
1
x−θ
1
+
∆
2
)
_
,
or,
ln det
ˆ
A ≈ −2(κ
2
x −θ
2
−
∆
2
) −(κ
1
x −θ
1
+
∆
2
) + ln
_
2 cosh(κ
1
x −θ
1
+
∆
2
)
_
,
37
5 Solving KdV by inverse scattering
5
0
5
0.0
0.5
t
κ
1
=2
κ
2
=1
θ
1
0
/κ
1
= 2.
θ
2
0
/κ
2
= 1.
x
5 0 5
0.2
0.0
0.2
0.4
0
1.000
1.950
3.000
4.000
5.000
6.000
7.000
8.000
x
t
Figure 5.1: The twosoliton solution [−u(x, t)] of the KdV equation as a function of space and time.
Left panel: a 3d plot shows the collision of the two solitons. Right panel: a contour
plot of the same function; note the asymptotic motion of the local maxima and the
phase shifts as a result of the interaction.
and hence
u(x, t) ≈ −2κ
2
1
sech
2
(κ
1
x −θ
1
+
∆
2
) if x ≈ θ
1
/κ
1
Similarly, it can be shown that
u(x, t) ≈ −2κ
2
2
sech
2
(κ
2
x −θ
2
−
∆
2
) if x ≈ θ
2
/κ
2
.
Combining the above, and reintroducing the explicit time dependence, I can write that
u(x, t) ∼ −2
2
j=1
κ
2
j
sech
2
(κ
j
x −4κ
3
j
t −θ
0
j
±
∆
2
) if t →−∞ , (5.74)
where the upper sign holds for j = 1, and the lower for j = 2. The above analysis can be
repeated almost verbatim for asymptotically late times and leads to
u(x, t) ∼ −2
2
j=1
κ
2
j
sech
2
(κ
j
x −4κ
3
j
t −θ
0
j
∓
∆
2
) if t →∞ . (5.75)
The above equations describe the soliton property in a mathematically exact fashion. As we
follow the evolution from very early to very late times, we see the larger  and faster  local
compression reach the smaller  and slower  , interact with it in an apparently intricate
fashion, and then disengage itself and resume its motion with the same velocity. Both waves
maintain shape, amplitude and speed. The interaction does however leave a signature. The
center of mass of each wave becomes slightly displaced; the fastest by an amount of ∆/κ
1
38
5 Solving KdV by inverse scattering
(forwards), the slower by an amount of −∆/κ
2
(backwards). Note that because the mass
of each soliton is proportional to κ
j
, the center of mass of the combined twosoliton system
moves at a constant speed before and after the twosoliton collision. This type of elastic,
transparent interaction which leaves velocities unchanged and results only in spatial shifts
2
is characteristic of soliton bearing systems, and accounts for their remarkable dynamical
properties.
The analysis can be generalized to the N−soliton solution. It can be shown that phase
shifts are pairwise additive, i.e. the total phase shift of any soliton as a result of its interaction
with the other N −1 solitons is the sum of the N −1 phase shifts resulting from the N −1
collisions.
Fig. 5.1 exhibits graphically the dependence of the twosoliton solution on space and time.
5.6 Integrals of motion
It is possible to deal with integrals of motion in a systematic fashion, by following the
analytic structure of the scattering data. Recall that the transmission coeﬃcient does not
carry any time dependence under the IST, i.e. it can be treated as a constant of the motion!
5.6.1 Lemma: a useful representation of a(k)
Given the fact that a(k) (recall that a is the inverse of the transmission coeﬃcient) has
simple zeros in the upper half of the complex plane, the following identity holds:
ln a(k) =
1
2πi
_
∞
−∞
dk
ln [a(k
)[
2
k
−k
+
N
j=1
ln
_
k −k
j
k −k
∗
j
_
(5.76)
(cf. appendix ...).
5.6.2 Asymptotic expansions of a(k)
The asymptotic expansion
ln a(k) ∼
∞
n=1
J
n
(2ik)
n
(5.77)
holds for [k[ > max¦[k
j
[¦.
Multiply both sides of (5.77) by k
l−1
/(2πi) and integrate over a circle of radius R >
max¦[k
j
[¦ centered at the origin of the complex kplane. The only term which survives in
the sum is that with j = l, hence
(2i)
−l
J
l
=
1
2πi
_
dk k
l−1
ln a(k) ;
performing the dk integration in the ﬁrst term of (5.76) generates a contribution −k
l−1
.
The second term can be integrated by parts and generates contributions from all poles. This
results in
J
l
(2i)
l
= −
1
2πi
_
∞
−∞
dk k
l−1
ln [a(k)[
2
+
N
j=1
1
l
_
k
∗ l
j
−k
l
j
_
. (5.78)
2
the term “phase shifts” is generically applicable.
39
5 Solving KdV by inverse scattering
So far this has been general. Applying to the KdV equation, I set k
j
= iκ
j
; note that the
terms in the discrete sum vanish if l is even. Due to the reﬂection symmetry [a(k)[ = [a(−k)[,
the integrals vanish as well for even l. This leaves
J
2m+1
2
2m+1
= −(−1)
m
_
∞
−∞
dk k
2m
ln [a(k)[
2
+ 2
N
j=1
κ
2m+1
j
2m+ 1
. (5.79)
In what follows, I will relate the J
l
s  which are by deﬁnition constants of the motion, since
they only depend on a(k) and bound state eigenvalues  to the family of conserved quantities
generated in the previous section, independently of the IST.
From the general property of the Jost functions
f
2
(x, k) = a(k)f
1
(x, −k) +b(k)f
1
(x, k) (5.80)
I deduce that if Imk > 0
lim
x→∞
_
f
2
(x, k)e
ikx
¸
= a(k) . (5.81)
This is because in the limit of large positive x the ﬁrst term in (5.80), f
1
(x, −k) ∼ e
−ikx
is
exponentially large and the second f
1
(x, −k) ∼ e
−ikx
is exponentially small. On the other
hand,
lim
x→−∞
_
f
2
(x, k)e
ikx
¸
= 1 . (5.82)
holds by deﬁnition. Knowledge of the two limits allows me to deﬁne
σ(x) =
d
dx
_
ln
_
f
2
(x, k)e
ikx
_¸
=
f
2
f
2
+ik (5.83)
with the property
_
∞
−∞
dx σ(x) = ln a(k) . (5.84)
Now we can use the fact that f
2
is a solution of the associated linear problem, to derive
a diﬀerential equation for σ in terms of u. To do this I multiply both sides of (5.83) and
diﬀerentiate with respect to x. This gives
f
2
+ikf
2
= f
2
σ +f
2
σ
,
or
uf
2
−k
2
f
2
+ikf
2
= f
2
σ +f
2
σ
;
substituting f
2
= (σ −ik)f
2
generates
σ
−2ikσ +σ
2
−u = 0 (5.85)
a nonlinear ordinary ﬁrst order diﬀerential equation of the Ricatti type.
I can now try to generate an asymptotic solution of the Ricatti equation (5.85),
σ(x, k) ∼
∞
n=1
σ
n
(x, k)
(2ik)
n
(5.86)
where I note that, because of (5.77) and (5.84),
_
∞
−∞
dx σ
n
(x) = J
n
. (5.87)
40
5 Solving KdV by inverse scattering
Indeed, I note that the asymptotic ansatz (5.86) in (5.85) generates the recurrence relation
ships
σ
n+1
(x) =
d
dx
σ
n
(x) +
n−1
j=1
σ
j
(x)σ
n−j
(x) , n = 2, 3, (5.88)
with σ
1
(x) = −u(x). This generates a countable inﬁnity of conserved densities. The ﬁrst
few are
σ
2
= u
x
(5.89)
σ
3
= −u
xx
+u
2
(5.90)
σ
4
= −u
xxx
+ 2(u
2
)
x
(5.91)
σ
5
= −u
xxxx
+ (u
2
)
xx
+u
2
x
+ 2uu
xx
−2u
3
. (5.92)
Note that the even σ
n
’s are total derivatives, i.e. they generate trivial, vanishing integrals;
we know this, since the corresponding J
n
’s vanish. The ﬁrst few odd σ
n
’s generate the mass,
momentum, and energy integrals of section ....
5.6.3 IST as a canonical transformation to actionangle variables
It can be shown [] that the inverse scattering transform is a canonical transformation from
the original ﬁeld variables to actionangle variables. The scattering data of the IST are
in essence actionangle variables. This demonstrates the KdV Hamiltonian system is com
pletely integrable.
41
6 Solitons in anharmonic lattice
dynamics: the Toda lattice
The Toda lattice [11] is a unique example of a nonlinear discrete particle system which is
completely integrable. Although the property of complete integrability is certainly a singular
feature due to the peculiarity of the lattice potential, the model has been extremely useful
as a theoretical laboratory for the exploration of a number of novel concepts and phenomena
related to lossfree supersonic pulse propagation.
6.1 The model
The Hamiltonian
H =
n
_
p
2
n
2m
+φ(q
n
−q
n−1
)
_
(6.1)
where
φ(r) =
a
b
_
e
−br
+br −1
_
(6.2)
describes a chain of N particles with equal mass m, which interact via nearestneighbor
repulsive potential of exponential form. The range of the potential is given by 1/b and
its strength by a/b. The linear term in the potential represents an external force which is
necessary to achieve conﬁnement. Important limiting cases of (6.2) are:
• the harmonic limit
a →∞, b →0, ab →k
which leads to
φ(r) =
1
2
kr
2
;
• the hardsphere limit
b →∞
(i.e. the range approaches zero) with a ﬁnite, which leads to
φ(r) = 0 if r > 0
= ∞ if r < 0 .
I will, from now on, set a = 1, b = 1, m = 1. Units will be reintroduced when appropriate.
The equations of motion are
˙ q
n
= p
n
˙ p
n
= e
−(q
n
−q
n−1
)
−e
−(q
n+1
−q
n
)
(6.3)
42
6 Solitons in anharmonic lattice dynamics: the Toda lattice
6.2 The dual lattice
Consider the variables
r
n
= q
n
−q
n−1
which describe the diﬀerence in displacements of neighboring sites. Diﬀerentiating both
sides with respect to time gives
˙ r
1
= ˙ q
1
− ˙ q
0
˙ r
2
= ˙ q
2
− ˙ q
1
˙ r
j
= ˙ q
j
− ˙ q
j−1
.
Summing left and right sides separately gives allows me to express the velocity coordinates
as
˙ q
j
=
j
l=1
˙ r
l
, (6.4)
where I have assumed ˙ q
0
= 0. The total kinetic energy is
T =
1
2
N
n=1
_
n
l=1
˙ r
l
_
2
.
If I now deﬁne new momentum variables, conjugate to the r
n
coordinates, via
s
j
=
∂T
∂ ˙ r
j
= ˙ r
1
+ + ˙ r
j
+ ˙ r
1
+ + ˙ r
j+1
+
+ ˙ r
1
+ + ˙ r
N
,
the s
j
’s will satisfy
s
j
−s
j+1
= ˙ r
1
+ ˙ r
j
= ˙ q
j
(6.5)
and therefore I can rewrite the kinetic energy as
T =
1
2
N
j=1
(s
j+1
−s
j
)
2
Since the total potential energy is clearly only a function of the r
n
’ s,
n
φ(r
n
) ,
this process has deﬁned a new canonically conjugate set of variables. Following Toda [11]
we view this set as describing a new lattice, “dual” to the original. The equations of motion
are
˙ s
j
= −
∂H
∂r
j
= −φ
(r
j
)
˙ r
j
=
∂H
∂s
j
= 2s
j
−s
j+1
−s
j−1
(6.6)
and describe  by construction  the same dynamics as the original equations of motion (6.3).
43
6 Solitons in anharmonic lattice dynamics: the Toda lattice
10 5 0 5 10
1.0
0.8
0.6
0.4
0.2
0.0
0 2 4 6 8 10
1E9
1E7
1E5
1E3
0.1
10
r
n
n
α=1
δ/α=0
3.5
7.25
r
n
n
Figure 6.1: The local compression corresponding to a Toda soliton (6.13). The value of α is equal
to 1. The three curves represent diﬀerent choices of the phase δ. Inset: the dependence
of −r
n
vs n on a logarithmic scale for the case δ = 0.
Now it is possible to eliminate either the s
n
’s or the r
n
’s from (6.6). In the ﬁrst case I
obtain
¨ r
n
= 2e
−r
n
−e
r
n+1
−e
r
n−1
(6.7)
and in the second
1 +
¨
S
n
= e
−2S
n
+S
n+1
+S
n−1
(6.8)
where
S
n
=
_
t
0
dt
s
n
(t
) . (6.9)
Note that owing to (6.6),
q
n
= S
n
−S
n+1
. (6.10)
6.2.1 A pulse soliton
A special solution of (6.8) is
S
n
(t) = ln cosh(αn ∓βt −δ) (6.11)
where α > 0, δ is an arbitrary constant, and β = sinh α. Diﬀerentiating with respect to
time, I obtain
s
n
(t) = ∓β tanh(αn ∓βt −δ) (6.12)
and therefore
e
−r
n
= 1 +
β
2
cosh
2
(αn ∓βt −δ)
. (6.13)
This special solution corresponds to a supersonic, compressional pulse soliton moving with
the velocity
v = ±
β
α
= ±
sinh α
α
.
The form of the pulse is shown in Fig. 6.1.
44
6 Solitons in anharmonic lattice dynamics: the Toda lattice
Mass of the soliton
The total mass carried by the soliton can be shown  using (6.11)  to be
M =
j
r
j
= lim
n→∞
(q
n
−q
−n
) = −2α . (6.14)
Momentum of the soliton
The total lattice momentum carried by the soliton can be shown  using (6.12)  to be
P =
j
˙ q
j
= lim
n→∞
(s
n
−s
−n
) = ∓2β = Mv . (6.15)
Energy of the soliton
The total energy of the soliton is given by
1
2
n
(s
n+1
−s
n
)
2
+
n
_
e
−r
n
−1
_
+
r
n
.
The sum of the ﬁrst two terms can be shown to be sinh 2α; the third sum we recognize as
the soliton mass. Thus
E = sinh 2α −2α (6.16)
6.3 Complete integrability
Deﬁne new coordinates in terms of the original positions and momenta
a
n
=
1
2
e
−
1
2
(q
n
−q
n−1
)
b
n
= −
1
2
p
n
. (6.17)
Using the original equations of motion, I obtain
˙
b
n
= −
1
2
˙ p
n
= 2(a
2
n+1
−a
2
n
) (6.18)
and
ln(2a
n
) = −
1
2
(q
n
−q
n−1
)
˙ a
n
a
n
= −
1
2
(p
n
−p
n−1
)
˙ a
n
= a
n
(b
n
−b
n−1
) . (6.19)
Note that decaying boundary conditions at (plus or minus) inﬁnity correspond to a
n
→1/2,
b
n
→0. This allows for a constant value of the displacement q (cf. the pulse solution of the
previous section). Now one can directly verify that the set of equations is equivalent to the
condition
i
dL
dt
= [B, L] , (6.20)
45
6 Solitons in anharmonic lattice dynamics: the Toda lattice
where
L =
_
_
_
_
_
_
_
_
b
n−1
a
n
0 0
a
n
b
n
a
n+1
0
0 a
n+1
b
n+1
a
n+2
0 0 a
n+2
b
n+2
_
_
_
_
_
_
_
_
(6.21)
and
B = i
_
_
_
_
_
_
_
_
0 a
n−1
0 0
−a
n−1
0 a
n
0
0 −a
n
0 a
n+1
0 0 −a
n+1
0
_
_
_
_
_
_
_
_
(6.22)
are tridiagonal matrices which form a Lax pair. In other words, the Toda lattice with
decaying boundary conditions can be completely integrated using the inverse scattering
transform. Details can be found in [11].
This means that there are multisoliton solutions, and that Toda solitons have all the nice
properties of exact solitons which we encountered in the KdV example (e.g. elastic scattering
which only results in phase shifts etc).
6.4 Thermodynamics
The partition function of the Toda chain
Z =
_
_
N
i=1
dp
i
dq
i
_
e
−βH
,
where β is the inverse temperature, can be factorized into two contributions, Z
K
and Z
P
,
coming from the kinetic and potential energy respectively. The integration over momentum
variables gives a product of N identical integrals,
Z
P
=
__
∞
−∞
dp e
−βp
2
/2
_
N
=
_
2π
β
_
N/2
whereas the integration over position coordinates gives
Z
K
=
_
∞
−∞
_
N
i=1
dq
i
_
e
−β
N
i=1
φ(q
i
−q
i−1
)
=
_
∞
−∞
_
N
i=1
dr
i
_
e
−β
N
i=1
φ(r
i
)
=
__
∞
−∞
dr e
−βφ(r)
_
N
=
_
e
β
_
∞
0
dy y
β−1
e
−βy
_
N
=
_
e
β
β
−β
Γ(β)
¸
N
(6.23)
where the substitution y = e
−r
has been made. Combining terms I obtain the free energy
per site
f = −
1
N
1
β
ln Z = −1 + ln β −
1
β
ln Γ(β) +
1
2β
ln
β
2π
(6.24)
46
6 Solitons in anharmonic lattice dynamics: the Toda lattice
At low temperatures, β ¸1, one can use the Stirling approximation to the gamma function
Γ(z) ∼ e
−z
z
z−1/2
√
2π
_
1 +
1
12z
+
_
(6.25)
and obtain
f ∼
1
β
ln
β
2π
−
1
12β
2
+ (6.26)
where the ﬁrst term is identiﬁed as the free energy per site of a harmonic chain, and the sec
ond is the leading term of a systematic asymptotic expansion in powers of the temperature.
47
7 Chaos in low dimensional systems
7.1 Visualization of simple dynamical systems
7.1.1 Two dimensional phase space
Linear stability analysis
Consider the following general dynamical system consisting of two coupled diﬀerential equa
tions.
˙
x =
F(x) , (7.1)
where F
1
(x
1
, x
2
), F
2
(x
1
, x
2
) are arbitrary, in general nonlinear functions of x
1
, x
2
. Note that
(7.1) does not necessarily represent a mechanical system. It could for example represent a
coupled system of preypredator species with populations x
1
and x
2
respectively, for which
F
1
(x
1
, x
2
) = rx
1
−kx
1
x
2
F
2
(x
1
, x
2
) = −sx
2
+k
x
1
x
2
(7.2)
(LotkaVolterra equation). In the absence of interaction, the prey and predator populations
will, respectively, grow and die oﬀ, at exponential rates (Malthus model of population bi
ology). Interaction creates new possibilities. Note ﬁrst that for x
∗
1
= s/k
, x
∗
2
= r/k, the
righthand side of (7.2) vanishes. The two populations may coexist stably at these levels.
Suppose however that you are dealing with ﬁsh populations, and some outside agent, with
out the power to modify the biological parameters, simply removes a part of one  or both 
populations. If the perturbation is large, we would have to solve the full system (7.1) with
the new set of initial conditions. For small perturbations however, it is possible to make
some general statements about the system’s behavior in terms of linear stability analysis.
Consider a state of the system near the ﬁxed point, i.e.
x = x
∗
+δx(t) . (7.3)
If δx(t) is suﬃciently small, we may expand (7.1) around the ﬁxed point, and obtain
d
dt
δx(t) = Mδx(t) (7.4)
where
M
ij
=
_
∂F
i
∂x
j
_
x=x
∗
.
The ansatz δx(t) = exp(λt)
f leads to the eigenvalue equation
M
f = λ
f , (7.5)
A perturbation which has a nonzero component along an eigenvector with positive eigenvalue
will grow exponentially. On the other hand, if both eigenvalues are negative, the system will
be stable in all directions around the ﬁxed point. The various possibilities are summarized
as follows:
48
7 Chaos in low dimensional systems
• λ
1
, λ
2
real.
If λ
1
λ
2
< 0 we have a saddle (stable in one direction, unstable in the other); in the
special case λ
1
+λ
2
= 0, the saddle is called a hyperbolic ﬁxed point.
If λ
1
λ
2
> 0 we have a node. A node is stable if both eigenvalues are negative, and
unstable if both eigenvalues are positive.
• If λ
1
, λ
2
are complex conjugates we have a focus. A focus will be stable or unstable
according to whether the real part of λ is negative or positive, respectively. If λ
1
, λ
2
are pure imaginary we have an elliptic ﬁxed point.
The undamped harmonic oscillator
˙ q = p
˙ p = −ω
2
0
q (7.6)
Elliptic ﬁxed point at p = 0, q = 0. Eigenvalues are λ
1,2
= ±iω
0
. Because there is a
conserved quantity (Hamiltonian), orbits in phase space are one dimensional (ellipses).
The damped harmonic oscillator
˙ q = p
˙ p = −ω
2
0
q −γp (7.7)
The ﬁxed point at p = 0, q = 0 is either a stable focus (if γ < 2ω
0
), or a stable node (if
γ = 2ω
0
). There is no conserved quantity; orbits in phase space have a spiral form.
The pendulum
H(p, q) =
1
2
p
2
−ω
2
0
cos q (7.8)
˙ q = p
˙ p = −ω
2
0
sin q (7.9)
There are ﬁxed points at p = 0, q = kπ, where k = 0, ±1, ±2, . The points at even k are
elliptic, the ones at odd k are hyperbolic. Orbits in phase space are again onedimensional,
due to energy conservation. They are either bounded (near a ﬁxed point), or unbounded.
A special orbit (separatrix) separates the two types of motion. The separatrix connects two
hyperbolic ﬁxed points.
The bistable potential
H(p, q) =
1
2
p
2
+
1
2
(1 −q
2
)
2
(7.10)
˙ q = p
˙ p = (1 −q
2
)q (7.11)
There are ﬁxed points at p = 0, q = 0 (hyperbolic), and p = 0, q = ±1 (elliptic). Motion
is bounded but has a diﬀerent topology according to the value of the energy. The diﬀerent
types of motion are separated by a particular orbit (separatrix).
49
7 Chaos in low dimensional systems
7.1.2 4dimensional phase space
The dynamics of a Hamiltonian system with two degrees of freedom
H =
1
2
(p
2
x
+p
2
y
) +V (q
1
, q
2
) (7.12)
is formulated in terms of a system of 4 coupled diﬀerential equations which are ﬁrst order
in time:
˙ q
1
= p
1
˙ q
2
= p
2
˙ p
1
= −
∂V (q
1
, q
2
)
∂q
1
˙ p
2
= −
∂V (q
1
, q
2
)
∂q
2
. (7.13)
In a generic Hamiltonian system only the energy is conserved. If for some reason there is a
further constant of motion, orbits will lie on a 2dimensional torus. In the generic case, phase
space orbits will be on the energy shell, i.e a 3d hypersurface. It is possible to visualize this
with the help of Poincar´e surfaces of section, i.e. by projecting the energy hypersurface on
the q
1
= 0 plane. A Poincar´e surface of section  brieﬂy, Poincar´e cut , consists of points
(q
2
, p
2
on a plane, taken at q
1
= 0, p
1
> 0. A cut of a 2d torus would be a continuous
curve. A cut of a generic 3d hypersurface would ﬁll a portion of the plane. Evidence of
such “ﬁlling” in systems which are perturbed away from an integrable limit, is interpreted
as “breaking of the torus”. This is what Hamiltonian chaos is all about.
The HenonHeiles Hamiltonian
H =
1
2
(p
2
x
+p
2
y
) +
1
2
(x
2
+y
2
) +x
2
y −
1
3
y
3
, (7.14)
originally proposed as a model for integrable behavior in galactic motion [12], was a milestone
in the study of Hamiltonian chaos. The equipotential surfaces, shown in Fig. 7.1, suggest
its usefulness as a model for triatomic molecules.
2 1 0 1 2
2.0
1.5
1.0
0.5
0.0
0.5
1.0
1.5
2.0
2.000
0.5000
0
0.06250
0.1667
1.000
2.500
4.000
5.500
7.000
8.500
10.00
x
y
1.0 0.5 0.0 0.5 1.0
0.6
0.4
0.2
0.0
0.2
0.4
0.6
0.8
1.0
y
x
Figure 7.1: Left: equipotential surfaces of the HenonHeiles Hamiltonian; right: details of the
region of bounded motion, E = 1/30, 1/15, 1/10, 2/15, 1/6 (outer surface).
Fig. 7.2 shows Poincar´e cuts obtained at increasing energies. At E = 1/12  which is not a
small energy!  the motion is almost entirely regular (note however the seeds of irregularity
50
7 Chaos in low dimensional systems
in the immediate vicinity of the separatrix). As the energy increases further, the various
tori begin to disappear. Widespread chaos ensues. Note that the scattered points all belong
to the same trajectory in phase space.
0.4 0.2 0.0 0.2 0.4
0.4
0.3
0.2
0.1
0.0
0.1
0.2
0.3
0.4
p
y
y
0.4 0.2 0.0 0.2 0.4 0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.1
0.2
0.3
0.4
0.5
p
y
y
0.5 0.0 0.5 1.0
0.6
0.4
0.2
0.0
0.2
0.4
0.6
p
y
y
Figure 7.2: Poincar´e cuts for the HenonHeiles system. Left, E=1/12; center E=1/8; right, E=1/6.
The percentage of area covered by such scattered points provides a measure of chaos.
7.1.3 3dimensional phase space; nonautonomous systems with one
degree of freedom
A nonautonomous Hamiltonian system with one degree of freedom
H(p, q, t) =
1
2m
p
2
+V (q, t) (7.15)
is described by the equations
˙ q =
p
m
˙ p = −
∂V (q, t)
∂q
˙
H =
∂V (q, t)
∂t
. (7.16)
Phase space is now in general 3dimensional. There are no conserved quantities to reduce it.
However, if the system is externally driven by a periodic force of period T, one may attempt
to visualize its behavior by using stroboscobic plots, i.e. plotting pairs p
n
, q
n
obtained at
times t
n
= nT. As an example, consider the plots obtained for the bistable oscillator with
m = 2 in a periodic ﬁeld
V (q, t) = −2q
2
+q
4
+q cos ωt . (7.17)
In the absence of a driving ﬁeld, the trajectories in 2dimensional phase space are shown in
Fig. 7.3 (left panel). The equations of motion have 3 ﬁxed points, two of them (at q = ±1)
elliptically stable, and one (at q = 0) hyperbolically unstable. If the total energy is low
(near 1), the particle performs lowamplitude oscillations at the bottom of the left or the
right well. The limiting natural frequency of oscillation is ω
0
= 2.
51
7 Chaos in low dimensional systems
The other two panels of Fig. 7.3 show what happens when a periodically varying ﬁeld is
turned on. The frequency of the ﬁeld ω = 1.92 is chosen to lie near ω
0
. At low amplitudes
of the driving ﬁeld and reasonably low energies, a stroboscopic plot of motion is not funda
mentally diﬀerent from the corresponding plot at the conservative limit = 0; the particle
stays conﬁned near the top of the potential well. As the driving amplitude increases, the
particle escapes the well and performs a chaotic motion in the vicinity of the separatrix of
the conservative limit (right panel).
1 0 1
2
1
0
1
2
p
x
1.0
0.50
0
0.50
1.0
1.5
2.0
2.0
1 0 1
2
1
0
1
2
p
x
ϖ=1.92
ε=0.01
1 0 1
2
1
0
1
2
p
x
ϖ=1.92
ε=0.1
Figure 7.3: Stroboscobic plot of the dynamics of (7.17) for ω = 1.92 and 0 < t < 2000 (after
[13]). The left panel shows the contours of phase space trajectories of the unperturbed,
conservative system; note the separatrix at E = 0, which separates bounded from
unbounded motion. Initial conditions were p = 0 and q = 0.24, corresponding to
E = −0.112, an energy near the top of the potential well. The middle panel, at
= 0.01, shows that the particle remains trapped in the well. The right panel, at
= 0.1, illustrates the escape from the well, and the “breaking of the torus” which
occurs near the separatrix.
7.2 Small denominators revisited: KAM theorem
Recall there was a problem of small denominators; if you start with an integrable Hamilto
nian H
0
(J
1
, J
2
) and functionally independent frequencies ω
i
= ∂H
0
/∂J
i
and perturb it with
a small perturbation µH
1
(J
1
, J
2
, θ
1
, θ
2
) then Poincar´e showed that there are no analytic in
variants of the perturbed system H
0
+µH
1
. Is chaos inevitable? The answer is more or less
yes. Is chaos imminent and overwhelming, even for a small perturbation? The answer is no 
as we have seen from circumstantial evidence in the HenonHeiles Hamiltonian. Kolmogorov,
Arnold and Moser (KAM) showed that, if the Hessian of the unperturbed Hamiltonian is
nondegenerate, i.e.
det
_
∂
2
H
0
∂J
i
∂J
j
_
= det
_
∂ω
i
∂J
j
_
,= 0 , (7.18)
a torus of the H
0
Hamiltonian with frequencies ω
i
survives, slightly deformed, in the per
turbed system, provided
[n
1
ω
1
+n
2
ω
2
[ ≥
K()
[[n
1
[ +[n
2
[[
α
∀ n
1
, n
2
(7.19)
where α > 2 and K() depend on the particulars of the problem. Tori which do not fulﬁll
this condition may break up. The destroyed tori constitute a dense set. Yet they have a
very small measure. Most tori survive. It is possible to understand this using an analogy
52
7 Chaos in low dimensional systems
with the length obtained by excluding from the line continuum a small neighborhood, say
/n
3
, around every rational number m/n (recall that the rationals form a dense set). The
measure of the continuum deleted is
∞
n=1
n
m=1
n
3
=
∞
n
n
2
=
π
2
6
. (7.20)
Although irrationals do not form a dense set, they make most of the measure of real numbers.
In this sense, almost all tori survive the addition of a small (in practice: even a moderately
large) perturbation. Eventually however, as the perturbation grows, chaos ensues.
Note I have used the language of systems with two degrees of freedom just for simplicity.
In fact, the KAM theorem holds for an arbitrary number of degrees of freedom, under the
conditions described above.
7.3 Chaos in area preserving maps
7.3.1 Twist maps
The twist map allows direct visualization of a Hamiltonian system with two degrees of free
dom, moving on a torus. Let J
1
, J
2
be the action coordinates, and θ
1
, θ
2
the corresponding
angle coordinates. Make a Poincar´e cut each time θ
2
= 0 mod 2π. This will by deﬁnition be
every τ = 2π/ω
2
seconds, where ω
2
= ∂H
0
/∂J
2
. Then plot the coordinates ρ =
√
2J
1
and
φ = θ
1
on a plane. The points will lie on a circle. I can express the successive values of the
angle coordinate on the cut by the sequence
φ
n+1
= φ
n
+ω
1
τ
or, more generally, in terms of the winding number w = ω
1
/ω
2
ρ
n+1
= ρ
n
φ
n+1
= φ
n
+ 2πw(ρ
n
) (7.21)
where I have explicitly allowed all possible J
1
’s and hence all possible radii. For a given
energy this ﬁxes J
2
, so that the winding number is only a function of ρ. In shorthand
notation this will be
_
ρ
n+1
φ
n+1
_
= T
0
_
ρ
n
φ
n
_
, (7.22)
where T
0
stands for the unperturbed twist map.
Now if the winding number can be expressed as a rational fraction r/s, the cut will be
composed of s points (scycle). If not, we have quasiperiodic motion; the cut ﬁlls the circle
densely.
We would like to ﬁnd out what happens under a perturbation. This is described below
(Poincar´eBirkhoﬀ theorem). For the moment, let me just describe what a perturbed map
will look like  and how to get it. In general,
ρ
n+1
= ρ
n
+f
1
(ρ
n
, φ
n
)
φ
n+1
= φ
n
+ 2πw(ρ
n
) +f
2
(ρ
n
, φ
n
) (7.23)
where I must choose the functions f
1
and f
2
such that the map represents a Hamiltonian ﬂow,
i.e. it should be a canonical transformation. This can be achieved by using an appropriate
53
7 Chaos in low dimensional systems
generating function F(φ
1
, φ
2
) such that
ρ
n+1
= −
_
∂F
∂φ
n
_
φ
n+1
φ
n+1
=
_
∂F
∂φ
n+1
_
φ
n
. (7.24)
A class of such perturbed maps can be obtained by the generating function
F(φ
n
, φ
n+1
) =
1
2
(φ
n
−φ
n+1
)
2
+V (φ
n
) . (7.25)
The maps have the form
ρ
n+1
= ρ
n
+V
(φ
n
)
φ
n+1
= φ
n
+ρ
n+1
; (7.26)
The above map equations (7.26) can also be derived by demanding that the “action”
W =
m
n=0
F(φ
n
, φ
n+1
) (7.27)
should be an extremum with respect to any of the m internal coordinates φ
1
, , φ
m
(i.e.
the end coordinates are φ
0
, φ
m+1
are held ﬁxed). F can thus be interpreted as a discrete
Lagrangian. Later in the course I will show that this has important applications in an entirely
diﬀerent context  determining energy minima and studying prototypes of amorphous solids;
in other words, spatial rather than temporal chaos.
7.3.2 Local stability properties
The local stability properties of ﬁxed points are governed by the tangent map (cf. continuous
dynamics). Thus if
X
∗
= (ρ
∗
, φ
∗
) is a ﬁxed point of the map T,
X
∗
= T(
X
∗
) (7.28)
the tangent map of T is deﬁned via a linearization procedure around the ﬁxed point:
X
n
=
X
∗
+δ
X
n
(7.29)
δ
X
n+1
= M(
X
∗
)δ
X
n
(7.30)
where in general
M(
X
n
) =
_
∂ρ
n+1
∂ρ
n
∂ρ
n+1
∂φ
n
∂φ
n+1
∂ρ
n
∂φ
n+1
∂φ
n
_
. (7.31)
Since the map T is area preserving, the eigenvalues of M will satisfy the relationship λ
1
λ
2
=
1. There are two distinct cases
• both roots are imaginary; they must be of the form
λ
1,2
= e
±iδ
(7.32)
(elliptic ﬁxed point), or
• both roots are real
[λ
1
[ > 1 [λ
2
[ < 1 (7.33)
(hyperbolic ﬁxed point if both positive, hyperbolic with reﬂection if both negative).
54
7 Chaos in low dimensional systems
Periodic motion (s− cycles in the form
X
∗
1
,
X
∗
2
, ,
X
∗
s
) is represented by ﬁxed points of
the T
s
map,
X
∗
j
= T
s
(
X
∗
j
) j = 1, 2, , s . (7.34)
The stability of the scycle (7.34) is governed by the eigenvalues of the product matrix
M
(s)
= M(
X
∗
s
)M(
X
∗
s−1
) M(
X
∗
1
) .
Note that, since the determinant of each one of the terms in the above product is unity,
det M
(s)
= 1. The classiﬁcation of stability properties is therefore exactly the same (elliptic
vs hyperbolic cycles) as in the case of ﬁxed points (cf. above).
7.3.3 Poincar´eBirkhoﬀ theorem
The unperturbed twist map with a rational winding number w = r/s will generate an scycle
whose points lie on a circle C. This will happen no matter where one starts on the circle.
In this sense, every point the circle will be a ﬁxed point of the unperturbed T
s
o
map,
T
s
o
C = C . (7.35)
Note that this diﬀers from the generic situation of an irrational winding number; the circle
with a radius which corresponds to an irrational winding number maps onto itself  but its
points are not ﬁxed points of any ﬁnite repeated application of the map. What happens
under the inﬂuence of a perturbation? In order to see this, consider two neighboring circles,
C
+
, with a slightly larger, irrational winding number w
+
, and C
−
with a slightly smaller,
irrational winding number w
−
. Under application of the same unperturbed twist map, C
+
will be slightly twisted  with respect to C in the positive (counterclockwise) direction, since
w
+
> w; similarly C
−
will be slightly twisted in the negative (clockwise) direction, since
w
−
< w. These relative opposite twists of the circles survive under the perturbed twist
map T
s
 although their form may be distorted. By a continuity argument it is possible
to construct a “zero twist” curve R. If I now apply the map T
s
to R, the resulting curve
will be distorted with respect to R only in the radial direction (zero twist). Because the
map is area preserving, there should, in general, be an even number of intersections, 2ks
(exceptions are possible in cases where the curve T
s
R might tangentially touch the curve
R). These intersections are the only ﬁxed points which survive from the original invariant
circle C in the presence of a perturbation. Of the 2ks ﬁxed points, half are elliptically stable
and half hyperbolically unstable; elliptic and hyperbolic ﬁxed points come in pairs and they
alternate. This is the Poincar´eBirkhoﬀ theorem.
7.3.4 Chaos diagnostics
Power spectra
Given a suitably averaged timedependent quantity f(t), it is possible to deﬁne its power
spectrum
I(ω) =
1
2π
_
∞
−∞
dte
iωt
f(t) . (7.36)
If the “signal” is periodic in time, i.e. if f(t) = f(t + T), it is possible to express it as a
Fourier series
f(t) =
∞
n=−∞
α
n
e
−inΩt
(7.37)
55
7 Chaos in low dimensional systems
Figure 7.4: Illustration of the Poincar´eBirkhoﬀ theorem. (a) upper left: the unperturbed map:
a circle C with a rational winding number w, along with neighboring circles C
+
, C
−
with irrational winding numbers w
+
(positive twist), w
−
(negative twist). (b) upper
right: the perturbed map; outer and inner curve represent, respectively, the slightly
deformed versions of C
+
, C
−
. The intermediate curve R is a zerotwist curve obtained
by the requirement of continuity. (c) lower right: tR(continuous curve) and its T
s
map
(dashed curve). In this case s = 2. There is no twist under the action of the map,
just pulling and pushing along the radial direction. There is a total of 4 intersections,
corresponding to a stable and an unstable 2cycle. Following the arrows, it is possible
to determine which points are elliptic and which are hyperbolic. Note that the small
arrows outside R are all pointing in the outward direction (positive twist), and those
inside R in the negative direction (negative twist). (d) a more abstract view of the
elliptic and hyperbolic 2cycles.
where Ω = 2π/T. It follows that the spectrum
I(ω) =
∞
n=−∞
α
n
δ(ω −nΩ) (7.38)
will be composed of a series of δpeaks situated at the fundamental frequency and its higher
harmonics.
One can generalize this to the case of a multiply periodic motion  which would be more
apt to describe motion on on a torus. In this case of a doubly periodic motion f(t) is
56
7 Chaos in low dimensional systems
described by a double Fourier expansion
f(t) =
∞
n
1
=−∞
∞
n
2
=−∞
α
n
1
,n
2
e
−i(n
1
Ω
1
+n
2
Ω
2
)t
(7.39)
and the spectrum
I(ω) =
∞
n
1
=−∞
∞
n
2
=−∞
α
n
1
,n
2
δ(ω −n
1
Ω
1
−n
2
Ω
2
) , (7.40)
forms peaks at all sum and diﬀerence frequencies. Under ideal conditions (cf. Fig. 7.5) it
0.0 0.1 0.2 0.3 0.4
10
8
10
7
10
6
1x10
5
ϖ/2π
P
o
w
e
r
s
p
e
c
t
r
a
0.0 0.1 0.2 0.3 0.4
10
8
10
7
10
6
1x10
5
1x10
4
ϖ/2π
P
o
w
e
r
s
p
e
c
t
r
a
0.0 0.1 0.2 0.3 0.4
10
8
10
7
10
6
1x10
5
ϖ/2π
P
o
w
e
r
s
p
e
c
t
r
a
Figure 7.5: Power spectra of p
y
(t) for quasiperiodic (left and center panels) and chaotic (right
panel) trajectories of the HenonHeiles system at energy E = 1/8. In the case of
quasiperiodic motion (left) it is possible to make a detailed identiﬁcation of the ﬁve
peaks in terms of two fundamental torus frequencies at f
1
= 0.16 and f
2
= 0.12, their
second harmonics, and the diﬀerence f
1
− f
2
= 0.04. A similar assignment can be
made in the case of the center panel. Chaotic spectra (right panel) are characterized
by broader, noisier features.
should of course be possible to distinguish regularity from chaos by its spectral signatures.
In the former case the spectrum is periodic or quasiperiodic, in the latter case there is
a lot of noise, perhaps accompanied by broad peaks. In practice however, the intrinsic
limitations of obtaining useful power spectra from ﬁnite numerical (or experimental) data,
renders spectral information somewhat limited as a sole criterion of deciding whether a given
process is chaotic or not.
Lyapunov exponents
Lyapunov exponents quantify the usual deﬁning property of deterministic chaos, which is
the sensitive dependence on initial conditions. Consider a certain trajectory of the  not
necessarily areapreserving  Ndimensional map T
X
j+1
= T(
X
j
) j = 0, n −1, (7.41)
and a “neighboring” trajectory, which starts at
X
0
+ δ
X
0
. The diﬀerence between the two
trajectories after the ﬁrst iteration can be expressed in terms of the tangent map:
δ
X
1
= M(
X
0
)δ
X
0
;
57
7 Chaos in low dimensional systems
after the second iteration it will be
δ
X
2
= M(
X
1
)δ
X
1
= M(
X
1
)M(
X
0
)δ
X
0
,
and after n iterations
δ
X
n
= M(
X
n−1
)M(
X
n−2
) M(
X
0
)δ
X
0
= Λ
n
(
X
0
, ,
X
n−1
)δ
X
0
, (7.42)
where the N N matrix Λ is the nth root of the product of all n tangent maps involved in
the trajectory; in general, Λ will have N eigenvalues λ
α
(n), α = 1, N, which will depend
on the order of iteration n. The Lyapunov exponents are deﬁned as
σ
i
= lim
n→∞
ln [λ
α
(n)[ α = 1, , N. (7.43)
Note that in general there are as many Lyapunov exponents as the dimensionality of the
map. If the map is area preserving, they come in pairs, i.e. for each positive exponent, a
negative exponent with the same magnitude must occur. This corresponds to expanding
and shrinking directions; It is obvious from (7.42) that, if we order Lyapunov exponents in
decreasing order
σ
1
> σ
2
> σ
N
(7.44)
the largest (positive) exponent will eventually dominate the right hand side of (7.42). This
will happen even if there is a vanishingly small component of δ
X
0
in the direction of the
eigenvector of Λ which corresponds to σ
1
. The norm [[δ
X
n
[[ will grow exponentially as
e
σ
1
n
. This is exactly the physical content of “sensitive dependence on initial conditions”.
Lyapunov exponents provide a measure of just how sensitive this dependence is.
Note: here I have deﬁned Lyapunov exponents in the context of maps. If time permits, I
will present the deﬁnitions  and computational procedures  for dynamical systems governed
by diﬀerential equations, i.e. Hamiltonian or dissipative dynamics.
7.3.5 The standard map
kicked pendulum, kicked rotator
Consider the nonautonomous Hamiltonian system deﬁned by a kicked pendulum, where
gravity acts in bursts, every τ seconds:
H =
p
2
2
−
K
(2π)
2
cos(2πq)
∞
n=−∞
δ(t −nτ) (7.45)
where p is the angular momentum and 2πq the angle, referred to the perpendicular direction
(cf. Hatom in electric ﬁeld.)
The equations of motion are
˙ p = −
∂H
∂q
= −
K
2π
sin(2πq)
∞
n=−∞
δ(t −nτ)
˙ q =
∂H
∂p
.
The ﬁrst equation implies that p is constant, except at times t = nτ, when it changes by a
discrete step. Deﬁning
p
n
= lim
→0
p(nτ −) ,
58
7 Chaos in low dimensional systems
I can integrate in the neighborhood of t = nτ, set τ = 1 and obtain what is known as the
standard map
p
n+1
= p
n
−
K
2π
sin(2πq
n
)
q
n+1
= q
n
+p
n+1
. (7.46)
Eqs. 7.46 belong to the general class (7.26) of area preserving twist maps. In the following,
the coordinates p
n
, q
n
will be understood as mod1, unless stated otherwise.
Fixed points
The map (7.46) has two ﬁxed points:
• p
∗
= 0, q
∗
= 0, which is elliptic, and
• p
∗
= 0, q
∗
= 1/2, which is hyperbolic .
(NB: the published literature has adopted a variety of conventions; one of them has a diﬀerent
sign in the Hamiltonian; this amounts to a shift of q by 1/2)
Summary of results
At small values of K (cf. Fig. 7.6) there is no sign of chaos. We observe the tori which
surround the elliptic ﬁxed point, which extend up to the separatrix which leaves oﬀ the hy
perbolic ﬁxed point. Furthermore, we observe a large number of “horizontal” tori  meaning
that they run all the way from left to right; these tori are the slightly deformed versions
of the original irrational tori of the unperturbed system, which have survived the pertur
bation. Finally, the structure which emanates from the period 2 cycle, around the center
of the picture, is visible. This resonant torus is broken; according to the Poincar´eBirkhoﬀ
theory, we observe a period 2 island chain, and the hyperbolic ﬁxed points nested between
them.
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
p
q
K=0.5
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
p
q
K=0.8
Figure 7.6: Trajectories of the standard map at K = 0.5 (left panel), K = 0.8 (right panel).
As the perturbation increases, more and more nearresonant horizontal tori break up.
Chaos develops around the separatrices of the leading resonances (hyperbolic ﬁxed point
59
7 Chaos in low dimensional systems
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
p
q
K=0.9716354
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
p
q
K=1.1716354
Figure 7.7: Trajectories of the standard map at K = K
c
= 0.9716354 (left panel), K = 1.17 (right
panel).
and in the crossings between period 2 island chain). The survival of a torus depends on
“how irrational” its winding number is. In order to see what this means, look at the
continued fraction representation of an irrational number
w = a
0
+
1
a
1
+
1
a
2
+···
≡ ¦a
0
; a
1
, a
2
, ¦ , (7.47)
where the integers ¦a
i
¦ satisfy a
0
≥ 0 and a
i
> 1 ∀i ≥ 1. An nth order approximation
w = r
n
/s
n
can be generated by the sequence
r
n
= a
n
r
n−1
+r
n−2
s
n
= a
n
s
n−1
+s
n−2
(7.48)
with r
−2
= 0, r
−1
= 1, s
−2
= 1, s
−1
= 0.
Eq. (7.48) implies that
s
n+1
> a
n+1
s
n
. (7.49)
It follows that
[w −
r
n
s
n
[ <
1
s
n
s
n+1
<
1
a
n+1
s
2
n
. (7.50)
Thus, if a
n+1
is large, the nth approximation is a good one. An example is
π = ¦3; 7, 15, 1, 292, ¦
which leads to π = 3.14159265 ≈ r
3
/s
3
= 355/113 = 3.14159292, good to 7 digits.
Conversely, the representation
√
2 = ¦1; 2, 2, 2, 2, ¦
leads to
√
2 = 1.414213 ≈ r
3
/s
3
= 17/12 = 1.41666 , which has an error in the 4th
digit. In this sense, the golden mean
√
5 + 1
2
= ¦1; 1, 1, 1, 1, ¦ (7.51)
and its inverse
√
5 −1
2
= ¦0; 1, 1, 1, 1, ¦ (7.52)
60
7 Chaos in low dimensional systems
can be considered as the “most irrational” numbers. Therefore, the nonresonant torus with
a winding number equal to the inverse golden mean, is expected to be the last to break.
The disappearance of the last, “golden mean” torus at K = K
c
= 0.9716354 (cf. Fig.
7.7, left panel)is a key event in the nonlinear scenario. It signals the transition from local
to widespread chaos. The following aspects deserve special attention:
• breaking of analyticity: As K approaches the critical value K
c
, the deformation of the
torus increases dramatically. The following procedure [14] makes it possible to follow
the torus’ shape and detailed properties. First observe, following Greene [15], that
an instability of a torus with irrational winding number w can be associated with the
instability of an s
n
→∞cycle, where s
n
is deﬁned in terms of the sequence r
n
/s
n
used
to approximate w. Thus, rather that try to construct a torus directly, it is possible to
determine successive cycles and their thresholds of instability.
It useful to introduce the Moser representation (parametrization) [14]
q
j
= t
j
+u(t
j
) , (7.53)
appropriate to any cycle with a rational winding number w; here t
j
= jw = jr/s. The
property q
j
= q
j+s
implies
u(t
j
) = u(t
j
+ 1) mod 1 . (7.54)
Note that the periodic function u(t)  which can be shown to be odd  is only deﬁned
on a rational set t = t
j
= jr/s, but this set becomes more and populous as s is
increased. Fig. 7.8 shows the dependence of u(t), evaluated for an s = 4181 cycle
which approximates the torus with a golden mean winding number, as a function of
K. Note how the function becomes less and less smooth as K
c
is approached.
• self similarity: Fig. 7.8 shows the shape of the KAM goldenmean torus at two non
critical K’s and at K = K
c
. Note the detailed view of the nonsmooth function.
The detailed numerics [14] allows the conjecture of selfsimilarity; in other words, the
valleys and hills of the curve repeat themselves at all possible scales of numerical ob
servation. In this sense, KAM torus disappearance resembles a critical phenomenon.
Selfsimilarity is very well demonstrated in the frequency spectra. The odd function
u(t) can be represented in a Fourier series
u(t) =
∞
f=1
A
f
sin 2πft
The product fA
f
is shown in Fig. 7.9 as a function of f, for the same values of K as
in Fig. 7.8. Note the presence of more and more peaks as the critical value of K is
approached. At K = K
c
self similar behavior occurs, with primary peaks occurring at
the Fibonacci numbers.
• Arnold diﬀusion: A picture of the widespread chaos which occurs at higher values
of the nonlinear parameter K > K
c
is given in Fig. 7.7) (right panel). Unless a
point starts in the immediate neighborhood of the elliptic ﬁxed point, or the very
few islands, it will typically generate a chaotic orbit which may diﬀuse over a large
twodimensional area of phase space. This diﬀusive behavior can be quantitatively
characterized as follows.
Suppose we relax the mod 1 condition on the momentum p in (7.46). In other words
we allow the phase space to be a cylinder of perimeter 1 and look at the quantity
D = lim
n→∞
_
(p
j+n
−p
n
)
2
2n
_
, (7.55)
61
7 Chaos in low dimensional systems
0.0 0.2 0.4 0.6 0.8 1.0
0.55
0.60
0.65
0.70
p
q
0.337 0.338 0.339 0.340
0.676910
0.676912
0.676914
p
q
0.674534
0.674536
0.674538
0.0 0.1 0.2 0.3 0.4 0.5
0.00
0.02
0.04
0.06
0.120 0.125 0.130
0.0460
0.0465
u
t
Figure 7.8: The torus with a winding number approximately equal to the inverse of the golden
mean W
∗
= (
√
5 − 1)/2, at K = 0.5, 0.9, K
c
. The curves shown are actually sets of
discrete points belonging to cycles of order s = 4181 with a rational winding number
w = r/s = 2584/4181 which diﬀers from W
∗
by less than 3 ×10
−8
. I. Left panel: the
torus in the (p, q) plane. II. Center panel: a detailed view of the same torus in the
cases K = .9 (right yscale) and K = K
c
(left yscale). III. Right panel: the function
u(t) which describes the torus of the standard map in parametric form. Again, the
curve shown is actually obtained from an 4181cycle which approximates the irrational
winding number W
∗
. Note how the function changes from smooth at K = 0.5, to
somewhat bumpy at K = 0.9, to very bumpy at K = K
c
. The inset shows a detailed
view of the critical curve, which suggests selfsimilar behavior (After [14]).
2 8 32 128 512 2048
0.00
0.01
0.02
0.03
0.04
0.05
fA(f)
f
2 8 32 128 512 2048
0.00
0.01
0.02
0.03
0.04
0.05
fA(f)
f
2 8 32 128 512 2048
0.00
0.01
0.02
0.03
0.04
0.05
fA(f)
f
Figure 7.9: Fourier coeﬃcients of the function u(t), which describes parametrically the torus with
with a golden mean winding number, at K = 0.5, 0.9, K
c
. The quantity plotted is
fA
f
. The curve at K = K
c
(right panel), with primary contributions at the Fibonacci
numbers, suggests selfsimilar behavior (after [14]).
which describes the coeﬃcient of diﬀusion in momentum space.
As long as K < K
c
, the existence of even a single torus presents a topological barrier
to diﬀusion
1
. D should vanish.
1
this is no more the case in higher dimensions! Arnold diﬀusion is generic in higher dimensionality because
tori can be bypassed.
62
7 Chaos in low dimensional systems
In the opposite limit K ¸ K
c
, we can estimate the diﬀusion coeﬃcient as follows.
From (7.46)
p
j+n
= p
j
−
K
2π
j+n−1
l=j
sin(2πq
l
) (7.56)
and hence
(p
j+n
−p
j
)
2
=
_
K
2π
_
2 j+n−1
l,l
=j
sin(2πq
l
) sin(2πq
l
) . (7.57)
Now, since q
j+1
= q
j
+ p
j+1
mod 1, if p
j+1
is large, q
j+1
is essentially random, i.e.
uncorrelated to q
j
. The only correlations which survive are from terms l
= l. On
the average, the double sum will therefore be equal to n/2 (the 1/2 factor from the
average value of sin
2
). Therefore
D ≈
_
K
4π
_
2
if K ¸K
c
. (7.58)
In the case where K slightly exceeds K
c
, Chirikov has estimated
D ∝ (K −K
c
)
2.56
. (7.59)
• Cantori: (7.59) makes clear that, even beyond the stochasticity threshold, diﬀusion
does not proceed uninhibited. At values slightly above K
c
the diﬀusion constant is in
fact very close to zero. It appears that some barriers to diﬀusion persist after all KAM
tori have been broken.
Resistance to diﬀusion can be related to a particular class of orbits with irrational
winding numbers, which do not fully cover a onedimensional curve (Fig. 7.10), but
leave a countable set of open intervals empty  i.e. they form a Cantor set. Because of
this property they were named cantori by Percival. The existence of cantori as isolated
regular orbits embedded in a sea of chaos is remarkable. We will deal with them again
in Chapter 9, in the context of solid state theory.
An excellent review of the transport properties of Hamiltonian maps has been written
by Meiss [16].
7.3.6 The Arnold cat map
The area preserving map
x
n+1
= x
n
+y
n
mod 1
y
n+1
= x
n
+ 2y
n
mod 1 (7.60)
has a tangent map
M =
_
1 1
1 2
_
(7.61)
which does not depend on the coordinates. The eigenvalues of the map are
λ
1,2
=
1
2
_
3 ±
√
5
_
(7.62)
and the Lyapunov exponents σ
1
= ln λ
1
= −σ
2
. There is a single, hyperbolic ﬁxed point
at x
∗
= y
∗
= 0. Neighboring trajectories everywhere diverge exponentially. All cycles are
unstable. What happens to a cat thus mapped is shown in Fig. 7.11.
63
7 Chaos in low dimensional systems
0.0 0.2 0.4 0.6 0.8 1.0
0.4
0.5
0.6
0.7
0.8
0.58 0.60 0.62
0.58
0.59
0.60
0.61
0.62
p
q
0.0 0.2 0.4 0.6 0.8 1.0
0.4
0.5
0.6
0.7
0.8
p
q
Figure 7.10: Standard map cantori with a goldenmean winding number, obtained at K−K
c
= 0.01
(left panel) and K −K
c
= 0.3 (right panel).
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
y
x
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
y
x
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
y
x
Figure 7.11: The fate of a cat under two iterations of the map (7.60).
7.3.7 The baker map; Bernoulli shifts
The map
x
n+1
= 2x
n
y
n+1
=
1
2
y
n
_
if 0 ≤ x
n
<
1
2
x
n+1
= 2x
n
−1
y
n+1
=
1
2
y
n
+
1
2
_
if
1
2
≤ x
n
< 1 , (7.63)
because of its action, which is to shrink (halve) in the vertical and stretch (double) in the
horizontal direction (cf. Fig. 7.12 ), has been named the “baker’s” map.
64
7 Chaos in low dimensional systems
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
y
x
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
y
x
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
y
x
Figure 7.12: Evolution of the cat of Fig. 7.11 under three successive iterations of the baker’s map
(7.63).
• the map is fully reversible.
• just like the cat map, the baker’s map has a single, hyperbolic ﬁxed point at x
∗
=
y
∗
= 0.
• the tangent map is the same for any trajectory:
M =
_
2 0
0 1/2
_
(7.64)
does not depend on the coordinates. The eigenvalues of the map are λ
1
= 2; λ
2
= 1/2
and the corresponding Lyapunov exponents σ
1
= ln 2 = −σ
2
.
• mixing: The map has the mixing property.
• Bernoulli shifts: Let x
0
, y
0
be represented in binary notation as
x
0
= .a
1
a
2
a
i
y
0
= .b
1
b
2
b
i
(7.65)
where a
i
, b
i
= 0 or 1. A symbolic “backtoback” representation of both coordinates
can be written as
X
0
= (x
0
, y
0
) = b
i
b
2
b
1
.a
1
a
2
a
i
. (7.66)
The ﬁrst iteration, by doubling the x and halving the y produces
X
1
= (x
1
, y
1
) = b
i
b
2
b
1
a
1
.a
2
a
i
. (7.67)
i.e. shifts the decimal point by one position to the right.
2
This process is called a
Bernoulli shift.
Now look at a “coarsegrained” description of the sequence ¦X
n
mod 1¦, where the
only information retained is the ﬁrst digit after the decimal point. For typical (i.e.
irrational) x
0
, y
0
this will be an aperiodic sequence of zeros and ones, i.e. a sequence
which is essentially equivalent to the tossing of a coin. Note that this totally random
behavior has been obtained by coarsegraining of an entirely deterministic, reversible
map. I will return to Bernoulli shifts in the next section, because they are a general
feature of deterministic chaos.
2
Convince yourselves that this is so by looking separately at the cases a
1
= 0 and a
1
= 1!
65
7 Chaos in low dimensional systems
7.3.8 The circle map. Frequency locking
The circle map
θ
n+1
= θ
n
+ Ω −K sin θ
n
(7.68)
is a onedimensional nonareapreserving map; I introduce it here in order to describe the
principle behind frequency locking. In addition, the map exhibits KAM characteristics,
breaking of rational tori etc (Arnold).
I will look at the winding number
R =
1
2π
lim
n→∞
_
θ
n
−θ
0
n
_
. (7.69)
In the integrable limit, K = 0, θ
n
−θ
0
= nw and therefore R = Ω. For K ,= 0
∂θ
n+1
∂θ
n
= 1 −K cos θ
n
,= 1 (7.70)
2 0 2
1
0
1
2
A''
A' A
Ω/K
θ
Figure 7.13: Tangent bifurcation scheme for the ﬁxed point of Eq. (7.71).
Consider ﬁrst the ﬁxed point, θ
n
= θ
∗
∀n, corresponding to R = 0. From (7.68), this
will happen if
Ω −K sin θ
∗
= 0 . (7.71)
Eq. 7.71 has no solutions if [Ω[ > K, two solutions if [Ω[ < K (one corresponding to a
stable and the other to an unstable ﬁxed point), and one solution if [Ω[ = K. This behavior
corresponds to a tangent bifurcation scenario (cf. Fig. 7.13 ). Note that the winding number
R will now be equal to zero not just for Ω = 0, but for any Ω < K.
An analogous situation occurs for R = 1/2, i.e. for a 2cycle. In that case, the 2cycle
remains stable within a band Ω
−
1/2
< Ω < Ω
+
1/2
, where Ω
±
1/2
= π±K
2
/4, i.e. the band width
is ∆Ω
1/2
= K
2
/2. More generally, a rational winding number R = P/Q will “lock in”to
that value for any Ω within a band of bandwidth
∆Ω
P/Q
∝ K
Q
. (7.72)
This is the phenomenon of frequency locking. The total length of frequency locked intervals
tends to zero as K →0. Note the analogy with the breaking of KAMtori (Arnold); irrational
winding numbers occupy most of available phase space in the slightly perturbed system.
Stable frequencylocked intervals in the Ω −K plane are shown in Fig. 7.14
For values K < 1 the winding number R will therefore exhibit frequencylocking steps
at various rational values of R; between those steps, there will be intervals where R will
66
7 Chaos in low dimensional systems
Figure 7.14: Frequency locking in the circle map. For any value of K > 0 there is a small band
of Ω values, of width ∆Ω
P/Q
, for which the winding number R locks to the rational
value P/Q. As long as K < 1, the total measure of such intervals is less than 1. At
K = 1, the total measure of lockedin frequency intervals is equal to unity. At values
K > 1, bands corresponding to diﬀerent rational ratios begin to overlap (resonance
overlap). This is indicated by the dotted lines (from [17]).
Figure 7.15: The complete devil’s staircase at K = 1. Frequency locking takes place at all ra
tionals.The inset shows that the staircase exhibits selfsimilarity at all scales (from
[17]).
depend linearly on Ω. This kind of behavior is represented pictorially by the “incomplete
devil’s staircase”. At K = 1 there are steps at all rational numbers and no portions with a
ﬁnite slope. Frequencylocked intervals now have measure 1. This is the “complete devil’s
staircase” (Fig. 7.15). For values K > 1 chaos occurs as the resonance regions depicted in
Fig. 7.14 begin to overlap. More details in [17].
7.4 Topology of chaos: stable and unstable manifolds,
homoclinic points
The stable manifold of a cycle
3
is the set of points such if the forward map is started from
one of them, further iterates will approach the cycle. Similarly, in the case of an invertible
map, the unstable manifold of a cycle is the set of points such if the inverse (backward) map
is started from one of them, further iterates will approach the cycle.
3
A ﬁxed point of a map is a cycle of period 1; for systems with continuous dynamics it is straightforward
to generalize statements on cycles and apply them to periodic orbits.
67
7 Chaos in low dimensional systems
Stable and unstable manifolds cannot intersect themselves; however, they can intersect
each other. If the manifolds belong to the same hyperbolic ﬁxed point, their intersections
are called homoclinic points. If the manifolds belong to diﬀerent hyperbolic ﬁxed points, the
intersections are called heteroclinic points. Fig. 7.16 shows what happens at a homoclinic
point X
0
. Let X
1
, X
2
be two successive iterates of the X
0
along the stable manifold (s).
Consider now the neighboring points Y
0
on the unstable manifold (u). Because it is a
“predecessors” of X
0
(follow the arrows!), its iterate Y
1
must ﬁnd a place on the unstable
manifold prior to X
1
. In order to accommodate this requirement, the unstable manifold must
fold. Now, as the hyperbolic ﬁxed point P is approached, the distance between iterates along
the stable manifold decreases. The folds of the unstable manifold must lie closer and closer
to each other. Because of the area preserving property (the crossed areas in the ﬁgure should
be equal), this makes the folds larger and larger in amplitude. Thus, if a single homoclinic
point exists, an inﬁnite sequence is generated. Fig. 7.16 (right panel) illustrates this in the
case of the hyperbolic ﬁxed point of the standard map.
The complexity of the intersecting stable and unstable manifolds near hyperbolic ﬁxed
points lies at the heart of chaos in conservative systems. It was aptly recognized by its
discoverer, Poincar´e, with the ﬁtting response “the complexity of this ﬁgure will be striking,
and I shall not even try to draw it”.
0.47 0.48 0.49 0.50 0.51 0.52
0.01
0.00
0.01
0.02
0.03
p
q
K=0.5
Figure 7.16: Homoclinic points in the vicinity of a hyperbolic ﬁxed point. Left: a schematic view
(cf. text); Right: the stable (red, thicker points) and unstable (black, thinner points)
manifolds of the hyperbolic ﬁxed point of the standard map at a low value of the
nonlinearity parameter K = 0.5.
For more details consult the excellent textbooks available, e.g. by Ott [18] or Tabor [19].
68
8 Solitons in scalar ﬁeld theories
8.1 Deﬁnitions and notation
8.1.1 Lagrangian, continuum ﬁeld equations
Starting point: classical discrete Lagrangian
L =
N
i=1
_
1
2
I
˙
φ
2
i
−mgl(1 −cos φ
i
) −
1
2
K(φ
i+1
−φ
i
)
2
_
, (8.1)
Physical realization, e.g. coupled torsion pendula. Disks of radius l with moment of inertia
I and an extra mass m on the periphery. Terms represent, respectively:
• rotational kinetic energy
• potential energy of mass in gravitational ﬁeld
• potential energy of coupling
Other physical realizations: arrays of Josephson junctions, onedimensional magnets, ...
Equations of motion
I
¨
φ
i
= K(φ
i+1
+φ
i−1
−2φ
i
) −mgl sin φ
i
(8.2)
Continuum approximation
•
φ
i±1
= φ(x
i±1
) ≈ φ(x
i
) ±aφ
(x
i
) +
1
2
a
2
φ
(x
i
)
where a is the distance between neighboring disks (lattice constant).
• x
i
→x (continuum space variable)
leads to
c
2
0
∂
2
φ
∂x
2
−
∂
2
φ
∂t
2
= ω
2
0
sin φ (8.3)
where c
2
0
= Ka
2
/I, ω
2
0
= mgl/I.
The KleinGordon class
Eq. 8.3 is a member of the wider class of KleinGordon(KG) ﬁeld equations
c
2
0
φ
xx
−φ
tt
= ω
2
0
V
(φ) (8.4)
where the onsite potential can be (examples)
69
8 Solitons in scalar ﬁeld theories
•
V (φ) =
1
2
φ
2
original KleinGordon (linear, QM ca 1930)
•
V (φ) =
1
8
(1 −φ
2
)
2
known as φ
4
(displacive phase transitions, continuum version of Ising model)
•
V (φ) = 1 −cos φ
known as SineGordon (misnomer 1970, rhymes with KleinGordon); proposed earlier
by FrenkelKontorova in context of dislocations, Skyrme in the 60s as a nonlinear ﬁeld
model for nucleons.
Of interest for characterization of onsite potential: vacuum state, deﬁned by
V
(φ
0
) = 0
V
(φ
0
) > 0 (8.5)
As deﬁned (dimensionless) all examples have V (φ
0
) = 0 and V
(φ
0
) = 1. Hence for small
displacements from φ
0
V (φ −φ
0
) ≈
1
2
(φ −φ
0
)
2
.
Note further that the 3 examples deﬁned above have, respectively,
1. a single vacuum
2. two degenerate vacua
3. an inﬁnite number of degenerate vacua.
The ﬁeld equations (8.4) can also be directly derived from the continuous Lagrangian
L = A
_
dxL
where
L(φ, φ
x
, φ
t
) =
1
2
_
∂φ
∂t
_
2
−
1
2
c
2
0
_
∂φ
∂x
_
2
−ω
2
0
V (φ)
is the Lagrangian density, and A deﬁnes the energy scale. In the SG example, A = I/a.
Symmetries and Conservation laws
Symmetries and conservation laws have been discussed in Section 1.4. In particular, the
invariance of the Lagrangian density with respect to space and time leads, respectively, to
the conservation of total momentum P and energy E. Furthermore, Lorentz invariance leads
to the conservation of angular momentum, which in 1+1 dimensional systems is simply
EX −Pt .
70
8 Solitons in scalar ﬁeld theories
In the special case of P = 0 (localized ﬁeld conﬁgurations with vanishing total momentum),
this implies that the center of energy remains ﬁxed. This is the relativistic analog of the
centerofmass theorem of Newtonian dynamics. It turns out to be quite useful in soliton
dynamics.
Furthermore, Lorentz invariance implies that if φ(x) is a solution of (8.4), so is φ(γ(x−vt)),
where γ = (1−v
2
/c
2
0
)
−1/2
and [v[ < c
0
. In other words, any static solution can be “Lorentz
boosted”.
8.2 Static localized solutions (general KG class)
8.2.1 General properties
The vacuum
Note that the vacuum (or vacua) is always a solution of the equations of motion (8.4).
Other solutions
I look for static, localized solutions  which may then be Lorentzboosted. The starting
point is
c
2
0
φ
xx
= ω
2
0
V
(φ)
or, in dimensionless form,
d
2
φ
dξ
2
=
dV
dφ
(8.6)
where d = c
0
/ω
0
and ξ = x/d. Multiplying both sides of (8.6) by dφ/dξ, I obtain
1
2
d
dξ
_
_
dφ
dξ
_
2
_
=
dV
dξ
which has a ﬁrst integral
1
2
_
dφ
dξ
_
2
= V +const.
For solutions which are localized, i.e.
lim
ξ±∞
dφ
dξ
= 0 (8.7)
and
lim
ξ±∞
V = V (φ
0
) = 0
the integration constant vanishes. A second integral can then be formally written as
ξ −ξ
0
= ±
_
φ
φ
0
dφ
1
_
2V (φ
)
. (8.8)
71
8 Solitons in scalar ﬁeld theories
8.2.2 Speciﬁc potentials
The linear KG case
In the linear KG case, V (φ) = φ
2
/2 (8.8) becomes
ξ −ξ
0
= ±
_
φ
dφ
1
φ
= ±ln φ
or
φ = e
±(ξ−ξ
0
)
which, although it formally satisﬁes the original ﬁeld equation, is not a localized solution in
the sense of (8.7). Therefore it is not a physical solution.
The φ
4
kink
In the φ
4
case (8.8) becomes
ξ −ξ
0
= ±
_
φ
dφ
1
1
2
(1 −φ
2
)
= ±2 arctanh φ
or
φ
K
(x) = ±tanh
_
x −x
0
2d
_
, (8.9)
where x
0
= ξ
0
d . The upper sign corresponds to a kink, the lower to an antikink. Both
solutions interpolate between the two degenerate vacua.
The SG kink
In the SG case (8.8) becomes
ξ −ξ
0
= ±
_
φ
dφ
1
_
2(1 −cos φ
)
= ±
_
φ
dφ
1
2 sin
φ
2
= ln tan
φ
4
or
φ
K
(x) = 4 arctan exp¦±
x −c
0
t −x
0
d
¦ . (8.10)
The solution with the upper sign interpolates between φ
0
= 0 and φ
0
= 2π, the one with
the lower sign conversely.
72
8 Solitons in scalar ﬁeld theories
8.2.3 Intrinsic Properties of kinks
Topological charge
Kinks (and antikinks) interpolate between distinct, degenerate vacua. They are known as
topological solitons. The conserved quantity
Q =
_
∞
−∞
dx
dφ
dx
= φ(∞) −φ(−∞) (8.11)
is known as topological charge. A φ
4
kink has topological charge 1, an antikink −1. A SG
kink has topological charge 2π, an antikink −2π.
Rest energy of a kink
The total energy can be obtained from the Hamiltonian density
H = A
_
1
2
_
∂φ
∂t
_
2
+
1
2
c
2
0
_
∂φ
∂x
_
2
+ω
2
0
V (φ)
_
. (8.12)
For a static kink, the ﬁrst term is zero, and the second and third terms are equal (cf. above).
Thus
E
K
≡ Mc
2
0
= Ac
2
0
_
∞
−∞
dx
_
∂φ
∂x
_
2
= Ac
2
0
1
d
_
∞
−∞
dξ
_
∂φ
∂ξ
_
2
= Ac
2
0
1
d
_
φ
2
φ
1
dφ
∂φ
∂ξ
or
M =
A
d
_
φ
2
φ
1
dφ
_
2V (φ) . (8.13)
Note that we do not need the explicit form of the kink solution in order to calculate the
rest mass. In the case of the φ
4
ﬁeld, this gives M = 4A/3d. In the case of the SG ﬁeld,
M = 8A/d.
Energy and momentum of a moving kink: classical waveparticle duality
The energy of the moving kink
φ
K
(γ(x −vt))
where
γ =
1
_
1 −
v
2
c
2
0
can be directly computed from the full Hamiltonian density. It is
E(v) = Mc
2
0
γ . (8.14)
The momentum can be computed from (1.49) and is equal to
P(v) = Mγv . (8.15)
73
8 Solitons in scalar ﬁeld theories
The energymomentum relationship
E
2
= P
2
c
2
0
+M
2
c
4
0
is also satisﬁed.
The fact that kink and antikink solutions satisfy the usual relativistic kinematic relations
which ordinarily hold for mass points suggests that these classical localized ﬁelds may, for
many practical purposes, be eﬀectively treated like particles. The remarkable property of
particlewave duality at a classical level suggests that solitonbearing classical Lagrangians
might be good candidates for the construction of nonlinear quantum ﬁeld theories.
8.2.4 Linear stability of kinks
Consider small displacements around a static kink solution of the KG class. The total space
and timedependent ﬁeld is written as
φ(x, t) = φ
K
(x) +χ(x, t) (8.16)
where χ will be regarded as a small quantity. Keeping only linear terms in χ leads to
c
2
0
χ
xx
−χ
tt
= ω
2
0
V
(φ
K
) χ .
Using a separation of variables ansatz
χ(x, t) =
j
α
j
e
iω
j
t
f
j
(x) (8.17)
leads to the eigenvalue equation
−
d
2
f
j
dξ
2
+V
(φ
K
)f
j
(ξ) = Ω
2
j
f
j
(ξ) (8.18)
where I have again introduced the dimensionless space variable ξ = x/d, and Ω
j
= ω
j
/ω
0
.
Eq. (8.18) is has the form of a Schr¨odinger equation. The eﬀective potential is the second
derivative of V , evaluated at φ = φ
K
. Because φ
K
asymptotically approaches the vacuum
ﬁeld values, the eﬀective potential (Draw!) approaches V
(φ
0
) which with our conventions
has the value 1. Possible eigenstates of (8.18) may then be
• localized (bound), with Ω
2
< 1 or
• extended (scattering) states, with Ω
2
> 1 .
Linear stability requires that
Ω
2
j
≥ 0 . (8.19)
Bound states. The zero frequency (Goldstone) mode
The function
f
0
=
dφ
K
dξ
(8.20)
is always an eigenstate of (8.18), associated with the eigenvalue 0. One can see this by
noting that satisfying
−φ
K,ξξξ
+V
(φ
K
)φ
K,ξ
= 0
74
8 Solitons in scalar ﬁeld theories
is equivalent to satisfying
d
dξ
[−φ
K,ξξ
+V
(φ
K
)] = 0.
But the brackets are identically zero for a kink solution.
The zerofrequency (or translational) mode, named after Goldstone, reﬂects the invariance
of the kink solution under translations. Note in this context that the integration constant
ξ
0
(cf. above) does not enter the expression for the restenergy. A kink (or antikink) can
be translated in space at zero energy cost. A kink solution centered at zero has the same
energy with a kink solution centered at α. If α is small, the latter can be obtained from the
former by Taylor expansion
φ
K
(ξ −α) ≈ φ
K
(ξ) −α
dφ
K
dξ
which is why dφ
K
/dξ must be an eigenstate of (8.18).
The Goldstone mode must be the eigenstate with the lowest Ω
2
j
value. One can see this
from the fact that, since kinks are monotonic functions in space, dφ
K
/dξ has no nodes.
Other bound states may or may not exist, depending on the details of the eﬀective po
tential. For example, the SG kink has no further bound states. The φ
4
kink has a further
localized mode, with an internal oscillation frequency .... and an eigenfunction ....
Scattering states. Phase shifts
In general, scattering states consist of an incident, a transmitted and a reﬂected wave.
The eﬀective potentials which correspond to both the SG and the φ
4
kink have the special
property that they are reﬂectionless. In other words, the eigenfunctions corresponding to
extended states with frequencies
Ω
2
q
= 1 +q
2
(8.21)
have the asymptotic form
lim
x±∞
f
q
(ξ) ∼ e
iqξ±iδ(q)/2
. (8.22)
The total phase shift δ(q) describes the net eﬀect of the interaction between an incident
phonon plane wave and a static kink.
Note that the above property is asymptotic. The exact form of extended eigenstates may
be signiﬁcantly distorted in the neighborhood of the kink. For example, in the SQ case
f
q
(ξ) = (iq + tanh ξ) e
iqξ
,
from which
δ(q) = 2 arctan
1
q
follows.
8.3 Special properties of the SG ﬁeld
8.3.1 The SineGordon breather
The SG ﬁeld equations admit a family of special localized solutions with an internal oscilla
tion
φ
br
(x, t) = 4 arctan
_
ρ
sin ω(t −t
0
)
cosh[(x −x
0
)/λ]
_
(8.23)
75
8 Solitons in scalar ﬁeld theories
6 4 2 0 2 4 6
3
2
1
0
1
2
3
φ
x/d
µ=π/4
20 15 10 5 0 5 10 15 20
7
6
5
4
3
2
1
0
1
2
3
4
5
6
7
φ
x/d
µ=π/2.001
Figure 8.1: Left: multiple snapshots of a SG breather with intermediate amplitude. Right: a very
slow breather with µ = π/2.001 which looks like a bound kinkantikink pair (of course
if you observe the very slow oscillation over an extremely long period of time you will
note the periodic motion). The snapshots are taken at times which are very far apart
from the point of view of the laboratory observer: ±π/(2ω
0
cos µ) and ±π/(16ω
0
cos µ).
where ω = ω
0
cos µ, ρ = tan µ, λ = d/ sin µ and 0 < µ < π/2. The constants x
0
, t
0
are arbitrary and can be shown to generate two Goldstone modes (cf. above), related,
respectively, to spatial and temporal translations. The solution is known as a “breather”
and the form shown is in its rest frame. It can be Lorentzboosted by applying the Lorentz
transformations. The rest energy of a breather is
E
0
br
= 2Mc
2
0
sinµ . (8.24)
In the limit of µ ¸ 1 the breather reduces to a phonon. In the limit of µ → π/2 the
frequency of oscillation approaches zero, the energy approaches Mc
2
0
, and the breather looks
very much like a bound kinkantikink pair.
The breather is a singular feature of the SG ﬁeld theory. Continuum ﬁeld equations with
other potentials of the KG class do not exhibit timeperiodic, spatially localized solutions.
8.3.2 Complete Integrability
The SG ﬁeld equation with decaying boundary conditions can be completely integrated using
the inverse scattering transform. Details in ....
76
9 Atoms on substrates: the
FrenkelKontorova model
The FrenkelKontorova (FK) model [20] is an attempt to describe structures of adsorbed
layers which have to reﬂect two competing periodicities, that of the substrate and that of
the adatoms. The total potential energy is assumed to be
Φ =
C
2
n
(x
n+1
−x
n
−a)
2
+V
0
n
_
1 −cos
2πx
n
b
_
, (9.1)
where x
n
is the position of the nth atom, a, b are the natural periodicities of adatoms and
substrate, respectively, and C, V
0
are material constants denoting the strength of the two
potentials. The ﬁrst term in (9.1) describes the harmonic interactions between the adatoms,
whereas the second term models the periodic template provided by the substrate.
I will use a dimensionless description of all relevant quantities. Let δ = (a − b)/b be the
“mismatch” between the two competing length scales; let further
x
n
= bn +bφ
n
(9.2)
denote the breakup of the displacement of the nth atom into a part which follows the
substrate and a “rest”. The dimensionless potential energy is then given by
ˆ
Φ =
Φ
Cb
2
=
1
2
n
(φ
n+1
−φ
n
−δ)
2
+
1
(2πλ)
2
n
(1 −cos 2πφ
n
) , (9.3)
where λ =
_
Cb
2
/V
0
_
1/2
/(2π) is the dimensionless coupling constant.
The equilibria of (9.3), deﬁned by
∂
ˆ
Φ
∂φ
n
= 0 ∀n , (9.4)
are given in terms of the secondorder recurrence equations
φ
n+1
+φ
n−1
−2φ
n
=
1
2πλ
2
sin 2πφ
n
. (9.5)
Eq. (9.5) is equivalent to the twodimensional standard map discussed in Section 7.3.5. This
means that we should in general expect to ﬁnd ground and metastable states of (9.1) which
exhibit all the complexity encountered there  and discussed in the general context of a
dynamical system  where the index n stood for a discrete time. In particular, we expect to
ﬁnd phase locking. i.e. adatoms and substrate “locked” into lattice periodicities whose ratios
are rational numbers. Furthermore, we expect to ﬁnd metastable, chaotic conﬁgurations at
higher values of the nonlinearity (low values of the coupling constant λ).
We begin by looking at the absolute minimum of (9.3) for weak nonlinearities and, more
speciﬁcally, at the ﬁrst transition between a commensurate and an incommensurate phase.
In order to do this, we will always compare the energy of a local extremum deﬁned by (9.5)
with the energy
ˆ
Φ
0
=
1
2
Nδ
2
(9.6)
77
9 Atoms on substrates: the FrenkelKontorova model
of the reference state
φ
n
= 0 ∀n
where N is the total number of substrate atoms.
9.1 The CommensurateIncommensurate transition
9.1.1 The continuum approximation
If the coupling is strong, λ ¸1, it is possible to make a continuum approximation φ
n
→φ(n);
in this case (9.5) becomes, to leading order,
d
2
φ
dn
2
=
1
2πλ
2
sin 2πφ , (9.7)
which is known as the SineGordon equation (cf. Chapter 8). Eq. 9.7 has a ﬁrst integral
_
dφ
dn
_
2
=
1
2π
2
λ
2
(−cos 2πφ +const) ,
which can be rewritten, by setting the constant equal to 1 + 2 and taking the square root,
as
dn
dφ
= ±
1
g(φ)
, (9.8)
where
g(φ) =
1
πλ
_
sin
2
πφ + . (9.9)
Eq. 9.8 can be integrated again, in the form
n −ν = ±J(φ) = ±
_
φ
0
d
¯
φ
g(
¯
φ)
(9.10)
where ν is a further constant of integration
1
.
The total energy: an intermediate result
The total energy associated with the solution (9.10) is
ˆ
Φ
=
_
dn
_
1
2
_
dφ
dn
_
2
+
1
(2πλ)
2
(1 −cos 2πφ)
_
−δ
_
dn
dφ
dn
+
1
2
Nδ
2
or, measured from the reference energy (9.6),
∆
ˆ
Φ
=
ˆ
Φ
−
ˆ
Φ
0
=
_
φ
2
φ
1
dφg(φ) −
2π
2
λ
2
N −δ(φ
1
−φ
2
) , (9.11)
where φ
2,1
= lim
n→±∞
φ(n), and I have also used the fact that φ(n) is a monotonic function
of n (cf. below).
1
A further substitution k
2
= (1 +)
−1
, χ = (φ −1/2)π and u = (n −ν)/(kλ) transforms (9.10) to
u = ±F(k, χ)
where F(k, χ) is the elliptic integral of the second kind. The latter equation can be formally inverted as
χ = am(k, ±u)
where am is the elliptic Jacobian amplitude[21]. In these lectures I will follow [22] and present a “no
prerequisites” description of the CI transition.
78
9 Atoms on substrates: the FrenkelKontorova model
9.1.2 The special case = 0: kinks and antikinks
If = 0, (9.8) admits soliton solutions of the kink/antikink type,
φ(n) =
2
π
arctan e
±(n−ν)/λ
. (9.12)
The total energy of a kink (or antikink) is, according to (9.11),
∆
ˆ
Φ
kink
=
2
π
2
λ
−δ , (9.13)
where I have used = 0 and g(φ) = (πλ)
−1
sin πφ. Note that the energy is negative, i.e. the
kink is formed spontaneously, if
δ > δ
c
=
2
π
2
λ
. (9.14)
9.1.3 The general case > 0: the soliton lattice
Let me now look at some general properties of (9.10). In the following, I will choose the
upper sign; the analysis can be inverted for the lower sign. Note ﬁrst that the integrand
is positive, therefore J(φ) is a monotonic, and hence invertible function. This is formally
expressed by
φ(n) = J
−1
(n −ν) . (9.15)
Furthermore, since g(φ) = g(φ + 1), it follows that
J(φ + 1) =
_
φ
0
d
¯
φ
g(
¯
φ)
+
_
φ+1
φ
d
¯
φ
g(
¯
φ)
= J(φ) +
_
1
0
d
¯
φ
g(
¯
φ)
= n −ν +L (9.16)
where L is deﬁned as the value of the deﬁnite integral in the second line
2
L = 2
_
1/2
0
d
¯
φ
g(
¯
φ)
. (9.17)
Consequently,
φ(n) + 1 = J
−1
(n −ν +L) = φ(n +L) (9.18)
i.e., each time the index n is increased by L, the ﬁeld variable φ, which measures the deviation
from the reference phase  the phase perfectly matched to the substrate , increases by one.
In the limit ¸1, the dominant contribution to L comes from φ near zero. It is possible
to obtain the leadingorder contribution to L by approximating
g(φ) ≈
1
λ
max(φ, φ
c
)
where φ
c
=
1/2
/π. This results in
L ∼ −λln
A
(9.19)
2
In the elliptic integral notation of the previous footnote
L = 2λkK
where K = F(k, π/2) is the complete elliptic integral of the ﬁrst kind.
79
9 Atoms on substrates: the FrenkelKontorova model
to leading order in . A is a numerical constant of order unity.
A direct consequence of (9.18) is that φ can be written in the form
φ(n) =
n −ν
L
+ψ(n −ν) (9.20)
where the ﬁrst term denotes the average change in φ, and the second is a periodic function
of n
ψ(n +L) = ψ(n)
(Fig. 9.1). The type of solution described by (9.20) is known as the soliton lattice. It
corresponds to a regular sequence of domains of L sites commensurate with the substrate,
interrupted by local discommensurations.
0 50 100 150 200
1
0
1
2
3
4
5
6
7
8
n
ψ
φ
λ=6
ε=e
2
Figure 9.1: The soliton lattice. The continuous curve represents the monotonic function φ, which
is a sum of a straight line with slope 1/L and a periodic function with periodicity L
(cf. Eq. (9.20) ).
Energy of the soliton lattice
The total energy of the soliton lattice  always measured relative to the reference state of
the commensurate state  consists, according to (9.11), of three terms. All contributions are
of order N. I can use the fact that φ
1
= 0 and φ
2
≈ N/L + O(1) (cf. (9.20) ) to write the
energy in the form
∆
ˆ
Φ
=
N
L
_
1
0
dφ g(φ) −N
2π
2
λ
2
−
N
L
δ . (9.21)
The soliton lattice energy (9.21), regarded as a function of the  still undetermined  constant
, has an extremum at a value determined by
∂
ˆ
Φ
∂
=
N
L
2
∂L
∂
_
δ −
_
1
0
dφ g(φ)
_
= 0 .
Since the ﬁrst factor is nonzero at all > 0  and in fact diverges as → 0  the above
condition can only be satisﬁed if
_
1
0
dφ g(φ) =
1
πλ
_
1
0
dφ
_
sin
2
πφ + = δ . (9.22)
80
9 Atoms on substrates: the FrenkelKontorova model
The above condition can be used to determine the value of = ¯ (δ) which, for a given
mismatch, gives an extremum of the soliton lattice energy. In order to determine the nature
of the extremum we must ﬁrst look at the second derivative
∂
2
ˆ
Φ
∂
2
¸
¸
¸
¸
¸
=¯ (δ)
= −
1
2(πλ)
2
N
L
∂L
∂
¸
¸
¸
¸
=¯ (δ)
.
In view of (9.19), the sign of the second derivative is positive. The local minimum thus
determined will always have a lower energy than the reference state, by an amount
∆
ˆ
Φ
¯
= −N
¯
2π
2
λ
2
(9.23)
(noting that the ﬁrst and the third terms in (9.21) cancel out).
It should however be noted that (9.22) has a solution for only for such δ > δ
c
, where
the critical value of δ is the same as that derived in the context of the energetic stability
of the single kink. In other words, at mismatches δ < δ
c
, the commensurate state will still
be favored. Once this critical value is exceeded however, not only becomes the spontaneous
creation of a single kink energetically possible, but the whole structure of a soliton lattice 
a new, incommensurate phase , acquires a macroscopic energetic advantage and is formed
spontaneously.
Relationship between ¯ and δ
In order to to derive the explicit relationship between ¯ and δ, I must go back to (9.22). For
notational simplicity let me from now on drop the bar from the . I ﬁrst note that
dδ
d
=
1
2π
2
λ
2
L() ;
using the general expression
δ −δ
c
=
_
0
d
dδ
d
and the leadingorder result (9.19), I obtain
δ −δ
c
= −
1
2π
2
λ
ln
Ae
or, in reduced dimensionless form,
δ −δ
c
δ
c
= −
1
4
ln
Ae
. (9.24)
Discommensurations repel each other
Using this relationship, it is possible to express the energy of the incommensurate phase per
discommensuration as
∆
ˆ
Φ
N/L
=
1
2π
2
λ
ln
A
= −(δ −δ
c
) +
δ
c
4
= −(δ −δ
c
) +
8
π
2
λ
e
−L/λ
. (9.25)
The ﬁrst term in the last line is exactly the energy (9.13) of an isolated kink. The sec
ond term, which is always positive, expresses the repulsive energy of interaction between
neighboring discommensurations.
81
9 Atoms on substrates: the FrenkelKontorova model
The mean interatomic spacing
The mean interatomic spacing, deﬁned by
¯ a = lim
n→∞
x
n
−x
0
n
,
can now be calculated for the incommensurate phase. It is equal to
¯ a = b
_
1 +
1
L
_
,
which corresponds to a winding number
r =
¯ a
b
∼ 1 −
1
λln
δ−δ
c
δ
c
(9.26)
to leading order. Note that, as the mismatch approaches the critical value from above, the
mean spacing approaches b continuously. A singularity at critical mismatch appears in the
second order derivative of the energy with respect to the length L = N¯ a (check this!, ex
ercise). The commensurateincommensurate transition is a therefore a “second order”phase
transition in the language of statistical mechanics.
Free vs. ﬁxedend boundary conditions
I have up to now considered freeend boundary conditions. In other words: given the material
parameters (coupling constant λ and mismatch δ) we look for the energy minimum, which
in turn determines the winding number (9.26). It is of course possible to consider ﬁxedend
boundary conditions, in which the positions of the end atoms are held ﬁxed. More precisely,
the relevant quantity for a system of N atoms is the diﬀerence φ
N
−φ
0
. Holding it constant
corresponds to ﬁxing the winding number, i.e. the density of discommensurations L/N.
The parameter is then determined by (9.19). The energy can be directly computed from
(9.21) and the end result has exactly the form of the last line in (9.25). The interpretation
is also the same: the soliton lattice has an energy which consists of contributions of the
individual discommensurations and of an interaction part, arising from the mutual repulsion
of neighboring discommensurations. Note however that since we are in eﬀect ﬁxing the
soliton lattice, the expression for the energy is valid for any value of the misﬁt parameter.
If δ < δ
c
this means that the extra positive energy must be supplied in order to maintain
the ﬁxedend boundary conditions.
Phasons
The soliton lattice has a further important property which I have not discussed up to now.
Its energy, (9.25), is independent of the integration constant ν, deﬁned in 9.10). This means
that the whole soliton lattice conﬁguration can be translated by an arbitrary amount without
an energy cost. As has been discussed in Section 8.2.4 in the context of single kinks, this
translational invariance implies the existence of a zerofrequency (Goldstone) mode in the
spectrum of linearized excitations around the exact soliton lattice conﬁguration. Let me
examine this in some detail:
Up to now we have only looked at equilibrium properties which are determined by the
minima of the total potential energy (9.3). The dynamics of the FK model is governed by
the equations of motion
¨
φ
n
= φ
n+1
+φ
n−1
−2φ
n
−
1
2πλ
2
sin 2πφ
n
, (9.27)
82
9 Atoms on substrates: the FrenkelKontorova model
or, in the continuum approximation,
∂
2
φ
∂t
2
−
∂
2
φ
∂n
2
= −
1
2πλ
2
sin 2πφ
n
, (9.28)
where the time is measured in units of (m/C)
1/2
.
Linearization of (9.28) around the static soliton lattice conﬁguration φ
s
(n − ν) = (n −
ν)/L +ψ(n −ν),
φ(x, t) = φ
s
(n −ν) +
q
e
−iω
q
t
f
q
(n) , (9.29)
leads to a Schroedingerlike equation for the f
q
’s
−
d
2
f
q
dn
2
+
1
λ
2
cos(2πφ
s
) f
q
= ω
2
q
f
q
. (9.30)
The eﬀective potential of (9.30) has the periodicity L of the soliton lattice. Consequently,
the Bloch/Floquet theorem applies to the eigenfunctions:
f
q
(n) = e
iqn
F
q
(n) where
F
q
(n) = F
q
(n +L) (9.31)
Now the eigenfunction corresponding to the Goldstone mode is
dφ
s
dn
=
1
L
+ψ
(n −ν)
and is therefore periodic in n with period L. By comparison with (9.31) we conclude that
it must correspond to q = 0 and that F
0
(n) = dφ
s
/dn.
Note the contrast with the situation encountered in the context of localized kinks; in that
case the zerofrequency mode was a discrete state in the spectrum. Now it is part of a
band. In fact, the spectrum ω
q
consists of (i) a lowq region, starting at zero and reaching
out to π/L, with a linear dependence of ω
q
, and (ii) a highq region, i.e. at wavelengths
shorter than the the distance L between discommensurations, which is dominated by the
shortrange properties of the soliton lattice, which are eﬀectively those of the commensurate
phase. Region (i) gives rise to the so called phason branch of excitations, region (ii) to the
optical phonon branch of the commensurate phase. The two branches are separated by a
frequency gap.
9.2 Breaking of analyticity
The treatment of the FK model up to now has been based on the continuum approximation,
which will break down at values of λ ≤ 1. It is then necessary to treat the equilibria of the
potential energy in terms of the secondorder recurrence equation (9.5), i.e. to go back to
the standard map of Section 7.3.5. When applying results from that section, it should be
noted that the correspondence is q
n
→φ
n
−1/2 and K →λ
−2
. The next section will mostly
treat the case of ﬁxed end points. The implications for the physically relevant case of free
boundary conditions will be treated somewhat heuristically.
83
9 Atoms on substrates: the FrenkelKontorova model
9.2.1 FK ground state as minimizing periodic orbit of the standard map
Rational winding numbers
If the ends are held so that the average interatomic distance is a rational multiple of the
substrate periodicity (in the language of the standard map this corresponds to a rational
winding number), i.e.
φ
N
−φ
0
N
= w =
r
s
,
where r, s are integers, the energy minimum (ground state) will be an (r, s) orbit of the
standard map, i.e. an s−cycle such that φ
s+1
−φ
1
= r. This corresponds to a commensurate
state and holds for any value of the nonlinearity.
Irrational winding numbers
The nontrivial part is of course what happens for irrational values of the winding number
w. In this case, Aubry [23] has proved the following fundamental results:
• for K < K
c
(corresponding to λ > λ
c
= K
−0.5
c
= 1.014491), the ground state is
quasiperiodic, of the form
φ
n
= t
n
−α +u(t
n
−α) (9.32)
where t
n
= wn and the hull function is periodic with period 1, u(t) = u(t + 1) and
analytic in t. The ground state of the FK chain thus corresponds to a torus trajectory
of the standard map. This result eﬀectively generalizes (9.20). As K approaches K
c
,
and successive KAM tori break, the hull function becomes more and more bumpy
(cf. Fig. 7.8). Aubry describes this behavior as a phase transition due to breaking of
analyticity. Indeed,
• for K > K
c
(corresponding to λ < λ
c
= K
−0.5
c
= 1.014491), the ground state is still
quasiperiodic, given uniquely by (9.32). However, it now corresponds to a cantorus
of the standard map (cf. Fig. 7.10). Accordingly, the corresponding hull function
develops discontinuities (Fig. 9.2, left and center panels).
0.0 0.2 0.4
0.00
0.05
0.10
u
t
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
u+t
t
0 20 40 60 80 100
0.0
0.5
1.0
1.5
2.0
0 2 4 6 8 10
0.0
0.1
0.2
0.3
0.4
0.5
ω
j
Figure 9.2: Left panel: the hull function, as represented by an 89cycle rational approximation
of a cantorus for K = K
c
+ 0.3. Center panel: the function u(t) + t allows within
this ﬁnite approximation a better view of the gaps. Right panel: the phonon spectra
corresponding to this ground state. Note that the minimal frequency is nonzero (inset).
84
9 Atoms on substrates: the FrenkelKontorova model
9.2.2 Small amplitude motion
Small amplitude motion around the ground state is described by linearizing the equations
of motion (9.27) around the ground state, i.e.
φ
n
(t) = φ
GS
n
+
q
f
(j)
n
e
−iω
j
t
which leads to the eigenvalue equation
n
h
mn
f
(j)
n
= ω
2
j
f
(j)
m
(9.33)
where
h
mn
=
∂
2
ˆ
Φ
∂φ
n
∂φ
m
(9.34)
and the second derivatives are evaluated at the ground state positions deﬁned by (9.32).
The eigenvalues of the Hessian matrix correspond to the squares of the eigenfrequencies of
local oscillations (phonon spectra).
We have seen in section 7.3.1 and in the beginning of the present chapter that all tra
jectories of the standard map correspond to local extrema of the potential energy function
ˆ
Φ. Local extrema can be classiﬁed according to the properties of the eigenvalues of (9.33).
If a single eigenvalue is negative, the extremum is unstable (maximum in at least one di
rection). If all eigenvalues are positive it is a local minimum, i.e. in general a metastable
conﬁguration. A zero eigenvalue (Goldstone mode) indicates an invariance of the energy
with respect to a particular motion. We saw an example of this in the case of the soliton
lattice  which was a stable conﬁguration (a ground state of the FK chain) in the continuum
approximation. In fact the spectra of toruslike ground states obtained below the threshold
of analyticity breaking always include such a zerofrequency mode, indicating that the total
energy is invariant with respect to a change of phase; the whole arrangement can slide freely.
Above that threshold, the atomic arrangement of adatoms becomes “pinned” . Pinning is
reﬂected in the phonon spectrum which develops a gap near zero frequency (cf. Fig. 9.2,
right panel). Peyrard and Aubry [24] performed extensive numerical studies of the transition
by breaking of analyticity and demonstrated that the gap frequency vanishes as the thresh
old is approached. More speciﬁcally, they determined that the minimal frequency scales as
a power of K −K
c
,
ω
min
∝ (K −K
c
]
χ
where 1 < χ < 1.03.
9.2.3 Free end boundary conditions
The picture which emerges for the physically relevant case of freeend boundary conditions
is the following: The average interparticle distance ¯ a as a function of the misﬁt parameter
is characterized by lockedin, ﬂat regions, at rational winding numbers, which correspond to
commensurate phases. As long as the nonlinearity is below threshold, these ﬂat regions are
interrupted by intervals of continuous variation of ¯ a with δ, which correspond to irrational
winding numbers (incommensurate phases); the function ¯ a(δ) forms an incomplete devil’s
staircase. At the threshold of analyticity breakup, there are steps at every rational values
of the ratio ¯ a/b, separated by discontinuities. This is the complete devil’s staircase (cf. the
similar analysis of phase locking in the case of the circle map in section 7.3.8). Numerical
results [25] are summarized in Fig. 9.3.
85
9 Atoms on substrates: the FrenkelKontorova model
Figure 9.3: Left panel: Phase diagram (winding number vs. misﬁt parameter) for the FK model.
The numbers represent values of the lockedin winding number. Unlabeled regions
contain additional structure. Right panel: winding number as a function of misﬁt
parameter for the FK model at K = 1, showing a devil’s staircase structure (from
[25]).
9.3 Metastable states: spatial chaos as a model of glassy
structure
Chaotic trajectories of the standard map  which proliferate beyond the threshold of ana
lyticity breakup  have a special signiﬁcance in the context of the FK model. A lot of them
correspond to unstable extrema. On the other hand, a great number of them (of order e
N
)
represent local minima, i.e. metastable states. The number and the energy distribution of
Figure 9.4: Left panel: The number of metastable equilibrium states vs. their energy per site
(measured from the ground state energy). Bands are shown for K = 5 (6 upper
segments) and K = 2 (5 lower segments); horizontal dashed lines show the border
between energy bands. Right panel: Band energy spectrum of the same equilibria vs.
K (upper scale) and the phonon gap (lower scale); bands are marked by ﬁlled areas
corresponding to a given K value (from [26]).
these metastable states have been recently computed [26]. They were found to bundle in
energy bands. The left panel of Fig. 9.4 shows the bands and their populations for K = 2
and K = 5 and a golden mean winding number. The energies lie very close to the ground
state. For example, in the case of K = 2, the lowest energy band has an energy per site of
10
−13
(in dimensionless units). The right panel shows the relationship of energy bands to
the ground state. Since the ground state is a cantorus, it is characterized by a nonzero Lya
86
9 Atoms on substrates: the FrenkelKontorova model
pounov exponent  which is itself proportional to the minimal frequency of small oscillations
found in the previous section. Small perturbations in the displacements of the boundary
sites  while still compatible with the ﬁxed winding number  generate an exponentially large
number of extra trajectories branching out of the cantorus with energies exponentially close
to that of the cantorus.
The type of energy landscape described above is characteristic of disordered condensed
matter systems, such as glasses or globular proteins. In this sense, the FK model provides
considerable insight regarding the interplay of nonlinearity and disorder.
87
10 Solitons in magnetic chains
10.1 Introduction
Anisotropic exchange interactions between localized magnetic moments cause many mag
netic materials to assume an eﬀectively onedimensional character. Within a certain range
of temperatures, interactions along chains of magnetic atoms can be far more signiﬁcant
than interactions across chains. It is then possible to describe material properties (statics
and dynamics) in terms of a onedimensional Hamiltonian
H = −J
n
S
n
S
n+1
+A
n
(S
z
n
)
2
−µ
B
n
S
n
. (10.1)
Here,
S
n
is the spin which resides at the nth magnetic site, J the exchange interaction
(positive for a ferromagnet, negative for an antiferromagnet), A the anisotropy (“easy
plane” if A > 0 and “easyaxis” if A < 0), and
B the external magnetic ﬁeld. The
magnetic moment of the nth spin is, following the standard notation, equal to µ
S
n
.
A typical example of a magnetic chain capable of supporting nonlinear excitations is
CsNiF
3
in the paramagnetic regime T > T
Neel
= 2.65K, with parameters S = 1, J/k
B
=
23.6K, A/k
B
= 4.5K, µ = gµ
B
with a gyromagnetic ratio g = 2.28 (µ
B
= is the Bohr
magneton) [27].
For some applications it is possible to neglect the explicit quantum nature of the spin
operators. In this case,
S
n
will be treated as a classical spin vector of length S (rather than
the technically correct [S(S + 1)]
1/2
.
10.2 Classical spin dynamics
10.2.1 Spin Poisson brackets
Spin is an intrinsically quantum phenomenon. The way to deal with it at a classical level is
by associating an appropriate Poisson bracket algebra directly with spin angular momentum
vectors ¦
I
n
¦
¦f, g¦ =
αβγ
∂f
∂I
α
n
∂g
∂I
β
n
I
γ
n
(10.2)
where
αβγ
is the antisymmetric LeviCivita tensor (=1 if αβγ is a cyclic permutation, 
1 if anticyclic, and zero otherwise) and the Einstein summation convention over repeated
symbols is implied. A special case of (10.2) is
¦I
α
m
, I
β
n
¦ =
αβγ
δ
mn
I
γ
n
(10.3)
where δ
mn
= 1 if m = n and 0 otherwise.
The Poisson bracket algebra deﬁned above can be used to generate Hamiltonian dynamics
˙
I
α
n
= ¦I
α
n
, H¦ = −
αβγ
I
β
n
∂H
∂I
γ
n
, (10.4)
88
10 Solitons in magnetic chains
or, in vector form,
˙
I
n
= −
I
n
∂H
∂
I
n
.
I will mostly deal with dimensionless spin vectors deﬁned as
S
n
=
I
n
/¯h; for these, the
equations of motion take the form
˙
S
n
= −
1
¯h
S
n
∂H
∂
S
n
. (10.5)
Note that the Poisson brackets (10.3) can be obtained from the standard spin commutation
relations by using the correspondence principle. Note further that, independently of the
details of the spin Hamiltonian, according to (10.5), the norms of all vectors ¦[
S
n
[¦ remain
constant in time.
Introducing the onedimensional Hamiltonian (10.1) into (10.5) results in
˙
S
n
=
J
¯h
S
n
(
S
n+1
+
S
n−1
) −2
A
¯h
(
S
n
ˆ z)
S
n
ˆ z −γ
B
S
n
(10.6)
where µ = ¯hγ and ˆ z is a unit vector in the zdirection.
We will deal with various special cases of (10.6), which governs the nonlinear dynamics of
a broad class of onedimensional spin chains.
10.2.2 An alternative representation
The polar form of the spin angular momentum vector
I
n
of length I
I
x
n
= I sin θ
n
cos φ
n
I
y
n
= I sin θ
n
sinφ
n
I
z
n
= I cos θ
n
(10.7)
can be used to provide an alternative representation of spin dynamics. The transformation
P
n
= I
z
n
= I cos θ
n
q
n
= arctan
_
I
y
n
I
x
n
_
= φ
n
(10.8)
can be shown to be canonical, i.e. it preserves Poisson brackets:
αβγ
∂f
∂I
α
n
∂g
∂I
β
n
I
γ
n
=
n
_
∂f
∂q
n
∂g
∂P
n
−
∂f
∂P
n
∂g
∂q
n
_
(10.9)
holds for any pair of functions f, g. The P’s and q’s are canonically conjugate sets of
coordinates and momenta, in the sense that
¦q
m
, P
n
¦ = δ
mn
(10.10)
(with any of the two expressions of Poisson brackets). The two representations are equivalent
 and so are the resulting dynamics. In the polar representation the dynamics takes the usual
symplectic form
˙ q
n
= ¦q
n
, H¦ =
∂H
∂P
n
˙
P
n
= ¦P
n
, H¦ = −
∂H
∂q
n
. (10.11)
89
10 Solitons in magnetic chains
Again, I will use the dimensionless polar variable
p
n
= P
n
/¯h = S cos θ
n
;
the equations of motion then take the form
¯h ˙ q
n
=
∂H
∂p
n
¯h ˙ p
n
= −
∂H
∂q
n
. (10.12)
10.3 Solitons in ferromagnetic chains
10.3.1 The continuum approximation
If the exchange constant J in (10.1) is positive, and in the absence of anisotropy and external
ﬁelds, spins will tend to a parallel ordering. This is exactly true at zero temperature. It
deﬁnes a ferromagnetic ground state S
z
n
= S, S
x
n
= S
y
n
= 0
1
. At reasonably low temperatures,
where thermal motion does not prevail, it is plausible to assume that spin orientations do
not vary wildly from site to site. Thus, although the spin vector may be far from the
“reference” state (0, 0, 1), it will still make sense to write down a continuum approximation.
This approximates sites n by a continuous index variable, n →x, and individual spins by a
continuum ﬁeld,
S
n
→
S(x). Sums over n are approximated by integrals, with the rule
n
→
_
dx
a
where a is the lattice constant. Spins at neighboring sites can be obtained by Taylor expan
sion
S
n±1
→
S(x ±a) ≈
S(x) ±a
S
(x) +
1
2
a
2
S
(x) +
where the primes denote derivatives with respect to x. According to the above rules,
S
n
S
n+1
→[
S(x)[
2
+a
S(x)
S
(x) +
1
2
a
2
S(x)
S
(x) .
The ﬁrst term is the constant norm of the spin vector ﬁeld. The second term is proportional
to the derivative of the constant norm, therefore vanishes. The third term can be integrated
by parts over all space (the contribution from the boundary vanishes identically). The
resulting continuum version of the Hamiltonian (10.1) is
H =
1
2
Ja
_
dx
_
∂
S
∂x
_
2
+
A
a
_
dx S
2
z
−
µ
a
B
_
dx
S . (10.13)
The spin equations of motion (10.6) reduce, in the continuum limit, to
˙
S =
Ja
2
¯h
S
S
−
2A
¯h
_
S ˆ z
__
S ˆ z
_
−γ
B
S . (10.14)
Comment: (10.14) may also be obtained by taking the continuum limit of (10.4). The
correspondence is
∂
∂
S
n
→a
δ
δ
S(x)
1
Note that the choice of the z direction is at this stage  with the full rotational invariance of the exchange
interaction  entirely arbitrary.
90
10 Solitons in magnetic chains
where the righthand side corresponds to a functional derivative. (10.4) becomes
˙
S(x) = −
a
¯h
S(x)
δH
δ
S(x)
which, if we insert the functional derivative
δH
δ
S(x)
= −Ja
S
+
A
a
_
S ˆ z
_
ˆ z −
µ
a
B ,
reproduces (10.14).
In what follows, I will also need the continuum limit of the alternative (polar) spin repre
sentation.
p(x) = S
z
(x), , q(x) = arctan
S
y
S
x
The Hamiltonian (10.13) can be transformed directly to the polar variables if we note that
S
x
2
+S
y
2
=
(pp
)
2
S
2
−p
2
+ (S
2
−p
2
)q
2
.
The result is
H =
1
2
Ja
_
dx
_
S
2
S
2
−p
2
p
2
+ (S
2
−p
2
) q
2
_
+
A
a
_
dx p
2
−
µ
a
B
_
dx
_
S
2
−p
2
cos q , (10.15)
where I have chosen to take the xaxis of the spin vector parallel to the magnetic ﬁeld.
10.3.2 The classical, isotropic, ferromagnetic chain
The isotropic ferromagnetic chain is described by the Hamiltonian (10.1) with A = 0. I
will choose the z direction along the magnetic ﬁeld. The classical spin dynamics in the
continuum limit are described by (10.14). In polar canonical variables these transform to
˙ p =
Ja
2
¯h
_
(S
2
−p
2
)q
¸
˙ q = −
Ja
2
¯h
_
S
2
p
S
2
−p
2
+
S
2
p p
2
(S
2
−p
2
)
2
+p q
2
_
−
µB
¯h
.
In the following I will use dimensionless units, i.e. measure lengths in units of the lattice
constant and times in units of ¯h/(JS). The magnetic ﬁeld will be denoted by the dimension
less quantity b = µB/(JS). Finally, I set p →Sp. The equations of motion in dimensionless
form are
˙ p =
_
(1 −p
2
)q
¸
˙ q = −
p
1 −p
2
−
p p
2
(1 −p
2
)
2
−p q
2
−b . (10.16)
91
10 Solitons in magnetic chains
Soliton solutions
I look for bounded propagating solutions of the type
p = p(x −vt)
q = Ωt + ˆ q(x −vt) (10.17)
where the extra term allows for an overall precession around the z axis. In addition to
boundedness, the solution should decay at inﬁnity, and approach the ferromagnetic ground
state p →1
The equations of motion transform to the following system of coupled ODEs:
−vp
= (1 −p
2
)ˆ q
−2pp
ˆ q
=
_
(1 −p
2
)ˆ q
¸
Ω −vˆ q
= −
p
1 −p
2
−
p p
2
(1 −p
2
)
2
−p (ˆ q
)
2
−b (10.18)
We note that the upper equation has a ﬁrst integral, which we write as
ˆ q
= −
v(p −p
0
)
1 −p
2
(10.19)
 where p
0
is a constant to be chosen later  and use to eliminate ˆ q from the lower equation.
After rearranging some terms I obtain
p
1 −p
2
+
p p
2
(1 −p
2
)
2
= −v
2
(p −p
0
)(1 −p
0
p)
(1 −p
2
)
2
−
ˆ
Ω
where
ˆ
Ω = Ω + b. Multiplying this by 2p
produces a complete derivative on the lefthand
side:
_
p
2
1 −p
2
_
= −2v
2
(p −p
0
)(1 −p
0
p)p
(1 −p
2
)
2
−2
ˆ
Ωp
= −2v
2
_
p
2
0
−2p
0
p + 1
2(1 −p
2
)
_
−2
ˆ
Ωp
which can be integrated to give
p
2
1 −p
2
= −v
2
p
2
0
−2p
0
p + 1
2(1 −p
2
)
−2
ˆ
Ωp +p
1
where p
1
is a new integration constant.
Now the requirement of boundedness, applied to the derivative ˆ q
as p →1, demands (cf.
(10.19)) that p
0
= 1. As a consequence,
(p
)
2
= (1 −p)
_
(1 +p)(p
1
−2
ˆ
Ωp) −v
2
_
. (10.20)
An analytically favorable choice of the integration constant p
1
can be made by demanding
that the brackets vanish at p = 1. This means taking
p
1
= 2
ˆ
Ω +v
2
and results in
_
dp
dx
_
2
= (1 −p)
2
_
2
ˆ
Ω(1 +p) −v
2
_
. (10.21)
92
10 Solitons in magnetic chains
Note that, in order for the righthand side to be positive at least for some values of p, the
conditions
ˆ
Ω > 0 and v
2
< 4
ˆ
Ω must hold. I therefore set
v = 2
ˆ
Ω
1/2
cos
α
2
(10.22)
and obtain
2
ˆ
Ω
1/2
dx
dp
= ±
1
(1 −p)(p −cos α)
1/2
which can be formally integrated as
2
ˆ
Ω
1/2
(x −x
0
) =
_
p
cos α
d¯ p
1
(1 − ¯ p)(¯ p −cos α)
1/2
=
2
1/2
sin
α
2
tanh
−1
_
(p −cos α)
1/2
2
1/2
sin
α
2
_
,
or, after some rearrangement,
p(x) = 1 −2 sin
2
α
2
sech
2
_
ˆ
Ω
1/2
sin
α
2
(x −vt −x
0
)
_
. (10.23)
Inserting (10.23) in (10.19) gives
ˆ q
=
v/2
1 −sin
2 α
2
sech
2
_
ˆ
Ω
1/2
sin
α
2
(x −x
0
)
_
which can be integrated to give
q = q
0
+ Ωt +
v
2
(x −vt −x
0
) + tan
−1
_
tan
α
2
tanh[
ˆ
Ω
1/2
sin
α
2
(x −vt −x
0
)]
_
(10.24)
where I have reverted to the original dynamical variable q.
Eqs. (10.23) and (10.24) describe a soliton with an internal degree of freedom: in addition
to its overall translational motion with velocity v, the soliton is characterized by a nonuni
form internal precession of each spin with respect to the zaxis (Fig. ). The soliton solution
contains two independent parameters, the internal precession frequency and α, which com
pletely determine the soliton dynamics. In particular, the translational velocity is given by
(10.22), the soliton spatial extent is (in units of the lattice constant a)
Γ =
1
ˆ
Ω
1/2
sin
α
2
and the amplitude  as deﬁned by the maximum deviation of p from the ferromagnetically
ordered state p = 1 
A = 2 sin
2
α
2
.
In addition, the soliton solution contains the arbitrary constants x
0
, q
0
which specify, re
spectively, the initial position and internal phase.
Soliton magnetization
The total magnetization carried by the soliton  measured, in units of ¯h, with respect to the
ferromagnetically ordered state  is
M =
n
(S
z
n
−S)
= S
_
∞
−∞
dx (p −1)
= −4S
sin
α
2
ˆ
Ω
1/2
. (10.25)
93
10 Solitons in magnetic chains
In what follows, it will prove useful to express the soliton’s translational velocity in terms
of M, i.e.
v =
4S
[M[
sin α
=
4JS
2
a
¯h[M[
sinα , (10.26)
where in the second line I have reintroduced the physical units.
Soliton energy
I will restrict myself to the case of vanishing external ﬁeld B = 0. In this case
ˆ
Ω = Ω. The
energy density is, from (10.15) and (10.17),
1
2
JS
2
_
p
2
1 −p
2
+ (1 −p
2
)ˆ q
2
,
_
or, using (10.21) and (10.19),
JS
2
Ω(1 −p)
which integrates to
E = 4JS
2
Ω
1/2
sin
α
2
= 16JS
3
1
[M[
sin
2
α
2
. (10.27)
where in the second line I have eliminated the precession frequency in favor of the magne
tization.
It turns out that the set of dynamical variables M and
P = 2¯hSα/a (10.28)
are a better choice for describing the dynamics of the soliton. This becomes clear by looking
at the derivative
_
∂E
∂P
_
M
=
4JS
2
a
¯h
sin α
[M[
= v ,
which indicates that P can be interpreted as the canonical momentum conjugate to the
position of the soliton.
Semiclassical quantization
Systems which are classically integrable may be quantized according to the BohrSommerfeld
scheme, which demands that the total action along a closed orbit must be a multiple of
Planck’s constant
J =
j
_
p
j
dq
j
= nh (10.29)
where ¦p
j
, q
j
¦ is a set of canonically conjugate coordinates and momenta.
The canonically conjugate polar spin coordinates deﬁned in subsection 10.2.2 may be used
in the above quantization condition after an appropriate correction for their dimensions.
Since polar spin coordinates are dimensionless an extra factor ¯ h must be added to the left
hand side of the action (cf. the extra factor ¯ h which appears in the equations of motion
94
10 Solitons in magnetic chains
(10.12) ). Furthermore, since in this section we have made the substitution p → Sp, the
quantization condition over a motion which is periodic with period T reads
S¯h
j
_
T
0
dt p
j
˙ q
j
= nh
or, going to the continuum limit (with the length measured in units of the lattice constant),
_
N/2
−N/2
dx
_
T
0
dt (p −1) ˙ q =
2πn
S
. (10.30)
Note that I have used p−1 rather than p, since I am interested in the properties of a localized
soliton excitation, which approaches p →1 as x →±∞.
There are two types of periodic motion associated with the soliton:
• The ﬁrst is related to the translational motion. In other words, the soliton runs
around the chain (which is in this case subjected to periodic boundary conditions)
with a period T = N/v. This is best viewed in a coordinate system which rotates with
angular velocity Ω around the zaxis. In this case
˙ q = −vˆ q
= −v
2
1
1 +p
and the lefthand side of (10.30) becomes
−v
2
_
N/v
0
dt
_
N/2
−N/2
dx
p −1
1 +p
= v
2
_
N/v
0
dt Ω
−1/2
_
∞
−∞
dξ 2 sin
α
2
sech
2
ξ
2 −2 sin
2 α
2
sech
2
ξ
= v
2
Ω
−1/2
sin
α
2
N
v
_
∞
−∞
dρ
1
cosh ρ + cos α
= N Ω
−1/2
2Ω
1/2
cos
α
2
sin
α
2
2α
sin α
= 2Nα .
In terms of P, the canonical momentum of the soliton given by (10.28), the quantiza
tion condition reads
P ≡ ¯hK = 2π
n
Na
¯h (10.31)
which is the usual quantization condition for the momentum of a free particle subject
to periodic boundary conditions.
• The second type of periodic motion is related to the precession around the symmetry
axis. In this case I consider a coordinate system moving with the soliton translational
velocity v. Then
˙ q = Ω
p = 1 −2 sin
2
α
2
sech
2
_
Ω
1/2
sin
α
2
x
_
and the lefthand side of (10.30) becomes
−Ω
_
2π/Ω
0
dt
_
N/2
−N/2
dx 2 sin
2
α
2
sech
2
_
Ω
1/2
sin
α
2
x
_
95
10 Solitons in magnetic chains
= −Ω
2π
Ω
2 sin
α
2
Ω
−1/2
2
= 2π
M
S
(10.32)
and the quantization condition simply expresses the fact that
M = m .
In order to complete the semiclassical quantization scheme, I rewrite the relation between
energy and canonical momentum, or, as is more usual in condensed matter physics, the
wavevector K. Equation (10.27) now reads
E = 16JS
3
1
[M[
sin
2
_
Ka
2S
_
. (10.33)
In the special case S = 1/2, the above expression coincides with the exact quantum me
chanical result found by Bethe, using the BetheAnsatz, for bound states of M magnons
2
.
10.3.3 The easyplane ferromagnetic chain in an external ﬁeld
Weak outofplane motion: the SineGordon limit
I will consider the case of strong anisotropy. The spin vectors are then approximately
conﬁned to the xy plane. The z component is small, so we can readily assume p ¸ S and
set p
∼ 0 in (10.15). The continuum Hamiltonian becomes
H =
1
2
JaS
2
_
dx q
2
+
A
a
_
dx p
2
−
µ
a
BS
_
dx cos q . (10.34)
Inserting the relevant functional derivatives
δH
δp(x)
= 2
A
a
p
δH
δq(x)
= −JaS
2
q
+
µBS
a
sinq (10.35)
into the equations of motion, I obtain
˙ q =
a
¯h
δH
δp(x)
=
2A
¯h
p
˙ p = −
a
¯h
δH
δq(x)
=
Ja
2
S
2
¯h
−
µBS
¯h
sin q (10.36)
from which I can eliminate p and get a diﬀerential equation which is of second order in space
and time for q:
∂
2
q
∂t
2
−c
2
∂
2
q
∂x
2
= −ω
2
0
sin q (10.37)
where
c =
√
2AJ
aS
¯h
and
ω
0
=
√
2AµBS
¯h
.
2
From a more modern perspective, bound states of magnons should be appropriately called quantum
solitons
96
10 Solitons in magnetic chains
We recognize (10.37) as the SG equation. We have derived it under the assumption of strong
anisotropy. Moreover, in order for (10.37) to provide a meaningful approximation to the true
dynamics of discrete spins, the length scale deﬁned by (10.37)
d =
c
0
ω
0
=
_
J
µBS
_
1/2
a
should be considerably larger that the lattice constant. The inequality
B ¸
J
µS
therefore deﬁnes the physical range of allowed magnetic ﬁelds consistent with contiuum SG
dynamics.
Making use of the ﬁrst of the equations of motion (10.36) I can write the total energy as
a function of the ﬁeld q only:
H =
0
_
dx
_
1
2
_
∂q
∂t
_
2
+
1
2
c
2
0
_
∂q
∂x
_
2
+ω
2
0
cos q
_
. (10.38)
where
0
= JaS
2
/c
2
0
. This allows me to eﬀectively ignore the outofplane motion and deal
with q as if it were a single scalar ﬁeld with an eﬀective Lagrangian density
L =
0
_
1
2
_
∂q
∂t
_
2
−
1
2
c
2
0
_
∂q
∂x
_
2
−ω
2
0
cos q
_
 which results in the SG dynamics and the total energy (10.38). Note however that this
should not be misunderstood to imply a vanishing outofplane motion.
Dynamical structure factor
The quantity
I
αα
(k, t) =
1
N
m,n
e
ik(m−n)a
< S
m
α
(t)S
n
α
(0) > ,
where the brackets denote a thermodynamic average over a canonical ensemble, measures
the spatial Fourier transform of timedependent correlations of the αcomponent of spins.
Its temporal Fourier transform is the dynamical structure factor (DSF)
I
αα
(k, ω) =
_
∞
−∞
dt
2π
e
−iωt
I
αα
(k, t)
=
1
a
_
dt
2π
_
dx e
i(kx−ωt)
< S
α
(x, t)S
α
(0, 0) > , (10.39)
where in the second line I have made use of the system’s translational invariance and, in
addition, taken the continuum limit. The DSF can be experimentally deduced from inelastic
neutron scattering experiments which detect k and ¯hω, as the change in the neutron’s
momentum and energy, respectively.
In the case of weak outofplane motion, the xx DSF can be written in terms of the qﬁeld
as
I
xx
(k, ω) =
S
2
a
_
dt
2π
_
dx e
i(kx−ωt)
< cos q(x, t) cos q(0, 0) > .
97
10 Solitons in magnetic chains
DSF calculation for a dilute gas of solitons
In the limit of weak outofplane motion, spin dynamics is eﬀectively governed by the SG
equation. Now the dynamics of the SG ﬁeld equation, a completely integrable system,
is truly exceptional. For decaying boundary conditions it implies that solitons have an
essentially inﬁnite lifetime. At a ﬁnite temperature (therefore ﬁnite energy density) the
exact mathematics is somewhat more subtle, but there are good reasons to believe in the
existence of a soliton gas. At low temperatures, such a kink (or antikink)like soliton gas
would consist of almost noninteracting particles of mass
M =
8
d
0
= 8J
_
S
c
0
_
2
a
d
,
velocitydependent energy
E(v) = Mc
2
0
γ ≡ Mc
2
0
_
1 −
v
2
c
2
0
_
−1/2
≈ Mc
2
0
+
1
2
Mv
2
+
(with the second expression valid at low velocities) and displacement ﬁeld (cf. )
cos q(x, t) = 1 −2 sech
2
_
γ(x −vt −x
0
)
d
_
where x
0
is a constant specifying the soliton position at time t = 0. The last equation implies
that, far from the soliton position, spins are oriented along the ferromagnetic reference state.
Only within a distance d from the soliton do spins deviate appreciably from that reference
state. Now if the soliton gas is dilute, i.e. the density is much smaller than a/d, then it is
very improbable that two solitons will be at the same place at the same time. We can then
assume that
cos q(x, t) ≈ 1 −2
j
sech
2
_
γ
j
(x −v
j
t −x
0
j
)
d
_
where the sum runs over all solitons of the gas.
The correlation function
< cos q(x, t) cos q(0, 0) > = 1 −2
j
< sech
2
_
γ
j
(x −v
j
t −x
0
j
)
d
_
>
− 2
j
< sech
2
_
γ
j
x
0
j
d
_
> (10.40)
+ 4
i,j
< sech
2
_
γ
j
(x −v
j
t −x
0
j
)
d
_
sech
2
_
γ
j
x
0
i
d
_
>
consists of four terms. The ﬁrst three are constant in time and space and therefore generate
contributions to the DSF only at zero momentum and energy transfer. The same holds for
that part of the fourth term which comes from terms i ,= j and can be factorized into space
and timeindependent averages. The only contribution to the DSF at nonzero k and ω will
come from the i = j part of last term (incoherent scattering) and  after averaging  is the
same for all solitons. The sum over j generates a factor equal to the total number of solitons
N
s
.
The averaging operation involves averaging over all initial positions and velocities of the
jth soliton. The initial positions are uniformly distributed (remember that there is no
98
10 Solitons in magnetic chains
energy cost involved in moving a SG kink from one position to another). The velocities are
distributed according to the Boltzmann distribution appropriate for a onedimensional gas.
In the limit of low temperatures, relativistic corrections can be neglected and
P(v) = Ce
−
βMv
2
2
, (10.41)
where β = 1/(k
B
T) and C = (βM/2π)
1/2
. Symbolically, the averaging operation can be
denoted as
< >=
1
L
_
dx
0
dv P(v) .
The resulting DSF
N
s
L
_
dt
2π
dx
0
dv dx e
−i(kx−ωt)
P(v) < sech
2
_
γ
j
(x −vt −x
0
)
d
_
sech
2
_
γ
j
x
0
d
_
>
is a quadruple integral. Setting ξ = x − vt − x
0
allows the integration over x
0
and ξ to be
readily performed, resulting in
n
s
_
dt
2π
dv e
i(kvt−ωt)
P(v)
_
f
_
kd
γ
__
2
where
f(κ) =
_
∞
−∞
dx e
−iκx
sech
2
x
is the soliton form factor and n
s
= N
s
/L the density of solitons. The time integral generates
a delta function of v −ω/k which then enables us to perform the integration over velocities.
The resulting DSF is
I(k, ω) = n
s
1
k
_
f
_
kd
γ
__
2
P(
ω
k
)
where γ = (1 − (ω/c
0
k)
2
)
−1/2
. At low temperatures k
B
T ¸ Mc
2
0
, where the velocity
distribution is well approximated by (10.41), this is well approximated by the Gaussian
central peak (CP)
I(k, ω) = n
s
π
−1/2
Γ
k
[f(kd)]
2
e
−ω
2
/Γ
2
k
with a width
Γ
k
=
_
2k
B
T
M
_
1/2
k .
The observed CP width in CsNiF
3
is consistent with the above predictions [27].
10.4 Solitons in antiferromagnets
10.4.1 Continuum dynamics
The starting point is the spin Hamiltonian (10.1) and the resulting equations of motion
(10.6) with J < 0.
In the following, I will use A = 2δ[J[; the dimensionless measure of the anisotropy will be
assumed to be small. Consider ﬁrst the case A = 0, B = 0. If zeropoint classical ﬂuctuations
are neglected, the ground state of the isotropic antiferromagnet at zero external ﬁeld is the
Ne´el state,
S
n
= ±(−1)
n
Sˆ n
99
10 Solitons in magnetic chains
where besides the rotational degeneracy (arbitrary direction of the unit vector ˆ n), there is
an “evenodd” degeneracy. If A denotes “up” and B denotes “down”, both ABABAB
and BABABA are possible ground states with the same energy. The existence of two
degenerate “vacua” makes antiferromagnets a priori good candidates as soliton bearing sys
tems.
I will deﬁne new vector ﬁelds which are well suited to describe the situation at low temper
atures, i.e. not too far from the ground state. Note that this will not exclude large amplitude
ﬂuctuations; however, I will make the demand that the various ﬁeld conﬁgurations should
vary smoothly in space. Let
φ
n
=
1
2S
(
S
2n+1
−
S
2n
)
l
n
=
1
2
(
S
2n+1
+
S
2n
) . (10.42)
The new ﬁelds satisfy the properties
φ
n
l
n
= 0 (10.43)
and
[
φ
n
[
2
+
1
S
2
[
l
n
[
2
= 1 . (10.44)
If ﬁeld conﬁgurations vary smoothly in space, it is possible to use a continuum ﬁeld approx
imation
φ
n
→
φ(x) with ﬁeld values at neighboring sites (note that a neighboring site of the
new ﬁeld is at a distance 2a apart!)
φ
n±1
∼
φ(x) ±2a
φ
(x) +
1
2
(2a)
2
φ
(x) ,
and a similar expansion for
l(x). Note however that the two ﬁelds do not have the same
status. At the Neel state, it is obvious that [
l[ = 0, whereas [
φ
n
[ = 1. In fact, a consistent
ﬁeld expansion treats
l as a small quantity, of the order of
φ
. Therefore, terms of second
order in
l will be dropped. Under these conditions, the normalization condition (10.44) is
exhausted by the
φ ﬁeld, which will henceforth be treated as a vector of unit length.
It is a tedious but straightforward  and necessary  exercise to use the inverse relations
S
2n
=
l
n
−S
φ
n
S
2n+1
=
l
n
+S
φ
n
(10.45)
and express the total Hamiltonian in the form
H =
_
dxH
where the Hamiltonian density is given in terms of the new vector ﬁelds:
H =
1
2gc
_
c
2
¸
¸
¸
¸
φ
−
2
aS
l
¸
¸
¸
¸
2
+c
2
¸
¸
¸
φ
¸
¸
¸
2
−2ω
2
1
δ
¸
¸
¸
φ
⊥
¸
¸
¸
2
−4
γω
1
S
B
l
_
(10.46)
The ﬁrst two terms come from the isotropic exchange term; the third term comes from the
anisotropy  note that
φ
⊥
=
φ−(
φ ˆ z)ˆ z is the transverse component of the
φﬁeld; the fourth
term comes from the interaction with the magnetic ﬁeld. The new constants are related to
the old as follows: g = 2/(¯hS), ω
1
= 2[J[S/¯h, c = ω
1
a.
100
10 Solitons in magnetic chains
The total magnetization can be expressed (in units of ¯ h) as
3
M =
n
S
n
=
_
dx
2a
2
l(x) . (10.47)
The next step is to obtain the dynamics of the coupled vector ﬁelds by diﬀerentiating both
sides of (10.42), using the dynamics deﬁned in (10.6), and rewriting the results in terms of
the new ﬁelds:
˙
φ = c
φ
_
φ
−
2
aS
l
_
−γ
B
φ (10.48)
˙
l = c
_
l
φ
_
+caS
φ
φ
−ω
1
δS(
φ ˆ z)
φ ˆ z −γ
B
l (10.49)
In deriving (10.48), I have further dropped anisotropy terms which are ﬁrst order in
l, if the
anisotropy is small, they are negligible compared to the leading, ﬁrst term in the righthand
side of (10.48).
The above coupled ﬁrstorder equations determine in principle the spin dynamics of the
new variables
4
. It is however possible to perform a further reduction by recognizing that
l is completely determined by
φ and its derivatives (i.e. it is a slave variable). This can
be easily seen by forming the vector product of both sides of (10.48) with
φ. After some
rearrangements,
2c
aS
l =
φ
˙
φ +c
φ
+γ
_
B −(
B
φ)
φ
_
, (10.50)
which can be used to eliminate
l from (10.49). The result is
φ
_
¨
φ −c
2
φ
+ 2ω
2
1
δ(
φ ˆ z)ˆ z + 2γ
B
˙
φ +γ
2
B(
B
φ)
_
= 0 . (10.51)
In what follows, it will be useful to exploit (10.50) in order to eliminate the slave variable
l
from (10.46). The result is
5
H =
1
2gc
_
[
˙
φ[
2
−c
2
[
φ
[
2
−2ω
2
1
δ [φ
⊥
[
2
+γ
2
(
B
φ)
2
_
(10.52)
I will now consider some special cases.
10.4.2 The isotropic antiferromagnetic chain
If δ = 0 and B = 0, (10.51) is equivalent to the dynamics of the ﬁeld theory deﬁned by the
Lagrangian density
L
0
=
1
2gc
_
[
˙
φ[
2
−c
2
[
φ
[
2
_
(10.53)
subject to the constraint [
φ[
2
= 1. This is the relativistically invariant nonlinear sigma
model, which has been employed as a toy model in quantum chromodynamics. Furthermore,
3
Note that, strictly speaking, this holds for an even number of spins. An odd number of spins will generate
a contribution from the boundary.
4
It should be noted that the equations preserve exactly both the normalization 
φ
2
= 1 and the orthogo
nality property
φ ·
l = 0.
5
Strictly speaking, the result in the brackets of (10.52) omits an irrelevant constant −γ
2
B ·
B and a total
derivative term −2γ
2
B ·
φ
, which only generates contributions from the boundaries.
101
10 Solitons in magnetic chains
a Wick rotation shows it to correspond to the Hamiltonian of the twodimensional classical
antiferromagnet.
The Hamiltonian density obtained from the ﬁeld theory (10.53)
H =
1
2gc
_
[
˙
φ[
2
+c
2
[
φ
[
2
_
(10.54)
is the same as the sum of the ﬁrst two terms in (10.52).
Note: it possible to add to the Lagrangian density (10.53) a term
L
∗
=
1
g
θ
2πS
φ
_
˙
φ
φ
_
. (10.55)
This so called topological term of the Lagrangian does not inﬂuence the classical equations
of motion. This is because it generates a contribution to the action which depends only on
general topological properties of the ﬁeld; in the simplest of cases, one can see, using a polar
representation p = φ
z
, q = arctan(φ
y
/φ
x
) that
Q =
1
4π
_
dtdx
φ
_
˙
φ
φ
_
=
1
4π
_
dtdx
_
∂p
∂t
∂q
∂x
−
∂q
∂t
∂p
∂x
_
=
1
4π
_
dtdx
∂(p, q)
∂(x, t)
=
1
4π
_
1
−1
dp
_
2π
0
dq
= 1 ; (10.56)
in general, the Pontryagin index Q of the vector ﬁeld tells us how many times the vector
sweeps the unit sphere as dxdt sweeps twodimensional spacetime. The resulting contribu
tion to the action
W
∗
=
_
dx dt L
∗
,
a constant, cannot modify the classical equations of motion, which are determined by the
action derived from the Lagrangian density L
0
. It may however be relevant for quantum
phenomena. Noting in this context that 2/gS = ¯h and θ = 2πS
6
we obtain
W
∗
¯h
= 2πSQ (10.57)
which hints that whether or not the extra term is relevant for quantum mechanics may well
depend on whether S is a halfinteger or an integer, respectively.
10.4.3 Easy axis anisotropy
Consider the case of easy axis anisotropy δ = −
1
2
(ω
0
/ω
1
)
2
.
The dynamics of the vector ﬁeld
φ (10.51) is equivalent to the Lagrangian ﬁeld theory
L =
1
2gc
_
[
˙
φ[
2
−c
2
[
φ
[
2
−ω
2
0
[
φ
⊥
[
2
_
(10.58)
6
The choice θ = 2πS is mandated by the requirement that the canonical momentum conjugate to
φ,
π = ∂(L
0
+ L
∗
)/∂
˙
φ should form a “triad” with
l and
φ, i.e π = ¯h/a
l ×
φ and ¯h/a
l =
φ ×π; the choice
θ = 2πS in (10.48) with B = 0 satisﬁes the ﬁrst of these requirements; the second, which is a natural
feature of a Hamiltonian theory, guaranteeing the right form for
l, the generator of inﬁnitesimal rotations,
is then automatically satisﬁed.
102
10 Solitons in magnetic chains
subject to the constraint [
φ[
2
= 1. Note that the anisotropy term does not destroy Lorentz
invariance. The simplest way to lift the constraint is to introduce polar coordinates
α = arccos φ
z
β = arctan(φ
y
/φ
x
)
where 0 ≤ α ≤ π and 0 ≤ φ < 2π. The Lagrangian density (10.58) can then be written as
L =
1
2gc
_
˙ α
2
+ sin
2
α
˙
β
2
−c
2
_
α
2
+ sin
2
α β
2
_
−ω
2
0
sin
2
α
_
. (10.59)
The corresponding energy density is given by (10.52) as
H =
1
2gc
_
˙ α
2
+ sin
2
α
˙
β
2
+c
2
_
α
2
+ sin
2
α β
2
_
+ω
2
0
sin
2
α
_
. (10.60)
The resulting equations of motion are
α
−
1
c
2
¨ α =
1
2
ω
2
0
−
˙
β
2
c
2
sin 2α
1
c
2
∂
∂t
_
sin
2
α
˙
β
_
=
∂
∂x
_
sin
2
α β
_
. (10.61)
The vacua
I ﬁrst determine the spatially and temporally uniform solutions. These are α = 0, π/2, π and
β = β
0
(arbitrary). By inspection of (10.60) it can be seen that only α = 0, π correspond to
 degenerate  energy minima (vacua), whereas α = π/2 corresponds to an energy maximum.
Kinks and antikinks
I next look for solutions which satisfy
˙
β = ω (uniform precession of the φ vector around the
zaxis) and ˙ α = 0; the latter restriction can later be lifted because I can always Lorentzboost
a static solution. The second equation is satisﬁed identically; the ﬁrst reduces to
α
=
ω
2
0
−ω
2
2c
2
sin 2α , (10.62)
which is a SineGordon equation for the ﬁeld 2α and a length scale R = c/
_
ω
2
0
−ω
2
. It has
a ﬁrst integral
R
2
(α
)
2
= −
1
2
cos 2α +const
and can therefore support soliton solutions which interpolate from one vacuum (α = 0 to
the other (α = π). This implies the choice of the constant equal to 1/2, therefore
Rα
= ±sin α
and
α = 2 arctan e
±(x−x
0
)/R
,
or,
sinα = sech
x −x
0
R
cos α = ∓tanh
x −x
0
R
, (10.63)
where x
0
is an arbitrary constant.
The solitons are πkinks, going from α = 0 at x → −∞ to π at x → ∞ (and the
corresponding antikinks).
103
10 Solitons in magnetic chains
The total magnetization of the πkink
Introducing (10.50) into (10.47), we obtain the following general expression for the magne
tization, valid for
B = 0:
M =
S
2c
_
dx
_
φ
˙
φ +c
φ
_
The zcomponent of the magnetization will then be
M
z
=
S
2c
_
dx
__
φ
x
˙
φ
y
−φ
y
˙
φ
x
_
+cφ
z
_
= −
S
2c
_
dxsin
2
α
˙
β +
S
2
cos α[
∞
−∞
= m∓S (10.64)
where the limits of integration have been extended to inﬁnity, since we assume that the spin
conﬁgurations approach one of the two Neel states at inﬁnity. The second term will then
be equal to S for a kink and S for an antikink. This is a contribution which is entirely
independent of the structural details of the kink. The “extra” magnetization will be
m = −
S
2c
ω
_
∞
−∞
dx sech
2
x −x
0
R
= −
S
2c
ω 2R
= −
ω
_
ω
2
0
−ω
2
S . (10.65)
The total energy of the πkink
Introducing the form of the kink solution into the expression for the energy density (10.60)
gives
H =
1
2gc
_
sin
2
α ω
2
+c
2
sin
2
α
R
2
+ω
2
0
sin
2
α
_
=
ω
2
0
gc
sech
2
x −x
0
R
which gives a total kink energy
E =
_
∞
−∞
dx H
=
ω
2
0
gc
2R
=
2
g
ω
2
0
_
ω
2
0
−ω
2
. (10.66)
It is possible to express ω in terms of the kink magnetization m using (10.65). This gives
the energy of the kink as a function of magnetization
E = ¯hω
0
_
S
2
+m
2
(10.67)
where I have made use of g = 2/(¯ hS). The last form, along with (10.65), is eminently
useful for developing a semiclassical quantization approach where m would take integer or
halfinteger values.
104
10 Solitons in magnetic chains
10.4.4 Easy plane anisotropy
A positive value of the anisotropy parameter δ favors spin orientations along the xyplane.
Setting
δ =
1
2
_
ω
0
ω
1
_
2
implies a change of sign ω
2
0
→ −ω
2
0
in the eﬀective Lagrangian density (10.59), which now
reads
L =
1
2gc
_
˙ α
2
+ sin
2
α
˙
β
2
−c
2
_
α
2
+ sin
2
α β
2
_
+ω
2
0
sin
2
α
_
, (10.68)
the energy density (10.60), and the equations of motion (10.61).
Although the spatially and temporally uniform solutions are the same as in the case of
the easyaxis anisotropy, their role is reversed. There is only one stable energy minimum at
α = π/2 and maxima at α = 0, π. Looking for solutions with ˙ α = 0 and
˙
β = ω leads to
α
= −
1
2
1
R
2
sin 2α
where now
R
2
=
c
2
ω
2
+ω
2
0
.
This is a SG equation for the ﬁeld 2(π/2 −α). We can write down the solution by making
the substitution α →π/2 −α in (10.63):
cos α = sech
x −x
0
R
sinα = ∓tanh
x −x
0
R
, (10.69)
where x
0
is again an arbitrary constant.
Note that this type of solution does not interpolate between degenerate vacua. It drives
the system out of the only available vacuum at x ¸x
0
, leads it to the energy maximum at
x = x
0
and returns it to the vacuum as x ¸x
0
. The associated energy density is
H =
1
2gc
_
sin
2
α ω
2
+
_
c
2
R
2
+ω
2
0
_
cos
2
α
_
and will therefore lead, from the ﬁrst term, to a total energy proportional to the size of the
system, as long as ω ,= 0. If we restrict ourselves to ﬁniteenergy solutions ω must vanish,
i.e. β = β
0
. The characteristic length becomes R = c/ω
0
and the energy density
H =
1
2gc
2ω
2
0
sech
2
_
ω
0
c
(x −x
0
)
_
,
which integrates to a total kink energy
E =
2
g
ω
0
= ¯hω
0
S (10.70)
Inspection of (10.64) shows that both the bulk and the boundary contributions to the
magnetization vanish.
105
10 Solitons in magnetic chains
10.4.5 Easy plane anisotropy and symmetrybreaking ﬁeld
In the case of easyplane anisotropy
δ =
1
2
_
ω
0
ω
1
_
2
and nonzero magnetic ﬁeld, it is possible to satisfy the equations of motion (vanishing
brackets in (10.51) ) by constructing a ﬁeld theory based on the eﬀective Lagrangian density
L =
1
2gc
_
[
˙
φ[
2
−c
2
[
φ
[
2
−ω
2
0
[
φ
⊥
[
2
_
+L
B
, where
L
B
=
1
2gc
_
−γ
2
(
B
φ)
2
+ 2γ
B (
φ
˙
φ)
_
. (10.71)
Two comments are in order here concerning the second term in the second line. First,
this is the most general scalar that can be constructed from the magnetic ﬁeld and the
ﬁeld vector
φ and its derivatives, consistent with the property of time reversal invariance
B →−
B,
˙
φ →−
˙
φ. Second, that although the term is crucial for the dynamics it generates
the fourth term in the brackets in (10.51), it does not contribute to the energy, which is
quadratic in the magnetic ﬁeld (cf. (10.52) ).
I will consider the case in which the magnetic ﬁeld is in the xdirection, i.e. it serves to
break the easyplane symmetry.
It is straightforward to derive the complete spin dynamics in polar coordinates. They
correspond to those of the previous subsection, with additional terms which come from the
ﬁeld dependent part of the Lagrangian
L
B
=
1
2gc
_
−γ
2
B
2
sin
2
αcos
2
β −2γBsinβ ˙ α −γBsin 2αcos β
˙
β
_
.
The equations of motion in polar coordinates are:
¨ α −c
2
α
−
1
2
sin 2α(
˙
β
2
+ω
2
0
) = 2γBcos β sin
2
α
˙
β −
1
2
γ
2
B
2
sin 2αcos
2
β
∂
∂t
(sin
2
α
˙
β) −c
2
∂
∂x
(sin
2
αβ
) = −γBcos β ˙ α +
1
2
γ
2
B
2
sin
2
αsin 2β (10.72)
I will also need the total energy density (10.52), which in polar coordinates reads
H = H
exch
+H
anis
+H
B
, where
H
exch
=
1
2gc
_
˙ α
2
+ sin
2
α
˙
β
2
+c
2
_
α
2
+ sin
2
α β
2
__
H
anis
= −
ω
2
0
2gc
sin
2
α
H
B
=
(γB)
2
2gc
sin
2
αcos
2
β . (10.73)
The vacua
Spatially and temporally uniform solutions of (10.72) must satisfy the conditions
sin 2α
_
ω
2
0
−γ
2
B
2
¸
= 0
sin
2
αsin 2β = 0 . (10.74)
106
10 Solitons in magnetic chains
If we look at the total energy of such a uniform state as a function of the ﬁeld,
E = −L
1
2gc
sin
2
α
_
ω
2
0
−γ
2
B
2
cos
2
β
_
we see that the energy minimum, which is at α = π/2 at zero magnetic ﬁeld, shifts to
α = 0, π if the ﬁeld exceeds the critical value B
c
= ω
0
/γ. At moderate ﬁelds B < B
c
, the
minimal energy conﬁguration will be at α = π/2, β = ±π/2.
The kinks
If the anisotropy is nonvanishing, we can assume that the outofplane motion will be weak.
As long as α ∼ π/2, the second equation of motion will reduce to
∂
2
β
∂t
2
−c
2
∂
2
β
∂x
2
=
1
2
γ
2
B
2
sin 2β (10.75)
which is a SineGordon equation for the angle π − 2β. It admits static (and Lorentz
boostable) kink and antikink solutions of the form
π −2β = arctan e
±(x−x
0
)/d
where d = c/(γB). Alternatively, we can write
cos β = sech
_
x −x
0
d
_
sin β = ±tanh
_
x −x
0
d
_
. (10.76)
which gives an energy density
1
2gc
_
c
2
β
2
−ω
2
0
+γ
2
B
2
cos
2
β
_
and a total kink rest energy (measured from the vauum)
E
0
= 2
c
gd
= ¯hSγB = µSB .
Outofplane corrections
Corrections which arise from the weak outofplane motion may be estimated by considering
the ﬁrst of the equations of motion (10.72) which for α ∼ π/2 (so that the second space and
time derivatives can be neglected) gives
cos α
_
˙
β
2
+ω
2
0
−γ
2
B
2
cos
2
β
_
= 2γBcos β
˙
β
For a static kink conﬁguration, the ﬁrst term on the lefthandside vanishes
7
and the third
is at most (in the vicinity of the kink) of order γ
2
B
2
. At moderate ﬁelds B < B
c
this leaves
the second term as the dominant one; hence
α =
π
2
+
2γB
ω
2
0
cos β
˙
β
which is the analog of the ﬁrst of eqs (10.36) for the easyplane ferromagnet and can be used
to estimate e.g. outof.plane corrections to the kink energy.
7
Note that this estimate is even better for moving kinks, since the ﬁrst term then partially cancels the
third.
107
10 Solitons in magnetic chains
The dynamical structure factors
The xx spectra
< cos(x, t) cos(0, 0) >
give rise to a DSF similar to that of the ferromagnet with easyplane anisotropy and magnetic
ﬁeld in the xdirection  except for a slightly diﬀerent form factor due to the sech rather
than sech
2
function.
The yy spectra can be approximated by
< sin(x, t) sin(0, 0) >≈< σ(x, t)σ(0, 0) >
where σ = ±1. The approximation suggests that they act as “registers” of a spin ﬂip every
time a soliton comes by. More precisely, we can approximate
σ(x, t) = σ(x, 0)(−1)
N
1
(t)
,
where N
1
(t) is the number of times a soliton passes through the point x during the time
interval (0, t), and
σ(x, 0) = σ(0, 0)(−1)
N
2
(x)
,
where N
2
(x) is the number of solitons which are in the segment (0, x) at any given time.
The dynamical correlation can then be calculated, using the Poisson statistics which
characterize the spin ﬂips, to be
< σ(x, t)σ(0, 0) >= e
−2[N
1
(t) + N
2
(x)]
.
The averages are straightforward to estimate:
• N
2
(x) is simply equal to 2n
s
[x[, where n
s
is the density of kinktype solitons (and
n
¯ s
= n
s
the density of antisolitons).
• The average number of solitons passing through a point during a given time interval of
length t is given by n
s
¯ u[t[, where ¯ u is the average thermal speed of solitons. If solitons
can be thought of as forming an ideal (Boltzmann) gas (cf. the analogous discussion
of the ferromagnetic case in section 10.3.3) then ¯ u = (2k
B
T/M)
1/2
. This should of
course be multiplied by a factor of 2, to account for the antisolitons.
Putting everything together,
< σ(x, t)σ(0, 0) >= e
−4n
s
x−4n
s
¯ ut
and hence
I(k, ω) =
π
k
2
+ Γ
2
k
π
ω
2
+ Γ
2
ω
(10.77)
where
Γ
k
= 4n
s
and
Γ
ω
= 4
_
2k
B
T
M
_
1/2
n
s
According to the above, the main temperature and magnetic ﬁeld dependence of both widths
comes from the soliton density which, in the appropriate temperature range, has a charac
teristic exponential dependence
n
s
∝ e
−E
0
/k
B
T
.
Since the kink rest energy is proportional to the magnetic ﬁeld, we expect the leading B
and T dependence of the width to scale with B/T. This is borne out by experiments [28]
on TMMC (cf. Fig. 10.1).
108
10 Solitons in magnetic chains
Figure 10.1: Energy and wavevector width of the central peak of TMMC. (from [28]).
109
11 Solitons in conducting polymers
11.1 Peierls instability
This is not an attempt to cover the general subject of electronic transport properties of
polymers. A good place to start learning about the physical properties, the chemistry and the
technological signiﬁcance of semiconducting and metallic polymers is Alan Heeger’s Nobel
lecture[29]. The theoretical and experimental background on solitons in conducting polymers
has also been reviewed by Heeger, Kivelson, Schrieﬀer and Su [30]. Here I will only try to
convey the main ideas about the Peierls instability, which turns a onedimensional metallic
chain into an insulator, and about how the resulting structures, if the proper conditions of
energy degeneracy are met, can support solitons and polarons.
The exemplary substance for the present discussion is transpolyacetylene, (CH)
x
. The
schematic structure shows that the backbone consists of carbon atom bonds which are neither
single nor double. For large chains, where the end points are irrelevant, the picture of the
electronic structure is relatively simple: each carbon atom contributes a single p
z
electron to
the π band. These electrons have a tendency to delocalize. One can express this in terms
of a tightbinding model Hamiltonian
H
0
= −
ns
t
n,,n+1
_
c
†
n+1s
c
ns
+c
†
ns
c
n+1s
_
, (11.1)
where the c
ns
’s denote creation and annihilation operators for electrons of spin s at the nth
site, and the hopping parameters t
n,n+1
correspond to the πelectron transfer integrals.
11.1.1 Electrons decoupled from the lattice
In the absence of any coupling to the underlying lattice, the hopping parameters can be
assumed to have a constant value, t
0
. The Hamiltonian H
0
is then diagonal in Fourier
space, i.e.
H
0
= −
qs
q
c
†
qs
c
qs
, (11.2)
where
c
qs
=
1
√
N
N
n=1
e
iqan
c
ns
; q =
2π
a
n
N
, n = −
N
2
+ 1, ,
N
2
−1,
N
2
, (11.3)
N is the number of sites and a the lattice constant, and
q
= 2t
0
cos qa (11.4)
The resulting band structure is shown in unfolded and folded form in Fig. . The point to
note is that of the total N allowed values of q, N/2 lead to states of negative energy; since
every state can be doubly occupied (spin up/down), and there is a total of N πelectrons, the
negative energy states are all occupied, and the positive energy states are empty. Since there
is no gap between them, the onedimensional tightbinding electronic system is metallic.
110
11 Solitons in conducting polymers
11.1.2 Electronphonon coupling; dimerization
Suppose that the carbon atoms in the backbone are slightly displaced from their reference
positions. Let the displacement of the nth carbon atom be y
n
. It is reasonable to assume
that if the distance from the nth to the n + 1st atom decrease, the probability amplitude
for electron hopping should increase; for small relative displacements
t
n,n+1
= t
0
−α(y
n+1
−y
n
) (11.5)
should hold. Furthermore, the atomic displacements contribute to a lattice deformation
energy
H
L
=
1
2
K
n
(y
n+1
−y
n
)
2
. (11.6)
In principle, the lattice atoms also contribute a kinetic energy term. In the framework of
the BornOppenheimer approximation, we will treat the slow motion of atoms as classical;
in essence we want to compare the electronic energies of various lattice conﬁgurations. The
kinetic energy of these conﬁgurations can be neglected in a ﬁrst approximation.
The possibility of dimerization
I examine conﬁgurations with alternating bond lengths, i.e.
y
n
= (−1)
n
y
0
.
I deﬁne new creation and annihilation operators appropriate to the folded Brillouin zone,
c
ks
= a
ks
if [k[ < k
F
c
ks
= b
k−sgn(k)k
F
s
if [k[ > k
F
which restricts kvalues to be smaller than k
F
= π/(2a). The tightbinding Hamiltonian 
which now includes the interaction of the electrons with the lattice  can now be written as
H
0
=
qs
_
q
_
a
†
qs
a
qs
−b
†
qs
b
qs
¸
+i∆
q
_
b
†
qs
a
qs
−a
†
qs
b
qs
¸_
(11.7)
with ∆
q
= ∆
0
sin qa and ∆
0
= 4αy
0
.
In addition, the lattice deformation contributes
H
L
=
1
2
NK4y
2
0
to the total energy.
Diagonalization of the electronic Hamiltonian
The Hamiltonian (11.7) can be diagonalized via a Bogoliubov transformation
a
ks
= α
∗
ks
A
ks
−β
ks
B
ks
b
ks
= β
∗
ks
A
ks
+α
ks
B
ks
(11.8)
where α
ks
, β
ks
are cnumbers, satisfying the relationship [α
ks
[
2
+[β
ks
[
2
= 1, and chosen such
as to satisfy the anticommutation relations ¦A
ks
, A
†
ks
¦ = 1 and ¦B
ks
, B
†
ks
¦ = 1; all other
111
11 Solitons in conducting polymers
anticommutators must vanish; furthermore, α
ks
can be chosen to be real and positive. The
procedure leads to
α
ks
=
1
√
2
_
1 −
q
E
q
_
β
ks
=
i
√
2
_
1 +
q
E
q
_
,
where
E
q
=
_
2
q
+ ∆
2
q
= 2t
0
_
1 −(1 −z
2
) sin
2
qa , (11.9)
with z = ∆
0
/2t
0
, and a twoband diagonal Hamiltonian:
H
0
=
qs
E
q
_
A
†
qs
A
qs
−B
†
qs
B
qs
¸
. (11.10)
The energy bands now form a gap of width 2∆
0
at the Brillouin zone edge (cf. Fig. 11.1).
In order to see whether this instability will materialize spontaneously, it is necessary to
compute the total ground state energy and compare it with that of the undimerized state.
The ground state energy of H
0
is
−2
q
E
q
= 2
Na
2π
2t
0
_
π/2a
−π/2a
dq
_
1 −(1 −z
2
) sin
2
qa
= −N
4
π
t
0
_
π/2
0
dx
_
1 −(1 −z
2
) sin
2
x
= −N
4
π
t
0
E(z)
where E(z) is the complete elliptic integral of the second kind. In addition, the dimerized
conﬁguration includes a contribution
H
0
= 2Nt
0
1
πλ
z
2
,
from the lattice deformation, where
λ =
4
π
α
2
Kt
0
is the dimensionless electronphonon coupling constant. Collecting terms and using the
smallz expansion of the elliptic integral,
E(z) ≈ 1 +
1
2
z
2
_
ln
4
[z[
−
1
2
_
I obtain a total energy per site
E
0
(z) = −
4t
0
π
_
1 +
1
2
z
2
_
ln
4
[z[
−
1
2
−
1
λ
__
.
Subtracting the energy E
0
(0) of the undimerized state gives
∆E
0
(z) = −
2t
0
π
z
2
_
ln
4
[z[
−
1
2
−
1
λ
_
(11.11)
112
11 Solitons in conducting polymers
1 0 1
1
0
1
z=0.2
e
n
e
r
g
y
qa/π
0.2 0.0 0.2
0.002
0.000
0.002
0.004
λ=0.4
∆E
0
/(4t
0
)
z
Figure 11.1: Left panel: electronic spectra of the undimerized (dotted curve) and dimerized (dashed
curve) cases of the SSH model; the energy is in units of 2t
0
. The Peierls gap is formed
at the edge of Brillouin zone. Right panel: the energy (11.11) of dimerization as a
function of the dimensionless parameter z.
as the energetic advantage of dimerization. Fig. 11.1 shows that the dimerization energy
has a double well structure, with minima at z = ±z
0
= ±4e
−1−1/λ
. This corresponds to a
band gap 2∆
0
at the BZ edge (Peierls gap). Expressed as a fraction of the total bandwidth,
2∆
0
4t
0
= z
0
= 4e
−1−1/λ
(11.12)
Note the nonanalytic dependence of the energy gap on the electronphonon coupling con
stant, which bears a formal similarity to the Cooper pair condensation energy in supercon
ductivity. Of course the eﬀect here is the opposite. Switching on the electronphonon inter
action brings about a spontaneous lattice distortion
1
and turns a putative onedimensional
metal into an insulator (Peierls instability).
The double minimum structure of the dimerization potential (11.11) makes plausible the
existence of kinklike solitons, i.e. nonlinear conﬁgurations of the coupled electronphonon
system which “interpolate” between the two degenerate vacua. It turns out that these are
not the only nonlinear conﬁgurations possible. The full arguments will be presented in the
next section.
1
in the (CH)
x
case the lattice distortion corresponds to the formation of alternating single and double
bonds.
113
11 Solitons in conducting polymers
11.2 Solitons and polarons in (CH)
x
11.2.1 A continuum approximation
The original theoretical treatment of solitons in polyacetylene was given by Su, Schrieﬀer and
Heeger [31], who wrote down the electronphonon Hamiltonian of the previous section (SSH
Hamiltonian). Here, I will present an alternative formulation, due to Takayama, LinLiu
and Maki (TLM) [32], which has the advantage of being more tractable analytically.
The objective of the theory is to look for exact, nonlinear conﬁgurations of the electron
phonon theory. Assuming that any such conﬁgurations are obtainable as smooth spatial
variations from the basic dimerization pattern (note the analogy with antiferromagnetic
solitons!), it is reasonable to write down the Ansatz
c
n
= e
ik
F
na
ˆ u
n
−ie
−ik
F
na
ˆ v
n
for the operator c
n
; the idea is that we can approximate the operators ˆ u
n
, ˆ v
n
by continuum
ﬁeld operators, i.e. ˆ u
n
→
√
aˆ u(x); this translates
¦c
n
, c
†
n
¦ = δ
n,n
−→¦ˆ u(x), ˆ u
†
(x
)¦ = δ(x −x
)
and
ˆ u
n+1
−→
√
a
_
ˆ u +a
∂ˆ u
∂x
_
ˆ u
†
n+1
−→
√
a
_
ˆ u
†
−a
∂ˆ u
†
∂x
_
.
The lattice distortion also forms a smooth variation with respect to the dimerization pattern,
y
n
= (−1)
n
1
4α
∆(x) .
Furthermore, it should be clear that only the electrons which are near the Fermi level may
contribute to the physics; for k ≈ k
F
I can approximate the dispersion relation (measuring
k from k
F
) by
−2t
0
cos[(k ±k
F
)a] = ±2t
0
sinka ≈ ±2t
0
ka ≈ ±v
F
k ,
where ¯hv
F
= 2at
0
.
Under these assumptions, the electronic part of the SSH Hamiltonian, including the
electronphonon interactions, transforms to
H
0
=
_
dx
ˆ
Ψ
†
(x)
_
−i¯hv
F
σ
3
∂
∂x
+ ∆(x)σ
1
_
ˆ
Ψ(x) (11.13)
where
σ
3
=
_
1 0
0 −1
_
σ
1
=
_
0 1
1 0
_
are Pauli matrices, and
ˆ
Ψ(x) =
_
ˆ u(x)
ˆ v(x)
_
.
The lattice part is
H
L
=
ω
2
Q
2g
2
_
dx∆
2
(x) (11.14)
114
11 Solitons in conducting polymers
where, conforming to standard ﬁeld theoretic notation, ω
2
Q
= 4K/M and g = 4α(a/M)
1/2
.
I now look for the ground state of the Hamiltonian, allowing for any smooth deformation
∆(x) of the lattice. Using standard secondquantized notation for the operators
ˆ u(x) =
l
u
l
(x)A
l
ˆ v(x) =
l
v
l
(x)B
l
I try to ﬁnd the set of normalized oneelectron states
Ψ
l
(x) =
_
u
l
(x)
v
l
(x)
_
.
and the deformation ﬁeld ∆(x) which minimizes the ground state energy
< 0[H[0 >=
l
_
dxΨ
∗
l
(x)
_
−i¯hv
F
σ
3
∂
∂x
+ ∆(x)σ
1
_
Ψ
l
(x) +
ω
2
Q
2g
2
_
dx∆
2
(x) .
This is a variational problem subject to the constraints imposed by the normalization of
oneelectron states
_
dx Ψ
∗
l
(x)Ψ
l
(x) = 1 ∀ l ;
it can be worked out by unrestricted minimization of
< 0[H[0 > −
l
l
_
dxΨ
∗
l
(x)Ψ
l
(x)
and subsequent determination of the Lagrange multipliers to ﬁt the normalization constraint.
The procedure leads to the Bogoliubovde Gennes equations, ﬁrst derived in the context of
the theory of superconductivity:
_
−i¯hv
F
σ
3
∂
∂x
+ ∆(x)σ
1
_
Ψ
l
(x) =
l
Ψ
l
(x)
l
Ψ
∗
l
(x)Ψ
l
(x) +
ω
2
Q
g
2
∆(x) = 0 (11.15)
or, in component form,
−i¯hv
F
∂
∂x
u
l
+ ∆(x)v
l
=
l
u
l
i¯hv
F
∂
∂x
v
l
+ ∆(x)u
l
=
l
v
l
l
[u
∗
l
v
l
+u
l
v
∗
l
] +
ω
2
Q
g
2
∆(x) = 0 . (11.16)
The ﬁrst of (11.15) comes from extremization with respect to the electronic wave function
Ψ
∗
l
(x), and the second from extremization with respect to the displacement ﬁeld ∆(x). Prac
tically, one treats ∆(x) as a parameter, describing a class of displacements, e.g. dimerization,
solitonlike pattern, etc., and computes the total energy corresponding to the particular dis
placement class,
E =
l
_
dx Ψ
∗
l
(x)
l
Ψ
l
(x) +
ω
2
Q
2g
2
_
dx ∆
2
(x)
=
l
l
+
ω
2
Q
2g
2
_
dx ∆
2
(x) ; (11.17)
115
11 Solitons in conducting polymers
note that the sum runs over all occupied states (factor 2 from spin implicit!).
It will prove useful to recast the electronic part of the BdG equations by deﬁning f
(±)
l
=
u
l
±iv
l
, as
−i¯hv
F
f
(−)
l
+i∆f
(−)
l
=
l
f
(+)
l
−i¯hv
F
f
(+)
l
−i∆f
(+)
l
=
l
f
(−)
l
. (11.18)
If
l
,= 0, it is possible to use the ﬁrst of these equations to express f
(+)
in terms of f
(−)
;
inserting the result in the second equation gives
_
¯h
2
v
2
F
∂
2
∂x
2
+
2
l
−∆
2
(x) −¯hv
F
∂∆(x)
∂x
_
f
(−)
l
(x) = 0 . (11.19)
11.2.2 Dimerization
The continuum theory describes the Peierls instability. This can be seen by inserting a
constant displacement ﬁeld ∆ in (11.19). The solutions are plane waves
f
(−)
q
(x) = N
q
e
iqx
with
q
= ±
_
∆
2
+ (¯hv
F
q)
2
,
and N
q
a normalization constant to be determined. The total energy (electronic ground
state plus deformation energy) is
2
q
q
+
ω
2
Q
2g
2
L∆
2
where the factor 2 comes from the spin states and L is the length of the chain. In order to
obtain the excess energy due to dimerization we must subtract the energy of the uniform
state. This gives a dimerization energy
E(∆) = −2
L
2π
_
Λ
−Λ
dq
_
_
∆
2
+ (¯ hv
F
q)
2
−¯hv
F
[q[
_
+
L∆
2
πλ¯hv
F
(11.20)
where, in the ﬁrst term we have used
q
→
L
2π
_
dq and introduced a cutoﬀ to treat
ultraviolet divergences, and in the second term we have used the dimensionless coupling
constant λ =
2
πv
F
¯ h
g
2
ω
2
Q
. Introducing the dimensionless variable z = ¯hv
F
q/∆, we obtain
E(∆) = −
2L
π
∆
2
¯hv
F
_
z
m
0
dz
_
_
1 +z
2
−z
_
+
L∆
2
πλ¯hv
F
= −
2L
π
∆
2
¯hv
F
1
2
_
z
m
_
1 +z
2
m
+ ln
_
z
m
+
_
1 +z
2
m
_
−z
2
m
_
+
L∆
2
πλ¯hv
F
≈ −
2L
π
Λ
W
∆
2
_
1
2
−
1
λ
+ ln
W
∆
_
(11.21)
where z
m
= ¯hv
F
Λ/∆ = W/(2∆) in terms of the bandwidth W = 2¯hv
F
Λ which comes with
the ﬁnite cutoﬀ. The approximation should hold as long as the gap is small compared to
the bandwidth. The dimerization energy (11.21) has a minimum at
∆
0
= We
−1/λ
,
corresponding to a lowering of the total energy by an amount −L∆
2
0
/(2π¯hv
F
). The ground
state will therefore be dimerized  just as in the discrete version of the model . The gap
which opens at the Fermi level is 2∆
0
. This is the energy scale  needed to create an
electronhole pair  with which other, nonlinear elementary excitations should be compared.
116
11 Solitons in conducting polymers
0
0
v
u
e
n
e
r
g
y
q
Figure 11.2: Electronic energy bands in the dimerized state of the TLM model (u, v); also shown
are the bands in the undimerized case (dotted straight lines). The electronic energy
spectra in the discrete lattice (SSH) case (dashed curves) are shown for comparison;
note that the negative wavevectors of Fig. 11.1 have been translated by an amount
2π/a in order to bring the gap at zero wavevector.
11.2.3 The soliton
The Bogoliubovde Gennes equations turn out to be exactly solvable for the class of lattice
deformations described by
∆(x) = ∆
0
tanh
x
ξ
.
The tanh Ansatz, introduced in (11.19) gives
_
−∆
2
0
ξ
2
0
∂
2
∂x
2
+
2
l
−∆
2
0
+ ∆
2
0
_
1 −
ξ
0
ξ
_
sech
2
x
ξ
_
f
(−)
l
(x) = 0 , (11.22)
where ξ
0
= ¯hv
F
/∆
0
is a length characteristic of the dimerized state. The above equation
is analytically solvable in terms of hypergeometric functions. Here, I will only show the
solution in the special case where the characteristic width of the kink ξ is equal to ξ
0
2
. In
this special case, the sech
2
term in (11.22) disappears, and the solutions are plane waves,
f
(−)
q
(x) = N
q
e
iqx
with
q
= ±∆
0
_
1 +ξ
2
0
q
2
,
i.e. the f
(−)
q
solutions are identical with those of the exactly dimerized state. However, in
the case of the tanh deformation, the f
(+)
q
solution derived from (11.18) is
f
(+)
q
=
¯hv
F
q +i∆(x)
q
f
(−)
q
= ±
qξ
0
+i tanh
x
ξ
0
_
1 +ξ
2
0
q
2
N
q
e
iqx
(11.23)
2
It turns out [32] that this produces the soliton with the minimal energy
117
11 Solitons in conducting polymers
where the ± refers to the sign of the energy. At [x[ ¸ ξ
0
this is eﬀectively a plane wave;
however, there is a phase diﬀerence:
lim
x→−∞
f
(+)
q
(x) ∝ e
iqx−iδ(q)/2
lim
x→+∞
f
(+)
q
(x) ∝ e
iqx+iδ(q)/2
(11.24)
where δ(q)/2 = arctan(1/qξ
0
).
In addition, we can now investigate whether the BdG equations (11.18) admit zeroenergy
solutions (something that could be immediately excluded in the dimerized case). It can be
readily seen that this is the case here, with
f
(−)
= cosh
x
ξ
0
f
(+)
= sech
x
ξ
0
.
The f
(−)
state is not normalizable; but the f
(+)
is a legitimate localized state with energy
exactly at midgap. I will return to its interpretation shortly.
The modiﬁcations in the electronic spectrum, i.e. the phase shift δ(q) of the extended
states and the appearance of a localized state at midgap, have important physical conse
quences. Phase shifts are important because, as I ﬁrst discussed in the context of scalar ﬁeld
theories, they modify the density of states in qspace. Let us recall: if we demand periodic
boundary conditions on a chain of length L, the phase of the wave function on the left end
should diﬀer from the phase on the right end by a multiple of 2π. Thus
q
n
L = 2πn (n = 0, ±1, ) if ∆ = 0 (11.25)
q
n
L +δ(q
n
) = 2πn (n = 0, ±1, ) if ∆ = ∆
0
tanh
x
ξ
0
(11.26)
For large L, this means that the wavevector q of an electronic state is shifted by an amount
q
n
− q
n
= −δ(q
n
)/L. This is true with one exception. Eq. (11.26) has no solutions if
n = 0. In other words the zero wavevector state does not exist in the presence of the soliton.
How does this modify the energy of the coupled electronphonon system, compared with the
energy of the [dimerized] groundstate? The energy diﬀerence to be calculated is
2
1
2
_
_
n=0
_
q
n
−
q
n
_
+¦0 −
0
¦
_
_
+
1
πλ¯hv
F
_
dx
_
∆
2
0
tanh
2
x −∆
2
0
_
.
The ﬁrst term comes from the shift in electronic states discussed above. Note that all
contributions refer to occupied states, i.e. states of negative energy. The factor 2 comes
from taking the spin into account. The factor 1/2 is there because electronic states are
linear superpositions of f
(−)
q
and f
(+)
q
and only the latter are shifted compared to the
pure dimerized state. The sum does not include the n = 0 state. The term in curly
brackets expresses the absence of the zero wavevector state from the deformed state  and
its presence in the dimerized state. The second term is the change in the elastic energy due
to the deformation. Finally, note that the midgap state does not appear in this calculation
because it has zero energy.
Transforming the sum into an integral, using a Taylor expansion
q
−
q
≈
q
(q
− q) =
−
q
δ(q)/L, and noting that
and δ are both odd functions of q, I obtain a soliton energy
E
s
=
L
2π
2
_
Λ
0
+
dq
q
−δ(q)
L
−
0
−
2
πλ
∆
0
118
11 Solitons in conducting polymers
= −
1
π
q
δ(q)
¸
¸
¸
¸
Λ
0
+
+
1
π
_
Λ
0
+
dq
q
δ
(q) + ∆
0
−
2
πλ
∆
0
≈
∆
0
π
_
2
_
1 + Λ
2
ξ
2
0
Λξ
0
−π
_
+
2∆
0
ξ
0
π
_
Λ
0
+
dq
_
1 +q
2
ξ
2
0
+ ∆
0
−
2
πλ
∆
0
≈
2
π
∆
0
+
2
π
∆
0
ln(2Λξ
0
) −
2
πλ
∆
0
=
2
π
∆
0
, (11.27)
where the approximation signs mean leading terms in cutoﬀdependent quantities. Within
the general context of continuum theory, the expression obtained is exact.
Let me summarize what has been derived: A lattice deformation of a tanh type (kink)
can exist in the coupled electronphonon system. It modiﬁes the electronic spectrum taking
“half a state” away from the top of the valence band (the q = 0 state corresponding to
one of the two branches of solutions) and creating a localized state (localized around the
center of the kink) at midgap. What remains in the Fermi sea consists of paired states, i.e.
both spin up and spin down states are occupied. However, the localized state at midgap is
unpaired. Therefore, the soliton excitation (which should be understood to consist of the
lattice deformation, the localized state at midgap, and the small shifts in qvalues of states
in the Fermi sea, the socalled “backﬂow”) has a spin 1/2 and a charge 0 (owing to overall
electrical neutrality). The energy 2∆
0
/π needed to create a soliton is less than ∆
0
. Or, in
terms of what really happens: The energy 4∆
0
/π needed to create a kinkantikink pair of
solitons is less than the 2∆
0
needed to create an electronhole pair. This is why solitons are
of practical importance in determining the conductivity of polyacetylene.
A further comment: it is in principle possible to feed an extra electron at the midgap
state, thus obtaining a charged object with Q = −[e[ and spin 0. Or one can remove
the electron from the midgap state, creating a soliton with positive charge and zero spin.
This is the physical principle behind doping in polyacetylene and the unusual spin/charge
relationships observed experimentally.
11.2.4 The polaron
The soliton solution interpolates from the ABAB.. to the BABA... dimerization pattern. Is
it possible to have a local deformation which starts oﬀ at the ABAB... dimerization pattern,
make a possibly large change, perhaps go oﬀ to the BABA... pattern and return to the
original ABAB... pattern? In other words, can we ﬁnd deformation patterns of the type
∆(x) = ∆
0
−C ¦tanh[κ(x +x
0
)] −tanh[κ(x −x
0
)]¦
which will solve the BdG equations? A way to achieve this would be to adjust the parameters
so that the eﬀective potential term in (11.19) should be a pure sech
2
term. Indeed, after
some rearrangements, it turns out that the choice C = ∆
0
tanh(2κx
0
) leads to
∆
2
+ ¯hv
F
∆
= ∆
2
0
− ∆
2
0
tanh(2κx
0
) [tanh(2κx
0
) +κξ
0
)] sech
2
[κ(x +x
0
)]
− ∆
2
0
tanh(2κx
0
) [tanh(2κx
0
) −κξ
0
)] sech
2
[κ(x −x
0
)] .
It is possible to make either one of the two sech
2
functions disappear; the choice which leads
to an attractive eﬀective potential is
tanh(2κx
0
) = κξ
0
. (11.28)
As long as κ does not exceed 1/ξ
0
, this condition will specify an x
0
as a function of κ. Let
me therefore denote acceptable parameter values as κ = ξ
−1
0
sin θ. The eﬀective potential
119
11 Solitons in conducting polymers
now has a singlewell form,
∆
2
+ ¯hv
F
∆
= ∆
2
0
_
1 −2κ
2
ξ
2
0
sech
2
[κ(x +x
0
)]
_
(11.29)
with which I may recast (11.19), using the dimensionless variable y = κ(x + x
0
) and the
dimensionless eigenvalue r
l
= (
2
l
−∆
2
0
)/(∆
0
κξ
0
)
2
, as
_
−
∂
2
∂y
2
−2 sech
2
y
_
˜
f
(−)
l
(y) = r
l
˜
f
(−)
l
(y) . (11.30)
Localized eigenstates of the BdG equations
The recasting is useful in order to recognize the prefactor in the potential as n(n + 1) with
n = 1, which gives a single localized eigenfunction at r
l
= −1, corresponding to
b
= ±∆
0
_
1 −κ
2
ξ
2
0
= ±∆
0
cos θ .
Note that the bound states  provided they exist, which we still have to establish by ﬁnding
acceptable values of κ (or θ) is not at midgap. Returning to the original units, I write the
bound state eigenfunction as
f
(−)
b
(x) = N
b
sech[κ(x +x
0
)] (11.31)
and from the BdG equation ...
f
(+)
b
(x) = ±iN
b
sech[κ(x −x
0
)] (11.32)
where the ± sign matches the sign of the energy. The u − v eigenstates corresponding to
the two energies ±∆
0
cos θ are
u
±
b
(x) =
N
b
2
[sechκ(x +x
0
) ±i sechκ(x −x
0
)]
v
±
b
(x) =
N
b
2
[i sechκ(x +x
0
) ±sechκ(x −x
0
)] . (11.33)
Their form shows that there is equal probability for the localized electron to be near x
0
or
−x
0
.
Extended eigenstates of the BdG equations
The extended states of the BdG equations are
f
(−)
q
(x) = N
q
[−iq +κ tanh κ(x +x
0
)] e
iqx
f
(+)
q
(x) = ±N
q
qξ
0
+i
_
1 +q
2
ξ
2
0
[−iq +κ tanh κ(x −x
0
)] e
iqx
where again the ± sign matches the sign of the energy. The phase shift, in both cases, is
δ(q) = 2 arccot
q
κ
. (11.34)
It is shown in Fig. . It runs, just like in the soliton case, from zero to −π, makes a jump
at q = 0 from −π to π, and then drops oﬀ to zero as q approaches inﬁnity. The similarity
with the soliton case is deceptive. This phase shift is really the entire physical shift of the
eigenfunction  not of half the eigenfunction. Both f
s are phaseshifted by this amount
(therefore the physical eigenstates u and v as well). As a result, the q = 0 state disappears
entirely (not by a half!) from the valence band. There is, just like in the soliton case, a
backﬂow in the Fermi sea, which redistributes qvectors as a result of the phase shift.
120
11 Solitons in conducting polymers
The total energy
The states which appear in the gap can in principle be occupied singly, doubly, or not at all.
But because the energies of the localized states are now nonzero, this will aﬀect the total
energy of the object. Let n
+
, n
−
= 0, 1, 2 be, respectively, the populations of the ±∆
0
cos θ
localized energy state. The total energy will again consist of an electronic part
2
n=0
_
q
n
−
q
n
_
+ (n
+
−n
−
)∆
0
sin θ −(−2∆
0
)
and a lattice deformation part. After some calculations analogous to the ones for the soliton,
this sums up to
E
p
∆
0
=
4
π
sin θ +
4
π
_
π
2
−θ +
π
4
(n
+
−n
−
)
_
cos θ
Considered as a function of θ, the total energy has a minimum at
θ
0
=
π
4
(n
+
−n
−
+ 2) .
The energy of the polaron
E
p
=
4
π
∆
0
sin θ
0
(11.35)
will therefore depend on the occupation of the gap states. The following cases can be
distinguished:
• n
−
− n
+
= 2. We would expect this to be the lowest energy state of the polaron,
where the two electrons excluded from the valence band end up, paired, in the lowest
localized state (n
−
= 2, n
+
= 0). This state has θ
0
= 0, i.e. κ = 0, E
p
= 0. The
lowest localized state in reality has returned to the valence band. The deformation
corresponds to a constant ∆
0
. This “polaron” is really nothing but the pure dimerized
state.
• n
+
= n
−
, θ
0
= π/2. The energy is 4∆
0
/π, the separation x
0
becomes inﬁnite. This is
really a kink/antikink pair with its components entirely separated.
• n
−
− n
+
= 1, θ
0
= π/4. It follows that κξ
0
= 2
−1/2
. The resulting separation x
0
is ﬁnite, x
0
/ξ
0
= arctanh(2
−1/2
)/(2
1/2
). The energy is equal to 2
√
2/π∆
0
≈ 0.9∆
0
.
There are two ways to form this polaron: either by n
−
= 2, n
+
= 1, i.e. the lower bound
state is doubly, and the upper singly occupied, which makes it a charged excitation
with an unpaired spin (Q = −[e[, S = 1/2); or by n
−
= 1, n
+
= 0, i.e. the lower bound
state is singly occupied and the upper is empty; this would be a “hole”polaron, with
Q = [e[, S = 1/2. Note that the spin/charge relationships of the polaron are the same
as in ordinary electrons and holes. However, the energy to create a pair of polarons is
about 10% lower than that required to create an electron/hole pair.
There are no other possibilities. What the various combinations of the second case imply in
terms of donor/acceptor character and spin/charge properties is summarized in Fig. ..
121
12 Solitons in nonlinear optics
12.1 Background: Interaction of light with matter,
MaxwellBloch equations
12.1.1 Semiclassical theoretical framework and notation
The propagation of electromagnetic waves through a material (gaseous) medium is modeled,
at a semiclassical level, as follows:
• The medium is considered as an assembly of quantummechanical twolevel systems
(2LS) described by a Hamiltonian
H
0
=
1
2
¯hω
0
_
1 0
0 1
_
with basis states
[ ↑>=
_
1
0
_
, [ ↓>=
_
1
0
_
and a general (mixed) state
[Ψ >=
_
α
β
_
= α[ ↑> +β[ ↓> .
The density of 2LS is n
0
. Each 2LS carries an electric dipole moment. The corre
sponding dipole moment operator is
ˆ p = q
_
0 1
1 0
_
.
• The electromagnetic ﬁeld is treated at a classical level. The electric ﬁeld, propagating
along the xdirection, satisﬁes Maxwell’s wave equation
_
∂
2
∂t
2
−c
2
∂
2
∂x
2
_
E(x, t) = −4π
∂
2
∂t
2
P(x, t) (12.1)
where the polarization is given by
P = n
0
< Ψ[ ˆ p[Ψ >= n
0
q (α
∗
β +β
∗
α)
. ¸¸ .
=2Re(α
∗
β)≡P
+
. (12.2)
• The 2LSs interact with electromagnetic radiation via a dipole interaction
H
1
= − p
E .
122
12 Solitons in nonlinear optics
12.1.2 Dynamics
The quantum mechanical wavefunction evolves in time according to
i¯h
∂
∂t
[Ψ >= (H
0
+H
1
) [Ψ > ,
or, in component form,
i¯h ˙ α =
¯hω
0
2
α −(q
E)β (12.3)
i¯h
˙
β = −
¯hω
0
2
β −(q
E)α . (12.4)
Let
P
−
= i (α
∗
β +β
∗
α) = 2Im(α
∗
β)
and
Z = P
+
+iP
−
= 2α
∗
β
and
^ = [α[
2
−[β[
2
.
Then one can explicitly verify that
i¯h
˙
Z = −¯hω
0
Z −2(q
E)^ ,
or, taking real and imaginary parts,
∂P
+
∂t
= −ω
0
P
−
(12.5)
∂P
−
∂t
= ω
0
P
+
+
2
¯h
(q
E)^ . (12.6)
Moreover, multiplying (12.3) by α
∗
, (12.4) by β
∗
, taking the real part of sum of the two
equations, we obtain
∂^
∂t
= −
2
¯h
(q
E)P
−
(12.7)
The triad of eqs (12.5), (12.6), (12.7) for the real functions P
−
, P
+
, ^ is equivalent to the
pair of eqs (12.3), (12.4) for the two complex amplitudes α, β. This is because the complex
amplitudes have a constant normalization, which we choose as unity:
[α[
2
+[β[
2
= 1
The system of equations (12.1)  with the right hand side given by
¨
P = n
0
q
¨
P
+
= −n
0
ω
0
q
˙
P
−
= −n
0
ω
2
0
q P
+
−
2
¯h
n
0
ω
0
(q
E)q^ (12.8)
 , (12.5), (12.6) and (12.7) are known as the MaxwellBloch (MB) equations. They were
originally derived in the context of nuclear magnetic resonance (with the correspondence
S
x
↔P
+
, S
y
↔P
−
, S
z
↔^).
12.2 Propagation at resonance. Selfinduced transparency
12.2.1 Slow modulation of the optical wave
At resonance, the carrier (optical) frequency of the electromagnetic wave coincides with the
frequency of the 2LS:
ω = ω
0
123
12 Solitons in nonlinear optics
We look for solutions of the MB equations of the form
E =
¯h
q
c cos(kx −ω
0
t +φ
. ¸¸ .
Ψ
) (12.9)
where k = ω
0
/c and c, φ are slowly varying functions of x, t; furthermore, we are taking the
ﬁeld to be polarized in the zdirection and q = qˆ z. Using the transformation
P
+
= Qcos Ψ +TsinΨ (12.10)
P
−
= T cos Ψ−Qsin Ψ (12.11)
one can rewrite (12.5), (12.6), respectively, as
cos Ψ
_
∂Q
∂t
+T
∂φ
∂t
_
+ sin Ψ
_
∂T
∂t
−Q
∂φ
∂t
_
= 0
cos Ψ
_
∂T
∂t
−Q
∂φ
∂t
_
−sin Ψ
_
∂Q
∂t
+T
∂φ
∂t
_
= 2c^ cos Ψ .
The above equations can be further simpliﬁed if we (i) multiply the ﬁrst by sin Ψ, the second
by cos Ψ and then add them, or (ii) multiply the ﬁrst by cos Ψ, the second by sin Ψ and then
subtract them. The result is
∂T
∂t
−Q
∂φ
∂t
= 2c^ cos
2
Ψ
∂Q
∂t
+T
∂φ
∂t
= −c^ sin 2Ψ
or, averaging over a period 2π/ω
0
of the (fast) carrier wave,
∂T
∂t
−Q
∂φ
∂t
= c^ (12.12)
∂Q
∂t
+T
∂φ
∂t
= 0 . (12.13)
Similarly, (12.7) can be rewritten as
∂^
∂t
= −2c cos Ψ(−Qsin Ψ +T cos Ψ)
which averages to
∂^
∂t
= −cT . (12.14)
Finally, keeping only the ﬁrst term in (12.8)  a valid approximation as long as ω
0
¸ c ,
transforms the ﬁeld equation (12.1) to
_
∂
∂t
+c
∂
∂x
__
∂
∂t
−c
∂
∂x
_
c cos Ψ = 2c α
(Qcos Ψ +T sin Ψ) , (12.15)
where
α
=
2πn
0
q
2
ω
0
¯hc
.
At resonance, the left hand side of (12.15) simpliﬁes considerably. We recognize
_
∂
∂t
−c
∂
∂x
_
c cos Ψ ≈ 2ω
0
c sin Ψ
124
12 Solitons in nonlinear optics
and
_
∂
∂t
+c
∂
∂x
_
c sin Ψ =
_
∂c
∂t
+c
∂c
∂x
_
sin Ψ + cos Ψ
_
∂φ
∂t
+c
∂φ
∂x
_
.
Combining these, and matching sine and cosine terms in (12.15) results in
∂c
∂t
+c
∂c
∂x
= cα
T (12.16)
c
_
∂φ
∂t
+c
∂φ
∂x
_
= cα
Q . (12.17)
The set of 5 equations (12.12), (12.13), (12.14), (12.16), (12.16) describes the slow modula
tion of the coupled wavemedium system variables T, Q, ^, c, φ.
12.2.2 Further simpliﬁcations: Selfinduced transparency
In the approximation of vanishing phase
φ = 0
(12.17) implies Q = 0. This leaves a reduced set of three equations
∂T
∂t
= c^
∂^
∂t
= −cT
∂c
∂t
+c
∂c
∂x
= cα
T ,
where the ﬁrst two have an obvious ﬁrst integral,
^
2
+T
2
= const
This suggests the parametrization
T = ±sinσ , ^ = ±cos σ ,
which in turn implies, from the ﬁrst two equations,
c =
∂σ
∂t
;
the last equation,
_
∂
∂t
+c
∂
∂x
_
∂σ
∂t
= ±cα
sin σ
can be cast into a more useful form by introducing new space and time coordinates
ξ = α
x
τ = t −
x
c
.
The transformed version
∂
2
σ
∂ξ∂τ
= ±sin σ (12.18)
is a form of the SineGordon(SG) equation.
125
12 Solitons in nonlinear optics
Propagating solutions. Slowing down of light
The SG equation is completely integrable. It is known to admit multisoliton solutions. Here
I will restrict myself to some of the properties of single solitons. A property of (12.18) is
that it admits solutions of the form σ(z), where
z = aτ −
ξ
a
= a
_
t −
x
c
−
α
x
a
2
_
(12.19)
and a is an arbitrary constant (a
−1
will be identiﬁed as the pulse duration). Introducing
this type of solution Ansatz to (12.18) leads to
d
2
σ
dz
2
= ±sinσ . (12.20)
Before I discuss the properties of the solution, let me note that (12.19) can be further
rewritten in the form
z = a
_
t −
x
v
_
with
1
v
=
1
c
+
α
a
2
(12.21)
which implies that light will be slowed down.
The 2π pulse
The choice of the lower sign in (12.20) leads to the wellknown kink/antikink solutions of
the SG equation,
σ = 4 arctan e
±(z−z
0
)
.
The resulting ﬁeld is
c =
∂σ
∂τ
= a
dσ
dz
= ±
2a
cosh[a(t −
x
v
−t
0
)]
and satisﬁes the sumrule
_
∞
−∞
dt c(x, t) =
_
∞
−∞
dτ
∂σ
∂τ
= σ(∞) −σ(−∞) = 2π . (12.22)
The 2LS inversion
^ = −cos σ = −1 +
2
cosh
2
[a(t −
x
v
−t
0
)]
.
starts oﬀ at −1 as t →−∞, increases up to 1 as the pulse reaches the 2LS, and then returns
to −1 as t →∞. Thus, the electromagnetic wave brings about a temporary inversion of the
2LS; as the pulse travels further however, the 2LS becomes deexcited and gives the energy
back to the ﬁeld. No net absorption of energy occurs. This is the phenomenon of selfinduced
transparency.
12.3 Selffocusing oﬀresonance.
12.3.1 Oﬀresonance limit of the MB equations
Oﬀresonance propagation occurs when the carrier frequency of the optical wave is much
lower than the eigenfrequency of the 2LS:
ω ¸ω
0
(12.23)
126
12 Solitons in nonlinear optics
In this case, the ﬁeld does not cause inversion, i.e. the MB equations hold with ^ ≈ −1. The
dominant time dependence of the polarization vector is determined by the carrier frequency
of the wave, hence
¨
P ≈ −ω
2
P
and
¨
P
+
≈ −ω
2
P
+
.
On the other hand, (12.5) and (12.6) with ^ ≈= −1 imply that
¨
P
+
= −ω
2
0
P
+
+
2ω
0
¯h
q
E
i.e.
_
1 −
ω
2
0
ω
2
_
¨
P
+
=
2ω
0
¯h
q
E
and, making use of the oﬀresonance condition (12.23),
¨
P
+
= −
2ω
2
¯hω
0
q
E (12.24)
P
+
=
2
¯hω
0
q
E . (12.25)
Inserting (12.24) in the ﬁeld equation (12.1) yields
_
∂
2
∂t
2
−c
2
∂
2
∂x
2
_
E(x, t) =
8πn
0
¯hω
0
ω
2
(q
E)q .
Up to now, we implicitly assumed that all 2LS carry the same dipole moment. This is
of course not quite true. Even if the magnitude of the dipole moment is the same (an
assumption which we will make), the random orientation of 2LS in a medium implies a
distribution of values for each component. If the ﬁeld is polarized along the zdirection,
what really enters the righthandside of the ﬁeld equation is an orientational average of q
2
z
.
Therefore,
_
∂
2
∂t
2
−c
2
∂
2
∂x
2
_
E =
8πn
0
< q
2
z
>
¯hω
0
ω
2
E . (12.26)
Modulation of the carrier wave
We will again look for solutions of the form
E(x, t) = c
c
cos(kx −ωt) +c
s
sin(kx −ωt) (12.27)
where ω = ck and c
c
, c
s
are slowly varying functions of x, t. In this context, the slowly
varying modulation of the ﬁeld responds only to “averages” over the fast phase. Thus, for
example, the shorttime average of the square of the ﬁeld (over a period 2π/ω) will be
E
2
=
1
2
_
c
2
c
+c
2
s
_
. (12.28)
12.3.2 Nonlinear terms
The orientational average of the square of the zcomponent of the dipole moment
< q
2
z
>=< cos
2
θ > q
2
127
12 Solitons in nonlinear optics
is deﬁned as
< cos
2
θ >=
_
1
−1
d cos θ cos
2
θ e
−βH
1
_
1
−1
d cos θ e
−βH
1
(12.29)
where
H
1
= − p
E = −q
z
EP
+
= −
2q
2
z
¯hω
0
E
2
= −
2q
2
¯hω
0
E
2
cos
2
θ ,
β is the inverse temperature, and I have made use of (12.25). Furthermore, since I am
interested in slow wave modulation, it is legitimate to substitute the square of the ﬁeld by
its time average over a period of the carrier wave, using (12.28). Deﬁning
ρ = β
q
2
¯hω
0
_
c
2
c
+c
2
s
_
I can rewrite
< cos
2
θ >=
∂
∂ρ
ln I(ρ)
where
I(ρ) =
_
1
−1
dx e
ρx
2
≈ 2
_
1 +
1
3
ρ +
1
2
1
5
ρ
2
+O(ρ
3
)
_
,
where the expansion is valid in the limit of low ﬁelds and/or high temperatures, ρ ¸1. In
this limit
ln I(ρ) ≈ ln 2 +
1
3
ρ +
_
1
10
−
1
18
_
ρ
2
< cos
2
θ > ≈
1
3
_
1 +
4
15
ρ
_
. (12.30)
and the wave equation (12.26) can be rewritten as
_
∂
2
∂t
2
−c
2
∂
2
∂x
2
_
E =
_
G
0
+G
2
_
c
2
c
+c
2
s
_¸
ω
2
E . (12.31)
with
G
0
=
8π
3
n
0
q
2
¯hω
0
G
2
=
4
15
βq
2
¯hω
0
G
0
.
12.3.3 Spacetime dependence of the modulation: the nonlinear
Schr¨odinger equation
Consider the complex modulational ﬁeld
φ = c
c
+ic
s
.
Then, if
F = φe
−i(kx−ωt)
the following properties hold:
E = ReF
[F[
2
= [φ[
2
= c
2
c
+c
2
s
128
12 Solitons in nonlinear optics
Therefore, if F satisﬁes
_
∂
2
∂t
2
−c
2
∂
2
∂x
2
_
F =
_
G
0
+G
2
[F[
2
_
ω
2
F , (12.32)
ReF will satisfy the original ﬁeld equation (12.31). It is therefore suﬃcient to look for
solutions of (12.32).
Now examine the left hand side of (12.32). First note that
_
∂
∂t
+c
∂
∂x
_
F = e
−i(kx−ωt)
_
∂
∂t
+c
∂
∂x
_
φ
and therefore
_
∂
∂t
−c
∂
∂x
__
∂
∂t
+c
∂
∂x
_
F = e
−i(kx−ωt)
__
∂
2
∂t
2
−c
2
∂
2
∂x
2
_
φ + 2iω
_
∂
∂t
+c
∂
∂x
_
φ
_
.
This allows me to rewrite (12.32) as
_
∂
2
∂t
2
−c
2
∂
2
∂x
2
_
φ + 2iω
_
∂
∂t
+c
∂
∂x
_
φ =
_
G
0
+G
2
[φ[
2
_
ω
2
φ (12.33)
which involves only the modulating ﬁeld φ.
Eq. (12.33) is still exact. If we restrict ourselves to slow modulations, the second time
derivative should be small and might be dropped. In this case, introducing new dimensionless
variables
ξ = kx −ωt , τ =
1
2
ωt
transforms (12.33) to
iφ
τ
= φ
ξξ
+
_
G
0
+G
2
[φ[
2
_
φ .
Introducing
φ = G
−1/2
2
exp(−iG
0
t)
ˆ
φ
eliminates the ﬁrst term in the parentheses and rescales the rest, leading to
i
ˆ
φ
τ
=
ˆ
φ
ξξ
+[
ˆ
φ[
2
ˆ
φ , (12.34)
the canonical form of the nonlinear Schr¨odinger (NLS) equation.
12.3.4 Soliton solutions
The NLS equation can be integrated exactly, for arbitrary initial conditions (meaning: suit
ably vanishing at plus and minus inﬁnity), by the inverse scattering transform. This means
that it admits multisoliton pulses as exact solutions  a fact of potentially vast technologi
cal signiﬁcance. The interested reader is referred to the specialized literature. Here, I will
restrict myself to a heuristic derivation of the single pulse solution.
I look for solutions of (12.34) of the form
ˆ
φ = ue
−iθ
with u, θ real. Real and imaginary terms lead, respectively, to
u
τ
+ 2u
ξ
θ
ξ
+uθ
ξξ
= 0
−uθ
τ
+u
ξξ
−uθ
2
ξ
+u
3
= 0 .
129
12 Solitons in nonlinear optics
I use the “traveling wave cum linear phase” Ansatz
θ(ξ, τ) = µτ +
ˆ
θ(z)
u(ξ, τ) = u(z) ,
where z = ξ −λτ, to reduce the PDEs into ODEs:
−λu
z
+ 2u
z
ˆ
θ
z
+u
ˆ
θ
zz
= 0 (12.35)
−µu +λu
ˆ
θ
z
+u
zz
−u
ˆ
θ
2
z
+u
3
= 0 (12.36)
Multiplying the ﬁrst equation by 2u results in
−λ(u
2
)
z
+ 2(u
2
)
z
ˆ
θ
z
+ 2u
2
ˆ
θ
zz
= 0
which has a ﬁrst integral
u
2
(λ −2
ˆ
θ
z
) = constant . (12.37)
An obvious choice for the value of the constant is zero. Introducing
ˆ
θ
z
=
λ
2
(12.38)
into the second equation yields
u
zz
=
_
µ −
λ
2
4
_
u −u
3
=
d
du
_
¸
¸
¸
_
1
2
_
µ −
λ
2
4
_
u
2
−
1
4
u
4
. ¸¸ .
V
eff
(u)
_
¸
¸
¸
_
(12.39)
which looks very much like the ODEs found in the context of scalar ﬁeld theories of the
1 0 1
0.5
0.0
V
e
f
f
(
u
)
u
µλ
2
/4
1
1
Figure 12.1: The eﬀective potential (12.39) in the two
cases µ > λ
2
/4 (upper curve) and µ <
λ
2
/4 (lower curve).
KleinGordon class. The diﬀerence is that, whereas in the solitonbearing KG class the
eﬀective potential had at least two degenerate stable minima, the eﬀective potential here
has either a single minimum (at u = 0)  and two maxima  if µ > λ
2
/4, or no minimum
at all  and a single maximum at u = 0  if µ ≤ λ
2
/4. Such a potential (Fig. 12.1) would
130
12 Solitons in nonlinear optics
of course be entirely unphysical in the context of ﬁeld theory, because of the instability at
large displacements. Here there is no such physical restriction. A solitonlike solution can
occur as long as there is a single, locally stable minimum; it will lead the system from the
local minimum out to either one of the maxima and back to to the local minimum. Note
that this implies that, in the ﬁrst integral of (12.39),
u
2
z
= 2V
eff
(u) +const
the constant vanishes. This leads to the formal second integral
z −z
0
= ±
_
du
1
_
2V (u)
and the bounded solutions
u(z) = ±κ sech
κ(z −z
0
)
√
2
(12.40)
where I have used the more appropriate constant
κ = +
¸
2
_
µ −
λ
2
4
_
.
Note that since I have up to now introduced two arbitrary constants (in addition to the
arbitrary phase z
0
), I can take any value of κ > 0 and λ; µ will then have the ﬁxed (and
positive) value
µ =
κ
2
2
+
λ
2
4
.
To conclude, I note that from (12.38)
ˆ
θ =
λ
2
z +θ
0
(12.41)
where θ
0
is an arbitrary phase. Collecting terms, and returning to the original spacetime
variables,
u(ξ, τ) = ±κ sech
κ(ξ −λτ −z
0
)
√
2
(12.42)
θ(ξ, τ) =
λ
2
_
ξ −
_
λ
2
−
κ
2
λ
_
τ
_
+θ
0
(12.43)
Note that envelope amplitude and phase propagate with diﬀerent velocities, v
e
= λ and
v
ph
= λ/2 −κ
2
/λ, respectively.
131
13 Solitons in BoseEinstein
Condensates
13.1 The GrossPitaevskii equation
Starting point: GrossPitaevskii equation [33, 34] for the weakly interacting Bose gas (par
ticles of mass m):
i¯h
∂
∂t
Ψ
0
=
_
−
¯h
2
2m
∇
2
+V
ext
(r) +g[Ψ
0
[
2
_
Ψ
0
(13.1)
where
• Ψ
0
(r, t) is the condensate wave function; the condensate density is then
n(r) = [Ψ
0
[
2
.
• g = 4π¯h
2
a/m the coupling constant, with
• a the swave scattering amplitude (lowenergy characteristic of the eﬀective potential).
• the external potential, typically of the form
V
ext
(r) = α(x
2
+y
2
) +λz
2
(13.2)
describes a cylindrically symmetric magnetic trap.
In the limit λ ¸α, I look for solutions which depend only on z; these must satisfy
i¯h
∂
∂t
Ψ
0
=
_
−
¯h
2
2m
∂
2
∂z
2
+g[Ψ
0
[
2
_
Ψ
0
(13.3)
which I recognize as the nonlinear Schr¨odinger equation. A useful quantity is the character
istic length deﬁned by
ξ
2
=
¯h
2
2mgn
=
1
8πan
(13.4)
where n is the average condensate density.
13.2 Propagating solutions. Dark solitons
I look for propagating solutions of the form
Ψ
0
=
√
ne
−iµt/¯ h
φ(ζ)
where
ζ =
z −vt
ξ
and µ = gn is the chemical potential. I will treat the case g > 0 (repulsive interaction).
132
13 Solitons in BoseEinstein Condensates
Introducing the above ansatz in (13.3) reduces it to
i
√
2
v
c
dφ
dζ
=
d
2
φ
dζ
2
+
_
1 −[φ[
2
_
φ (13.5)
where c =
_
gn/m is the sound velocity.
Figure 13.1: Absorption images of BEC’s with kinkwise structures propagating in the direction of
the long condensate axis, for diﬀerent evolution times in the magnetic trap, t
ev
. The
moving dark regions can be interpreted as a pair of gray solitons. (From [35]).
I ﬁrst rewrite (13.5) as a system of coupled ODEs for real and imaginary parts of φ =
φ
1
+iφ
2
:
√
2
v
c
dφ
1
dζ
=
d
2
φ
2
dζ
2
+ (1 −φ
2
1
−φ
2
2
)φ
2
−
√
2
v
c
dφ
2
dζ
=
d
2
φ
1
dζ
2
+ (1 −φ
2
1
−φ
2
2
)φ
1
.
I now look for solutions with a constant imaginary part φ
2
= A. These must satisfy
√
2
v
c
dφ
1
dζ
−A(1 −A
2
−φ
2
1
) = 0
d
2
φ
1
dζ
2
+ (1 −A
2
−φ
2
1
)φ
1
= 0 . (13.6)
Muitiplying the ﬁrst equation by φ
1
and the second by A, and taking the diﬀerence gives
A
d
2
φ
1
dζ
2
+
√
2
v
c
dφ
1
dζ
φ
1
= 0
which has a ﬁrst integral
A
dφ
1
dζ
+
√
2
2
v
c
φ
2
1
= C
The latter, ﬁrst order ODE must be however be identical with the ﬁrst of (13.6). This
mandates A
2
= v
2
/c
2
and ﬁxes the constant C. I then obtain the obvious solution
1
γ
tanh
ζ
√
2γ
where γ = (1 −v
2
/c
2
)
−1/2
. Collecting terms, I obtain
φ(x −vt) = i
v
c
+
1
γ
tanh
x −vt
√
2γξ
(13.7)
which has unit amplitude as x → ±∞, and drops to v/c at x = vt (gray soliton). If v = 0
the soliton is dark, i.e. the amplitude vanishes at x = 0. Fig. 13.1 shows an experimental
observation of a dark soliton in a BE condensate.
133
14 Unbinding the double helix
14.1 A nonlinear lattice dynamics approach
14.1.1 Mesoscopic modeling of DNA
Background: thermodynamic phase transitions
Transitions between diﬀerent states of matter (e.g. the transition from the paramagnetic to
the ferromagnetic phase, or the liquidgas transition) are reﬂected in singularities of the ther
modynamic functions (free energy, entropy etc). The modern theory of critical phenomena
developed by Fisher, Kadanoﬀ and Wilson during the 1960’s and 70’s has demonstrated that
the essential features of such mathematical singularities depend only on a few “relevant”
degrees of freedom of the underlying Hamiltonian. As a consequence, substantial eﬀort has
been made by researchers in developing and studying appropriately reduced descriptions,
“minimal” models of many complex phenomena related to transformations between diﬀerent
states of matter.
Figure 14.1: Melting of poly(dI)poly(dC) (after [36]).
Experiment suggests DNA denaturation is a sharp phase transition
The thermal denaturation of DNA (also known as DNA melting) consists of the unbinding
of the double helix into the two component strands. There is no breaking of covalent bonds
along the chain, and the transition is in principle reversible. In the case of DNA chains
which are long (of the order of N ≈ 10
4
base pairs) and homogeneous (i.e. all base pairs
are identical), the transition, as observed by the diﬀerence in UV absorption spectra, can
be very sharp (Fig. 14.1). It is then perfectly reasonable to assume that the underlying
phenomenon would be an exact phase transition in the thermodynamic limit N → ∞ and
attempt to model it accordingly.
134
14 Unbinding the double helix
Mesoscopic modeling: 1 degree of freedom per base pair
A reduced, mesoscopic description of DNA consists of assigning a single continuous, “trans
verse” degree of freedom y
n
to the nth base pair, corresponding to the distance between
the two bases which comprise the pair. The energy related to this degree of freedom has its
physical origin in the hydrogen bonds which are responsible for pair binding. Accordingly,
it is modeled by a Morse potential
V (y) = D(e
−ay
−1)
2
where D is an average measure of the binding energy and 1/a a length which characterizes the
range of the hydrogen bonding. The tendency of successive base pairs to “stack” (“stacking”
interaction) can be modeled by assuming that they are bound together by springs. For
simplicity, I will assume these springs to be harmonic.
The total Hamiltonian will then be of the form [37]
H =
n
_
p
2
n
2µ
+
1
2
µω
2
0
(y
n+1
−y
n
)
2
+V (y
n
)
_
, (14.1)
where µ is the reduced mass corresponding to the base pair, p
n
= µ˙ y
n
is the canonical
momentum conjugate to y
n
and µω
2
0
a measure of the strength of the stacking interaction.
Note that this minimal modeling makes no reference to the helical structure of the
molecule. Although generalizations to that eﬀect have been formulated, it should be borne
in mind that this type of modeling makes no attempt to describe structural details of the
DNA molecule. Its scope begins and ends with capturing essential observed macroscopic
features at and very near the denaturation point.
14.1.2 Thermodynamics
The classical thermodynamics of H is described by the canonical partition function
Z =
_
N
n=1
dp
n
dy
n
e
−βH
. (14.2)
which factorizes into a product of Gaussian integrals over the momentum variables,
Z
K
= (2πµ/β)
N/2
, (14.3)
and a nontrivial conﬁgurational part
Z
P
=
_
_
N
n=1
dy
n
_
T (y
1
, y
2
) T (y
N−1
, y
N
) T (y
N
, y
N+1
) , (14.4)
where
T(x, y) = e
−β
_
µω
2
0
2
(y−x)
2
+V (x)
_
. (14.5)
The transfer integral formalism: deﬁnitions and notation
Consider the eigenvalue problem deﬁned by the asymmetric kernel T (the kernel can be easily
symmetrized but need not be so; in fact, working with the asymmetric kernel is technically
135
14 Unbinding the double helix
advantageous in examining the validity of some approximations, cf. below):
_
∞
−∞
dy T(x, y) Φ
R
ν
(y) = Λ
ν
Φ
R
ν
(x) (14.6)
_
∞
−∞
dy T(y, x) Φ
L
ν
(y) = Λ
ν
Φ
L
ν
(x) , (14.7)
where left and right eigenstates have been assumed to be normalized; note that the normal
ization integral is
_
dxΦ
L
ν
(x)Φ
R
ν
(x). Orthogonality
_
∞
−∞
dx Φ
L
ν
(x) Φ
R
ν
(x) = δ
νν
(14.8)
and completeness
ν
Φ
L
ν
(x) Φ
R
ν
(y) = δ(x −y) (14.9)
relationships are assumed to hold. I will further use the notation
Λ
ν
= e
−β
ν
(14.10)
(sensible as long as the eigenvalues are nonnegative).
Relationship between Z and the spectrum of T
The integrand of (14.4), as written down has a problem: it includes a reference to the
displacement y
N+1
of the N + 1st particle, which has not yet been deﬁned. For a large
system, this is best remedied by means of periodic boundary conditions (PBC), i.e. by
demanding that y
N+1
= y
1
. Alternatively, the integration may be extended to one more
variable, dy
N+1
, with the simultaneous introduction of a factor δ(y
N+1
−y
1
) to take care of
PBC. This however is the same as the sum in the lefthandside of (14.9). I then obtain
Z
P
=
ν
_
dy
1
dy
N+1
Φ
L
ν
(y
1
)
. ¸¸ .
T(y
1
, y
2
) T(y
N
, y
N+1
)Φ
R
ν
(y
N+1
)
. ¸¸ .
. (14.11)
The braces make clear that I can perform the integral over dy
N+1
and obtain a factor
Λ
ν
Φ
R
ν
(y
N+1
), using the deﬁning property of righthand eigenfunctions. The process can be
repeated N times, each time giving a further factor Λ
ν
and a right eigenfunction with an
argument whose index is smaller by one. At the end, I am left with
Z
P
=
ν
_
dy
1
Φ
L
ν
(y
1
)Λ
N
ν
Φ
R
ν
(y
1
) =
ν
Λ
N
ν
. (14.12)
In the thermodynamic limit, Z
P
is dominated by the largest eigenvalue Λ
0
or, equivalently,
the lowest
0
:
lim
N→∞
1
N
ln Z
P
= ln Λ
0
= −β
0
(14.13)
The order parameter
< y
i
> =
1
Z
P
_
dy
1
dy
N
T(y
1
, y
2
) T(y
i−1
, y
i
)y
i
T(y
i
, y
i+1
) T(y
N
, y
N+1
)
≡
1
Z
P
ν
_
dy
1
dy
N+1
Φ
L
ν
(y
1
) T(y
1
, y
2
) T(y
i−1
, y
i
)
. ¸¸ .
i−1
y
i
T(y
i
, y
i+1
) T(y
N
, y
N+1
)
. ¸¸ .
N−i+1
Φ
R
ν
(y
N+1
) , (14.14)
136
14 Unbinding the double helix
after insertion of a complete set of states (cf. above); the braces denote the number of
times I can perform an integration and obtain, respectively, a right eigenfunction with an
argument smaller by one, or a left eigenfunction with an argument larger by one, as well as
a factor Λ
ν
. The remaining integral must be performed explicitly:
< y
i
> =
1
Z
P
ν
Λ
N
ν
M
νν
≈ M
00
(14.15)
where the second line is exact in the thermodynamic limit, and I have used the abbreviation
M
νµ
=
_
∞
−∞
dyΦ
L
ν
(y)yΦ
R
µ
(y) . (14.16)
The spectrum of T: Gradientexpansion approximation; analogy with quantum
mechanics
Suppose that the displacement ﬁeld does not change appreciably over a lattice constant.
This is certainly reasonable at low temperatures. Note that this does not exclude large
displacements per se. Nonlinearity is explicitly allowed, but the displacement ﬁeld must be
smooth. The assumption is certainly reasonable at low temperatures.
I set y = x +z, Φ
R
→φ and rewrite (14.6) as
e
−β[
ν
−V (x)]
φ
ν
(x) =
_
+∞
−∞
dz e
−
1
2
βµω
2
0
z
2
_
φ
ν
(x) +zφ
ν
(x) +
1
2
z
2
φ
ν
(x)
_
=
_
2π
βµω
2
0
_
1/2
_
φ
ν
(x) +
1
2βµω
2
0
φ
ν
(x)
_
(14.17)
where higher terms in the gradient expansion have been neglected and the Gaussian integrals
have been performed; this is meaningful as long as the width of the Gaussians is smaller
than the range of the Morse potential, i.e.
βµω
2
0
/a
2
> 1 . (14.18)
The factor in front of the r.h.s. of (14.17) can be absorbed in the eigenvalue by deﬁning
˜
ν
=
ν
+ 1/(2β)] ln[2π/(βµω
2
0
)]. Now, for many practical purposes, when it comes to
calculating matrix elements, the relevant magnitude of − V (x) is D, the depth of the
Morse well (or some other characteristic energy in the case of another potential). The key
to this statement is that one does not need to consider large negative values of x, where
V (x) is huge, because at such x, both the exact eigenfunction Φ and its approximation φ
can be expected to be negligible. If then βD ≤ 1
1
it is reasonable to expand the exponential
and keep only the ﬁrst term. Dividing both sides by β, I obtain a Schr¨odingerlike equation,
−
1
2µ(βω
0
)
2
φ
ν
(x) + [V (x) −˜
ν
] φ
ν
(x) = 0 . (14.19)
Before continuing the discussion of (14.19) and its properties, I pick up the bits and pieces
(cf (14.2), (14.3), (14.13) ) of the thermodynamic free energy (per site)
f = −
1
βN
ln(Z
K
Z
P
) ≡ −
1
β
ln
_
2π
βω
0
_
+
˜
f , (14.20)
1
Note that, in connection with (14.18), this deﬁnes a temperature window D < k
B
T < µω
2
0
/a
2
for the
validity of the overall approximation scheme.
137
14 Unbinding the double helix
where
˜
f = ˜
0
. The ﬁrst term in (14.20) is the free energy of the small oscillations (transverse
phonons in this context). It is a term smooth in temperature (constant speciﬁc heat!) and
therefore irrelevant to any phase transition. Any nontrivial physics is hidden in the second
term, which is identical with the the smallest eigenvalue of (14.19).
A couple of comments are in order. First, (14.19) would be a literal (i.e. quantum
mechanical) Schr¨odinger equation, if I substituted 1/(βω
0
) by ¯h. I will come back to that
point. Second, I can get a dimensionless potential (and eigenvalue) by dividing both sides
of (14.19) by D. In other words, the relevant dimensionless parameter is
δ
2
=
_
_
_
2µ
a
2
¯ h
2
D (quantum mechanics)
2µβ
2
ω
2
0
a
2
D (statistical mechanics).
(14.21)
In terms of δ, the bound state spectrum of (14.19) is given [38] by
˜
n
D
= 1 −
_
1 −
n + 1/2
δ
_
2
n = 0, 1, ..., int(δ −1/2) . (14.22)
There is at least one bound state if δ > 1/2. For 1 ≥ δ > 1/2 there is exactly one bound
state. And if δ becomes equal to, or smaller than 1/2, there is no bound state at all. The
value δ
c
= 1/2 is ”critical”. In quantum mechanical language, if a particle has a mass
which is lighter than a critical mass µ
c
= ¯h
2
a
2
/(8D), it cannot be conﬁned in the Morse
well. Quantum ﬂuctuations will drive it out
2
.
Thermodynamic free energy
In the context of statistical mechanics, δ
c
corresponds, via (14.21), to a critical temperature
T
c
= 2(ω
0
/a)
√
2µD. The free energy is given by
˜
f
D
=
_
¸
_
¸
_
1 T > T
c
1 −
_
1 −
T
T
c
_
2
T < T
c
,
(14.23)
where in the upper line I have made use of the fact that the bottom of the continuum part
of the spectrum is at = D. The free energy f is nonanalytic at T = T
c
, where its second
derivative is discontinuous (i.e. there is a jump in the speciﬁc heat). This corresponds to a
second order transition, according to the Ehrenfest classiﬁcation scheme
3
.
The order parameter. DNA melting as a thermodynamic instability
In order to gain some further insight into the physics involved
4
it is useful to examine the
average displacement (14.15), determined by the groundstate (GS) eigenfunction
φ
0
(x) = e
−ζ/2
ζ
δ−1/2
(14.24)
2
This is a general property of asymmetric onedimensional wells; symmetric wells will support a particle
in a bound state, no matter how low its mass.
3
Note that the term “second order” is meant literally in this case, not just as a metaphor for the absence
of a latent heat (for which the term ”continuous transition” would be appropriate).
4
The mathematical analogy between the behavior of the spectral gap which occurs in a point (d = 0)
system and the singularity in the free energy of a classical chain (d = 1) is an example of a deeper analogy
which relates quantum to thermal ﬂuctuations; the formal correspondence ¯ h ↔1/(βω
0
) manifests a far
reaching analogy between ddimensional quantum mechanics and (d +1)dimensional classical statistical
mechanics. The analogy is most fruitful at d = 1, because of the interplay and the richness of exact
available results which based either in the transfermatrix approach of 2dimensional classical statistics
or on the BetheAnsatz developed for 1d quantum spin systems.
138
14 Unbinding the double helix
where ζ = 2δe
−ax
. It is straightforward to see that, as T approaches T
c
from below, the
eigenfunction extends towards larger and larger positive values of x:
φ
0
(x) ∝ e
−λx
(14.25)
where
λ =
1
δ −δ
c
(14.26)
is a (transverse) characteristic length which measures the spatial extent of the GS eigen
function. As a consequence, we can estimate that < y >, which is dominated by the large
values of the argument, will also behave as
< y >∼ (δ −δ
c
)
−1
∼
_
1 −
T
T
c
_
−1
. (14.27)
As the critical temperature is approached from below, particles cease to be conﬁned to the
minimum of the Morse well. They perform larger and larger excursions to the ﬂatter part
of the potential. At T
c
the transition is complete; the average transverse displacement is
inﬁnite. Particles move, on the average, on the ﬂat top of the Morse potential. Unwinding
(“melting”) of the DNA has occurred.
In the language of critical phenomena < y > is the order parameter. In ordinary phase
transitions, where one goes from an ordered to a disordered phase, the order parameter m
vanishes at the transition point, i.e m ∝ (T
c
−T)
β
with a positive critical exponent β
5
. DNA
melting is really an instability  rather than an “orderdisorder” transition. It is therefore
not surprising that the corresponding critical exponent β extracted from (14.27) is negative
(1).
Experimental data on DNA denaturation do not deliver < y > directly. The “experimental
order parameter” is the helical fraction, i.e. the probability that a given base pair is still
bound; technically one uses an (instrumentationdependent) cutoﬀ y
0
and measures P(y >
y
0
, T). For the model presented here, this function approaches zero smoothly (linearly) as
T →T
c
, independently of the choice of y
0
.
14.2 Nonlinear structures (domain walls) and DNA melting
In discussing how adsorbed atoms arrange themselves on a substrate, we examined a num
ber of possibilities: a uniform structure, commensurate with the substrate, and a soliton
lattice. We found that the commensurateincommensurate phase transition occurred when
the mismatch between the competing lattice periodicities made the soliton lattice energet
ically favored. The DNA denaturation  as described within the model Hamiltonian (14.1)
 is a thermal, not a parametric instability. Nonetheless, it will prove useful to examine the
existence and properties [39] of competing nonlinear structures of (14.1).
In this section I will use dimensionless variables ay
n
→ y
n
; the energy will be measured
in units of D. The total potential energy will then be
Φ =
N
n=0
_
1
2R
(y
n+1
−y
n
)
2
+V (y
n
)
_
(14.28)
where R = D/(Ka
2
) is a dimensionless coupling constant.
5
not to be confused with the inverse temperature; this is the standard notation of critical phenomena
139
14 Unbinding the double helix
14.2.1 Local equilibria
Deﬁnition
Local equilibria are deﬁned by static solutions of the equations of motion, i.e. by extrema
∂Φ
∂y
n
= 0 ∀n. (14.29)
of the total potential energy. Their spatial patterns, for a given boundary condition y
0
=
0, y
N+1
= L are described by a secondorder diﬀerence equation
y
n+1
−2y
n
+y
n−1
+RV
(y
n
) = 0 ∀n = 1, N . (14.30)
Fixed point
There is only one ﬁxed point
y
n
= 0 ∀n
of the map. Note however that it is compatible only with the boundary condition L = 0.
Note further that the energy associated with the ﬁxed point conﬁguration is zero.
Stability criteria
The stability of equilibria is governed by the spectra of the corresponding N N Hessian
matrix
h
ij
=
∂
2
Φ
∂y
i
∂y
j
(14.31)
where the derivative is evaluated at the extrema deﬁned by (14.30). Let Λ
ν
, ν = 1, , N
denote the eigenvalues of the matrix h. If, for a given extremum, the eigenvalues are all
positive, then the extrema is a local minimum. If they are all negative it is a local maximum.
If some are negative and some are positive it is a local saddle point. I will not discuss the
interesting marginal case where an eigenvalue vanishes, since it does not arise in the context
of this particular problem.
Picturing and classifying equilibria
What do these equilibria look like? A picture can be given by looking at the full set of
solutions of (14.30), without ﬁxing the value of L, and then choose the ones that correspond
to the boundary condition y
N+1
= L. This can be done by noting that (14.30) for unspeciﬁed
L is equivalent to all realizations of the twodimensional map
p
n+1
= p
n
+RV
(y
n
)
y
n+1
= y
n
+p
n+1
, (14.32)
where n = 1, , N, y
1
= p
1
+ y
0
, y
0
= 0 and p
1
is unspeciﬁed. The set of all orbits of the
map thus derived is shown in Fig (14.2). Note that there are two kinds of orbits. Stable ones
(drawn with full points) and saddles (drawn with open points), where all but one eigenvalues
of the Hessian are positive. It is then possible to isolate those orbits which correspond to a
given L. They are shown in Fig. (14.2.1). We note that they start oﬀ at a value very close
to zero, remain there for a few sites, and then suddenly “take” oﬀ with a constant linear
slope. The equilibria pictured here represent in some sense “interpolations”  or domain
walls (DWs  between the bound (y ≈ 0) and the unbound (y ¸1) phase.
140
14 Unbinding the double helix
0 1 2 3 4 5 6
0
10
20
30
40
50
4 5
43
44
45
y
p
Figure 14.2: The unstable manifold of the FP
of the map (14.32) for R = 10.1
and N = 28. Black squares belong
to stable equilibria, red open circles
belong to unstable equilibria. The
horizontal line at y = 44 demon
strates the multivaluedness of the
manifold as a function of y (4 sta
ble and 3 unstable equilibria with
that value of y; details in the in
set). The vertical lines are drawn
at p
min
and p
max
, the minimal and
maximal asymptotic slopes of DWs
(from [39].
0 5 10 15 20 25 30
0
20
40
60
80
4.0 4.5 5.0 5.5
35.5
36.0
36.5
37.0
E
y
n
n
p
Figure 14.3: The 8 stable equilibria correspond
ing to N = 28, y
0
= 0, y
N+1
= L =
80. Not shown are 7 unstable equi
libria enmeshed between the stable
ones. Inset: total energies for both
stable (black squares) and unstable
(red open circles) equilibria. The
continuous curve corresponds to a
theoretical estimate which does not
distinguish between stable and un
stable equilibria (cf. text); (from
[39]).
Conﬁguration of lowest energy for a given L ,= 0.
For a given slope p of the unbound segment, there are L/p unbound sites. The excess energy
from them is
E(p) =
_
p
2
2R
+ 1
_
L
p
and has a minimum at p = p
∗
= (2R)
1/2
. Thus, the minimum energy required to create a
DW at a given L is
E
∗
(L) =
_
2
R
_
1/2
L . (14.33)
As long as this energy is not available, the system will, if left to itself, prefer the equilibrium
available at the ﬁxed point. In order to maintain the transverse displacement L, one must
apply an external force
f =
dE
∗
(L)
dL
=
_
2
R
_
1/2
. (14.34)
This is exactly what happens in singlemolecule experiments which achieve mechanical DNA
denaturation or, as commonly called, DNA unzipping.
141
14 Unbinding the double helix
0 10 20 30
0
1
2
Λ
ν
ν
N=32, L=100
0
*
Figure 14.4: Eigenvalue spectra of the Hessians
for (i) the ﬁxed point (open squares)
and (ii) the DW with minimal en
ergy (open circles) for L = 100. In
both cases N = 32. The DW’s
spectrum consists of bands of opti
cal and acoustic phonons, localized
respectively in the bound and un
bound portions of the chain, and a
single local mode in the gap; both
bands are well described (to order
O(1/L)) by the corresponding free
phonon dispersion curves (dotted);
(from [39]).
14.2.2 Thermodynamics of domain walls
At any ﬁnite temperature, we will have to consider the competition of the two possible
structures: the one corresponding to the ﬁxed point, and the corresponding to the DW with
the least energy. Strictly speaking, in the latter case we are considering the totality of all
possible values of L.
Free energy associated with a given minimum
For small displacements around any given local minimum ¦¯ y
n
¦, i.e.
y
n
= ¯ y
n
+u
n
the total potential energy will be given, to quadratic order in the displacements, by
Φ(u) ≈ E(¯ y) +
1
2
i,j
h
ij
u
i
u
j
,
where E(¯ y) is the energy of the local minimum.
The associated conﬁgurational part of the partition function will be
Z(¯ y) = e
−βE(¯ y)
_
∞
−∞
_
N
m=1
du
m
_
e
−
β
2
i,j
h
ij
u
i
u
j
= e
−βE(¯ y)
N
ν=1
_
2π
βΛ
ν
_
1/2
(14.35)
where the product runs over all eigenvalues of the Hessian. Note that the eigenvalues must be
strictly positive, not just nonnegative. The free energy associated with any given minimum
will then be
F(¯ y) = −T ln Z(¯ y) = E(¯ y) −
T
2
N
ν=1
ln
_
2π
βΛ
ν
_
(14.36)
Comparison of free energies
The spectra of (i) the ﬁxed point and (ii) the stable DW with the minimal energy at some
ﬁnite L are shown in Fig. 14.2.2.
142
14 Unbinding the double helix
Now consider the diﬀerence in free energies between the DW with the minimal energy at
some ﬁnite L and the ﬁxed point
∆F(L) = E
∗
(L) −
T
2
N
ν=1
ln
_
Λ
0
ν
Λ
∗
ν
_
(14.37)
where the star in the superscript denotes the DW and the 0 the ﬁxed point.
The second term represents the diﬀerence in entropies. Formation of the DW generates a
gain in entropy. The quantity
1
2
N
ν=1
ln
_
Λ
0
ν
Λ
∗
ν
_
≡
L
p
∗
σ(R)
is generally proportional to the number L/p
∗
of unbound sites. The extra entropy comes
exclusively from the unbound part. It is due to the fact that the acoustic phonons which
live in the unbound region have lower frequencies than the optical phonons which live in the
bound state deﬁned by ﬁxed point.
Combining terms, the diﬀerence in free energy can be written as
∆F(L) = [2 −Tσ(R)]
L
p
∗
(14.38)
which becomes zero at
T
c
(R) =
2
σ(R)
(14.39)
and negative at higher temperatures. Thus, if the temperature is raised beyond T
c
(R) a DW
of any length can be formed spontaneously  since it generates a net gain in free energy.
Denaturation can occur spontaneously.
Alternatively, we may look at the derivative
p =
dF(L)
dL
= [2 −Tσ(R)]
1
p
∗
(14.40)
which represents the unzipping force at a ﬁnite temperature T. Spontaneous thermal de
naturation is, in this sense, equivalent to the vanishing of the unzipping force.
For not too large values of R, the proportionality constant is
σ(R) = ln
_
_
R/2 +
_
1 +R/2
_
. (14.41)
In the continuum limit, R ¸ 1, this gives a T
c
= 2(2/R)
1/2
, which is exactly the critical
temperature found in Section 14.1.2.
In summary, what I have presented in this lecture is an alternative picture of the DNA
instability, based on the underlying, competing nonlinear equilibrium structures (domain
walls vs. ﬁxed point). The results suggest that the domain wall, via the entropic gain it
generates, can overcome the energetic cost of its production. In other words, spontaneous
formation of a DW at the instability temperature is what really “drives” DNA denaturation.
143
15 Pulse propagation in nerve cells: the
HodgkinHuxley model
15.1 Background
The physics of electric pulse propagation in nerve cells [40] has a long and interesting history.
Helmholtz measured the signal velocity on a frog’s sciatic nerve in 1850. Bernstein succeeded
shortly thereafter (1868) in detecting the complete shape of the pulse, the action potential
V (t) as a function of time. The concept of a nerve cell, or neuron as an independent
functional unit was established through extensive anatomical studies by Ram´on y Cajal in
the beginning of the twentieth century. Typically, a neuron (Fig. 15.1) consists of an input
collecting part with a dendritic structure (dendrites), a cell nucleus, and an output ﬁber
(axon) which transports and relays the signals. The membrane of the nerve cell, whose
Figure 15.1: A nerve cell.
existence was experimentally conﬁrmed by Fricke in the 1920’s, is permeable to K
+
and
Na
+
ions. If the nerve is at rest, the inner and outer surfaces of the membrane carry,
respectively, net negative and positive electric charges. The corresponding resting potential
(Ruhepotential) V
in
−V
out
is of the order of 50mV .
15.2 The HodgkinHuxley model
A signiﬁcant breakthrough in our understanding of pulse propagation in nerve cells is due
to experimental and theoretical work done by Hodgkin and Huxley (HH) in the 1950’s on
the giant
1
axon of the Atlantic squid (Loligo pealei).
1
The diameter of the squid’s axon is of the order of 1.5mm, which is about 50 times as thick as that of
most animals, including humans.
144
15 Pulse propagation in nerve cells: the HodgkinHuxley model
15.2.1 The axon membrane as an array of electrical circuit elements
HH’s schematic view of a cylindrical axon membrane as an electrical circuit element is
summarized in Fig. 15.2.1. The constitutive equations for the ion transport are:
Figure 15.2: Upper part: Schematic view of the
axon interior and membrane. Lower
part: a membrane element of length
∆x, viewed as a piece of electri
cal circuit with a capacitance, inde
pendent K
+
and Na
+
ion channels
for the ﬂow of a transverse current
(across the membrane) with a non
linear conductance, and a “leak”
channel with linear conductance. In
the HH experiments, the inner part
of the membrane was held at a spa
tially uniform potential V (voltage
clamp). This allowed a detailed
analysis of the ion gate properties.
• Ohm’s law for the longitudinal current ﬂow (along the axon):
I
πd
2
/4
= −σ
∂V
∂x
(15.1)
where d = 0.476 mm is the diameter of the axon and σ = 2.9 S/m the axoplasm
conductivity.
• Kirchoﬀ’s law, applied to a membrane element of length ∆x and diameter d with a
capacitance C = cπd∆x, can be written as
I(x) =
∂
∂t
(CV ) +J +I(x + ∆x)
where J = jπd∆x is the total transverse current (across the membrane). In diﬀerential
form, this can be rewritten in terms of the transverse current density j as
1
πd
∂I
∂x
= −c
∂V
∂t
−j . (15.2)
where c = 1µF/cm
2
is the membrane capacitance per unit surface. Eliminating the current
from (15.1) and (15.2) one obtains
σd
4
∂
2
V
∂x
2
= c
∂V
∂t
+j (15.3)
which, under general conditions, is a driven, nonlinear diﬀusion equation.
145
15 Pulse propagation in nerve cells: the HodgkinHuxley model
15.2.2 Ion transport via distinct ionic channels
According to HH, the transverse current per unit length consists of distinct ionic components,
and a “leakage” current
j = j
Na
+j
K
+j
L
(15.4)
with
j
Na
= G
Na
g
Na
(V −V
Na
)
j
K
= G
K
g
K
(V −V
K
)
j
L
= G
L
(V −V
L
) , (15.5)
where V
Na
= 115 mV , V
K
= −12 mV are the electrochemical potentials for Na and K ions
respectively, and V
L
= 10.6 mV is adjusted so that the total current j vanishes if V = 0.
Note that this condition really corresponds to the rest state, i.e. V = V
in
− V
out
+ 65mV .
The electrochemical potentials of the squid axon are fairly typical: the corresponding values
for the frog’s sciatic nerve are V
Na
= 122mV , V
K
≈ 0mV . The G
i
’s are linear conductances,
G
Na
= 120 mS cm
−2
, G
K
= 36 mS cm
−2
, G
L
= 0.3 mS cm
−2
.
Finally, the g
i
’s are nonlinear dimensionless functions of the potentials, to be discussed
below.
15.2.3 Voltage clamping
In order to study the details of nonlinear transport, HH developed an experimental technique,
called space clamping. The technique consisted of piercing the axon with a thin metallic
electrode, so that the inside of the membrane could be held at a spatially uniform potential.
In this case, using (15.3),(15.4) and (15.5), one obtains
c
∂V
∂t
= G
Na
g
Na
(V −V
Na
) +G
K
g
K
(V −V
K
) +G
L
(V −V
L
) . (15.6)
Voltage clamping was a further technical development which made it possible to control
the uniform voltage at any desired level. Clamping was a major conceptual breakthrough
in the electrophysics of nerve cells because it facilitated a detailed analysis of experimental
data and made possible the extraction of crucial information on the unknown functions g
i
.
15.2.4 Ionic channels controlled by gates
The experimental ﬁndings of HH led them to conclude that the two ionic channels are
controlled by gates. Moreover, a phenomenological description of the data could be only be
achieved by postulating the existence of diﬀerent types of gates. Thus
g
K
= n
4
(15.7)
g
Na
= m
3
h (15.8)
where m, n, h ∈ [0, 1] are gating functions.
Gating functions
The value of any gating function p corresponds to the probability that the corresponding
gate is in its “open” state. The time evolution of any gating function is governed by a
ﬁrstorder ordinary diﬀerential equation
dp
dt
= α(1 −p) −βp = −
p −p
0
τ
(15.9)
146
15 Pulse propagation in nerve cells: the HodgkinHuxley model
50 0 50 100
0.0
0.5
1.0
50 0 50 100
0.1
1
10
100
n
0
m
0
h
0
V (mV)
τ
n
1
τ
m
1
τ
h
1
m
s
e
c

1
V
Figure 15.3: The HH gating functions at equilibrium. n
0
and m
0
are of the activation type, h
0
is of
the deactivation type. Inset: the inverse re
laxation times which correspond to the gating
functions. Note that the mgate is consider
ably faster than either the n or the h gates.
where αdt is the probability that the closed gate will open within the time dt, and βdt the
probability that the open gate will close during the same time interval. Both α and β are, in
general, nonlinear functions of the membrane potential V . The second form of the equation,
where the new parameters are deﬁned as
1
τ
= α +β
p
0
=
1
1 +
β
α
, (15.10)
shows clearly that the gating function approaches an equilibrium value p
0
(V ) for any given
membrane potential within a characteristic time τ(V ).
Gates are classiﬁed as either of the activation type, if
lim
V →∞
p(V ) = 1 ,
or of the deactivation type, if
lim
V →∞
p(V ) = 0 .
HH gate parameters
The gating function n which controls the ﬂow of K
+
ions is of the activation type, with
α
n
=
0.01(10 −V )
e
10−V
10
−1
, β
n
= 0.125e
−
V
80
. (15.11)
Here α
n
and β
n
are measured in msec
−1
, V in mV.
The ﬂow of Na
+
is controlled by both an activationtype gate m, with
α
m
=
0.1(25 −V )
e
25−V
10
−1
, β
m
= 4e
−
V
18
, (15.12)
and a deactivationtype gate h, with
α
h
= 0.07e
−
V
20
, β
h
=
1
e
30−V
10
+ 1
. (15.13)
The corresponding values of the gating functions at equilibrium are shown in Fig. 15.2.4.
147
15 Pulse propagation in nerve cells: the HodgkinHuxley model
15.2.5 Membrane activation is a threshold phenomenon
The solutions of the HH equations under spaceclamping conditions for a variety of initial
membrane potentials are shown in Fig. 15.4. A spike always forms, provided that the initial
stimulus is above a certain threshold ∼ 6.5mV. The spike amplitude and width are roughly
independent of the initial stimulus.
0 2 4 6 8 10 12
20
0
20
40
60
80
100
120
V
(
m
V
)
t (msec)
V(0) (mV)
90
60
30
15
10
7
6
5
Figure 15.4: The membrane potential as a function of time
for a variety of initial stimuli. Note that a
spike will always form if the initial stimulus
is above a certain threshold; amplitude and
width of the spike are roughly independent of
the strength of the stimulus. The threshold
lies between 6 and 7 mV.
15.2.6 A qualitative picture of ion transport during nerve activation
On the basis of the HH model, the following qualitative picture emerges for the temporal
evolution of the action potential during nerve activation (cf. Fig. 15.2.6:
• At zero membrane potential, m
0
(0) = 0.053, h
0
(0) = 0.596; the product m
3
h is very
small, hence the Na
+
current will be very small. Since n
0
(0) = 0.318, the K
+
current
will also be small.
• As the membrane voltage is turned on, the mgate of the Na
+
channel is rapidly
activated (cf. Fig. 15.2.4), within a characteristic time 0.2 − 0.4 msec. Sodium ions
ﬂow into the axon.
• As V approaches the electrochemical potential of sodium, V
Na
= 115 mV , the inﬂux
of sodium ions becomes small.
• At high values of V the hgate becomes deactivated. The Na
+
channel closes within
a characteristic relaxation time of 1 −8 msec.
• While this happens, the slower potassium gate n becomes activated, with a charac
teristic time of 2 − 5 msec. K
+
ions begin to ﬂow out of the axon. The membrane
depolarizes.
15.2.7 Pulse propagation
It is now possible to look for propagating solutions of the nonlinear diﬀusion equation (15.3),
i.e. solutions of the form V (x − vt); since electrodes are usually placed at a ﬁxed point in
148
15 Pulse propagation in nerve cells: the HodgkinHuxley model
0 2 4 6 8 10 12
20
0
20
40
60
80
100
120
0 2 4 6
0
10
20
30
V
(
m
V
)
t (msec)
relaxation (space clamping)
0.5
0.0
0.5
1.0
j (mA/cm
2
)
G
(
m
S
/
c
m
2
)
t (msec)
K
Na
Figure 15.5: The thick curve shows the membrane poten
tial (left yscale) as a function of time. The
pair of thinner curves (right yscale) repre
sents the potassium and sodium currents. In
set: the potassium and sodium conductances.
Note that the sodium conductance peak oc
curs much earlier in time.
space and follow the temporal evolution of a pulse, it is more convenient to look  equivalently
 for solutions of the form V (t − x/v). Moreover, in order to keep everything ﬁrstorder in
time, it is convenient to revert to (15.1) and (15.2) instead of (15.3).
This leads to a system of 5 coupled ODEs ( (15.1) and (15.2) plus the three equations of
the type (15.9) which determine the time evolution of the gating functions):
dV
dt
=
4
πσd
2
vI
1
πdv
dI
dt
=
4c
πσd
2
vI +j
Na
(V ) +j
K
(V ) +j
L
(V )
dn
dt
= −
n −n
0
(V )
τ
n
(V )
dm
dt
= −
m−m
0
(V )
τ
m
(V )
dh
dt
= −
h −h
0
(V )
τ
h
(V )
(15.14)
Figure 15.6: A propagating pulse solution of the HH equations corresponds to a homoclinic tra
jectory (from [40]).
The above system of coupled ODEs has an equilibrium point S = (0, 0, n
0
(0), m
0
(0), h
0
(0))
149
15 Pulse propagation in nerve cells: the HodgkinHuxley model
in the ﬁvedimensional space (V, I, n, m, h). A pulselike solution vanishes as t → ±∞. It
corresponds to a homoclinic trajectory in the 5dimensional space. This is shown schemati
cally in Fig. 15.6, where the three dimensions m, n, h are collapsed onto one. The trajectory
starts oﬀ at S at t = −∞; for a generic value of v it will eventually wander oﬀ to unbounded
values of voltage and/or current. If v has the “right” value, it will return to S in the limit
t → ∞. With the HH parameters, this occurs for two values of v. The ﬁrst, v = 18.8
Figure 15.7: Propagating pulse solutions of the HH equations (from [40]).
m/sec, corresponds to a stable pulse with an amplitude of approximately 90 mV. The sec
ond, v = 5.7 m/sec, corresponds to an unstable pulse with a smaller amplitude (Fig. 15.7).
The velocity of the stable pulse is comparable to the measured velocity, 21.2 m/sec, of the
squid’s action potential.
150
16 Localization and transport of energy
in proteins: The Davydov soliton
16.1 Background. Model Hamiltonian
16.1.1 Energy storage in C=O stretching modes. Excitonic Hamiltonian
Davydov’s proposal [41] was an attempt to deal with localization and transport of energy in
alphahelical proteins. He viewed the three strands of the alpha helix as roughly independent
onedimensional chains, and assumed that energy from ATP hydrolysis (0.42 eV) could be
conveniently stored in the C=O (AmideI) stretching vibration. He then argued that if the
energy had to hop from one site to the next, following a linear (excitonic) model Hamiltonian
H
exc
=
n
_
E
0
B
†
n
B
n
−J
_
B
†
n+1
B
n
+B
†
n
B
n+1
__
, (16.1)
where the B
n
’s are boson operators representing the C=O stretching mode at the nth
site, and J represents the hopping parameter, it would very soon (in the order of a few
picoseconds) be dissipated due to linear dispersion  and thus cease to be available where
really needed, e.g. for muscular contraction.
16.1.2 Coupling to lattice vibrations. Analogy to polaron
Davydov then speculated whether the “selftrapping” eﬀect, which had been proposed by
Landau in 1933 in the context of electrons coupled to the lattice, and known in solidstate
physics as the polaron, could be applicable in the bosonic system (16.1). We have already
dealt with a similar situation in conjugated polymers, where the coupling to the lattice
vibrations produces stable, propagating excitations (solitons and polarons).
Lattice vibrations are acoustic phonons represented by the Hamiltonian
H
ph
=
n
p
2
n
2M
+
1
2
k
n
(u
n+1
−u
n
)
2
(16.2)
where u
n
is the displacement of the nth site from its equilibrium position, M is the mass
associated with each unit of the alphahelix, and k is a spring constant associated with the
longitudinal motion along the chain.
Coupling of the exciton modes to the lattice vibrations may occur because the energy
stored at a given site changes with the distance between adjacent sites, i.e. with the length
of the hydrogen bond connecting the nth to the n + 1st site of the strand
E
0
→E
0
+χ(u
n+1
−u
n
) .
The above coupling generates an interaction Hamiltonian of the form
H
int
= χ
n
(u
n+1
−u
n
)B
†
n
B
n
. (16.3)
The total Hamiltonian is the sum of the three terms
H = H
exc
+H
ph
+H
int
. (16.4)
151
16 Localization and transport of energy in proteins: The Davydov soliton
16.2 BornOppenheimer dynamics
The dynamics of the coupled excitonphonon system described by the Hamiltonian (16.4)
can be considerably simpliﬁed if we make use of the BornOppenheimer approximation.
This is not an unreasonable assumption, since acoustic vibrations are slow compared to the
excitonic modes. It allows us to treat the lattice displacements as classical variables and
simpliﬁes the mathematical computations.
16.2.1 Quantum (excitonic) dynamics
For a given set of lattice displacements ¦u
n
¦ we denote the excitonic wave function by
[Ψ >=
n
α
n
(t)B
†
n
[0 > (16.5)
where [0 > is the bosonic vacuum state and the amplitudes α
n
(t) depend parametrically on
the lattice conﬁguration ¦u
n
¦. Normalization of the quantum state demands that
n
[α
n
[
2
= 1 . (16.6)
The time evolution proceeds according to the timedependent Schr¨odinger equation
i¯h
∂
∂t
[Ψ >=
ˆ
H[Ψ > (16.7)
where
ˆ
H = H
exc
+ H
int
is the quantum part of the Hamiltonian (remember, at this stage
H
ph
is just a cnumber!); with the Ansatz (16.5) the Schr¨odinger equation (16.7) can be
written as a set of coupled equations for the complex amplitudes
i¯h
∂α
n
∂t
= E
0
α
n
+χ(u
n+1
−u
n
)α
n
−J(α
n+1
+α
n−1
) . (16.8)
The total energy
The expectation value of the excitonic part of the energy can be expressed in terms of the
amplitudes as
< Ψ[
ˆ
H[Ψ >=
n
[E
0
+χ(u
n+1
−u
n
)] α
∗
n
α
n
−J
n
α
∗
n
(α
n+1
+α
n−1
) . (16.9)
The total energy of a given excitonphonon conﬁguration ¦u
n
, α
n
¦ is
=< Ψ[
ˆ
H[Ψ > +H
ph
(¦u
n
¦) . (16.10)
The limiting case χ = 0
In the limiting case χ = 0, (16.8) reduces to
i¯h
∂α
n
∂t
= E
0
α
n
−J(α
n+1
+α
n−1
) , (16.11)
which has plane wave solutions of the form
α
(q)
n
(t) =
1
√
N
e
i(qx−
q
¯ h
t)
(16.12)
152
16 Localization and transport of energy in proteins: The Davydov soliton
with total energy
q
= E
0
−2J cos qa ≡ ¯hω
q
(16.13)
where the a is the lattice constant (the distance between successive sites along a single strand
of the helix). These plane waves are the excitons which, according to Davydov, would exhibit
dispersion over a time scale much too short to be relevant for biological energy transport.
The group velocity of excitons is given by
v
g
=
∂ω
q
∂q
= 2
Ja
¯h
sin qa . (16.14)
In the longwavelength limit the exciton energy takes the form
q
= E
0
−2J +
¯h
2
4Ja
2
v
2
g
. (16.15)
We identify
m
∗
=
¯h
2
2Ja
2
as the exciton’s eﬀective mass.
16.2.2 Lattice motion
The dynamics of the classical lattice coordinates is described by
M¨ u
n
= −
∂
∂u
n
_
H
ph
+ < Ψ[
ˆ
H[Ψ >
_
= k(u
n+1
+u
n−1
−2u
n
) +χ([α
n
[
2
−[α
n−1
[
2
) . (16.16)
16.2.3 Coupled excitonphonon dynamics
It will prove convenient to deﬁne a new complex amplitude via
α
n
= φ
n
e
−
i
¯ h
(E
0
−2J)t
.
This allows us to rewrite (16.8) and (16.16) as
i¯h
˙
φ
n
= −J(φ
n+1
+φ
n−1
−2φ
n
) +χ(u
n+1
−u
n
)φ
n
(16.17)
and
M¨ u
n
= k(u
n+1
+u
n−1
−2u
n
) +χ([φ
n
[
2
−[φ
n−1
[
2
) (16.18)
respectively.
The general problem of coupled phononexciton dynamics involves the solution of the
above set of coupled ODEs.
16.3 The Davydov soliton
16.3.1 The heavy ion limit. Static Solitons
In the limit M → ∞ we may assume that ions do not move. This allows us to set the
lefthand side of (16.18) equal to zero, whereupon
u
n+1
−u
n
= −
χ
k
[φ
n
[
2
, (16.19)
153
16 Localization and transport of energy in proteins: The Davydov soliton
which transforms (16.17) to
i¯h
˙
φ
n
= −J(φ
n+1
+φ
n−1
−2φ
n
) −
χ
2
k
[φ
n
[
2
φ
n
, (16.20)
and the total energy (16.10) to
n
E
0
[φ
n
[
2
−J
n
φ
∗
n
(φ
n+1
+φ
n−1
) −
χ
2
2k
n
[φ
n
[
4
. (16.21)
The continuum approximation
If the amplitudes vary smoothly from site to site, we can approximate the set of discrete
variables ¦φ
n
¦ by a continuum ﬁeld φ(x). The dynamics (16.20) takes the form of the
continuum ﬁeld equation
iφ
τ
+φ
xx
+
1
σ
0
[φ[
2
φ = 0 , (16.22)
where σ
0
= kJ/χ
2
and I have introduced the dimensionless time τ = Jt/¯h.
We recognize (16.22) as the nonlinear Schr¨odinger (NLS) equation. In section 12.3.4 we
derived a family of one soliton solutions of the form
[φ(x)[ =
√
σ
0
κ sech
κ(x −vτ)
√
2
with arbitrary κ and v. In the present context, since we assumed that ions do not move, v
must vanish. Furthermore, the normalization condition
_
∞
−∞
dx [φ(x)[
2
= 1
ﬁxes the value of κ = 1/(2
√
2σ
0
).
Form of the soliton
The exact form of the static soliton is given by
φ(x) =
1
√
8σ
0
sech
x
4σ
0
e
iτ/(16σ
2
0
)
.
Note that the spatial extent of the soliton (in units of the lattice constant) is of the order
of 4σ
0
. With the standard parameter values [42] σ
0
is estimated to be between 1.6 and 7.4;
this appears to justify the use of the continuum approximation.
Energy considerations
The total energy of the soliton can be calculated from (16.21). The ﬁrst term in (16.21)
contributes E
0
. In the second term, we can use a Taylor expansion which produces a
contribution
−2J +J
_
∞
−∞
dx [φ
[
2
= −2J +
χ
4
48Jk
2
.
Finally, the third term produces a contribution
−
χ
2
2k
_
∞
−∞
dx [φ[
4
= −
χ
4
24Jk
2
.
154
16 Localization and transport of energy in proteins: The Davydov soliton
Collecting terms, I obtain the total energy of a static soliton
(v = 0) = E
0
−2J −
χ
4
48Jk
2
(16.23)
which lies below the excitonic band (16.13). Thus the soliton is expected to be a more
stable excitation (cf. the similar argument made with the SSH kink, or the TLM polaron in
conjugated polymers).
16.3.2 Moving solitons
It is straightforward to generalize the analysis of the previous subsection to the case of
moving solitons. The system of coupled ODEs (16.17) and (16.18) can be written in the
continuum limit as
i¯hφ
t
+Jφ
xx
−χu
x
φ = 0
Mu
tt
−ku
xx
+χ
_
[φ[
2
_
x
= 0 . (16.24)
If we look for propagating solutions of the type u(x − vt), the second equation has a ﬁrst
integral,
M(v
2
−c
2
)u
x
= χ[φ[
2
where Mc
2
= k and I have assumed boundary conditions decaying at inﬁnity to set the
integration constant equal to zero. The last equation, when introduced into the ﬁrst of
(16.24) yields
i
¯h
J
φ
t
+φ
xx
+
1
σ
[φ[
2
φ = 0 (16.25)
which contains only the φ ﬁeld. The parameter σ = σ
0
(1−v
2
/c
2
) now depends on the soliton
velocity. Again, we recognize (16.25) as the NLS equation, with a family of normalized (cf.
above) solutions
φ(x, t) = ψ(x, t) e
iθ
where
ψ(x, t) =
1
√
8σ
sech
_
x −vt −x
0
4σ
_
θ =
_
J
16¯hσ
2
+
¯h
4J
v
2
_
t +
¯hv
2J
x +θ
0
.
The moving Davydov soliton is a coherent localized excitation which couples the quantum
(excitonic) to the vibrations of the underlying lattice. Note that the above analysis is only
valid for positive σ. This restricts soliton velocities to v < c.
Energy of the moving soliton
Again, it is possible to calculate the total energy involved in the coupled excitonphonon
system. The contributions can be read oﬀ (16.10) in the continuum limit:
(v) = E
0
−2J +J
_
dx [φ
[
2
+χ
_
dx u
x
[φ[
2
+
k
2
_
dx u
2
x
+
M
2
_
dx u
2
t
= E
0
−2J +J
_
dx
_
ψ
2
+θ
2
x
ψ
2
_
−J
_
dx
σ
[φ[
4
+
1
2
J
σ
0
σ
_
dx
σ
[φ[
4
+
1
2
J
σ
0
σ
v
2
c
2
_
dx
σ
[φ[
4
155
16 Localization and transport of energy in proteins: The Davydov soliton
≈ E
0
−2J +J
_
dx
_
ψ
2
+θ
2
x
ψ
2
_
−
J
2
_
dx
σ
ψ
4
+J
v
2
c
2
_
dx
σ
ψ
4
+O(v
4
/c
4
)
≈ E
0
−2J +J
_
1
48σ
2
+
¯h
2
v
2
4J
2
_
−
J
24σ
2
+
J
12σ
2
v
2
c
2
≈ E
0
−2J −
χ
4
48Jk
2
+
1
2
m
∗
s
v
2
+O(v
4
/c
4
) (16.26)
where the sum of the ﬁrst three terms is the energy (16.23) of the static soliton and
m
∗
s
= m
∗
_
1 +
Mχ
4
6¯h
2
k
3
_
(16.27)
is the soliton’s eﬀective mass (where I have reintroduced the lattice constant a in order to
restore the proper units to m
∗
).
156
17 Nonlinear localization in
translationally invariant systems:
discrete breathers
The existence of localized states in condensed matter systems has traditionally been associ
ated with the presence of impurities and disorder, i.e. with material behavior which breaks
the translational invariance of the perfect crystal. One of the major discoveries of the last
two decades in nonlinear science is that localization may also occur as a consequence of
nonlinearity in pure, translationally invariant systems of any dimensionality. A key con
tribution to this ﬁeld was made by Sievers and Takeno [43] who proposed the existence
of “intrinsic localized modes” in anharmonic crystals. I will ﬁrst present their argument,
which is approximate and makes use of the “rotatingwave approximation”(RWA). I will then
present further evidence for the existence of nonlinear localized excitations  called “discrete
breathers” by some authors  based on numerical calculations by Flach and coworkers [44]
and give a plausibility argument which underlies the exact mathematical proof given by
MacKay and Aubry [45]. For more details consult the review articles by Flach [46] and
Aubry [47].
17.1 The SieversTakeno conjecture
The starting point is the onedimensional Hamiltonian which describes nearestneighbor
atoms coupled via nonlinear springs; the anharmonicity is quartic:
H =
n
_
p
2
n
2M
+
1
2
K
2
(u
n+1
−u
n
)
2
+
1
2
K
4
(u
n+1
−u
n
)
4
_
(17.1)
I try to solve the equations of motion
M¨ u
n
= K
2
(u
n+1
+u
n−1
−2u
n
) +K
4
[(u
n+1
−u
n
)
3
+ (u
n−1
−u
n
)
3
] (17.2)
using the “rotatingwave approximation”, known from the theory of nuclear magnetic reso
nance. The idea is to make the Ansatz
u
n
= α(ξ
n
e
−iωt
+ξ
∗
n
e
iωt
) (17.3)
and neglect fast oscillations which occur from terms of order e
−3iωt
. In the spirit of the
RWA, such oscillations presumably average to zero over the time scales of interest, deﬁned
by the inverse of the frequency ω. I will further assume that the ξ
n
’s are real, so that I keep
track of the e
−iωt
terms only. Note that there are no terms of order e
±2iωt
. The Ansatz
results in a second order recurrence equation for the amplitudes:
2ξ
n
−ξ
n+1
−ξ
n−1
+λ
_
(ξ
n
−ξ
n+1
)
3
+ (ξ
n
−ξ
n−1
)
3
¸
=
ω
2
J
ξ
n
(17.4)
where J = K
2
/M and λ = 3K
4
α
2
/K
2
. The value of J can be set equal to unity without
loss of generality.
157
17 Nonlinear localization in translationally invariant systems: discrete breathers
I now rewrite (17.4) as
[ω
2
δ
mn
−D
mn
]ξ
n
= V
n
(ξ
n−1
, ξ
n
, ξ
n+1
) (17.5)
where
V
n
(ξ
n−1
, ξ
n
, ξ
n+1
) = λ
_
(ξ
n
−ξ
n+1
)
3
+ (ξ
n
−ξ
n−1
)
3
¸
the matrix elements D
mn
= 2 if m = n, D
mn
= −1 if m = n ± 1 and D
mn
= 0 otherwise
describe the dynamical matrix of the harmonic chain with nearest neighbor springs. Now we
have used the inverse of the matrix ω
2
I −D in discussing the problem of a single impurity.
The matrix
_
ω
2
−D
_
−1
mn
= G([m−n[, ω
2
) =
1
N
q
e
−iq(m−n)
ω
2
−ω
2
q
, (17.6)
where the sum runs over all eigenvectors of the dynamical matrix, is known as the lattice
Green function. In the particular case we are discussing here, ω
2
q
= 2(1 − cos q) and G has
been shown, for frequencies above the band, i.e, ω
2
> 4, to be of the form
G(n, ω
2
) = (−1)
n
1
ω
2
g(n) , where
g(n) = (1 −y)
−1/2
_
2
y
_
1 −
1
2
y −(1 −y)
1/2
__
n
∼
_
y
4
_
n
, (17.7)
where y = 4/ω
2
and the last line gives the leading order asymptotic expansion for y ¸1.
Use of the lattice Green function allowed Sievers and Takeno to rewrite (17.5) as
ξ
m
=
n
G(m−n, ω
2
)V
n
(ξ
n−1
, ξ
n
, ξ
n+1
) (17.8)
and exploit the rapid convergence properties of the sum in the r.h.s. of (17.8) which results
from the exponential decay of the Green functions. Let us see how this works in detail:
We look for symmetric solutions of the type
ξ
n
= (−1)
n
η
n
with η
0
= 1.
1
Because of (17.4), η
0
and η
1
are related by the equation
1 +η
1
+λ(1 +η
1
)
3
=
1
2
ω
2
=
2
y
. (17.9)
In terms of the η’s, (17.8) can be written as
η
m
=
λ
ω
2
∞
n=−∞
¦g(m−n) +g(m−n + 1)¦ (η
n
+η
n+1
)
3
,
which can be further symmetrized by making use of the property g(−n) = g(n), to
η
m
=
_
2
y
−1 −η
1
_
g(m) +
1
4
λy
∞
n=1
A
mn
(η
n
+η
n+1
)
3
m ≥ 1 , (17.10)
where
A
mn
= g(m−n) +g(m−n + 1) +g(m+n) +g(m+n + 1) .
1
Note that this is permissible since the scale of the amplitude has already been ﬁxed by α in the original
Ansatz (17.3),
158
17 Nonlinear localization in translationally invariant systems: discrete breathers
Note that the m = 0 equation (which I did not write down) is not really independent, since it
must be equivalent to (17.9). The system of coupled nonlinear equations (17.9) and (17.10)
can in principle be solved numerically to yield the amplitudes and the eigenfrequency. The
numerical procedure would presumably converge fast, due to the exponential decay of the
A
mn
’s. In practice, only a few terms are expected to contribute to the sum.
However, one can already make some statements using the asymptotic properties of the
Green functions in the limit y ¸1.
To leading order in y, the m = 1 equation (17.10) gives
η
1
∼
1
2
which can be used in (17.9) to compute the eigenvalue. The result
y =
4
ω
2
=
1
3
4
+
27
16
λ
gives a consistent value of y ¸ 1 if λ ¸ 4/27. For suﬃciently strong nonlinearities, one
therefore has
ω
2
∼ 3 +
27
4
λ. (17.11)
17.2 Numerical evidence of localization
An illuminating picture of nonlinear localization as “local integrability” was obtained through
numerical simulations by Flach and coworkers [44]. I summarize some of their ﬁndings.
The starting point is the onedimensional Hamiltonian
H =
n
_
p
2
n
2
+
1
2
C(u
n+1
−u
n
)
2
+V (u
n
)
_
(17.12)
with a moderately weak harmonic interparticle coupling strength C = 0.1 and a nonlinear
onsite potential
V (u) = u
2
−u
3
+
1
4
u
4
.
The equations of motion
¨ u
n
= C(u
n+1
+u
n−1
−2u
n
) −V
(u
n
) i = 0, ±1, ±N/2 (17.13)
were numerically integrated for a lattice of N = 3000 sites, subject to periodic boundary
conditions. The initial condition was spatially localized, i.e. all particles started at rest,
˙ u
n
= 0 ∀n and all but one of them (at the n = 0 site) were at the equilibrium positions
corresponding to the absolute minimum of (17.12).
A measure of the energy density at the lth site is given by
e
l
=
1
2
˙ u
2
l
+V (u
l
) +
C
4
_
(u
l+1
−u
l
)
2
+ (u
l
−u
l−1
)
2
¸
. (17.14)
Note that the sum over all e
l
’s is by deﬁnition a conserved quantity, the total energy.
159
17 Nonlinear localization in translationally invariant systems: discrete breathers
Figure 17.1: The temporal evolution of e
(5)
. The solid line shows the total energy. In the inset,
the energy distribution around the central site, measured for 1000 < t < 1150 (from
[44] ).
17.2.1 Diagnostics of energy localization
An empirical diagnosis of localization can be made in terms of the quantities
e
(2m+1)
=
m
−m
e
l
, (17.15)
which provide a measure of the energy residing in the ﬁrst 2m+1 central sites. If these can
be shown to remain constant over long periods of time, one may reasonably conjecture the
presence of a localized oscillation which keeps the energy around the central sites. Fig. 17.1
describes the temporal evolution of e
(5)
. There is some radiation, amounting to less than
1% of the total energy, which occurs during the ﬁrst few hundred time units. After these
transients decay however, the energy stays remarkably constant.
17.2.2 Internal dynamics
It is possible to obtain additional information about the internal dynamics of the localized
oscillation by looking at the Fourier spectra of the few central sites. The numerical results
are shown in Fig. 17.2. The spectra of the central site, l = 0, shows a dominant peak at
ω
1
= 0.822. The spectra of the sites l = ±1 show a dominant peak at ω
2
= 1.34. All
other peaks of both spectra can be obtained as sums or diﬀerences of these two fundamental
frequencies. This remarkable result suggests that we are, in eﬀect, dealing with a system of
two degrees of freedom.
This suggestion can be tested in some more detail by looking at the reduced dynamical
system with two degrees of freedom
¨ u
0
= −V
(u
0
) + 2C(u
1
−u
0
) (17.16)
¨ u
1
= −V
(u
1
) +C(u
0
−2u
1
) (17.17)
which is obtained from the full dynamics (17.13) by assuming all particles with [l[ > 1 to
remain at rest, and exploiting the symmetry u
−1
= u
1
.
160
17 Nonlinear localization in translationally invariant systems: discrete breathers
Figure 17.2: Fourier spectra of u
0
(t > 1000). Inset: spectra of u
1
. All peak frequencies are either
sums or diﬀerences of the two fundamental frequencies. (From [44] ).
Fig. 17.3 shows a Poincar´e cut of this reduced dynamical system. Note the presence
of regular motion (tori) embedded in a sea of chaos. Flach and coworkers [44] made, and
tested, the following remarkable conjecture: If the frequencies of a torus lie outside the
phonon band of the linearized version of (17.13), the torus should correspond to a localized
oscillation of the full system. Indeed, a choice of initial condition from the islands 1 or 2
generates a localized oscillation in the full system (17.13). A choice from island 3, or from
a chaotic trajectory, generates a nonlocalized pattern. This is shown in detail in Fig. 17.4.
17.3 Towards exact discrete breathers
Consider a Hamiltonian of the type (17.12) which describes a system of N weakly coupled
nonlinear oscillators. The spatial dimensionality is not important for the arguments which
follow. Let the state of the system be described at any given time by the 2Ndimensional
vector
X = ¦x
1
, x
N
; p
1
p
N
¦ .
At zero coupling strength it is possible to excite a single oscillator at the lth site and leave
all other particles at rest. The motion of the system will be periodic, with a given period
161
17 Nonlinear localization in translationally invariant systems: discrete breathers
Figure 17.3: Poincar´e cut of the reduced dynamical system at E = 0.58. (From [44] ).
Figure 17.4: Temporal evolution of e
(5)
for a variety of initial conditions chosen according to their
properties in the reduced system. Localization occurs for 4 initial conditions chosen
from ﬁxed points in islands 1 and 2 in Fig. 17.3, the larger torus in island 1, and
the torus in island 2 (solid lines). Initial conditions from the torus in island 3 (long
dashed line), or from a chaotic trajectory (dasheddotted line) lead to decaying e
(5)
.
The upper shortdashed line shows the total energy of all simulations. (From [44] ).
T. I denote this by
X
0
(t) =
X
0
(t +T) .
Now consider what happens when a weak coupling is turned on. The time evolution of the
system over a time T, generated by the full Hamiltonian, transforms any initial state vector
X(0) (denoted from now on as
X for simplicity) to
X(T). Let
F[
X] =
X(T) −
X
formally denote the operator which performs this time evolution.
Periodic orbits of the interacting system correspond to roots of the 2N coupled equations
F[
X] = 0 . (17.18)
Since a weak coupling was assumed, it is not unreasonable to expect the initial condition
X
0
of the decoupled system  which is known to lead to a periodic orbit there  to lie near
the root of (17.18). We could then use this a starting point for a Newtonlike iteration
162
17 Nonlinear localization in translationally invariant systems: discrete breathers
procedure
2
, i.e. proceed to construct a next iterate
X
1
near the original guess
X
0
by
demanding that
F[
X
1
] =
F[
X
0
] +M
0
_
X
1
−
X
0
_
= 0
where the elements of the 2N 2N matrix M
0
are given by
M
0
mn
=
∂F
m
∂X
0
n
=
∂X
m
(T)
∂X
n
¸
¸
¸
¸
X=
X
0
−δ
mn
. (17.19)
This is equivalent to demanding
X
1
=
X
0
−
_
M
0
¸
−1
F[
X
0
]
which can be continued as
X
j+1
=
X
j
−
_
M
j
¸
−1
F[
X
j
] (17.20)
until convergence to the “true” discrete breather is achieved. This can also be a practical
method to construct exact breather solutions to machine numerical accuracy. It is successful,
provided that (a) the matrix M is invertible, and (b) that the breather frequency ω
b
= 2π/T
has no resonances of the type
nω
b
= ω
q
(17.21)
with the phonon band. Note that this generally allows breathers with frequencies above the
phonon bands without any restrictions, but may impose severe limits to breathers whose
frequencies lie below a phonon band. For example, in the case of the Hamiltonian (17.12),
breather frequencies must lie outside the frequency ranges
1 < nω
b
< (1 + 4C)
1/2
n = 1, 2, 3,
(cf. Fig. 17.5).
Figure 17.5: Forbidden frequency bands (in color) for DBs
in the case of the Hamiltonian (17.12). The
dotted line represents the linear phonon dis
persion curve. Note that the bands with n > 5
merge, leaving no frequency region allowed for
DBs.
0 1 2 3
0.0
0.5
1.0
1.5
F
r
e
q
u
e
n
c
y
k
Details of the existence proof for discrete breathers can be found in Ref. [45].
2
The Newton procedure for locating roots of f(x) = 0 starts from a “guess” x
0
with f(x
0
) not too far from
zero, and iterates successively,
x
j+1
= x
j
−
_
df
dx
_
−1
x=x
j
f(x
j
) .
Provided that the guess does not lead away from the true root, the procedure is rapidly (quadratically)
convergent.
163
A Impurities, disorder and localization
In the following, I will try to illustrate, with the help of characteristic examples, how dis
order can lead to localization of eigenstates. The point is not to present exact criteria for
localization; this would be beyond the scope of these lectures. Nonetheless, the simple ex
amples treated should make clear that (i) an isolated eigenstate, i.e. one which is outside
the phonon (or electron, depending on the problem) bands, will tend to be localized, and
(ii) the introduction of disorder will transform most eigenstates from extended to localized
ones. In other words, these examples serve to illustrate the obvious, i.e. that breaking trans
lational invariance will introduce some degree of localization; at the same time, they remind
us the basic condition for the existence of isolated localized states, i.e. that the frequency of
oscillation should lie outside any bands.
A.1 Deﬁnitions
A.1.1 Electrons
Consider a onedimensional tightbinding electron Hamiltonian
H =
n
_
n
c
†
n
c
n
−t
n,n+1
(c
†
n+1
c
n
+h.c)
_
(A.1)
where the onsite energies
n
and the hopping amplitudes t
n,n+1
may depend on the lattice
site.
We look for eigenstates of (A.1)
H[Ψ >= E[Ψ >
in the subspace of oneelectron states:
[Ψ >=
n
ψ
n
c
†
n
[0 > (A.2)
The amplitudes ψ
n
must then satisfy the diﬀerence eigenvalue equation
−t
n,n−1
ψ
n−1
−t
n,n+1
ψ
n+1
+
n
ψ
n
= Eψ
n
. (A.3)
Limiting cases are
• the translationally invariant case t
n
= t
0
,
n
=
0
∀n. The eigenstates are plane waves
ψ
(q)
n
=
1
√
N
e
iqn
with the corresponding eigenvalues
E
q
=
0
−2t
0
+ 2t
0
(1 −cos q)
.
• diagonal disorder t
n,n+1
= t
0
∀n .
• oﬀdiagonal disorder
n
=
0
∀n .
• a single impurity of either type, e.g
n
=
0
if n ,= α,
α
=
.
164
A Impurities, disorder and localization
A.1.2 Phonons
Lattice vibrations in disordered harmonic lattices are described by the same mathematics.
Consider the Hamiltonian
H =
n
p
2
n
2µ
n
+
1
2
n
k
n,n+1
(x
n+1
−x
n
)
2
+
1
2
n
v
n
x
2
n
(A.4)
where, in general, masses and spring constants depend on the lattice site; note that I have
added a harmonic onsite term. The coeﬃcients v
n
will of course vanish in the usual harmonic
chain with nearestneighbor springs only.
The classical equations of motion corresponding to (A.4) are
µ
n
¨ x
n
= k
n,n+1
x
n+1
+k
n,n−1
x
n−1
−(k
n,n+1
+k
n,n−1
+v
n
)x
n
; (A.5)
if we look for normal modes of (A.5) with the Ansatz
x
n
(t) =
1
√
µ
n
ψ
n
e
iωt
, (A.6)
the amplitudes must satisfy the diﬀerence eigenvalue equation
−t
n,n−1
ψ
n−1
−t
n,n+1
ψ
n+1
+
n
ψ
n
= ω
2
ψ
n
, (A.7)
where t
n,n+1
= (µ
n
µ
n+1
)
−1/2
k
n,n+1
and
n
= (k
n,n+1
+k
n,n−1
+v
n
)/µ
n
.
Limiting cases of interest are
• the translationally invariant case: k
n,n+1
= k, µ
n
= µ, v
n
= v ∀n. The eigenstates
are plane waves
ψ
(q)
n
=
1
√
N
e
iqn
with the corresponding eigenvalues
ω
2
q
=
1
µ
[v + 2k(1 −cos q)] .
• mass disorder: k
n,n+1
= k ∀n , ¦µ
n
¦ random.
• spring disorder: µ
n
= µ ∀n , ¦k
n
¦ random.
• a single impurity of either type, e.g µ
n
= µ if n ,= α, µ
α
= µ
(isotopic mass
impurity).
• onsite single impurity (corresponds to diagonal disorder at a single site): k
n,n+1
=
k, µ
n
= µ, ∀n, v
n
= v if n ,= α, v
α
= v
.
Note that in the most general case considered here, one has to diagonalize a tridiagonal NN
real symmetric matrix. This can be done with very eﬃcient numerical techniques[48].
A.2 A single impurity
A.2.1 An exact result
The case of a lattice impurity which modiﬁes only a single diagonal element of the dynamical
matrix has been treated exactly by M. Lax [49] and provides substantial insight. I present
the original derivation and elaborate on the special case of one dimension.
165
A Impurities, disorder and localization
Consider the more general problem where one knows the spectrum of a given dynamical
matrix D,
D
ij
e
j
(p) = λ(p) e
i
(p) i = 1, N (A.8)
and is interested in the spectrum of a modiﬁed matrix D+B, where the modiﬁcation B is
of reduced dimensionality, i.e.
B
rs
,= 0 only if r, s = 1, k; k ¸N.
Let
ψ be an eigenvector of the modiﬁed matrix, corresponding to an eigenvalue Λ:
(D
ij
+B
ij
) ψ
j
= Λψ
i
It follows that
(Λδ
ij
−D
ij
) ψ
i
= B
ij
ψ
j
ψ
i
= (ΛI −D)
−1
ij
B
jk
ψ
k
= (ΛI −D)
−1
ir
B
rs
ψ
s
=
p
e
∗
i
(p) e
r
(p)
Λ −λ(p)
B
rs
ψ
s
(A.9)
where in the last line I have used the standard representation of the inverse of D in terms
of its spectrum.
Let me now consider the special case where k = 1, i.e.
B
ij
= bδ
iα
δ
jα
(onsite lattice impurity at the site α, cf. above). The condition (A.9) becomes
ψ
i
= b
p
e
∗
i
(p) e
α
(p)
Λ −λ(p)
ψ
α
. (A.10)
If the unperturbed matrix describes a translationally invariant system, the eigenvectors will
be plane waves, i.e.
e
j
(p) =
1
√
N
e
i p·
R
j
(A.11)
where p is the wavevector corresponding to the eigenvalue index p. (Note that up to now
there are no restrictions to dimensionality.) Using the planewave form of the eigenvectors,
I rewrite (A.10) as
ψ
i
=
b
N
p
e
−i p(
R
i
−
R
α
)
Λ −λ(p)
ψ
α
. (A.12)
which, for i = α, becomes
1
N
p
1
Λ −λ(p)
=
1
b
. (A.13)
Locating states outside the band
Eq. (A.13) is an Nth order algebraic equation. It is straightforward to see that a root
must exist between any pair of successive eigenvalues λ(p) < Λ < λ(p + 1). This provides
N − 1 roots. The Nth root will therefore lie outside the band. In this case however, one
166
A Impurities, disorder and localization
can readily substitute the sum in (A.13) by an integral. Let us see how this works in the
onedimensional case with an eigenvalue spectrum
λ(p) = v + 2 (1 −cos p) , (A.14)
where
p
⇒
N
2π
_
π
−π
dp ⇒
N
π
_
π
0
dp .
(A.13) can be written as
_
π
0
dp
π
1
Λ −v −2 + 2 cos p
=
1
b
(A.15)
If b > 0, we look for a state above the band, Λ > v + 4. The integral has the value
[2(Λ − v − 4)]
−1/2
, therefore Λ = v + 4 + b
2
/2. As b → 0, the eigenvalue merges with the
band. Although this is not strictly the case (because changing a single mass modiﬁes two
oﬀdiagonal as well as a diagonal element of the dynamical matrix), it is very similar to
what happens if one introduces an impurity with a mass lighter than the rest. Fig. A.1b
shows a numerical calculation in that case.
If b < 0, the situation is analogous to what happens with a heavy impurity. The eigenvalue
lies below the phonon band, and merges with it as [b[ → 0. Fig. A.1b shows a numerical
calculation with a heavy impurity.
States outside the band are localized
I now return to (A.12) to extract information regarding the eigenvectors. In the one
dimensional case discussed above, (A.12) can be written as
ψ
n
ψ
0
= b
_
π
−π
dp
2π
e
ipn
Λ −λ(p)
= b
_
π
0
dp
π
cos(pn)
Λ −λ(p)
(A.16)
where I have assumed that the eigenvalue Λ lies outside the band, and that the impurity is
at the site 0. The imaginary part of the integral vanishes because λ(p) is an even function
of p.
Now take the case b > 0. I use the result Λ = v + 4 +b
2
/2 (cf. above) and obtain
ψ
n
ψ
0
=
b
2
_
π
0
dp
π
cos(pn)
1 +
b
2
4
+ cos p
.
The integral can be evaluated exactly, yielding
ψ
n
ψ
0
= (−1)
n
e
−n/ξ
, (A.17)
where
ξ =
1
arccosh(1 +b
2
/4)
.
The value of the eigenvector decreases exponentially with the distance from the impurity.
1
In other words, the state is localized. This is a generic feature of isolated eigenstates which
lie outside bands of extended states.
1
Note however that the localization length ξ diverges as b →0, i.e. as the eigenvalue approaches the band.
167
A Impurities, disorder and localization
A.2.2 Numerical results
A number of other simple cases can be worked out analytically. Here I show some numerical
results for the isotopic mass impurity. Note that, in the absence of analytical results such
as (A.17), one needs a handy criterion to determine the degree of localization of a given
eigenvector. One such criterion is the participation ratio, deﬁned as
a) b)
0 10 20 30 40 50 60
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
0 20 40 60
0.0
0.5
1.0
p
a
r
t
i
c
i
p
a
t
i
o
n
r
a
t
i
o
f
r
e
q
u
e
n
c
y
eigenvalue index
harmonic chain N=64, plus onsite u^2
isotopic impurity m
'
/m = 2
0.0
0.1
0.2
0.3
0.4
0.5
a
m
p
l
i
t
u
d
e
site
0 10 20 30 40 50 60
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
0 20 40 60
0.0
0.5
1.0
p
a
r
t
i
c
i
p
a
t
i
o
n
r
a
t
i
o
f
r
e
q
u
e
n
c
y
eigenvalue index
harmonic chain N=64, plus onsite u^2
isotopic impurity m
'
/m = 0.5
0.0
0.1
0.2
0.3
0.4
0.5
0.6
a
m
p
l
i
t
u
d
e
site
Figure A.1: Isotopic mass impurity in a harmonic crystal with onsite interaction: (a) heavy im
purity, (b) light impurity. Inset: the amplitude of the localized state.
P =
_
_
N
j
[ψ
j
[
4
_
_
−1
. (A.18)
The idea behind this particular criterion is that the squared amplitude of a normalized
extended state is typically 1/N, therefore its P should be of order unity; a state which is
localized over a couple of lattice constants contributes only amplitudes of order unity, and
therefore its P should be of order 1/N. A state which is localized over a signiﬁcant length
scale, say ξ lattice constants, will have a P of the order of ξ/N. Therefore, the participation
ratio provides a measure of the localization length. This is important when we interpret
numerical results: a statement like “eigenstates of disordered onedimensional systems are
always localized” is not very useful. Perhaps some states have localization lengths which are
comparable to the system size; this is bound to inﬂuence macroscopic transport properties.
Therefore, getting detailed information about participation ratios, localization lengths, etc.
is essential in understanding the eﬀects of disorder.
Fig. A.1 shows the vibrational spectrum of a onedimensional lattice with an onsite
potential (v ,= 0) and a single isotopic mass impurity (heavy and light).
Fig. A.2 shows the spectrum of a heavy or light mass impurity in the case where v = 0,
i.e. there is no onsite potential. The diﬀerence is that the heavy mass has nowhere to go;
there are no states below the band. In this case, all states remain extended.
168
A Impurities, disorder and localization
a) b)
0 10 20 30 40 50 60
0.0
0.5
1.0
1.5
2.0
2.5
0 20 40 60
0.2
0.0
0.2
0.4
p
a
r
t
i
c
i
p
a
t
i
o
n
r
a
t
i
o
f
r
e
q
u
e
n
c
y
eigenvalue index
harmonic chain N=64
isotopic impurity m
'
/m = 5
inset: "most" localized EV
0.0
0.2
0.4
0.6
a
m
p
l
i
t
u
d
e
site
0 10 20 30 40 50 60
0.0
0.5
1.0
1.5
2.0
2.5
0 20 40 60
0.5
0.0
0.5
1.0
p
a
r
t
i
c
i
p
a
t
i
o
n
r
a
t
i
o
f
r
e
q
u
e
n
c
y
eigenvalue index
harmonic chain N=64
isotopic impurity m
'
/m = 0.5
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
a
m
p
l
i
t
u
d
e
site
Figure A.2: as in previous ﬁgure, no onsite term. The light mass impurity does not generate a
localized state, because there are no states below the phonon band. Inset: right: the
localized state, left: the “most” localized state (the one with the lowest participation
ratio) is still an extended state.
A.3 Disorder
Here I show numerical results obtained for a selection of random distribution of potential
parameters.
A.3.1 Electrons in disordered onedimensional media
Figs. A.3 and A.4 show oneelectron spectra of disordered onedimensional system (A.3).
The disorder is of the diagonal type, i.e. the t
n,n+1
= 1 and
n
= 2 + Wr
n
, where r
n
is a
random number between −1/2 and 1/2, and and the strength of the disorder increases from
W = 1 to W = 4. Fig. A.5a is a histogram of the density of states for W = 2; Fig. A.5b
is a histogram of participation ratios for W = 1, 2, 4. Note the drastic increase of localized
states which occurs at W = 4.
A.3.2 Vibrational spectra of onedimensional disordered lattices
Figs. A.6 and A.7 show the eﬀect of mass disorder in the onedimensional harmonic lattice.
The masses are generated according to µ
i
= e
Wr
i
, where r
i
is a random number between
−0.5 and 0.5 and the strength of the disorder is W = 4. Note the proliferation of localized
states. Only the very lowest frequencies correspond to extended states.
169
A Impurities, disorder and localization
a) b)
0 20 40 60 80 100 120 140
1.0
0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
0 50 100 0.15
0.10
0.05
0.00
0.05
0.10
0.15
p
a
r
t
i
c
i
p
a
t
i
o
n
r
a
t
i
o
f
r
e
q
u
e
n
c
y
eigenvalue index
Schr discr,
N=128, plus onsite u^2
no disorder
0.50
0.55
0.60
0.65
0.70
0.75
0.80
a
m
p
l
i
t
u
d
e
site
0 20 40 60 80 100 120 140
1.0
0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
0 50 100
0.0
0.5
p
a
r
t
i
c
i
p
a
t
i
o
n
r
a
t
i
o
f
r
e
q
u
e
n
c
y
eigenvalue index
0.0
0.1
0.2
0.3
a
m
p
l
i
t
u
d
e
site
Figure A.3: Spectrum of oneelectron states : (a) reference (no disorder), (b) diagonal disorder
W = 1.
a) b)
0 20 40 60 80 100 120 140
4
2
0
2
4
6
8
80 100 120
0.0
0.5
p
a
r
t
i
c
i
p
a
t
i
o
n
r
a
t
i
o
f
r
e
q
u
e
n
c
y
eigenvalue index
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
0.18
0.20
0.22
0.24
0.26
0.28
0.30
a
m
p
l
i
t
u
d
e
site
0 20 40 60 80 100 120 140
4
2
0
2
4
6
8
90 95 100 105
0.0
0.5
1.0
p
a
r
t
i
c
i
p
a
t
i
o
n
r
a
t
i
o
f
r
e
q
u
e
n
c
y
eigenvalue index
0.00
0.02
0.04
0.06
0.08
a
m
p
l
i
t
u
d
e
site
Figure A.4: Spectrum of oneelectron states : (a) as in previous, W = 2, (b) W = 4.
170
A Impurities, disorder and localization
a) b)
4 2 0 2 4 6 8
0
2
4
6
8
10
12
14
16
18
n
u
m
b
e
r
o
f
s
t
a
t
e
s
energy
W=2
0.0 0.1 0.2 0.3
0
20
40
60
80
100
n
u
m
b
e
r
o
f
s
t
a
t
e
s
inv. partic.
W
1
2
4
Figure A.5: Spectrum of oneelectron states : (a) Density of states W = 2, (b) Histogram of
participation ratios, W = 1, 2, 4.
a) b)
0 500 1000
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
410 420
0.5
0.0
0.5
1.0
1.5
p
a
r
t
i
c
i
p
a
t
i
o
n
r
a
t
i
o
f
r
e
q
u
e
n
c
y
eigenvalue index
harm lattice
mass disorder 4
0.0
0.1
0.2
0.3
0.4
a
m
p
l
i
t
u
d
e
site
0 1 2 3 4 5
0
50
100
N
u
m
b
e
r
o
f
s
t
a
t
e
s
frequency
B
Figure A.6: Mass disorder in the harmonic lattice W=4 : (a) spectrum, and degree of localization,
(b) Frequency Histogram.
171
A Impurities, disorder and localization
a) b)
0.0 0.1 0.2 0.3
0
200
400
600
800
N
u
m
b
e
r
o
f
s
t
a
t
e
s
Inverse participation ratio
C
0.01 0.1 1 10
1E3
0.01
0.1
I
n
v
p
a
r
t
i
c
i
p
a
t
i
o
n
r
a
t
i
o
frequency
Figure A.7: Mass disorder in the harmonic lattice W=4 : (a) The vast majority of states are
localized. (b) A detailed view of localization vs. frequency: Localization lengths can
become signiﬁcantly large at low frequencies. There is a low frequency regime where
theory predicts that the localization length grows as the inverse square of the frequency
(dotted line). At the very lowest frequencies this theoretically predicted localization
length is limited by ﬁnite size eﬀects.
172
Bibliography
[1] P.Chr. Hemmer, L.C. Maximon and H. Wergeland, Phys. Rev. 111, 689 (1958).
[2] E. Fermi, J. Pasta and S. Ulam, Los Alamos report LA 1940 (1955), published in
Collected papers of Enrico Fermi, E. Segr´e (Ed.), University of Chicago Press (1965).
[3] C.Y. Lin, S.N. Cho, C.G. Goedde and S. Lichter, Phys. Rev. Lett. 82, 259 (1999).
[4] J. Ford, Phys. Reports 213, 271 (1992).
[5] John Scott Russell, Report on Waves (Report of the fourteenth meeting of the British
Association for the Advancement of Science, York, September 1844 (London 1845), pp
311390, Plates XLVIILVII).
[6] D.J. Korteweg and G. deVries, Phil. Mag. [5], 39, 422 (1895).
[7] I.M. Gel’fand, B.M. Levitan, Amer. Math. Soc. Transl. 1, 253 (1955).
[8] Marchenko
[9] L.D. Faddeev, J. Math. Phys. 4, 72 (1963).
[10] A.C. Scott, F.V.F. Chu and D. McLaughlin, Proc. IEEE 61, 1473 (1973).
[11] M. Toda, Phys. Repts. (1975); Theory of nonlinear lattices, Springer (1988)
[12] M. Henon and C. Heiles, Astron. J. 69, 73 (1964).
[13] L.E. Reichl and W.M. Zheng, Phys. Rev. A 29, 2186 (1984).
[14] S.J. Shenker and L.P. Kadanoﬀ, J. Stat. Phys. 27, 631 (1982).
[15] J.M. Greene, J. Math. Phys. 20, 1183 (1979).
[16] J.D. Meiss, Rev. Mod. Phys. 64, 795 (1992).
[17] M.H. Jensen, P. Bak and T. Bohr, Phys. Rev. A 30, 1960 (1984).
[18] E. Ott, Chaos in Dynamical Systems, Cambridge (2002).
[19] M. Tabor, Chaos and integrability in nonlinear dynamics, Wiley (1989).
[20] J. Frenkel and T. Kontorova, Phys. Z. Sowjet. 13, 1 (1938).
[21] F.C. Frank and J.H. van der Merwe, Proc. Roy. Soc. Lond. A198, 205 (1949).
[22] P.M. Chaikin and T.C. Lubensky, Principles of condensed matter physics, Cambridge
University Press (1995).
[23] S. Aubry in Solitons and Condensed Matter Physics (Eds. A.R. Bishop and T. Schnei
der), p. 264, Springer (1978); S. Aubry and P.Y. Le Daeron, Physica D 8, 381 (1983).
[24] M. Peyrard and S. Aubry, J. Phys. C 16, 1593 (1983).
173
Bibliography
[25] W. Chou and R.B. Griﬃths, Phys. Rev. B 34, 6219 (1986).
[26] O.V. Zhirov, G. Casati and D.L. Shepelyansky, Phys. Rev. E 65, 026220 (2002).
[27] H.J. Mikeska and M. Steiner, Adv. Phys. 40, 191 (1991).
[28] J.P. Boucher, L.P. Regnault, J. RossatMignaud, Y. Henry, J. Bouillot, W.G. Stirling
and F. Mezei, Physica B, 120, 141 (1983).
[29] A.J. Heeger, Rev. Mod. Phys. 73, 681 (2001).
[30] A.J. Heeger, S. Kivelson, J.R. Schrieﬀer and W.P. Su, Rev. Mod. Phys. 60, 781 (1988).
[31] W.P. Su, J.R. Schrieﬀer and A.J. Heeger, Phys. Rev. Lett. 42, 1698 (1979).
[32] H. Takayama, Y.R. LinLiu and K. Maki, Phys. Rev. B 21, 2388 (1980).
[33] L. Pitaevskii and S. Stringari, BoseEinstein Condensation, Oxford University Press
(2003).
[34] A.J. Leggett, Rev. Mod. Phys. 73, 307 (2001).
[35] S. Burger, K. Bongs, S. Dettmer, W. Ertmer, K. Sengstock, A. Sanpera, G. V. Shlyap
nikov and M. Lewenstein, Phys. Rev. Lett. 83, 5198 (1999).
[36] R.B. Inman and R.L. Baldwin, J. Mol. Biol. 8, 452 (1964).
[37] M. Peyrard and A.R. Bishop, Phys. Rev. Lett. 62, 2755 (1989).
[38] L.D. Landau and E.M. Lifshitz, Nonrelativistic Quantum Mechanics, Pergamon Press
(1977).
[39] N. Theodorakopoulos, M. Peyrard and R.S. MacKay, Phys. Rev. Lett. 93, 258101(2004).
[40] A.C. Scott, Rev. Mod. Phys. 47, 487 (1975).
[41] A.S. Davydov, J. Theor. Biol. 38, 559 (1973).
[42] A.C. Scott, Phys. Reports 217, 1 (1992).
[43] A. Sievers and S. Takeno, Phys. Rev. Lett. 61, 970 (1999).
[44] S. Flach, C.R. Willis and E. Olbrich, Phys. Rev. E 49, 836 (1994).
[45] R.S. MacKay and S. Aubry, Nonlinearity 7, 1623 (1994).
[46] S. Flach and C.R. Willis, Phys. Reports 295, 181 (1998).
[47] S. Aubry, Physica D 216, 1 (2006).
[48] W.H. Press, B.P. Flannery, S.A. Teukolsky and W.T. Vetterling, Numerical Recipes in
FORTRAN: The Art of Scientiﬁc Computing, Cambridge University Press (1992).
[49] M. Lax, Phys. Rev. 94, 1391 (1954).
174