Professional Documents
Culture Documents
of Classical Mechanics
Structure and Interpretation
of Classical Mechanics
This book was set by the authors using the LATEX typesetting system and
was printed and bound in the United States of America.
This book is dedicated,
in respect and admiration,
to
Contents vii
Preface xiii
Acknowledgments xvii
1 Lagrangian Mechanics 1
1.1 The Principle of Stationary Action 4
1.2 Configuration Spaces 9
1.3 Generalized Coordinates 11
1.4 Computing Actions 16
1.5 The Euler-Lagrange Equations 26
1.5.1 Derivation of the Lagrange Equations 27
1.5.2 Computing Lagrange’s Equations 34
1.6 How to Find Lagrangians 37
1.6.1 Coordinate Transformations 44
1.6.2 Systems with Rigid Constraints 48
1.6.3 Constraints as Coordinate Transformations 60
1.6.4 The Lagrangian is Not Unique 62
1.7 Evolution of Dynamical State 67
1.8 Conserved Quantities 76
1.8.1 Conserved Momenta 76
1.8.2 Energy Conservation 78
1.8.3 Central Forces in Three Dimensions 81
1.8.4 Noether’s Theorem 84
1.9 Abstraction of Path Functions 88
1.10 Constrained Motion 93
1.10.1Coordinate Constraints 95
viii Contents
1
In his book on mathematical pedagogy [15], Hans Freudenthal argues that
the reliance on ambiguous, unstated notational conventions in such expressions
as f (x) and df (x)/dx makes mathematics, and especially introductory calcu-
lus, extremely confusing for beginning students; and he enjoins mathematics
educators to use more formal modern notation.
2
In his beautiful book Calculus on Manifolds (1965), Michael Spivak uses
functional notation. On p.44 he discusses some of the problems with classical
notation. We excerpt a particularly juicy quote:
The mere statement of [the chain rule] in classical notation requires the
introduction of irrelevant letters. The usual evaluation for D1 (f ◦ (g, h))
runs as follows:
If f (u, v) is a function and u = g(x, y) and v = h(x, y) then
∂f (g(x, y), h(x, y)) ∂f (u, v) ∂u ∂f (u, v) ∂v
= +
∂x ∂u ∂x ∂v ∂x
[The symbol ∂u/∂x means ∂/∂x g(x, y), and ∂/∂u f (u, v) means
D1 f (u, v) = D1 f (g(x, y), h(x, y)).] This equation is often written simply
∂f ∂f ∂u ∂f ∂v
= + .
∂x ∂u ∂x ∂v ∂x
Note that f means something different on the two sides of the equation!
Preface xv
3
This is presented here without explanation, to give the flavor of the notation.
The text gives a full explanation.
4
“It is necessary to use the apparatus of partial derivatives, in which even the
notation is ambiguous.” From V.I. Arnold, Mathematical Methods of Classical
Mechanics (1980), Section 47, p258. See also the footnote on that page.
xvi Preface
1
A stationary point of a function is a point where the function’s value does not
vary as the input is varied. Local maxima or minima are stationary points.
2
The variational formulation successfully describes all of the Newtonian me-
chanics of particles and rigid bodies. The variational formulation has also
been usefully applied in the description of many other systems such as classi-
cal electrodynamics, the dynamics of inviscid fluids, and the design of mech-
anisms such as four-bar linkages. In addition, modern formulations of quan-
tum mechanics and quantum field theory build on many of the same con-
cepts. However, the variational formulation does not appear to apply to all
dynamical systems. For example, there is no simple prescription to apply
the variational apparatus to systems with dissipation, though in special cases
variational methods still apply.
3
3
Experience with systems on an atomic scale suggests that at this scale systems
do not travel along well-defined configuration paths. To describe the evolution
of systems on the atomic scale we employ quantum mechanics. Here, we
restrict attention to systems for which the motion is well described by a smooth
configuration path.
4
Extrapolation of the orbit of the Moon backward in time cannot determine
the point at which the Moon was placed on this trajectory. To determine
the origin of the Moon we must supplement dynamical evidence with other
physical evidence such as chemical compositions.
1.1 The Principle of Stationary Action 5
5
We suspect that this argument can be promoted to a precise constraint on
the possible ways of making this path-distinguishing function.
6
Historically, Huygens was the first to use the term “action” in mechanics. He
used the term to refer to “the effect of a motion.” This is an idea that came
from the Greeks. In his manuscript “Dynamica” (1690) Leibnitz enunciated a
“Least Action Principle” using the “harmless action,” which was the product
of mass, velocity, and the distance of the motion. Leibnitz also spoke of a
“violent action” in the case where things collided.
6 Chapter 1 Lagrangian Mechanics
7
RAb
definite integral of a real-valued
R b function f of a real argument is written
a
f . This can also be written a f (x)dx. The first notation emphasizes that
a function is being integrated.
8
Traditionally, square brackets are put around functional arguments. In this
case, the square brackets remind us that the value of S may depend on the
function γ in complicated ways, such as through its derivatives.
9
In the case of a real-valued function the value of the function and its deriva-
tives at some point can be used to construct a power series. For sufficiently
nice functions (real analytic) the power series constructed in this way con-
verges in some interval containing the point. Not all functions can be locally
represented in this way. For example, the function f (x) = exp(−1/x2 ), with
f (0) = 0, is zero and has all derivatives zero at x = 0, but this infinite number
of derivatives is insufficient to determine the function value at any other point.
10
Here ◦ denotes composition of functions: (f ◦g)(t) = f (g(t)). In our notation
the application of a path-dependent function to its path is of higher precedence
than the composition, so L ◦ T [γ] = L ◦ (T [γ]).
1.1 The Principle of Stationary Action 7
11
The derivative Dγ of a configuration path γ can be defined in terms of
ordinary derivatives by specifying how it acts on sufficiently smooth real-
valued functions f of configurations. The exact definition is unimportant at
this stage. If you are curious see footnote 23.
12
We will later discover that an initial segment of the local tuple will be
sufficient to determine the future evolution of the system. That a configuration
and a finite number of derivatives determines the future means that there is
a way of determining all of the rest of the derivatives of the path from the
initial segment.
13
The classical Lagrangian plays a fundamental role in the path-integral for-
mulation of quantum mechanics (due to Dirac and Feynman), where the com-
plex exponential of the classical action yields the relative probability ampli-
tude for a path. The Lagrangian is the starting point for the Hamiltonian
formulation of mechanics (discussed in chapter 3), which is also essential in
the Schrödinger and Heisenberg formulations of quantum mechanics and in
the Boltzmann-Gibbs approach to statistical mechanics.
8 Chapter 1 Lagrangian Mechanics
14
The principle is often called the “Principle of Least Action” because its
initial formulations spoke in terms of the action being minimized rather than
the more general case of taking on a stationary value. The term “Principle of
Least Action” is also commonly used to refer to a result, due to Maupertuis,
Euler, and Lagrange, which says that free particles move along paths for which
the integral of the kinetic energy is minimized among all paths with the given
endpoints. Correspondingly, the term “action” is sometimes used to refer
specifically to the integral of the kinetic energy. (Actually, Euler and Lagrange
used the vis viva, or twice the kinetic energy.)
15
Other ways of stating the principle of stationary action make it sound teleo-
logical and mysterious. For instance, one could imagine that the system con-
siders all possible paths from its initial configuration to its final configuration
and then chooses the one with the smallest action. Indeed, the underlying vi-
sion of a purposeful, economical, and rational universe played no small part in
the philosophical considerations that accompanied the initial development of
1.2 Configuration Spaces 9
mechanics. The earliest action principle that remains part of modern physics is
Fermat’s Principle, which states that the path traveled by a light ray between
two points is the path that takes the least amount of time. Fermat formu-
lated this principle around 1660 and used it to derive the laws of reflection
and refraction. Motivated by this, the French mathematician and astronomer
Pierre-Louis Moreau de Maupertuis enunciated the Principle of Least Action
as a grand unifying principle in physics. In his Essai de cosmologie (1750)
Maupertuis appealed to this principle of “economy in nature” as evidence of
the existence of God, asserting that it demonstrated “God’s intention to regu-
late physical phenomena by a general principle of the highest perfection.” For
a historical perspective of Maupertuis’s, Euler’s, and Lagrange’s roles in the
formulation of the principle of least action, see Jourdain [25].
16
For reflection the angle of incidence is equal to the angle of reflection. Re-
fraction is described by Snell’s law. Snell’s Law is that when light passes from
one medium to another, the ratio of the sines of the angles made to the normal
to the interface is the inverse of the ratio of the refractive indices of the media.
The refractive index is the ratio of the speed of light in the vacuum to the
speed of light in the medium.
17
We often refer to a point particle with mass but no internal structure as a
point mass.
10 Chapter 1 Lagrangian Mechanics
18
Strictly speaking the dimension of the configuration space and the number
of degrees of freedom are not the same. The number of degrees of freedom is
the dimension of the space of configurations that are “locally accessible.” For
systems with integrable constraints the two are the same. For systems with
non-integrable constraints the configuration dimension can be larger than the
number of degrees of freedom. For further explanation see the discussion of
systems with non-integrable constraints below (section 1.10.3). Apart from
that discussion, all of the systems we will consider have integrable constraints
(they are “holonomic”). This is why we have chosen to blur the distinction be-
tween the number of degrees of freedom and the dimension of the configuration
space.
1.3 Generalized Coordinates 11
19
A tuple of functions that all have the same domain is itself a function on
that domain: Given a point in the domain the value of the tuple of functions
is a tuple of the values of the component functions at that point.
20
The use of superscripts to index the coordinate components is traditional,
even though there is potential confusion, say, with exponents. We use zero-
based indexing.
1.3 Generalized Coordinates 13
21
More precisely, the generalized coordinates identify open subsets of the con-
figuration space with open subsets of Rn . It may require more than one set of
generalized coordinates to cover the entire configuration space. For example,
if the configuration space is a two-dimensional sphere, we could have one set
of coordinates that maps (a little more than) the northern hemisphere to a
disk, and another set that maps (a little more than) the southern hemisphere
to a disk, with a strip near the equator common to both coordinate systems.
A space that can be locally parametrized by smooth coordinate functions is
called a differentiable manifold. The theory of differentiable manifolds can be
used to formulate a coordinate-free treatment of variational mechanics. An
introduction to mechanics from this perspective can be found in [2] or [5] .
22
The derivative of a function f is a function. It is denoted Df . Our notational
convention is that D is a high-precedence operator. Thus D operates on the
adjacent function before any other application occurs: Df (x) is the same as
(Df )(x).
14 Chapter 1 Lagrangian Mechanics
23
The formal definition of is unimportant to the discussion, but if you really
want to know here is one way to do it:
First, we define the derivative Dγ of a configuration path γ in terms of
ordinary derivatives by specifying how it acts on sufficiently smooth real-
valued functions f of configurations: (Dn γ)(t)(f ) = Dn (f ◦ γ)(t). Then we
define χ (a, b, c, d, . . .) = (a, χ(b), c(χ), d(χ), . . .) . With this definition:
2 ¡ 2 ¢
χ (t, γ(t), Dγ(t), D γ(t), . . .) = t, χ(γ(t)), Dγ(t)(χ), D γ(t)(χ), . . .
¡ ¢
= t, χ ◦ γ(t), D(χ ◦ γ)(t), D2 (χ ◦ γ)(t), . . .
¡ ¢
= t, q(t), Dq(t), D2 q(t), . . . .
1.3 Generalized Coordinates 15
The action is
Z t2
S[γ](t1 , t2 ) = L ◦ T [γ]. (1.8)
t1
then25
24
The coordinate function χ is locally invertible, and so is χ.
25 −1
L ◦ T [γ] = L ◦ χ ◦ χ ◦ T [γ] = Lχ ◦ Γ[χ ◦ γ] = Lχ ◦ Γ[q].
16 Chapter 1 Lagrangian Mechanics
26
Here we are making a function definition. A definition specifies the value
of the function for arbitrarily chosen formal parameters. One may change
the name of a formal parameter, so long as the new name does not conflict
with any other symbol in the definition. For example, the following definition
specifies exactly the same free-particle Lagrangian:
L(a, b, c) = 12 m(c · c).
27
The Lagrangian is formally a function of the local tuple, but any particular
Lagrangian only depends on a finite initial segment of the local tuple. We
define functions of local tuples by explicitly declaring names for the elements
of the initial segment of the local tuple that includes the elements upon which
the function depends.
1.4 Computing Actions 17
28
We represent the local tuple as a composite data structure, the components
of which are the time, the generalized coordinates, the generalized velocities,
and possibly higher derivatives. We do not want to be bothered by the details
of packing and unpacking the components into these structures, so we provide
utilities for doing this. The constructor ->local takes the time, the coor-
dinates, and the velocities and returns a data structure representing a local
tuple. The selectors time, coordinate, and velocity extract the appropri-
ate pieces from the local structure. The procedures time = (component 0),
coordinate = (component 1) and velocity = (component 2).
29
Be careful. The x in the definition of q is not the same as the x that was used
as a formal parameter in the definition of the free-particle Lagrangian above.
There are only so many letters in the alphabet, so we are forced to reuse them.
We will be careful to indicate where symbols are given new meanings.
30
A tuple of coordinate or velocity components is made with the procedure
up. Component i of the tuple q is (ref q i). All indexing is zero based. The
word up is to remind us that in mathematical notation these components are
indexed by superscripts. There are also down tuples of components that are
indexed by subscripts. See the appendix on notation.
31
In our system, arithmetic operators are generic over symbols and expressions
as well as numeric values; so arithmetic procedures can work uniformly with
numbers or expressions. For example, if we have the procedure (define (cube
18 Chapter 1 Lagrangian Mechanics
x) (* x x x)) we can obtain its value for a number (cube 2) => 8 or for a
literal symbol (cube ’a) => (* a a a).
32
Derivatives of functions yield functions. For example, ((D cube) 2) => 12
and ((D cube) ’a) => (* 3 (expt a 2)).
1.4 Computing Actions 19
1 1 1
m (Dx (t))2 + m (Dy (t))2 + m (Dz (t))2
2 2 2
33
The display is generated with TEX.
34
For very complicated expressions the prefix notation of Scheme is often bet-
ter, but simplification is almost always useful. We can separate the functions
of simplification and infix display. We will see examples of this later.
35
Scmutils includes a variety of numerical integration procedures. The ex-
amples in this section were computed by rational-function extrapolation of
Euler-MacLaurin formulas with a relative error tolerance of 10−10 .
20 Chapter 1 Lagrangian Mechanics
m (xb − xa )2
.
2 tb − ta
36
Surely for a real physical situation we would have to specify units for these
quantities. In this illustration we do not give units.
37
Here we use decimal numerals to specify the parameters. This forces the
representations to be floating point, which is efficient for numerical calculation.
If symbolic algebra is to be done it is essential that the numbers be exact
integers or rational fractions, so that expressions can be reliably reduced to
lowest terms. Such numbers are specified without a decimal point.
38
The squared magnitude of the velocity is ~v · ~v , the vector dot-product of
the velocity with itself. The square of a structure of components is defined to
be the sum of the squares of the individual components, so we write simply
v 2 = v · v.
1.4 Computing Actions 21
We can use this to compute the action for a free particle over a
path varied from the given path, as a function of ²:40
(define ((varied-free-particle-action mass q nu t1 t2) epsilon)
(let ((eta (make-eta nu t1 t2)))
(Lagrangian-action (L-free-particle mass)
(+ q (* epsilon eta))
t1
t2)))
The action for the varied path, with ν(t) = (sin t, cos t, t2 ), and
² = 0.001 is, as expected, larger than for the test path:
((varied-free-particle-action 3.0 test-path
(up sin cos square)
0.0 10.0)
0.001)
436.29121428571153
39
Note that we are doing arithmetic on functions. We extend the arithmetic
operations so that the combination of two functions of the same type (same
domains and ranges) is the function on the same domain that combines the
values of the argument functions in the range. For example, if f and g are
functions of t, then f g is the function t 7→ f (t)g(t). A constant multiple of
a function is the function whose value is the constant times the value of the
function for each argument: cf is the function t 7→ cf (t).
40
Note that we are adding procedures. Paralleling our extension of arithmetic
operations to functions, arithmetic operations are extended to compatible pro-
cedures.
22 Chapter 1 Lagrangian Mechanics
41
The arguments to minimize are a procedure implementing the univariate
function in question, and the lower and upper bounds of the region to be
searched. Scmutils includes a choice of methods for numerical minimization;
the one used here is Brent’s algorithm, with an error tolerance of 10−5 . The
value returned by minimize is a list of 3 numbers: the first is the argument
at which the minimum occurred, the second is the minimum obtained, and
the third is the number of iterations of the minimization algorithm required
to obtain the minimum.
42
Yes, -1.5987211554602254e-14 is zero for the tolerance required of the min-
imizer. And the 435.0000000000237 is arguably the same as 435 obtained
before.
43
There are lots of good ways to make such a parametric set of approximating
trajectories. One could use splines or higher-order interpolating polynomials;
one could use Chebyshev polynomials; one could use Fourier components. The
choice depends upon the kinds of trajectories one wants to approximate.
1.4 Computing Actions 23
44
Here is one way to implement make-path:
(define (make-path t0 q0 t1 q1 qs)
(let ((n (length qs)))
(let ((ts (linear-interpolants t0 t1 n)))
(Lagrange-interpolation-function
(append (list q0) qs (list q1))
(append (list t0) ts (list t1))))))
The procedure linear-interpolants produces a list of elements that linearly
interpolate the first two arguments. We use this procedure here to specify ts,
the n evenly spaced intermediate times between t0 and t1 at which the path
will be specified. The parameters being adjusted, qs, are the positions at these
intermediate times. The procedure Lagrange-interpolation-function takes
a list of values and a list of times and produces a procedure that computes
the Lagrange interpolation polynomial that goes through these points.
45
The minimizer used here is the Nelder-Mead downhill simplex method. As
usual with numerical procedures, the interface to the nelder-mead procedure
is complex, with lots of optional parameters to allow the user to control errors
effectively. For this presentation we have specialized nelder-mead by wrapping
it in the more palatable multidimensional-minimize. Unfortunately, you will
have to learn to live with complicated numerical procedures someday.
24 Chapter 1 Lagrangian Mechanics
L(t, q, v) = 12 mv 2 − 12 kq 2 , (1.16)
46
Don’t worry. We know that you don’t yet know why this is the right La-
grangian. We will get to this in section 1.6.
47
By convention, named constants have names that begin with colon. The
constants named :pi and :-pi are what we would expect from their names.
1.4 Computing Actions 25
+0.0002
-0.0002
0 π/4 π/2
48
This result was initially discovered by Euler and later rederived by Lagrange.
49
The derivative or partial derivative of a function that takes structured argu-
ments is a new function that takes the same number and type of arguments.
The range of this new function is itself a structure with the same number of
components as the argument with respect to which the function is differenti-
ated.
1.5.1 Derivation of the Lagrange Equations 27
50
Lagrange’s equations are traditionally written in the form
d ∂L ∂L
− = 0,
dt ∂ q̇ ∂q
or, if we write a separate equation for each component of q, as
d ∂L ∂L
− i =0 i = 0, . . . , n − 1 .
dt ∂ q̇ i ∂q
In this way of writing Lagrange’s equations the notation does not distinguish
between L, which is a real-valued function of three variables (t, q, q̇), and L ◦
Γ[q], which is a real-valued function of one real variable t. If we do not realize
this notational pun, the equations don’t make sense as written—∂L/∂ q̇ is a
function of three variables, so we must regard the arguments q, q̇ as functions
of t before taking d/dt of the expression. Similarly, ∂L/∂q is a function of
three variables, which we must view as a function of t before setting it equal
to d/dt(∂L/∂ q̇). These implicit applications of the chain rule pose no problem
in performing hand computations—once you understand what the equations
represent.
51
The variation operator δη is like the derivative operator in that it acts on
the immediately following function: δη f [q] = (δη f )[q].
28 Chapter 1 Lagrangian Mechanics
δη S[q](t1 , t2 ) = 0. (1.29)
This follows from the fact that variation commutes with integra-
tion.
30 Chapter 1 Lagrangian Mechanics
which follows from equations (1.20) and (1.21), and using the chain
rule for variations (1.26) we get52
Z t2
δη S[q](t1 , t2 ) = (DL ◦ Γ[q])δη Γ[q]
t1
Z t2
= ((∂1 L ◦ Γ[q])η + (∂2 L ◦ Γ[q])Dη) . (1.32)
t1
52
A function of multiple arguments is considered a function of a tuple of its
arguments. Thus, the derivative of a function of multiple arguments is a
tuple of the partial derivatives of that function with respect to each of the
arguments. So in the case of a Lagrangian L
DL(t, q, v) = [∂0 L(t, q, v), ∂1 L(t, q, v), ∂2 L(t, q, v)] .
1.5.1 Derivation of the Lagrange Equations 31
Then
So
53
To make this argument more precise requires careful analysis.
32 Chapter 1 Lagrangian Mechanics
and
54
When we write a definition that names the components of the local tuple, we
indicate that these are grouped into time, position, and velocity components
by separating the groups with semicolons.
55
The derivative with respect to a tuple is a tuple of the partial derivatives
with respect to each component of the tuple (see the appendix on notation).
1.5.1 Derivation of the Lagrange Equations 33
So
· ¸
−µx(t) −µy(t)
∂1 L ◦ Γ[q](t) = 3/2
,
((x(t))2 + (y(t))2 ) ((x(t))2 + (y(t))2 )3/2
∂2 L ◦ Γ[q](t) = [mDx(t), mDy(t)] (1.46)
and
£ ¤
D(∂2 L ◦ Γ[q])(t) = mD2 x(t), mD2 y(t) . (1.47)
56
The symbol θ̇ is just a mnemonic symbol; the dot over the θ is not intended
to indicate differentiation. To define L we could have just as well have written:
L(a, b, c) = 12 ml2 c2 + mgl cos b. However, we use a dotted symbol to remind
us that the argument matching a formal parameter, such as θ̇, is a rate of
change of an angle, such as θ.
34 Chapter 1 Lagrangian Mechanics
57
In traditional notation these equations read
2
d ∂L d ∂L ∂L
− + = 0.
dt2 ∂ q̈ dt ∂ q̇ ∂q
58
The Lagrange-equations procedure uses the operations (partial 1) and
(partial 2), which implement the partial derivative operators with respect
to the second and third argument positions (those with indices 1 and 2).
1.5.2 Computing Lagrange’s Equations 35
(print-expression
(((Lagrange-equations (L-free-particle ’m))
test-path)
’t))
(down 0 0 0)
That the residuals are zero indicates that the test-path satisfies
the Lagrange equations.59
Instead of checking the equations for an individual path in
three-dimensional space, we can also apply the Lagrange-equations
procedure to an arbitrary function:60
(show-expression
(((Lagrange-equations (L-free-particle ’m))
(literal-function ’x))
’t))
(* (((expt D 2) x) t) m)
59
There is a Lagrange equation for every degree of freedom. The residuals of
all the equations are zero if the path is realizable. The residuals are arranged
in a down tuple because they result from derivatives of the Lagrangian with
respect to argument slots that take up tuples. See the appendix on notation.
60
Observe that the second derivative is indicated as the square of the derivative
operator (expt D 2). Arithmetic operations in Scmutils extend over operators
as well as functions.
36 Chapter 1 Lagrangian Mechanics
mD2 x (t)
(show-expression
(((Lagrange-equations (L-harmonic ’m ’k))
proposed-solution)
’t))
¡ ¢
cos (ωt + ϕ) a k − mω 2
Exercise 1.11:
Compute Lagrange’s equations for the Lagrangians in exercise 1.9 using
the Lagrange-equations procedure. Additionally, use the computer to
perform each of the steps in the Lagrange-equations procedure and
show the intermediate results. Relate these steps to the ones you showed
in the hand derivation of exercise 1.9.
Exercise 1.12:
a. Write a procedure to compute the Lagrange equations for Lagrangians
that depend upon acceleration, as in exercise 1.10.
1.6 How to Find Lagrangians 37
b. Use your procedure to compute the Lagrange equations for the La-
grangian
where V (t, x(t)) = V (t; x0 (t), . . . , xN −1 (t)) and ∂1,α V (t, x(t)) is
the tuple of the components of the derivative of V with respect
to the coordinates of the particle with index α, evaluated at time
t and coordinates x(t). These conditions are satisfied if for every
aα and bα
∂2 L(t; a0 , . . . , aN −1 ; b0 , . . . , bN −1 )
= [m0 b0 , . . . , mN −1 bN −1 ] (1.56)
and
∂1 L(t; a0 , . . . , aN −1 ; b0 , . . . , bN −1 )
= [−∂1,0 V (t, a), . . . , −∂1,N −1 V (t, a)] , (1.57)
61
Remember that x and v are just formal parameters of the Lagrangian. This
x is not the path x used earlier in the derivation, though it could be the value
of that path at a particular time.
1.6 How to Find Lagrangians 39
62
We can always give a function extra arguments that are not used so that it
can be algebraically combined with other functions of the same shape.
63
Hamilton formulated the fundamental variational principle for time-
independent systems in 1834-1835. Jacobi gave this principle the name
“Hamilton’s principle.” For systems subject to generic, nonstationary con-
straints Hamilton’s principle was investigated in 1848 by Ostrogradsky. In
the Russian literature Hamilton’s principle is often called the Hamilton-
Ostrogradsky principle.
William Rowan Hamilton (1805–1865) was a brilliant 19th-century mathe-
matician. His early work on geometric optics (based on Fermat’s principle)
was so impressive that he was elected to the post of Professor of Astronomy at
Trinity College and Royal Astronomer of Ireland while he was still an under-
graduate. He produced two monumental works of 19th-century mathematics.
His discovery of quaternions revitalized abstract algebra and sparked the de-
velopment of vector techniques in physics. His 1835 memoir “On a General
Method in Dynamics” put variational mechanics on a firm footing, finally giv-
ing substance to Maupertuis’s vaguely stated Principle of Least Action of 100
years before. Hamilton also wrote poetry and carried on an extensive corre-
spondence with Wordsworth, who advised him to put his energy into writing
mathematics rather than poetry.
40 Chapter 1 Lagrangian Mechanics
(show-expression
(((Lagrange-equations
(L-uniform-acceleration ’m ’g))
(up (literal-function ’x)
(literal-function ’y)))
’t))
" #
mD2 x (t)
gm + mD2 y (t)
As a procedure:
(define ((L-central-rectangular m U) local)
(let ((q (coordinate local))
(v (velocity local)))
(- (* 1/2 m (square v))
(U (sqrt (square q))))))
³q ´
DU (y (t))2 + (x (t))2 x (t)
mD2 x (t) + q
2
(y (t)) + (x (t)) 2
q
³ ´
2 2
DU (x (t)) + (y (t)) y (t)
mD y (t) +
2
q
2 2
(x (t)) + (y (t))
x = r cos ϕ
y = r sin ϕ. (1.62)
De
x(t) = De e − re(t)Dϕ(t)
r(t) cos ϕ(t) e sin ϕ(t)
e
De
y (t) = De e + re(t)Dϕ(t)
r(t) sin ϕ(t) e sin ϕ(t).
e (1.64)
These relations are valid for any configuration path at any mo-
ment, so we can abstract them to relations among coordinate
representations of an arbitrary velocity. Let vx and vy be the
rectangular components of the velocity; and ṙ and ϕ̇ be the rate
of change of r and ϕ. Then
" #
mD2 r (t) − mr (t) (Dϕ (t))2 + DU (r (t))
Exercise 1.13:
Check that the Lagrange equations for central force motion in polar
coordinates and the Lagrange equations in rectangular coordinates are
equivalent. Determine the relationship among the second derivatives
by substituting paths into the transformation equations and computing
derivatives, then substitute these relations into the equations of motion.
64
We will talk much more about angular momentum later.
1.6.1 Coordinate Transformations 45
L0 = L ◦ C. (1.70)
(t, x, v, . . .) = C(t, x0 , v 0 , . . .)
= (t, F (t, x0 ), ∂0 F (t, x0 ) + ∂1 F (t, x0 )v 0 , . . .) . (1.74)
L0 = L ◦ C (1.75)
Exercise 1.14:
Show by direct calculation that the Lagrange equations for L0 are satis-
fied if the Lagrange equations for L are satisfied.
65
As described in footnote 28 the procedure ->local constructs a local tuple
from an initial segment of time, coordinates, and velocities.
1.6.1 Coordinate Transformations 47
In terms of the polar coordinates and the rates of change of the po-
lar coordinates, the rates of change of the rectangular components
are:
(show-expression
(velocity
((F->C p->r)
(->local ’t (up ’r ’phi) (up ’rdot ’phidot)))))
à !
−ϕ̇r sin (ϕ) + ṙ cos (ϕ)
We can use F->C to find the Lagrangian for central force motion in
polar coordinates from the Lagrangian in rectangular components,
using equation (1.70),
(define (L-central-polar m U)
(compose (L-central-rectangular m U) (F->C p->r)))
(show-expression
((L-central-polar ’m (literal-function ’U))
(->local ’t (up ’r ’phi) (up ’rdot ’phidot))))
1 1
mϕ̇2 r2 + mṙ2 − U (r)
2 2
grangians analytically, then check the results with the computer by gen-
eralizing the programs that we have presented.
66
See section 1.6.1.
1.6.2 Systems with Rigid Constraints 49
ys (t) 0110 g
θ
l
0110
y m
10
x
A Lagrangian is L = T − V .
The Lagrangian is expressed as
(define ((T-pend m l g ys) local)
(let ((t (time local))
(theta (coordinate local))
(thetadot (velocity local)))
(let ((vys (D ys)))
(* 1/2 m
(+ (square (* l thetadot))
(square (vys t))
(* 2 l (vys t) thetadot (sin theta)))))))
Exercise 1.16:
Derive the Lagrangians in exercise 1.9.
67
We hope you appreciate the TEXmagic here. A symbol with a underline char-
acter is converted by show-expression to a subscript. Symbols with carets,
the names of Greek letters, and terminating in the characters ”dot” are simi-
larly mistreated.
52 Chapter 1 Lagrangian Mechanics
g m2
l1
l2
m1
m3
m1
x
θ g
m2
Why it works
In this section we show that L = T − V is in fact a suitable
Lagrangian for rigidly constrained systems. We do this by requir-
1.6.2 Systems with Rigid Constraints 53
{β|β↔α}
68
We will simply accept the Newtonian procedure for systems with rigid con-
straints and find Lagrangians that are equivalent. Of course, actual bodies are
never truly rigid, so we may wonder what detailed approximations have to be
made to treat them as truly rigid. For instance, a more satisfying approach
would be to replace the rigid distance constraints by very stiff springs. We
could then immediately write the Lagrangian as L = T − V , and we should
be able to derive the Newtonian procedure for systems with rigid constraints
as an approximation. However, this is too complicated to do at this stage, so
we accept the Newtonian idealization.
54 Chapter 1 Lagrangian Mechanics
where Fαβ (t) is the scalar magnitude of the tension in the con-
straint at time t. Note that F~αβ = −F~βα . In general, the scalar
constraint forces change as the system evolves.
Formally, we can reproduce Newton’s equations with the La-
grangian69
X
1 2
L(t; x, F ; ẋ, Ḟ ) = 2 mα ẋα − V (t, x)
α
X Fαβ £ ¤
− (xβ − xα )2 − lαβ
2
(1.89)
2lαβ
{α,β|α<β,α↔β}
69
This Lagrangian is purely formal and does not represent a model of the
constraint forces. In particular, note that the constraint terms do not look
like a potential of constraint with a minimum when the constraint is exactly
satisfied. Rather, the constraint terms in the Lagrangian are zero when the
constraint is satisfied, and can be either positive or negative depending on
whether the distance between the particles is larger or smaller than the con-
straint distance.
1.6.2 Systems with Rigid Constraints 55
we find
70
Typically the number of components of x is equal to the sum of the number
of components of q and c; adding a strut removes a degree of freedom and
adds a distance constraint. However, there are singular cases for which the
addition of single strut can remove more than a single degree of freedom. We
do not consider the singular cases here.
56 Chapter 1 Lagrangian Mechanics
71
Consider a function g of, say, three arguments, and let g0 be a function of two
arguments satisfying g0 (x, y) = g(x, y, 0). Then (∂0 g0 )(x, y) = (∂0 g)(x, y, 0).
The substitution of a value in an argument commutes with the taking of
the partial derivative with respect to a different argument. In deriving the
Lagrange equations for q we can set c = l and ċ = 0 in the Lagrangian, but we
cannot do this in deriving the Lagrange equations associated with c, because
we have to take derivatives with respect to those arguments.
1.6.2 Systems with Rigid Constraints 57
D(mα Dxα )(t) = −∂1,α V (t, x(t)) + λ(t)∂1,α ϕ(t, x(t)). (1.100)
x1 , y1
m1
l
θ
m0
x0 , y0
such that Lagrange’s equations will yield the Newton’s equations that
you derived in part a.
c. Make a change of coordinates to a coordinate system with center of
mass coordinates xcm , ycm , angle θ, distance between the particles c, and
tension force F . Write the Lagrangian in these coordinates, and write
the Lagrange equations.
d. You may deduce from one of these equations that c(t) = l. From
this fact we get that Dc = 0 and D2 c = 0. Substitute these into the
Lagrange equations you just computed to get equation of motion for
xcm , ycm , θ.
e. Make a Lagrangian (= T − V ) for the system described with the irre-
dundant generalized coordinates xcm , ycm , θ and compute the Lagrange
equations from this Lagrangian. They should be the same equations as
you derived for the same coordinates from part d.
Lf (t; x0 , . . . , xN −1 ; v0 , . . . , vN −1 )
X
1 2
= 2 mα vα − V (t; x0 , . . . , xN −1 ; v0 , . . . , vN −1 ) (1.104)
α
L = Lf ◦ C. (1.106)
The Lagrangian is
(show-expression
((L-pend ’m ’l ’g (literal-function ’y s))
(->local ’t ’theta ’thetadot)))
1 1
glm cos (θ)−gmys (t)+ l2 mθ̇2 +lmθ̇Dys (t) sin (θ)+ m (Dys (t))2
2 2
Dt F (t, q, v, a, . . .) = ∂0 F (t, q, v, a, . . .)
+ ∂1 F (t, q, v, a, . . .) v
+ ∂2 F (t, q, v, a, . . .) a + · · · , (1.114)
72
Components of a tuple structure, such as the value of Γ[q](t) can be selected
with selector functions: Ii gets the element with index i from the tuple.
64 Chapter 1 Lagrangian Mechanics
L0 = L + Dt F. (1.115)
Dt F = ∂0 F + ∂q F Q̇. (1.118)
Show explicitly that the Lagrange equations for Dt F are identically zero,
and thus that the addition of Dt F to a Lagrangian does not affect the
Lagrange equations.
difference ∆L = L − L0 is
∂ 0 G1 = ∂ 0 ∂ 1 F
∂1 G2 = ∂1 ∂0 F. (1.123)
∂ 0 G1 = ∂ 1 G0 . (1.124)
Furthermore, G1 = ∂1 F , so
∂1 G1 = ∂1 ∂1 F. (1.125)
Note that we have not shown that these conditions are sufficient
for determining that a function is a total time derivative, only that
they are necessary.
73
For example, the Lipschitz condition is that the rate of change of the deriva-
tive is bounded by a constant in an open set around each point of the trajec-
tory. See [22] for a good treatment of the Lipschitz condition.
68 Chapter 1 Lagrangian Mechanics
D2 q = A ◦ Γ[q], (1.126)
74
If the coordinates are redundant we cannot, in general solve for the highest-
order derivative. However, since we can transform to irredundant coordinates,
and since we can solve the initial-value problem in the irredundant coordinates,
and since we can construct the redundant coordinates from the irredundant
coordinates, we can in general solve the initial-value problem for redundant
coordinates. The only hitch is that we may not specify arbitrary initial con-
ditions: the initial conditions must be consistent with the constraints.
1.7 Evolution of Dynamical State 69
∂1 L ◦ Γ[q]
= ∂0 ∂2 L ◦ Γ[q] + (∂1 ∂2 L ◦ Γ[q]) Dq + (∂2 ∂2 L ◦ Γ[q]) D2 q.
D2 q =
[∂2 ∂2 L ◦ Γ[q]]−1 [∂1 L ◦ Γ[q] − (∂1 ∂2 L ◦ Γ[q]) Dq − ∂0 ∂2 L ◦ Γ[q]]
75
In Scmutils division by a matrix is interpreted as multiplication on the left
by the inverse matrix.
70 Chapter 1 Lagrangian Mechanics
the state (t, q(t), Dq(t)) at the moment t the derivative of the state
is (1, Dq(t), D2 q(t)) = (1, Dq(t), A(t, q(t), Dq(t))). The procedure
Lagrangian->state-derivative takes a Lagrangian and returns
a procedure that takes a state and returns the derivative of the
state:
(define (Lagrangian->state-derivative L)
(let ((acceleration (Lagrangian->acceleration L)))
(lambda (state)
(up 1
(velocity state)
(acceleration state)))))
(print-expression
((harmonic-state-derivative ’m ’k)
(up ’t (up ’x ’y) (up ’v x ’v y))))
(up 1 (up v x v y) (up (/ (* -1 k x) m) (/ (* -1 k y) m)))
(define ((qv->state-path q v) t)
(up t (q t) (v t)))
(show-expression
(((Lagrange-equations-first-order (L-harmonic ’m ’k))
(up (literal-function ’x)
(literal-function ’y))
(up (literal-function ’v x)
(literal-function ’v y)))
’t))
0
à !
−
Dx (t) vx (t)
Dy (t) − vy (t)
kx (t)
m + Dvx (t)
ky (t)
+ Dvy (t)
m
The zero in first element of the structure of the Lagrange equa-
tions residuals is just the tautology that time advances uniformly:
that the time function is just the identity, so its derivative is 1
and the residual is zero. The equations in the second element
constrain the velocity path to be the derivative of the coordinate
path. The equations in the third element give the rate of change
of the velocity in terms of the applied forces.
Numerical integration
A set of first order ordinary differential equations that give the
state derivative in terms of the state can be integrated to find the
state path that emanates from a given initial state. Numerical
integrators find approximate solutions of such differential equa-
tions by a process illustrated in figure 1.6. The state derivative
produced by Lagrangian->state-derivative can be used by a
package that numerically integrates systems of first-order ordinary
differential equations.
The procedure state-advancer can be used to find the state of
a system at a specified time, given an initial state, which includes
the initial time, and a parametric state-derivative procedure.76
76
The Scmutils system provides a stable of numerical integration routines
that can be accessed through this interface. These include quality-controlled
Runge-Kutta (QCRK4) and Bulirsch-Stoer. The default integration method
is Bulirsch-Stoer.
72 Chapter 1 Lagrangian Mechanics
t0
R t
t 1
a
q(t0 )
A
R q(t)
q
v Dq(t0 )
R Dq(t)
Figure 1.6 The input to the system derivative is the state. The func-
tion A gives the acceleration as a function of the components that de-
termine the state. The output of the system derivative is the derivative
of the state. The integrator takes the derivative of the state as its in-
put and produces the integrated state, starting at the initial conditions.
Notice how the second-order system is put into first-order form by the
routing of the Dq(t) components in the system derivative.
77
The procedure state-advancer automatically compiles state-derivative pro-
cedures the first time they are encountered. The first time a new state-
derivative is used there is a delay while compilation occurs.
1.7 Evolution of Dynamical State 73
(show-expression
((pend-state-derivative ’m ’l ’g ’a ’omega)
(up ’t ’theta ’thetadot)))
1
θ̇
aω 2 cos (ωt) sin (θ) g sin (θ)
−
l l
74 Chapter 1 Lagrangian Mechanics
((evolve pend-state-derivative
1.0 ;m=1kg
1.0 ;l=1m
9.8 ;g=9.8m/s2
0.1 ;a=1/10 m
(* 2.0 (sqrt 9.8)) ) ;omega
(up 0.0 ;t0 =0
1. ;theta0 =1 radian
0.) ;thetadot0 =0 radians/s
(monitor-theta plot-win)
0.01 ;step between plotted points
100.0 ;final time
1.0e-13) ;local error tolerance
Figure 1.7 shows the angle θ versus time for a couple of orbits for
the driven pendulum. The initial conditions for the two runs are
the same except that in one the bob is given a tiny velocity equal to
10−10 m/s, about one atom width per second. The initial segments
78
The results are plotted in a plot-window that is created by the procedure
frame with arguments xmin, xmax, ymin, ymin, that specify the limits of the
plotting area. Points are added to the plot with the procedure plot-point
that takes a plot-window and the abscissa and ordinate of the point to be
plotted.
The procedure principal-value is used to reduce an angle to a standard
interval. The argument to principal-value is the point at which the circle is
to be cut. Thus (principal-value :pi) is a procedure that reduces an angle
θ to the interval −π ≤ θ < π.
1.7 Evolution of Dynamical State 75
+π
−π
0 25 50 75 100
+π
−π
0 25 50 75 100
79
In the older literature conserved quantities are sometimes called first inte-
grals.
1.8.1 Conserved Momenta 77
So we see that
with components
pi = Pi ◦ Γ[q]. (1.135)
The momentum path is well defined for any path q. If the path is
realizable and the Lagrangian does not depend on q i then pi is a
constant function
Dpi = 0. (1.136)
80
The derivative of a component is equal to the component of the derivative.
81
Observe that we indicate a component of the generalized momentum with
a subscript, and indicate a component of the generalized coordinates with a
superscript. These conventions are consistent with the ones that are commonly
used in tensor algebra, which is sometimes helpful in working out complex
problems.
78 Chapter 1 Lagrangian Mechanics
82
In general, conserved quantities in a physical system are associated with
continuous symmetries, whether or not one can find a coordinate system in
which the symmetry is apparent. This powerful notion was formalized and a
theorem linking conservation laws with symmetries was proved by E. Noether
early in the 20th century. See section 1.8.4 on Noether’s theorem.
1.8.2 Energy Conservation 79
stant of the motion, the energy, if the Lagrangian L(t, q, q̇) does
not depend explicitly on the time: ∂0 L = 0.
Consider the time derivative of the Lagrangian along a solution
path q:
Isolating ∂0 L and combining the first two terms on the right side
E = P Q̇ − L, (1.140)
83
The sign of the energy state function is a matter of convention.
80 Chapter 1 Lagrangian Mechanics
E = P Q̇ − L = P Q̇ − T + V. (1.144)
E = 2T − T + V = T + V. (1.145)
84
Euler’s theorem says that if f is a function of x = (x0 , x1 , . . .) that is homo-
geneous of degree n in each of the xi , then
X ³ ∂f ´
(x)xi = nf (x).
i
∂xi
1.8.3 Central Forces in Three Dimensions 81
Exercise 1.28:
An analogous result holds when the fα do depend explicitly on time.
a. Show that in this case the kinetic energy contains terms that are
linear in the generalized velocities.
b. Show that, by adding a total time derivative, the Lagrangian can
be written in the form L = A − B, where A is a homogeneous quadratic
form in the generalized velocities, and B is velocity independent.
c. Show, using Euler’s theorem, that the energy function is E = A + B.
An example where terms that were linear in the velocity were removed
from the Lagrangian by adding a total time derivative has already been
given: the driven pendulum.
Exercise 1.29:
A particle of mass m slides off a horizontal cylinder of radius R in a
uniform gravitational field with acceleration g. If the particle starts
close to the top with zero initial speed, with what angular velocity does
the particle leave the cylinder?
As a procedure:
(define ((T3-spherical m) state)
(let ((t (time state))
(q (coordinate state))
(qdot (velocity state)))
(let ((r (ref q 0))
(theta (ref q 1))
(phi (ref q 2))
(rdot (ref qdot 0))
(thetadot (ref qdot 1))
(phidot (ref qdot 2)))
(* 1/2 m
(+ (square rdot)
(square (* r thetadot))
(square (* r (sin theta) phidot)))))))
82 Chapter 1 Lagrangian Mechanics
Let’s first look at the generalized forces (the derivatives of the La-
grangian with respect to the generalized coordinates). We com-
pute these with a partial derivative with respect to the coordinate
argument of the Lagrangian:
(show-expression
(((partial 1) (L3-central ’m (literal-function ’V)))
(up ’t
(up ’r ’theta ’phi)
(up ’rdot ’thetadot ’phidot))))
mϕ̇2 r (sin (θ))2 + mrθ̇2 − DV (r)
mϕ̇2 r2 cos (θ) sin (θ)
0
mṙ
mr2 θ̇
mr2 ϕ̇ (sin (θ))2
(show-expression
((compose (ang-mom-z ’m) (F->C s->r))
(up ’t
(up ’r ’theta ’phi)
(up ’rdot ’thetadot ’phidot))))
1 1 1
mϕ̇2 r2 (sin (θ))2 + mr2 θ̇2 + mṙ2 + V (r)
2 2 2
x = Fe (s)(t, x0 ). (1.146)
85
Noether’s theorem is more general than we state and prove it here. We
assume the transformations Fe (s) have no dependence on the generalized ve-
locities. Properly, we should also consider velocity dependent symmetries.
1.8.4 Noether’s Theorem 85
For s = 0 the paths q and q 0 are the same, so Γ[q] = Γ[q 0 ], and
this equation becomes
86
The total time derivative is like a derivative with respect to a real-number
argument in that it does not generate structure, so it can commute with
derivatives that generate structure. Be careful though, it may not commute
with some derivatives for other reasons. For example, Dt ∂1 (Fe (s)) is the same
as ∂1 Dt (Fe (s)), but Dt ∂2 (Fe (s)) is not the same as ∂2 Dt (Fe (s)). The reason is
that Fe (s) does not depend on the velocity, but Dt (Fe (s)) does.
86 Chapter 1 Lagrangian Mechanics
L(t; x, y, z; vx .vy , vz )
¡ ¢ ³p ´
= 12 m vx2 + vy2 + vz2 − U x2 + y 2 + z 2 , (1.155)
x2 + y 2 + z 2 = (x0 )2 + (y 0 )2 + (z 0 )2 . (1.157)
and
DFe(0)(t; x, y, z) = DR
ez (0)(x, y, z) = (y, −x, 0) . (1.161)
87
The definition of the procedure Rx is
(define ((Rx angle) q)
(let ((ca (cos angle)) (sa (sin angle)))
(let ((x (ref q 0)) (y (ref q 1)) (z (ref q 2)))
(up x
(- (* ca y) (* sa z))
(+ (* sa y) (* ca z))))))
The definitions of Ry and Rz are similar.
88 Chapter 1 Lagrangian Mechanics
(define Noether-integral
(let ((L (L-central-rectangular
’m (literal-function ’U))))
(* ((partial 2) L) ((D F-tilde) 0 0 0))))
(print-expression
(Noether-integral
(up ’t
(up ’x ’y ’z)
(up ’vx ’vy ’vz))))
(down (+ (* m vy z) (* -1 m vz y))
(+ (* m vz x) (* -1 m vx z))
(+ (* m vx y) (* -1 m vy x)))
that generates an osculating path with the given local tuple com-
ponents. So O(t, q, v, . . .)(t) = q, D(O(t, q, v, . . .))(t) = v, and in
general
f = Γ̄(f¯). (1.166)
(show-expression
((F->C p->r)
(->local ’t (up ’r ’theta) (up ’rdot ’thetadot))))
t
à !
r cos (θ)
à r sin (θ)
−rθ̇ sin (θ) + ṙ cos (θ) !
rθ̇ cos (θ) + ṙ sin (θ)
Ē[L][q] = 0. (1.170)
E[L](t, q, v, . . .) = Γ̄(Ē[L])(t, q, v, . . .)
= D(∂2 L ◦ Γ[O(t, q, v, . . .)])
− ∂1 L ◦ Γ[O(t, q, v, . . .)]
92 Chapter 1 Lagrangian Mechanics
E[L] = Dt ∂2 L − ∂1 L. (1.174)
88
Notice that Gamma has one more argument than it usually has. This argument
gives the length of the initial segment of the local tuple needed. The default
length is 3, giving components of the local tuple up to and including the
velocities.
1.10 Constrained Motion 93
δη (ϕ̄) = 0. (1.178)
89
Given any acceptable variation we may make another acceptable variation by
multiplying the given one by a bump function that emphasizes any particular
time interval.
1.10.1 Coordinate Constraints 95
Note that these are functions of time; the variation at a given time
is tangent to the constraint at that time.
∂2 ϕ ≡ 0.
(∂1 ϕ ◦ Γ) η = 0. (1.180)
That the two vectors are parallel everywhere along the path does
not guarantee that the proportionality factor is the same at each
moment along the path, so the proportionality factor λ is some
function of time, which may depend on the path under consider-
ation. These equations, with the constraint equation ϕ ◦ Γ[q] = 0,
are the governing equations. These equations are sufficient to de-
termine the path q and to eliminate the unknown function λ.
Now watch this
Suppose we form an augmented Lagrangian treating λ as one of
the coordinates
90
We take two tuple-valued functions of time to be orthogonal if at each instant
the dot product of the tuples is zero. Similarly, tuple-valued functions are
considered parallel if at each moment one of the tuples is a scalar multiple of
the other. The scalar multiplier is in general a function of time.
96 Chapter 1 Lagrangian Mechanics
91
Recall that the Euler-Lagrange operator E has the property
E [F G] = F E[G] + E[F ] G + Dt F ∂2 G + ∂2 F Dt G.
1.10.1 Coordinate Constraints 97
θ
l
The Lagrange equations are the same as those derived from the
augmented Lagrangian L0 . The difference is that now we see that
λ = Λ ◦ Γ[q] is determined by the unaugmented state. This is the
same as saying that λ can be eliminated.
Considering only the formal validity of the Lagrange equations
for the augmented Lagrangian, we could not deduce that λ could
be written as the composition of a state-dependent function Λ with
Γ[q]. The explicit Lagrange equations derived from the augmented
Lagrangian depend on the accelerations D2 q as well as λ so we
may not deduce separately that either is the composition of a
state-dependent function and Γ[q]. However, now we see that λ is
such a composition. This allows us to deduce that D2 q is also a
state-dependent function composed with the path. The evolution
of the system is determined from the dynamical state.
The pendulum using constraints
The pendulum can be formulated as the motion of a massive par-
ticle in a vertical plane subject to the constraint that the distance
to the pivot is constant (see figure 1.8).
In this formulation, the kinetic and potential energies in the
Lagrangian are those of an unconstrained particle in a uniform
98 Chapter 1 Lagrangian Mechanics
x2 + y 2 − l2 = 0. (1.188)
These equations are sufficient to solve for the motion of the pen-
dulum.
It should not be surprising that these equations simplify if we
switch to “polar” coordinates
Multiplying the first by cos θ and the second by sin θ and adding,
we find
92
This constraint has the same form as the constraints used in the demonstra-
tion that L = T − V can be used for rigid systems. Here it is a particular
example of a more general set of constraints.
1.10.1 Coordinate Constraints 99
93
Indeed, if we had scaled the constraint equations as we did in the discussion
of Newtonian constraint forces we could have identified λ with the the magni-
tude of the constraint force F . However, though λ will in general be related to
the constraint forces it will not be one of them. We chose to leave the scaling
as it naturally appeared rather than make things turn out artificially pretty.
100 Chapter 1 Lagrangian Mechanics
k1 k2
m1 m2
X1 x1 X2 x2
k1
m1
X1 x1
k2
ξ
m2
X2 x2
Let’s see how this works. The Lagrangian for the subsystem
attached to the wall is
m1 D2 x1 = −k1 x1 − λ (1.201)
m2 (D2 ξ + D2 x2 ) = −k2 x2 (1.202)
m2 (D2 ξ + D2 x2 ) = λ (1.203)
0 = ξ − (X1 + x1 ) (1.204)
m1 D2 x1 + m2 (D2 x1 + D2 x2 ) + k1 x1 = 0 (1.205)
m2 (D2 x1 + D2 x2 ) + k2 x2 = 0 (1.206)
ψ = Dt ϕ = ∂0 ϕ + ∂1 ϕQ̇. (1.208)
∂1 ϕ = ∂2 ψ, (1.211)
L0 = L + λ0 ψ. (1.213)
θ
x
Exercise 1.37:
Show that the augmented Lagrangian (1.213) does lead to the Lagrange
equations (1.214), taking into account the fact that ψ is a total time
derivative of ϕ.
Goldstein’s hoop
Here we consider a problem for which the constraint can be rep-
resented as a time derivative of a coordinate constraint: a hoop
of mass M rolling, without slipping, down a (one-dimensional)
inclined plane (see figure 1.10).94
We will formulate this problem in terms of the two coordinates
θ, the rotation of an arbitrary point on the hoop from an arbitrary
reference direction, and x, the linear progress down the inclined
plane. The constraint is that the hoop does not slip. Thus a
change in θ is exactly reflected in a change in x; the constraint
function is:
94
This example appears in [18] pages 49–51,
1.10.2 Derivative Constraints 105
The kinetic energy has two parts, the energy of rotation of the
hoop and the energy of the motion of its center of mass.95 The
potential energy of the hoop decreases as the height decreases.
Thus we may write the augmented Lagrangian:
M D2 x − Dλ = M g sin ϕ (1.217)
M R2 D2 θ + R Dλ = 0 (1.218)
R Dθ − Dx = 0. (1.219)
D2 x = RD2 θ. (1.220)
D2 x = 12 g sin ϕ (1.221)
is just half of what it would have been if the mass had just slid
down a frictionless plane without rotating. Note that for this hoop
D2 x is independent of both M and R. We see from the Lagrange
equations that Dλ can be interpreted as the friction force involved
in enforcing the constraint. The frictional force of constraint is
Dλ = 12 M g sin ϕ (1.222)
95
We will see in chapter 2 how to compute the kinetic energy of rotation, but
for now the answer is 12 M R2 θ̇2
106 Chapter 1 Lagrangian Mechanics
96
For some treatments of non-holonomic systems see, for example, Whit-
taker [43], Goldstein [18], Gantmakher [17], or Arnold et al. [6].
1.10.3 Non-Holonomic Systems 107
and
0 = E[L] ◦ Γ[q]
+ Dλ(∂2 ψ) ◦ Γ[q] + λD((∂2 ψ) ◦ Γ[q]) − λ(∂1 ψ) ◦ Γ[q]. (1.234)
ψ ◦ Γ[q] = 0. (1.235)
97
Arnold, et al. [6] call the variational mechanics with the constraints added
to the Lagrangian Vakonomic mechanics.
1.11 Summary 109
1.11 Summary
1.12 Projects
Exercise 1.38: A numerical investigation
Consider a pendulum: a mass m supported on a massless rod of length
l, in a uniform gravitational field. A Lagrangian for the pendulum is:
m
L(t, θ, θ̇) = (lθ̇)2 + mgl cos θ
2
For the pendulum, the period of the motion depends on the amplitude.
We wish to find trajectories of the pendulum with a given frequency.
Three methods of doing this present themselves: (1) solution by the
principle of least action, (2) numerical integration of Lagrange’s equa-
tion, and (3) analytic solution (which requires some exposure to elliptic
functions). We will carry out all three, and compare the solution trajec-
tories.
To be specific, consider the parameters m = 1kg, l = 1m,pg =
9.8ms−2 . The frequency of small amplitude oscillations is ω0 = g/l.
Let’s find the non-trivial solution that has the frequency ω1 = 45 ω0 .
a. The angle is periodic in time, so a Fourier series representation is
appropriate. We can choose the origin of time so that a zero crossing
of the angle is at time zero. Since the potential is even in the angle,
the angle is an odd function of time. Thus we need only a sine series.
Since the angle returns to zero after one-half period the angle is an odd
function of time about the midpoint. Thus only odd terms of the series
are present:
m
X
θ(t) = An sin((2n − 1)ω1 t).
n=1
P∞
The amplitude of the trajectory is A = θmax = n=1 (−1)n+1 An .
Find approximations to the first few coefficients An by minimizing
the action. You will have to write a program similar to the find-path
procedure in section 1.4. Watch out: there is more than one trajectory
that minimizes the action.
b. Write a program to numerically integrate Lagrange’s equations for
the trajectories of the pendulum. The trouble with using numerical
integration to solve this problem is that we do not know how the fre-
quency of the motion depends on the initial conditions. So we have to
guess, and then gradually improve our guess. Define a function Ω(θ̇)
that numerically computes the frequency of the motion as a function of
the initial angular velocity (with θ = 0). Find the trajectory by solving
Ω(θ̇) = ω, for the initial angular velocity of the desired trajectory. Meth-
ods of solving this equation include successive bisection, minimizing the
squared residual, etc.—choose one.
1.12 Projects 111
θ1 l 1
m1
l2
θ2 m2
Figure 1.11 The double pendulum is pinned in two joints so that its
members are free to move in a plane.
c. Now let’s formulate the analytic solution for the frequency as a func-
tion of amplitude. The period of the motion is simply
Z T /4 Z A
1
T =4 dt = 4 dθ.
0 0 θ̇
Using the energy, solve for θ̇ in terms of the amplitude A and θ to write
the required integral explicitly. This integral can be written in terms
of elliptic functions, but in a sense this does not solve the problem—we
still have to compute the elliptic functions. Let’s avoid this excursion
into elliptic functions and just do the integral numerically using the
procedure definite-integral. We still have the problem that we can
specify the amplitude A and get the frequency but to solve our problem
we need to solve the inverse problem, but that can be done as in part b.
l2 = 0.9 m
m1 = 1.0 kg
m2 = 3.0 kg
1
We put a rubber band around the book so that it does not open.
114 Chapter 2 Rigid Bodies
It turns out that the kinetic energy of a rigid body can be sepa-
rated into two pieces: a kinetic energy of translation and a kinetic
energy of rotation. Let’s see how this comes about.
The configuration of a rigid body is fully specified given the
location of any point in the body and the orientation of the body.
This suggests that it would be useful to decompose the position
vectors for the constituent particles as the sum of the vector X ~
~
to some reference position in the body and the vector ξα from the
reference position to the particular constituent element with index
α:
~ + ξ~α .
~xα = X (2.2)
~˙ + ξ~˙ α .
~x˙ α = X (2.3)
The kinetic energy is the sum of the kinetic energy of the motion
of the total mass at the center of mass
1 ~˙ ~˙
· X,
2MX (2.9)
2
For an elementary geometric proof of Euler’s theorem see Whittaker [43].
118 Chapter 2 Rigid Bodies
with
X ¡ ¢ ¡ ¢
Iij = mα êi × ξ~α · êj × ξ~α . (2.14)
α
The quantities Iij are the components of the inertia tensor with
respect to the chosen coordinate system. Note what a remarkable
form the kinetic energy has taken. All we have done is interchange
the order of summations, but now the kinetic energy is written as
a sum of products of components of the angular velocity vector,
which completely specify how the orientation of the body is chang-
ing, and the quantity Iij , which depends solely on the distribution
of mass in the body relative to the chosen coordinate system.
We will deduce a number of properties of the inertia tensor.
First, we find a somewhat simpler expression for it. The compo-
2.3 Moments of Inertia 119
nents of the vector ξ~α are (ξα , ηα , ζα ).3 Rewriting ξ~α as a sum
over its components, and simplifying the elementary vector prod-
ucts of basis vectors, the components of the inertia tensor can be
arranged in the inertia matrix I, which looks like:
"P P P #
αPmα (ηα2 + ζα2 ) P− α mα ξα ηα − Pα mα ξα ζα
− Pα mα ηα ξα αPmα (ξα2 + ζα2 ) P − α mα ηα ζα (2.15)
− α mα ζα ξα − α mα ζα ηα m (ξ 2 + η2 )
α α α α
where ξα⊥ is the perpendicular distance from the line to the con-
stituent with index α. The diagonal components of the inertia
tensor Iii are recognized as the moments of inertia about the lines
coinciding with the coordinate axes êi . The off-diagonal compo-
nents of the inertia tensor are called products of inertia.
The rotational kinetic energy of a body depends on the distri-
bution of mass of the body solely through the inertia tensor. Re-
markably, the inertia tensor involves only second order moments
of the mass distribution with respect to the center of mass. We
might have expected the kinetic energy to depend in a complicated
way on all the moments of the mass distribution, interwoven in
some complicated way with the components of the angular ve-
locity vector, but this is not the case. This fact has a remarkable
consequence: for the motion of a free rigid body the detailed shape
of the body does not matter. If a book and a banana have the
same inertia tensor, that is, the same second order mass moments,
then if they are thrown in the same way the subsequent motion
will be the same, however complicated that motion is. The fact
that the book has corners and the banana has a stem do not affect
the motion except for their contributions to the inertia tensor. In
general, the potential energy of an extended body is not so simple
3
Here we avoid the more consistent notation (ξα0 , ξα1 , ξα2 ) for the components
of ξ~α because it is awkward to write expressions involving powers of the com-
ponents written this way.
120 Chapter 2 Rigid Bodies
where M is the mass and R is the radius of Jupiter. Find the moment
of inertia of Jupiter in terms of M and R.
4
An orthogonal matrix R satisfies RT = R−1 and det R = 1.
5
The last equality follows from the fact that the rotation of two vectors pre-
serves the dot product: ~x·~ y ), or (R−1 ~
y = (R~x) · (R~ x) · ~ x · (R~
y=~ y ).
122 Chapter 2 Rigid Bodies
vector with respect to the rotated basis vectors ê0i are x0 = R−1 x,
or equivalently x = Rx0 . A rotation that actively rotates the
basis vectors, leaving other vectors unchanged, is called a passive
rotation. For a passive rotation the components of a fixed vector
change as if the vector was actively rotated by the inverse rotation.
With respect to the rectangular basis êi the rotational kinetic
energy is written
1P i j
2 ij ω ω Iij . (2.19)
ω = Rω 0 (2.21)
However, if we had started with the basis ê0i , we would have written
the kinetic energy directly as
1 0 T 0 0
2 (ω ) I ω , (2.23)
where the components are taken with respect to the ê0i basis. Com-
paring the two expressions, we see that
I0 = RT IR. (2.24)
6
We take a 1-by-1 matrix as a number.
7
That the inertia tensor transforms in this manner could have been deduced
from its definition (2.14). However, it seems that this argument, based on the
coordinate-system independence of the kinetic energy, provides insight.
2.5 Principal Moments of Inertia 123
for i 6= j. Let’s assume that I0 is diagonal and solve for the rota-
tion matrix R that does the job. Multiplying both sides of (2.24)
on the left by R we have
x0 = RT x. (2.29)
8
If two eigenvalues are not distinct then linear combinations of the associ-
ated eigenvectors are eigenvectors. This gives us the freedom to find linear
combinations of the eigenvectors that are orthonormal.
2.6 Representation of the Angular Velocity Vector 125
Now let’s rewrite the kinetic energy in terms of the principal mo-
ments of inertia. If we choose our rectangular coordinate system
so that it coincides with the principal axes then the calculation
is simple. Let the components of the angular velocity vector on
the principal axes be (ω a , ω b , ω c ). Then, keeping in mind that the
inertia tensor is diagonal with respect to the principal axis basis,
the kinetic energy is just
1
TR = 2 [A(ω a )2 + B(ω b )2 + C(ω c )2 ] . (2.30)
y y
M(q(t)) b
a
b c x a x
c
z z
Figure 2.1 The rotation M(q(t)) rotates the body from a reference
orientation in which the principal axes are aligned with the basis êi
(labeled by x, y, and z here) to the orientation specified by q(t).
Recall that the velocity results from a rotation, and that the ve-
locities are (see equation 2.11)
Dξ~α (t) = ω
~ (t) × ξ~α (t). (2.35)
We can get the procedures of local state that give the angu-
lar velocity components by abstracting these procedures along ar-
bitrary paths that have given coordinates and velocities. The
abstraction of a procedure of a path to a procedure of state is
accomplished by Gamma-bar (see section 1.6.1):
(define (M->omega M-of-q)
(Gamma-bar
(M-of-q->omega-of-t M-of-q)))
the rotation that takes the body from some reference orientation
and rotates it to the orientation specified by the generalized coor-
dinates. Here we take the reference orientation so that principal-
axis unit vectors â, b̂, ĉ are coincident with the basis vectors êi
labeled here by x̂, ŷ, ẑ.
We define the Euler angles in terms of simple rotations about
the coordinate axes. Let Rx (ψ) be a right-handed rotation about
the x̂ axis by the angle ψ, and let Rz (ψ) be a right-handed rotation
about the ẑ axis by the angle ψ. The function M for Euler angles
is written as a composition of three of these simple coordinate axis
rotations:
Dϕ (t) sin (θ (t)) sin (ψ (t)) + cos (ψ (t)) Dθ (t)
Dϕ (t) sin (θ (t)) cos (ψ (t)) − sin (ψ (t)) Dθ (t)
cos (θ (t)) Dϕ (t) + Dψ (t)
ϕ̇ sin (ψ) sin (θ) + θ̇ cos (ψ)
ϕ̇ sin (θ) cos (ψ) − θ̇ sin (ψ)
ϕ̇ cos (θ) + ψ̇
where ~xα , ~x˙ α , and mα are the positions, velocities, and masses
of the constituent particles. It turns out that the vector angular
momentum decomposes into the sum of the angular momentum
of the center of mass and the rotational angular momentum about
the center of mass, just as the kinetic energy separates into the
kinetic energy of the center of mass and the kinetic energy of
rotation. As in the kinetic energy demonstration, decompose the
position into the vector to the center of mass X ~ and the vectors
from the center of mass to the constituent mass elements ξ~α :
~ + ξ~α ,
~xα = X (2.45)
with velocities
~˙ + ξ~˙ α .
~x˙ α = X (2.46)
X ~˙
~ × (M X), (2.49)
where Ijk are the components of the inertia tensor (2.14). The
angular momentum and the kinetic energy are expressed in terms
of the same inertia tensor.
With respect to the principal axis basis, the angular momentum
components have a particularly simple form:
La = Aω a (2.53)
Lb = Bω b (2.54)
Lc = Cω c . (2.55)
Exercise 2.9:
Verify that the expression (2.52) for the components of the rotational
angular momentum (2.51) in terms of the inertia tensor is correct.
134 Chapter 2 Rigid Bodies
(show-expression
(ref (((partial 2) (T-rigid-body ’A ’B ’C)) Euler-state)
1))
Aϕ̇ (sin (θ))2 (sin (ψ))2 + Aθ̇ cos (ψ) sin (θ) sin (ψ)
+ B ϕ̇ (cos (ψ))2 (sin (θ))2 − B θ̇ cos (ψ) sin (θ) sin (ψ)
+ C ϕ̇ (cos (θ))2 + C ψ̇ cos (θ)
is the same for any choice of coordinate system. Thus the situa-
tion meets the requirements of Noether’s theorem, which tells us
that there is a conserved quantity. In particular, the family of
rotations around each coordinate axis gives us conservation of the
angular momentum component on that axis. We construct the
vector angular momentum by combining these contributions.
The following program monitors the errors in the energy and the
components of the angular momentum:
(define ((monitor-errors win A B C L0 E0) state)
(let ((t (time state))
(L ((Euler-state->L-space A B C) state))
(E ((T-rigid-body A B C) state)))
(plot-point win t (relative-error (ref L 0) (ref L0 0)))
(plot-point win t (relative-error (ref L 1) (ref L0 1)))
(plot-point win t (relative-error (ref L 2) (ref L0 2)))
(plot-point win t (relative-error E E0))))
10−12
−10−12
0 25 50 75 100
Figure 2.2 The relative error in energy and in the three spatial com-
ponents of the angular momentum versus time. It is interesting to note
that the energy error is one of the three falling curves.
9
We expect that for each constant of the motion we reduce by one the di-
mension of the region of the state space explored by a trajectory. This is
because a constant of the motion can be used to locally solve for one of the
state variables in terms of the others.
140 Chapter 2 Rigid Bodies
is conserved.
Using the expressions (2.53 - 2.55) for the angular momentum
in terms of the components of the angular velocity vector on the
2.9.2 Qualitative Features 141
tum to stay there. These points are equilibrium points for the body
components of the angular momentum. However, these points are
not equilibrium points for the system as a whole. At these points
the body is still rotating even though the body components of the
angular momentum are not changing. This kind of equilibrium is
called a relative equilibrium. We can also see that if the angular
momentum is initially slightly displaced from one of these relative
equilibria then the angular momentum is constrained to stay near
it on one of the intersection curves. The angular momentum vec-
tor is fixed in space, so the principal axis of the equilibrium point
of the body rotates stably about the angular momentum vector.
At the principal axis with intermediate moment of inertia, the b̂
axis, the intersection curves cross. As we observed, the dynamics
of the components of the angular momentum on the principal axes
form a self-contained dynamical system. Trajectories of a dynam-
ical system cannot cross,10 so the most that can happen is that
if the equations of motion carry the system along the intersec-
tion curve then the system can only asymptotically approach the
crossing point. So without solving any equations we can deduce
that the point of crossing is another relative equilibrium. If the
angular momentum is initially aligned with the intermediate axis,
then it stays aligned. If the system is slightly displaced from the
intermediate axis, then the evolution along the intersection curve
will take the system far from the relative equilibrium. So rotation
about the axis of intermediate moment of inertia is unstable—
initial displacements of the angular momentum, however small
initially, become large. Again, the angular momentum vector is
fixed in space, but now the principal axis with the intermediate
principal moment does not stay close to the angular momentum,
so the body executes a complicated tumbling motion.
This gives some insight into the mystery of the thrown book
mentioned at the beginning of the chapter. If one throws a book
so that it is initially rotating about either the axis with the largest
or the smallest moment of inertia (the smallest and largest physi-
cal axes, respectively), the book rotates regularly about that axis.
However, if the book is thrown so that it is initially rotating about
the axis of intermediate moment of inertia (the intermediate phys-
ical axis), then the book tumbles, however carefully the book is
10
Systems of ODEs that satisfy a Lipschitz condition have unique solutions.
2.9.2 Qualitative Features 143
thrown. You can try it with this book (but put a rubber band
around it first).
Before moving on, we can make some further physical deduc-
tions. Suppose a freely rotating body is subject to some sort of
internal friction that dissipates energy, but conserves the angular
momentum. For example, real bodies flex as they spin. If the
spin axis moves with respect to the body then the flexing changes
with time, and this changing distortion converts kinetic energy
of rotation into heat. Internal processes do not change the total
angular momentum of the system. If we hold the magnitude of
the angular momentum fixed but gradually decrease the energy
then the curve of intersection on which the system moves gradu-
ally deforms. For a given angular momentum there is a lower limit
on the energy; the energy cannot be so low that there are no in-
tersections. For this lowest energy the intersection of the angular
momentum sphere and the energy ellipsoid is a pair of points on
the axis of maximum moment of inertia. With energy dissipation,
a freely rotating physical body eventually ends up with the lowest
energy consistent with the given angular momentum, which is ro-
tation about the principal axis with the largest moment of inertia
(typically the shortest physical axis).
Thus, we expect that given enough time all freely rotating phys-
ical bodies will end up rotating about the axis of largest moment of
inertia. You can demonstrate this to your satisfaction by twirling
a small bottle containing some viscous fluid, such as correction
fluid. What you will find is that, whatever spin you try to put
on the bottle, it will reorient itself so that the axis of the largest
moment of inertia is aligned with the spin axis. Remarkably, this
is very nearly true of almost every body in the solar system for
which there is enough information to decide. The deviations from
principal axis rotation for the Earth are tiny, the angle between
the angular momentum vector and the ĉ axis for the Earth is less
than one arc-second.11 In fact, the evidence is that all of the plan-
ets, the Moon and all of the other natural satellites, and almost
all of the asteroids rotate very nearly about the largest moment
of inertia. We have deduced that this is to be expected using
an elementary argument. There are exceptions. Comets typically
do not rotate about the largest moment. As they are heated by
11
The deviation of the angular momentum from the principal axis may be due
to a number of effects: earthquakes, atmospheric tides, ... .
144 Chapter 2 Rigid Bodies
the sun, material spews out from localized jets, and the back reac-
tion from these jets changes the rotation state. Among the natural
satellites, the only known exception is Saturn’s satellite Hyperion,
which is tumbling chaotically. Hyperion is especially out-of-round
and subject to strong gravitational torques from Saturn.
We have all played with a top at one time or another. For the
purposes of analysis we will consider an idealized top that does
not wander around. Thus, an ideal top is a rotating rigid body,
one point of which is fixed in space. Furthermore, the center of
mass of the top is not at the fixed point, which is the center of
rotation, and there is a uniform gravitational acceleration.
For our top we can take the Lagrangian to be the difference
of the kinetic energy and the potential energy. We already know
how to write the kinetic energy—what is new here is that we must
express the potential energy in terms of the configuration. In the
case of a body in a uniform gravitational field this is easy. The
potential energy is sum of “mgh” for all the constituent particles:
X
mα ghα , (2.59)
α
where the last sum is zero because the center of mass is the origin
of ξ~α . So the potential energy of a body in a gravitational field
with uniform acceleration is very simple: it is just M gh, where M
is the total mass, and h = X~ · ẑ is the height of the center of mass.
2.10 Axisymmetric Tops 145
ẑ
ψ
φ
x̂
1 ³1 ´ 1 1
(sin (θ))2 Aϕ̇2 + cos (θ) cos (θ) C ϕ̇2 + C ϕ̇ψ̇ + Aθ̇2 + C ψ̇ 2
2 2 2 2
12
That the axisymmetric top can be solved in Euler angles is, no doubt, the
reason for the traditional choice of the definition of the Euler angles. For other
problems, the Euler angles may offer no particular advantage.
13
Here, we do not require that C be larger than A = B, because they are not
measured with respect to the center of mass.
2.10 Axisymmetric Tops 147
where R is the distance of the center of mass from the pivot. The
Lagrangian is L = T − V . We see that the Lagrangian is indeed
independent of ψ and ϕ, as expected.
There is no particular reason to look at the Lagrange equations.
We can assign that job to the computer when needed. However, we
have already seen that it may be useful to examine the conserved
quantities associated with the symmetries.
The energy is conserved, because the Lagrangian has no ex-
plicit time dependence. Also, the energy is the sum of the kinetic
and potential energy E = T + V , because the kinetic energy is
a homogeneous quadratic form in the generalized velocities. The
energy is
¡ ¢ ¡ ¢2
E = 12 A θ̇2 + ϕ̇2 sin2 θ + 12 C ψ̇ + ϕ̇ cos θ + M gR cos θ. (2.63)
14
Traditionally, evaluating a definite integral is known as performing a quadra-
ture.
2.10 Axisymmetric Tops 149
π/2
0
0 1 2
Figure 2.5 The tilt angle π − θ of the top versus time. The tilt of the
top varies periodically.
2π
0
0 1 2
Figure 2.6 The precession angle ϕ of the top versus time. The top
precesses nonuniformly—the rate of precession varies as the tilt varies.
150 Chapter 2 Rigid Bodies
200
100
0
0 1 2
Figure 2.7 The rate of rotation ψ̇ of the top versus time. The rate of
rotation of the top changes periodically, as the tilt of the top varies.
π/2
0
0 π 2π
Figure 2.8 An idea of the actual motion of the top is obtained by
plotting the tilt angle π − θ versus the precession angle ϕ. This is a
“latitude-longitude” map showing the path of the center of mass of the
top. We see that though the top has a net precession it executes a
looping motion as it precesses.
2.10 Axisymmetric Tops 151
mα
ξ~α rα
θ M0
R
~
X ~x
origin
~x, and the center of mass has position X. ~ The vector from the
center of mass to the constituent with index α is ξ~α , and has
magnitude ξα . The distance rα is then given by the law of cosines
rα2 = R2 + ξα2 − 2ξα R cos θα where θα is the angle between ~x − X
~
and ξ~α . The potential energy is then
X mα
−GM 0 1/2
. (2.68)
2 2
α (R + ξα − 2ξα R cos θα )
15
The Legendre polynomials Pl may be obtained by expanding (1 + y 2 −
2yx)−1/2 as a power series in y. The coefficient of y l is Pl (x). The first few
Legendre polynomials are:P0 (x) = 1, P1 (x) = x, P2 (x) = 32 x2 − 12 , and so
on. The rest satisfy the recurrence relation: lPl (x) = (2l − 1)xPl−1 (x) − (l −
1)Pl−2 (x).
154 Chapter 2 Rigid Bodies
Exercise 2.14:
a. Fill in the details that show that the sum over consitutents in equa-
tion (2.72) can be expressed as written in terms of moments of inertia.
In particular, show that
X
mα ξα cos θα = 0,
α
X
mα ξα2 = 2(A + B + C),
α
and that
X
mα ξα2 (sin θα )2 = I.
α
16
This approximate representation of the potential energy is sometimes called
MacCullagh’s formula.
17
Watch out, we just reused α. It was also used as the constituent index.
156 Chapter 2 Rigid Bodies
I = α2 A + β 2 B + γ 2 C.
â
b̂ θa
θb
f θ
$
Figure 2.10 The spin-orbit model problem in which the spin axis is
constrained to be perpendicular to the orbit plane has a single degree
of freedom, the orientation of the body in the orbit plane. Here the
orientation is specified by the generalized coordinate θ.
158 Chapter 2 Rigid Bodies
We are assuming that the orbit does not change or precess. The
orbit is an ellipse with the point mass at a focus of the ellipse. The
angle f (see figure 2.10) measures the position of the rigid body
in its orbit relative to the point in the orbit at which the two
bodies are closest.18 We assume the orbit is a fixed ellipse, so the
angle f and the distance R are periodic functions of time, with
period equal to the orbit period. With the spin axis constrained
to be perpendicular to the orbit plane, the orientation of the rigid
body is specified by a single degree of freedom: the orientation of
the body about the spin axis. We specify this orientation by the
generalized coordinate θ that measures the angle to the â principal
axis from the same line as we measure f , the line through the point
of closest approach.
Having specified the coordinate system, we can work out the
details of the kinetic and potential energies, and thus find the
Lagrangian. The kinetic energy is
where C is the moment of inertia about the spin axis, and the
angular velocity of the body about the ĉ axis is θ̇. There is no
component of angular velocity on the other principal axes.
To get an explicit expression for the potential energy we must
write the direction cosines in terms of θ and f : α = cos θa =
− cos(θ − f ), β = cos θb = sin(θ − f ), and γ = cos θc = 0 because
the ĉ axis is perpendicular to the orbit plane. The potential energy
is then
GM M 0
−
R
1 GM 0 £ ¤
− 3
(1 − 3 cos2 (θ − f ))A + (1 − 3 sin2 (θ − f ))B + C .
2 R
Since we are assuming that the orbit is given, we only need to
keep terms that depend on θ. Expanding the squares of the cosine
and the sine in terms of the double angles, and dropping all the
18
Traditionally, the point in the orbit at which the two bodies are closest is
called the pericenter, and the angle f is called the true anomaly.
2.11.2 Rotation of the Moon and Hyperion 159
1 n2 ²2 C a3
L(t, θ, θ̇) = C θ̇2 + cos 2(θ − f (t)). (2.79)
2 4 R3 (t)
This is a problem with one degree of freedom with terms that vary
periodically with time.
The Lagrange equations are derived in the usual manner. The
equations are
n2 ²2 C a3
CD2 θ(t) = − sin 2(θ(t) − f (t)). (2.80)
2 R3 (t)
The equation of motion is very similar to that of the periodically
driven pendulum. The main difference here is that not only is the
strength of the acceleration changing periodically, but in the spin-
orbit problem the center of attraction is also varying periodically.
We can give a physical interpretation of this equation of motion.
It states that the rate of change of the angular momentum is equal
to the applied torque. The torque on the body arises because the
19
The given potential energy differs from the actual potential energy in that
non-constant terms that do not depend on θ and consequently do not affect
the evolution of θ have been dropped.
160 Chapter 2 Rigid Bodies
−1
0 25 50
Figure 2.11 The angle θ − f versus time for 50 orbit periods. The
ordinate scale is ±1 radian. The Moon has been kicked so that the initial
rotational angular velocity is 1.01 times the orbital frequency. The trace
with fewer wiggles was computed with zero lunar orbital eccentricity;
the other trace was computed with lunar orbital eccentricity of 0.05.
The period of the rapid oscillations is the lunar orbit period, and are
due mostly to the nonuniform motion of f .
n2 ²2
D2 ϕ = − sin 2ϕ. (2.82)
2
For small deviations from synchronous rotation (small ϕ) this is
D2 ϕ = −n2 ²2 ϕ, (2.83)
−π
0 25 50
Figure 2.12 The angle θ − f versus time for 50 orbit periods. The
ordinate scale is ±π radian. The out-of-roundness parameter is large
² = 0.89, with an orbital eccentricity of e = 0.1. The system is strongly
driven. The rotation is apparently chaotic.
For a free rigid body we have seen that the components of the
angular momentum on the principal axes comprise a self contained
dynamical system: the variation of the principal axis components
depends only on the principal axis components. Here we derive
equations that govern the evolution of these components.
The starting point for the derivation is the conservation of the
vector angular momentum. The components of the angular mo-
mentum on the principal axes are
L0 = I0 ω 0 (2.84)
164 Chapter 2 Rigid Bodies
L = ML0 , (2.86)
0 = DL = DM L0 + M DL0 . (2.87)
Solving, we find
In terms of ω 0 this is
I0 Dω 0 = −MT DM I0 ω 0
= −MT A(Mω 0 ) M I0 ω 0 , (2.89)
20
Rotating the cross product of two vectors gives the same vector that is
u × ~v ) =
obtained by taking the cross product of two rotated vectors: R(~
u) × (R~v ).
(R~
2.12 Euler’s Equations 165
for any vector with components v and any rotation with matrix
representation R. Using this property of A we find Euler’s equa-
tions:
I0 Dω 0 = −A(ω 0 ) I0 ω 0 . (2.91)
A Dω a = (B − C) ω b ω c
B Dω b = (C − A) ω c ω a
C Dω c = (A − B) ω a ω b . (2.92)
DM = MA(ω 0 ). (2.94)
Exercise 2.15:
Fill in the details of the derivation of equation (2.96). You may want to
use the computer to help with the algebra.
21
In this equation we have a partial derivative with respect to a component of
the coordinate argument of the potential energy function. The first subscript
on the ∂ symbol indicates the coordinate argument. The second one selects
the ϕ component.
168 Chapter 2 Rigid Bodies
DL = T = DM L0 + M DL0 . (2.102)
T0 = M−1 T. (2.104)
In terms of ω 0 this is
I0 Dω 0 + A(ω 0 ) I0 ω 0 = T0 . (2.105)
In components,
A Dω a − (B − C) ω b ω c = T a (2.106)
B Dω b − (C − A) ω c ω a = T b (2.107)
C Dω c − (A − B) ω a ω b = T c . (2.108)
Note that the torque entered only the equations for the body
angular momentum or alternately for the body angular velocity
vector. The equations that relate the derivative of the orientation
to the angular velocity vector are not modified by the torque. In a
sense, Euler’s equations contain the dynamics, and the equations
governing the orientation are kinematic. Of course, Lagrange’s
equations must be modified by the potential that gives rise to the
torques; in this sense Lagrange’s equations contain both dynamics
and kinematics.
~u = M~u0 . (2.109)
170 Chapter 2 Rigid Bodies
AA = S − I (2.114)
SS = S (2.115)
SA = 0 (2.116)
AS = 0. (2.117)
ω 0 = W Do. (2.123)
Solving, we find
Do = W−1 ω 0 . (2.124)
N = aI + bA + cS (2.125)
that we wish to invert. Let’s guess that the inverse matrix has a
similar form.
N−1 = a0 I + b0 A + c0 S. (2.126)
172 Chapter 2 Rigid Bodies
with solution
a
a0 = 2 (2.130)
a + b2
−b
b0 = 2 (2.131)
a + b2
b2 − ac
c0 = 3 . (2.132)
a + a2 c + ab2 + b2 c
We can now invert the matrix W using its representation in terms
of primitive matrices to find
1 ³ o sin o ´ o 1 ³ o sin o ´
W−1 = I + A+ S 2− (2.133)
2 1 − cos o 2 2 1 − cos o
Note that all terms have finite limits as o → 0. There is however
a new singularity. As o → 2π two of the denominators become
singular, but there the zeros in the numerators are not strong
enough to kill the singularity. This is the expected singularity that
corresponds to the fact that at radius 2π the orientation vector
corresponds to no rotation, but nevertheless specifies a rotation
axis. This singularity is easy to avoid. Whenever the orientation
vector develops a magnitude larger than π simply replace it by
the equivalent orientation vector ~o − 2πô.
We can write the equations governing the evolution of the ori-
entation as a vector equation in terms of ω ~ 0 = M −1 ω
~
1
ω 0 + ~o × ω
D~o = f (o)~ ~ 0 + g(o)~o(~o · ω
~ 0) (2.134)
2
with two auxiliary functions
1 x sin x
f (x) = (2.135)
2 1 − cos x
1 − f (x)
g(x) = . (2.136)
x2
2.13 Nonsingular Generalized Coordinates 173
lim f (x) = 1
x→0
lim g(x) = 16 . (2.137)
x→0
22
This notation has the potential for great confusion: q is not the magnitude
of the vector ~
q . Watch out!
176 Chapter 2 Rigid Bodies
2.14 Summary
2.15 Projects
Exercise 2.18: Free rigid body
Write and demonstrate a program that reproduces diagrams like fig-
ure 2.3. Can you find trajectories that are asymptotic to the unstable
relative equilibrium on the intermediate principal axis?
Mercury a bit to see how far off the rotation rate can be and still be
trapped in this spin-orbit resonance. If the mismatch in angular velocity
is too great, Mercury’s rotation is no longer resonantly locked to its orbit.
Set ² = 0.026 and e = 0.2.
a. Write a program for the spin-obit problem so this resonance dynamics
can be investigated numerically. You will need to know (or, better,
show!) that f satisfies the equation
³ a ´2
Df = n(1 − e2 )1/2 , (2.152)
r
with
a 1 + e cos f
= . (2.153)
r 1 − e2
where
1
Here we restrict our attention to Lagrangians that only depend on the time,
the coordinates, and the velocities.
182 Chapter 3 Hamiltonian Mechanics
then V satisfies2
Equations (3.5) and (3.6) give the rate of change of q and p along
realizable paths as functions of t, q, and p along the paths.
Though fulfilling our goal of expressing the equations of motion
entirely in terms of coordinates and momenta, we can find a more
convenient representation. Define the function
e q, p) = L(t, q, V(t, q, p)),
L(t, (3.7)
where we used the chain rule in the first step and the inverse
property of V in the second step. Introducing the momentum
selector3 P (t, q, p) = p, and using the property ∂1 P = 0, we have
e q, p) − P (t, q, p)∂1 V(t, q, p)
∂1 L(t, q, V(t, q, p)) = ∂1 L(t,
2
The following properties hold: d = V(b, c, ∂2 L(b, c, d)) and a =
∂2 L(b, c, V(b, c, a)).
3
P = I2
3.1 Hamilton’s Equations 183
e − P V)(t, q, p)
= ∂1 (L
= −∂1 H(t, q, p), (3.9)
4
The overall minus sign in the definition of the Hamiltonian is traditional.
184 Chapter 3 Hamiltonian Mechanics
The Hamiltonian has the same value as the energy function E (see
equation 1.140), except that the velocities are expressed in terms
of time, coordinates, and momenta by V:
Illustration
Let’s try something simple: the motion of a particle of mass m
with potential energy V (x, y). A Lagrangian is
p2x + p2y
H(t; x, y; px , py ) = + V (x, y). (3.20)
2m
5
In traditional notation Hamilton’s equations are written:
dq ∂H dp ∂H
= and =− ,
dt ∂p dt ∂q
or as separate equations for each component:
dq i ∂H dpi ∂H
= and =− i.
dt ∂pi dt ∂q
6
Traditionally, the Hamiltonian is written
H = pq̇ − L,
This way of writing the Hamiltonian confuses the values of functions with
the functions that generate them: both q̇ and L have to be reexpressed as
functions of the time, coordinates and momenta.
3.1 Hamilton’s Equations 185
Dx = px /m
Dy = py /m. (3.21)
Hamiltonian state
Given a coordinate path q, and a Lagrangian L, the corresponding
momentum path p is given by equation (3.2). Equation (3.15) ex-
presses the same relationship in terms of the corresponding Hamil-
tonian H. That these relations are valid for any path, whether
or not it is a realizable path, allows us to abstract to arbitrary
velocity and momentum at a moment. At a moment, the mo-
mentum p for the state tuple (t, q, v) is p = ∂2 L(t, q, v). We also
have v = ∂2 H(t, q, p). In the Lagrangian formulation the state
186 Chapter 3 Hamiltonian Mechanics
7
In the construction of the Lagrangian state derivative from the Lagrange
equations we must solve for the highest order derivative. The solution process
requires the inversion of the matrix ∂2 ∂2 L. In the construction of Hamilton’s
equations, the construction of V from the momentum state function ∂2 L re-
quires the inversion of the same matrix. If the Lagrangian formulation has
singularities, the singularities cannot be avoided by going to the Hamiltonian
formulation.
3.1 Hamilton’s Equations 187
ΠL [q](t) = (t, q(t), ∂2 L(t, q(t), Dq(t))) = (t, q(t), p(t)) . (3.24)
8
The term phase space was introduced by Josiah Willard Gibbs in his for-
mulation of statistical mechanics. The Hamiltonian plays a fundamental role
in the Boltzmann-Gibbs formulation of statistical mechanics, and in both the
Heisenberg and Schrödinger approaches to quantum mechanics.
The momentum p can be viewed as the coordinate representation of a linear
form on the tangent space. Thus pq̇ is a scalar quantity, which is invariant
under time-independent coordinate transformations of the configuration space.
The set of momentum forms comprise an n-dimensional vector space at each
point of configuration space called the cotangent space. The collection of all
cotangent spaces of a configuration space forms a space called the cotangent
bundle of the configuration manifold.
188 Chapter 3 Hamiltonian Mechanics
9
By default literal functions map reals to reals; the default type for a lit-
eral function is (-> Real Real). Here, the potential energy V takes two real
arguments and returns a real.
3.1.1 The Legendre Transformation 189
(show-expression
(((Hamilton-equations
(H-rectangular
’m
(literal-function ’V (-> (X Real Real) Real))))
(up (literal-function ’x) (literal-function ’y))
(down (literal-function ’p x) (literal-function ’p y)))
’t))
0
px (t)
Dx (t) −
m
py (t)
Dy (t) −
" Dp (t) + ∂ V (xm #
x 0 (t) , y (t))
Dpy (t) + ∂1 V (x (t) , y (t))
or
G = IV − Fe, (3.28)
then we have
V = DG. (3.29)
10
The Legendre transformation is more general than its use in mechanics in
that it captures the relationship between conjugate variables in systems as
diverse as thermodynamics, circuits, and field theory.
11
This can be done so long as the derivative is not zero.
3.1.1 The Legendre Transformation 191
DF
G(w) − G(w0 )
w0
F (v) − F (v0 )
v0 v DG
v = ∂1 G(x, w) (3.34)
giving
then define
G = W V − Fe. (3.38)
∂1 G = ∂1 (W V − Fe)
= V + W ∂1 V − ∂1 Fe , (3.39)
but
or
∂1 Fe = W ∂1 V. (3.41)
So
∂1 G = V, (3.42)
∂0 (W V) = W ∂0 V (3.43)
w = ∂1 F (x, v)
wv = F (x, v) + G(x, w)
v = ∂1 G(x, w)
0 = ∂0 F (x, v) + ∂0 G(x, w). (3.46)
argument. Show that the Legendre transform relations hold for your
solution, including the relations among passive arguments, if any.
a. F (x) = a sin x + b cos x, there are no passive arguments.
b. F (x, y) = a sin x cos y, with x active.
c. F (x, y, ẋ, ẏ) = xẋ2 + 3ẋẏ + y ẏ 2 , with ẋ and ẏ active.
and
This relation is purely algebraic and is valid for any path. The
passive equation (3.51) gives
but the left-hand side can be rewritten using the Lagrange equa-
tions, so
Exercise 3.5:
Using Hamilton’s equations, show directly that the Hamiltonian is a
conserved quantity if the Hamiltonian has no explicit time dependence.
w = DF (v) = vM + b. (3.61)
v = V(w) = M −1 (w − b) (3.62)
Computing Hamiltonians
We implement the Legendre transform for quadratic functions by
the procedure:13
(define (Legendre-transform F)
(let ((w-of-v (D F)))
(define (G w)
(let ((z (dual-zero w)))
(let ((M ((D w-of-v) z))
(b (w-of-v z)))
(let ((v (/ (- w b) M)))
(- (* w v) (F v))))))
G))
12
Let M be the matrix representation of M , then M = MT .
13
The division operation, denoted by / in the Legendre-transform procedure,
is generic over mathematical objects. We interpret the division in the matrix
representation: if a vector y is divided by a matrix M this is interpreted as a
request to solve the linear system Mx = y, where x is the unknown vector.
3.1.1 The Legendre Transformation 197
(show-expression
((Lagrangian->Hamiltonian
(L-rectangular
’m
(literal-function ’V (-> (X Real Real) Real))))
(up ’t (up ’x ’y) (down ’p x ’p y))))
1 2 1 2
2 px 2 py
V (x, y) + +
m m
Figure 3.2
a. What are the degrees of freedom of this system? Pick and describe
a convenient set of generalized coordinates for this problem. Write a
Lagrangian to describe the dynamical behavior. It may help to know
that the moment of inertia of the cylinder around its axis is 12 M R2 . You
3.1.2 Hamiltonian Action Principle 199
which is one of Hamilton’s equations, the one that does not depend
on the path being a realizable path. Using
ΠL [q](t) = (t, q(t), ∂2 L(t, q(t), Dq(t))) = (t, q(t), p(t)) , (3.67)
200 Chapter 3 Hamiltonian Mechanics
the integrand is
δS[q](t1 , t2 )
Z t2
= δ(pDq − H ◦ ΠL [q])
t1
Z t2
= (δp Dq + p δDq − (DH ◦ ΠL [q])δΠL [q])
t1
Z t2
= {δp Dq + p Dδq
t1
−(∂1 H ◦ ΠL [q])δq − (∂2 H ◦ ΠL [q])δp} , (3.69)
δS[q](t1 , t2 ) = pδq|tt21
Z t2
+ {δp Dq − Dp δq
t1
−(∂1 H ◦ ΠL [q])δq − (∂2 H ◦ ΠL [q])δp} . (3.70)
δS[q](t1 , t2 ) (3.71)
Z t2
= ((Dq − ∂2 H ◦ ΠL [q]) δp − (Dp + ∂1 H ◦ ΠL [q]) δq) .
t1
14
The variation of the momentum δp does not need to be further expanded in
this argument because it turns out that the factor multiplying it is zero. How-
ever, it is handy to see how it is related to the variations in the coordinate
path δq:
δp(t) = ∂1 ∂2 L(t, q(t), Dq(t))δq(t) + ∂2 ∂2 L(t, q(t), Dq(t))Dδq(t).
3.1.3 A Wiring Diagram 201
15
It is sometimes asserted that the momenta have a different status in the
Lagrangian and Hamiltonian formulations; that in the Hamiltonian framework
the momenta are “independent” of the coordinates. From this it is argued that
the variations δq and δp are arbitrary and independent, therefore implying
that the factor multiplying each of them in the action integral (3.72) must
independently be zero, apparently deriving both of Hamilton’s equations. The
argument is fallacious: we can write δp in terms of δq (see footnote 14).
202 Chapter 3 Hamiltonian Mechanics
p q̇ q t
q̇
Lagrangian
q
∂1 L ∂0 L L ∂2 L t
ṗ
− − + ×
∂1 H ∂0 H H ∂2 H
p
q
Hamiltonian t
p0
R
q0
R
1
t0
{F, H} = ∂1 F ∂2 H − ∂2 F ∂1 H. (3.76)
Note that the Poisson bracket of two functions on the phase state
space is also a function on the phase state space.
The coordinate selector Q = I1 is an example of a function on
phase state space: Q(t, q, p) = q. According to equation (3.75)
D(Q ◦ σ) = {Q, H} ◦ σ
D(P ◦ σ) = {P, H} ◦ σ. (3.81)
16
In traditional notation the Poisson bracket is written
X ³ ∂F ∂H ∂F ∂H ´
{F, H} = i
− .
i
∂q ∂pi ∂pi ∂q i
3.2 Poisson Brackets 205
where all but the last can be immediately verified from the def-
inition. Jacobi’s identity requires a little more effort to verify.
We can use the computer to avoid this work. Define some literal
phase-space functions of Hamiltonian type:
(define F
(literal-function ’F
(-> (UP Real (UP Real Real) (DOWN Real Real)) Real)))
(define G
(literal-function ’G
(-> (UP Real (UP Real Real) (DOWN Real Real)) Real)))
(define H
(literal-function ’H
(-> (UP Real (UP Real Real) (DOWN Real Real)) Real)))
The residual is zero, so the Jacobi identity is satisfied for any three
phase space functions for two degrees of freedom.
206 Chapter 3 Hamiltonian Mechanics
0 = D(F ◦ σ) = {F, H} ◦ σ
0 = D(G ◦ σ) = {G, H} ◦ σ. (3.89)
and thus
D({F, G} ◦ σ) = 0, (3.91)
17
Separatrices is the plural of separatrix.
208 Chapter 3 Hamiltonian Mechanics
20
-20
−π 0 π
Figure 3.4 The phase plane of the pendulum has three regions dis-
playing two distinct kinds of behavior. In this figure there are a number
of different trajectories. Trajectories lie on the contours of the Hamilto-
nian. Trajectories may oscillate, making ovoid curves around the equi-
librium point, or they may circulate, producing wavy tracks outside the
eye-shaped region. The eye-shaped region is delimited by the separatrix.
This pendulum has length 1m, a bob of mass 1kg, and the acceleration
of gravity is 9.8ms−2 .
18
The pendulum has only one unstable equilibrium. Remember that the co-
ordinate is an angle.
3.4 Phase Space Reduction 209
19
If a Lagrangian does not depend on a particular coordinate then neither does
the corresponding Hamiltonian, because the coordinate is a passive variable
in the Legendre transform. Such a Hamiltonian is said to be cyclic in that
coordinate.
20
Traditionally, when a problem has been reduced to the evaluation of a def-
inite integral it is said to be reduced to a “quadrature.” Thus, the determi-
nation of the evolution of a cyclic coordinate q i is reduced to a problem of
quadrature.
210 Chapter 3 Hamiltonian Mechanics
Dz i = F i (z 1 , z 2 , . . . , z m ) (3.92)
C(z 1 , z 2 , . . . , z m ) = 0. (3.93)
The momenta are pr = mṙ and pϕ = mr2 ϕ̇. The kinetic energy is
a homogeneous quadratic form in the velocities so the Hamiltonian
is T + V with the velocities rewritten in terms of the momenta:
p2r p2ϕ
H(t; r, ϕ; pr , pϕ ) = + + V (r). (3.95)
2m 2mr2
Hamilton’s equations are:
pr
Dr =
m
pϕ
Dϕ =
mr2
p2ϕ
Dpr = − DV (r)
mr3
Dpϕ = 0. (3.96)
21
It is not always possible to choose a set of generalized coordinates in which
all symmetries are simultaneously manifest. For these systems, the reduction
of the phase space is more complicated. We have already encountered such
a problem: the motion of a free rigid body. The system is invariant under
a rotation about any axis, yet no single coordinate system can reflect this
symmetry. Nevertheless, we have already found that the dynamics is described
by a system of lower dimension that the full phase space: the Euler equations.
212 Chapter 3 Hamiltonian Mechanics
1 2 1 2
2 pϕ 2 pr
V (r) + +
mr2 m
Axisymmetric top
We reconsider the axisymmetric top (see section 2.10) from the
Hamiltonian point of view. Recall that a top is a rotating rigid
3.4 Phase Space Reduction 213
body, one point of which is fixed in space. The center of mass is not
at the fixed point, and there is a uniform gravitational field. An
axisymmetric top is a top with an axis of symmetry. We consider
here an axisymmetric top with the fixed point on the symmetry
axis.
The axisymmetric top has two continuous symmetries that we
would like to exploit. It has the symmetry that neither the ki-
netic nor potential energy are sensitive to the orientation of the
top about the symmetry axis. The kinetic and potential energy
are also insensitive to a rotation of the physical system about the
vertical axis, because the gravitational field is uniform. We take
advantage of these symmetries by choosing coordinates that nat-
urally express them. We already have an appropriate coordinate
system that does the job—the Euler angles. We choose the refer-
ence orientation of the top so that the symmetry axis is vertical.
The first Euler angle ψ expresses a rotation about the symmetry
axis. The next Euler angle θ is the tilt of the symmetry axis of
the top from the vertical. The third Euler angle ϕ expresses a
rotation of the top about the fixed z axis. The symmetries of the
problem imply that the first and third Euler angles do not appear
in the Hamiltonian. As a consequence the momenta conjugate to
these angles are conserved quantities. The problem of determining
the motion of the axisymmetric top is reduced to the problem of
determining the evolution of θ and pθ . Let’s work out the details.
In terms of Euler angles a Lagrangian for the axisymmetric top
is (see section 2.10):
(define ((L-axisymmetric-top A C gMR) local)
(let ((q (coordinate local))
(qdot (velocity local)))
(let ((theta (ref q 0))
(thetadot (ref qdot 0))
(phidot (ref qdot 1))
(psidot (ref qdot 2)))
(+ (* 1/2 A
(+ (square thetadot)
(square (* phidot (sin theta)))))
(* 1/2 C
(square (+ psidot (* phidot (cos theta)))))
(* -1 gMR (cos theta))))))
1 2
2 pψ
1 2
2 pψ (cos (θ))2 1 2
2 pθ pφ pψ cos (θ)
1 2
2 pφ
+ + − +
C A (sin (θ))2 A A (sin (θ))2 A (sin (θ))2
+ gM R · cos (θ)
p2θ p2 p2 θ
H= + + tan2 + gM R cos θ. (3.97)
2A 2C 2A 2
Defining the effective potential energy
p2 p2 θ
Ueff (θ) = + tan2 + gM R cos θ, (3.98)
2C 2A 2
which parametrically depends on p, A, C, and gM R, the Hamil-
tonian is
p2θ
H= + Ueff (θ). (3.99)
2A
3.4 Phase Space Reduction 215
0.9
0.2
−π/2 0 π/2
For ω > ωc the top can stand vertically; for ω < ωc the top
falls if slightly displaced from the vertical. The top which stands
vertically is called the “sleeping” top. For a more realistic top
friction gradually slows the rotation, and the rotation rate of the
top eventually falls below the critical rotation rate and the top
“wakes up.”
216 Chapter 3 Hamiltonian Mechanics
0.002
0.001
-0.001
-0.002
−π 0 π
Figure 3.6 The θ, pθ phase plane for the axisymmetric top with
pϕ = pψ and ω = 130 rad/s. The parameters are A = 0.0000328kg m2 ,
C = 0.000066kg m2 , gM R = 0.0456kg m2 s−2 . For these parameters the
critical frequency ωc is about 117.2 rad/s.
0.005
-0.005
−π 0 π
Figure 3.7 The θ, pθ phase plane for the axisymmetric top with
pϕ = pψ and ω = 90 rad/sec. The other parameters are as before.
3.4 Phase Space Reduction 217
0.01
-0.01
−π 0 π
Figure 3.8 The θ, pθ phase plane for the axisymmetric top with pϕ >
pψ . Most of the parameters are the same as before, but here pϕ =
0.00726kgm2 s−1 and pψ = 0.00594kgm2 s−1 .
We get additional insight into the sleeping top and the awake
top by looking at the trajectories in the θ, pθ phase plane. The
trajectories in this plane are simply contours of the Hamiltonian,
because the Hamiltonian is conserved. Figure 3.6 shows a phase
portrait for ω > ωc . All of the trajectories are loops around the
vertical (θ = 0). Displacing the top slightly from the vertical
simply places the top on a nearby loop, so the top stays nearly
vertical. Figure 3.7 shows the phase portrait for ω < ωc . Here
the vertical position is an unstable equilibrium. The trajectories
that approach the vertical are asymptotic—they take an infinite
amount of time to reach it, just as a pendulum with just the right
initial conditions can approach the vertical but never reach it. If
the top is displaced slightly from the vertical then the trajectories
loop around another center with nonzero θ. A top started at the
center point of the loop stays there, and one started near this
equilibrium point loops stably around it. Thus we see that when
the top “wakes up” the vertical is unstable, but the top does not
fall to the ground. Rather, it oscillates around a new equilibrium.
It is also interesting to consider the axisymmetric top when
pϕ 6= pψ . Consider the case pϕ > pψ . Some trajectories in the θ,
218 Chapter 3 Hamiltonian Mechanics
pθ plane are shown in figure 3.8. Note that in this case trajectories
do not go through θ = 0. The phase portrait for pϕ < pψ is similar
and will not be shown.
We have reduced the motion of the axisymmetric top to quadra-
tures by choosing coordinates that express the symmetries. It
turns out that the resulting integrals can be expressed in terms of
elliptic functions. Thus, the axisymmetric top can be analytically
solved. We do not dwell on this solution because it is not very il-
luminating. In fact, most problems cannot be solved analytically,
so there is not much profit in dwelling on the analytic solution of
one of the rare problems which is analytically solvable. Rather,
our discussion has focused on the geometry of the solutions in the
phase space, and the use of integrals to reduce the dimension of
the problem. With the phase space portrait we have found some
interesting qualitative features of the motion of the top.
where
10
-10
−π 0 π
1
− a2 mω 2 (cos (θ))2 (sin (ωt))2 + agm cos (ωt)
2
1 2
aωpθ sin (θ) sin (ωt) p
+ − glm cos (θ) + 22 θ
l l m
10
-10
−π 0 π
10
-10
−π 0 π
22
The surface of section technique was introduced by Poincaré in his Méthodes
Nouvelles de la Mécanique Céleste. Poincaré proved remarkable results about
dynamical systems using the surface of section technique, and we shall return
to those later. The surface of section technique is a key tool in the modern
study of dynamical systems, for both analytical and numerical investigations.
3.6.1 Poincaré Sections for Periodically-Driven Systems 225
23
The surface of section technique was put to spectacular use in the 1964
landmark paper [19] by astronomers Michel Hénon and Carl Heiles. In their
numerical investigations they found that some trajectories are chaotic, and
show exponential divergence with time, while other trajectories are regular,
showing linear divergence with time. They found that these two types of
trajectories are typically clustered in the phase space into regions of chaotic
behavior and regions of regular behavior.
24
That solutions of ordinary differential equations can show exponential
sensitivity to initial conditions was independently discovered by Edward
Lorenz ([28]) in the context of simplified model of convection in the Earth’s
atmosphere. Lorenz coined the picturesque term the “butterfly effect” to de-
scribe this sensitivity. The weather system model of Lorenz is so sensitive to
initial conditions that “the flapping of a butterfly’s wings in Brazil can change
the course of a typhoon in Japan.”
226 Chapter 3 Hamiltonian Mechanics
p p
Phase Space Trajectory
(q(t + T ), p(t + T ))
q q
time
t Drive t+T
20
-20
−π 0 π
Figure 3.12 Surface of section for the driven pendulum. The angle is
plotted on the abscissa; the momentum conjugate to this angle is plotted
on the ordinate. For this section the parameters are:
p m = 1 kg, l = 1m,
2
g = 9.8 m/s , A = 0.05 m, ω = 4.2ω0 , with ω0 = g/l.
25
Regular trajectories are also called quasiperiodic trajectories.
228 Chapter 3 Hamiltonian Mechanics
The trajectories that appear to fill areas are called chaotic tra-
jectories. For these points the distance in phase space between ini-
tially nearby points grows, on average, exponentially with time.26
In contrast, for the regular trajectories, the distance in phase space
between initially nearby points grows, on average, linearly with
time.
The phase space seems to be grossly clumped into different re-
gions. Initial conditions in some regions seem to predominately
yield regular trajectories, and other regions seem to predominately
yield chaotic trajectories. This gross division of the phase space
into qualitatively different types of trajectories is called the di-
vided phase space. We will see later that there is much more
structure here than is apparent at this scale, and that upon mag-
nification there is a complicated interweaving of chaotic and reg-
ular regions on finer and finer scales. Indeed, we shall see that
many trajectories which appear to generate curves on the surface
of section are, upon magnification, actually chaotic and fill a tiny
area. We shall also find that there are trajectories which lie on
one-dimensional curves on the surface of section, but which only
explore a subset of this curve formed by cutting out an infinite
number of holes.27
The features seen on the surface of section of the driven pen-
dulum are quite general. The same phenomena are seen in most
dynamical systems. In general, there are both regular and chaotic
trajectories, and there is the clumping characteristic of the divided
phase space. The specific details depend upon the system, but the
basic phenomena are generic. Of course we are interested in both
aspects: the phenomena which are generic to all systems, and the
specific details for particular systems of interest.
The surface of section for the periodically driven pendulum has
specific features that give us qualitative information about how
this system behaves. The central island in figure 3.12 is the rem-
nant of the oscillation region for the unforced pendulum (see fig-
ure 3.4). There is a sizable region of regular trajectories here that
are, in a sense, similar to the trajectories of the unforced pendu-
26
We saw an example of this extreme sensitivity to initial conditions in fig-
ure 1.7.
27
One-dimensional invariant sets with an infinite number of holes are some-
times called cantori, by analogy to the Cantor sets, but it really doesn’t
Mather.
3.6.1 Poincaré Sections for Periodically-Driven Systems 229
lum. In this region, the pendulum oscillates back and forth, much
as the undriven pendulum does, but the drive makes it wiggle as
it does so. The section points are all collected at the same phase
of the drive so we do not see these wiggles on the section.
The central island is surrounded by a large chaotic zone. Thus
the region of phase space with regular trajectories similar to the
unforced trajectories has finite extent. On the section, the bound-
ary of this “stable” region is apparently rather well defined—there
is a sudden transition from smooth regular invariant curves to
chaotic motion that can take the system far from this region of
regular motion.
There are two other sizeable regions of regular behavior. The
trajectories in these regions are resonant with the drive, on av-
erage executing one full rotation per cycle of the drive. The two
islands differ in the direction of the rotation. In these regions
the pendulum is making complete rotations, but the rotation is
locked to the drive so that points on the section appear only in the
islands with finite angular extent. The fact that points for partic-
ular trajectories loop around the islands means that the pendulum
sometimes completes a cycle faster than the drive and sometimes
slower than the drive, but never loses lock.
Each regular region has finite extent. So from the surface of
section we can see directly the range of initial conditions which
remain in resonance with the drive. Outside of the regular region
initial conditions lead to chaotic trajectories which evolve far from
the resonant regions.
Various higher order resonance islands are also visible, as are
non-resonant regular circulating orbits. So, the surface of section
has provided us with an overview of the main types of motion that
are possible and their relationship.
If we change the parameters we can see other interesting phe-
nomena. Figure 3.13 shows the surface of section when the drive
frequency is twice the natural small amplitude oscillation fre-
quency of the undriven pendulum. The section has a large chaotic
zone, with an interesting set of islands. The central equilibrium
has undergone an instability and instead of a central island we
find two off-center islands. These islands are alternately visited
one after the other. As the support goes up and down the pendu-
lum alternately tips to one side and then the other. It takes two
periods of the drive before the pendulum visits the same island.
Thus, the system has “period doubled.” An island has been re-
230 Chapter 3 Hamiltonian Mechanics
10
-10
−π 0 π
Figure 3.13 Another surface of section for the driven pendulum, il-
lustrating a period-doubled central island. For this section the frequency
of the drive is resonant with the frequency of small amplitude oscilla-
tions of the undriven pendulum. The angle is plotted on the abscissa
(scale −π to π); the momentum conjugate to this angle is plotted on the
ordinate (scale -10 to 10 kg m2 /s). For this section the parameters are:
2
m = 1 kg, l = 1 m, g = 9.8m/s , A = 0.1m, ω = 2ω0 .
20
-20
−π 0 π
28
In the particular case of the driven pendulum there is no reason to call fail.
This contingency is reserved for systems where orbits escape or cease to satisfy
some constraint.
3.6.3 Poincaré Sections for Autonomous Systems 233
tum p~.29 Integrating this density over some finite volume of phase
space gives the probability of finding a star in that phase-space
volume (in that region of space within a specified region of mo-
menta). We assume the probability density is normalized so that
the integral over all of phase space gives unit probability; the star
is somewhere and has some momentum with certainty. In terms
of f the statistical average of any dynamical quantity w over some
volume of phase space V is just
Z
hwiV = fw (3.123)
V
29
We will see that it is convenient to look at distribution functions in the phase-
space coordinates because the consequences of conserved momenta are more
apparent, but also because volume in phase space is conserved by evolution
(see section 3.8).
3.6.3 Poincaré Sections for Autonomous Systems 235
There was good reason to believe that this might be correct. First,
it is clear that the distribution function surely depends at least on
E and pθ . The problem is “Given an energy E and angular mo-
mentum pθ what motion is allowed?” The integrals clearly confine
the evolution. Does the evolution carry the system everywhere
in the phase space subject to these known constraints? In the
early part of the 20th century this appeared plausible. Statistical
mechanics was successful, and statistical mechanics made exactly
this assumption. Perhaps there are other integrals of the mo-
tion which exist, but we have not yet discovered them? Poincaré
proved an important theorem with regard to integrals of the mo-
tion. Poincaré proved that most integrals of a dynamical system
typically do not persist upon perturbation of the system. That
is, if a small perturbation is added to a problem, then most of
the integrals of the original problem do not have analogs in the
perturbed problem. The integrals are destroyed. Of course, in-
tegrals which result from symmetries of the problem continue to
be preserved if the perturbed system has the same symmetries.
Thus angular momentum continues to be preserved upon appli-
cation of any axisymmetric perturbation. Poincaré’s theorem is
correct, but what came next was not. As a corollary to Poincaré’s
theorem, in 1920 Fermi published a proof of an ergodic theorem,
236 Chapter 3 Hamiltonian Mechanics
σz = σr . (3.128)
σr ≈ 2σz . (3.129)
30
A system is ergodic if time averages along trajectories are the same as phase
space averages over the region explored by the trajectories.
3.6.3 Poincaré Sections for Autonomous Systems 237
1.0
0.5
0.0
-0.5
0.5
0.0
-0.5
0.5
0.0
-0.5
0.5
0.0
-0.5
py
0.5
0.0
-0.5
31
As before, upon close examination we may find that trajectories that appear
to be confined to a curve on the section are chaotic trajectories that explore
3.6.3 Poincaré Sections for Autonomous Systems 243
0.5
0.0
-0.5
0.5
0.0
-0.5
0.005
-0.005
−π 0 π
12
0
0 200 400 600
200
100
0
0 200 400 600
themselves as far apart as they can get. Once this happens the
distance no longer grows. The estimate of the rate of divergence
of trajectories is limited by this “saturation.”
We can improve on this method by studying a variational sys-
tem of equations. Let
32
In strongly chaotic systems w may become so large that the computer can
no longer represent it. To prevent this we can replace w by w/c whenever the
size of w becomes uncomfortably large. The equation governing w is linear
so, except for the scale change, the evolution is unchanged. Of course we have
to keep track of these scale changes when computing the average growth rate.
This process is called “renormalization” to make it sound impressive.
250 Chapter 3 Hamiltonian Mechanics
p2θ
H(t, θ, pθ ) = + glm cos θ. (3.141)
2l2 m
In figure 3.25 we see the evolution of an elliptic region around a
point on the θ-axis, in the oscillation region of the pendulum.
Three later positions of the region are shown. The region is
stretched and sheared by the flow, but the area is preserved. After
many cycles, the starting region will be stretched to be a thin layer
distributed in the phase angle of the pendulum. Figure 3.26 shows
a similar evolution (for smaller time intervals) of a region strad-
dling the separatrix33 near the unstable equilibrium point. The
phase-space region rapidly stretches along the separatrix, while
preserving the area. The initial conditions that start in the oscil-
lation region (inside of the separatrix) will continue to spread into
a thin ring-shaped region, while the initial conditions that start
outside of the separatrix will spread into a thin region of rotation
on the outside of the separatrix.
Proof of Liouville’s theorem
Consider a set of ordinary differential equations of the form
33
The separatrix is the curve that separates the oscillating motion from the
circulating motion. It is made up of several trajectories that are asymptotic
to the unstable equilibrium.
3.8 Liouville’s Theorem 251
+10
−10
−π 0 +π
+10
−10
−π 0 +π
Figure 3.26 The pendulum here is the same as in the previous figure,
but now the swarm of initial points surrounds the unstable equilibrium
point for the pendulum in phase space, where θ = π and pθ = 0. The
swarm is stretched out along the separatrix. The time interval between
successively plotted contours is 0.3 seconds.
R
The volume V (t) of a region R(t) is R(t) 1. The volume of the
evolved region R(t + ∆t) is
Z
V (t + ∆t) = 1
R(t+∆t)
Z
= 1
gt,∆t (R(t))
Z
= Jac(gt,∆t ), (3.145)
R(t)
so
to show that
where
Thus
Z
£ ¤
V (t + ∆t) = 1 + ∆tGt + o(∆t2 )
R(t)
Z
= V (t) + ∆t Gt + o(∆t2 ). (3.151)
R(t)
see that the trace, which is the sum of these diagonal components,
is zero. Thus the integral of Gt over the region R(t) is zero, so the
derivative of the volume at time t is zero. Because t is arbitrary,
the volume does not change. This proves Liouville’s theorem: the
phase-space flow conserves phase-space volume.
Notice that the proof of Liouville’s theorem does not depend
upon whether the Hamiltonian has explicit time dependence. Li-
ouville’s theorem holds for systems with time-dependent Hamil-
tonians.
We may think of the ensemble of all possible states as a fluid
flowing around under the control of the dynamics. Liouville’s theo-
rem says that this fluid is incompressible for Hamiltonian systems.
34
It is reported that when Boltzmann was confronted with this problem he
responded, “You should wait that long!”
256 Chapter 3 Hamiltonian Mechanics
Since the exponential is never zero this equation has the same
trajectories as equation (3.155) above.
The momentum conjugate to x is
α
p = mẋe m t , (3.158)
p(t) − α t
Dx(t) = e m
m
α
Dp(t) = −kx(t)e m t . (3.160)
35
This is just the product of the Lagrangian for the undamped harmonic
oscillator with an increasing exponential of time.
258 Chapter 3 Hamiltonian Mechanics
Thus we can form the transformation from the initial state to the
final state:
h i · − 101 t ¸h i h i
x(t) e e − 12 t 1 1 −1 x(0)
= 1 1
1 5 . (3.163)
p(t) − 12 e− 2 t − 52 e− 10 t −2 −2 p(0)
Distribution functions
We only know the state of a system approximately. It is reasonable
to model our state of knowledge by a probability density function
on the set of possible states. Given such incomplete knowledge,
what are the probable consequences? As the system evolves, the
density function also evolves. Liouville’s theorem gives us a handle
on this kind of problem.
Let f (t, q, p) be a probability density function on the phase
space at time t. For this to be a good probability density function
we require that the integral of f over all coordinates and momenta
is 1—that the system is somewhere is certain.
3.9 Standard Map 259
∂0 f ◦ σ + {f, H} ◦ σ = 0. (3.165)
36
This question was also addressed in the remarkable paper by Hénon and
Heiles, but with a different map than we use here.
260 Chapter 3 Hamiltonian Mechanics
37
The standard map has been extensively studied. Early investigations were
by Chirikov [11] and by Taylor [41]. So the map is sometimes called the
Chirikov-Taylor map. Chirikov coined the term “standard map,” which we
adopt.
3.9 Standard Map 261
2π
0
0 π 2π
Figure 3.27 Surface of section for the standard map for K = 0.6. The
section shows mostly regular trajectories, with a few dominant islands,
but also shows a number of small chaotic zones.
x0 = x cos α − (y − x2 ) sin α
y 0 = x sin α + (y − x2 ) cos α
2π
0
0 π 2π
Figure 3.28 Surface of section for the standard map for K = 1.4.
The dominant feature is a large chaotic zone. There are also some large
islands of regular behavior. In this case there are also some interesting
secondary islands - islands around islands.
3.10 Summary
3.11 Projects
Exercise 3.14: Periodically driven pendulum
Explore the dynamics of the driven pendulum, using the surface of sec-
tion method. We are interested in exploring the regions of parameter
space over which various phenomena occur. Consider a pendulum of
length 9.8m, mass 1kg, and acceleration of gravity g = 9.8ms−2 , giv-
ing ω0 = 1rad/s. Explore the parameter plane of the amplitude A and
frequency ω of the periodic drive.
Examples of the phenomena to be investigated:
a. Inverted equilibrium. Show the region of parameter space (A, ω) in
which the inverted equilibrium is stable. If the inverted equilibrium is
stable there is some range of stability, i.e. there is a maximum angle
264 Chapter 3 Hamiltonian Mechanics
20
-20
−π 0 π
Figure 4.1 The phase plane of the pendulum has three regions dis-
playing two distinct kinds of behavior. Trajectories lie on the contours
of the Hamiltonian. Trajectories may oscillate, making ovoid curves
around the equilibrium point, or they may circulate, producing wavy
tracks outside the eye-shaped region. The eye-shaped region is delimited
by the separatrix. This pendulum has length 1m, and the acceleration
of gravity is 9.8m s−2 .
20
-20
−π 0 π
Figure 4.2 A surface of section for the driven pendulum, with zero-
amplitude drive. The effect is to sample the trajectories of the undriven
pendulum, which lie on the contours of the Hamiltonian. Only a small
number of points are plotted for each trajectory to illustrate the fact that
for zero-amplitude drive the surface of section samples the continuous
trajectories of the undriven pendulum.
20
-20
−π 0 π
Figure 4.3 A surface of section for the driven pendulum, with non-
zero drive amplitude A = 0.001m and drive frequency 4.2ω0 . Many tra-
jectories apparently generate invariant curves, as in the zero-amplitude
drive case. Here, in addition, some orbits belong to island chains and
others are chaotic. The most apparent chaotic orbit is near the separa-
trix of the undriven pendulum.
point.1 For these orbits the pendulum rotates on average once per
drive, but the phase of the pendulum is sometimes ahead of the
drive and sometimes behind it.
There are other islands that appear with non-zero amplitude
drive. In the central oscillation region there is a six-fold chain
of secondary islands. For this orbit the pendulum is oscillating,
and the period of the oscillation is commensurate with the drive.
The six islands are all generated by a single orbit. In fact, the
islands are visited successively in a clockwise direction. After six
cycles of the drive the section point returns to the same island
but falls at a different point on the island curve, accumulating the
island curve after many iterations. The motion of the pendulum
is not periodic, but is locked in a resonance so that on average it
oscillates once for every six cycles of the drive.
1
Keep in mind that the abscissa is an angle.
4.2 Linear Stability of Fixed Points 271
with components
0 = F (t, ze ). (4.3)
That this is zero at all moments for the equilibrium solution im-
plies ∂0 F (t, ze ) = 0.
Next consider a state path z 0 which passes near the equilibrium
point. The path displacement ζ is defined so that at time t
We have
M α = λα, (4.12)
2
Actually, all we need is ∂0 ∂1 F (t, ze ) = 0.
274 Chapter 4 Phase Space Structure
ζc (t) = (u + iv)e(a+ib)t
= (u + iv)eat (cos bt + i sin bt)
= eat (u cos bt − v sin bt)
+ ieat (u sin bt + v cos bt). (4.14)
3
If the eigenvalues are not unique then the form of the solution is modified.
4.2.2 Fixed Points of Maps 275
x0 = T (x0 ). (4.17)
4
The map T is being used as an operator: multiplication is interpreted as
composition.
276 Chapter 4 Phase Space Structure
but x0 = T (x0 ) so
ξ(n) = ρn α, (4.22)
or
ξ(n) = ρn α. (4.26)
5
We assume the eigenvalues are distinct for now.
4.2.3 Relations Among Exponents 277
done for the equilibrium solutions. Let ρ = exp(A + iB) with real
A and B, and ξ = u + iv. A calculation similar to that for the
equilibrium case show that there are two real solutions
Imρ
Reρ
MJMT = J, (4.29)
MT α0 = ρα0 . (4.31)
Imρ
Reρ
Figure 4.5 If there is more than one degree of freedom the eigenvalues
for fixed points of a Hamiltonian map may lie in a quartet, with two
complex-conjugate pairs. The magnitudes of the pairs must be inverses.
This enforces the constraint that the expansion produced by the roots
with magnitude greater than one is counterbalanced by the contraction
produced by the roots with magnitude smaller than one.
So
1 0
MJα0 = J(MT )−1 α0 = Jα , (4.34)
ρ
and we can conclude that 1/ρ is an eigenvalue of M with the
eigenvector Jα0 . From the fact that for every eigenvalue its in-
verse is also an eigenvalue we deduce that the determinant of the
transformation M, which is the product of the eigenvalues, is one.
The constraints that the eigenvalues must be associated with
inverses and complex conjugates yields exactly one new pattern of
eigenvalues in higher dimensions. Figure 4.5 shows the only new
pattern that is possible.
We have seen that the Lyapunov exponents for fixed points
are related to the characteristic multipliers for the fixed points,
so the Hamiltonian constraints on the multipliers correspond to
Hamiltonian constraints for Lyapunov exponents at fixed points.
282 Chapter 4 Phase Space Structure
pθ
Vs Vu
Ws Wu
Wu Ws
Vu Vs
Figure 4.6 The neighborhood of the unstable fixed point of the pen-
dulum shows the stable and unstable manifolds of the nonlinear pen-
dulum and of the linearized variational system around the fixed point.
The axes are centered at the fixed point (±π, 0). The linear stable and
unstable manifolds are labeled by Vs and Vu ; the nonlinear stable and
unstable manifolds are labeled by Ws and Wu .
the unstable fixed point. So in this case the stable and unstable
manifolds coincide.
If the drive amplitude is non-zero then there are still one-
dimensional sets of points that are asymptotic to the unstable
fixed point forward and backward in time: there are still stable
and unstable manifolds. Why? The behavior near the fixed point
is described by the linearized variational system. For the linear
variational system, points in the space spanned by the unstable
eigenvector, when mapped backwards in time, are asymptotic to
the fixed point. Points slightly off this curve may initially ap-
proach the unstable equilibrium, but eventually will fall away to
one side or the other. For the driven system with small drive,
there must still be a curve which separates the points that fall
away to one side from the points that fall away to the other side.
Points on the dividing curve must be asymptotic to the unstable
equilibrium. The dividing set cannot have positive area because
the map is area preserving.
For the zero-amplitude drive case the stable and unstable man-
ifolds are contours of the conserved Hamiltonian. For non-zero
amplitude the Hamiltonian is no longer conserved. For non-zero
drive the stable manifolds and unstable manifolds no longer coin-
cide. This is generally true for non-integrable systems: stable and
unstable manifolds do not coincide.
If the stable and unstable manifolds no longer coincide where
do they go? In general, the stable and unstable manifolds must
cross one another. The only other possibilities are that they run
off to infinity or spiral around. Area preservation can be used to
exclude the spiraling case. We will see that in general there are
barriers to running away. So the only possibility is that the stable
and unstable manifolds cross. This is illustrated in figure 4.7. The
point of crossing of a stable and unstable manifold is called a ho-
moclinic intersection if the stable and unstable manifolds belong
to the same unstable fixed point. It is called a heteroclinic in-
tersection if the stable and unstable manifolds belong to different
fixed points.
If the stable and unstable manifolds cross once then there are an
infinite number of other crossings. The intersection point belongs
to both the stable and unstable manifolds. That it is on the
unstable manifold means that all images forward and backward in
time also belong to the unstable manifold, and likewise for points
on the stable manifold. Thus all images of the intersection belong
4.3 Homoclinic Tangle 285
Pθ
−π +π
Figure 4.7 For non-zero drive the stable and unstable manifolds no
longer coincide and in general cross. The dashed circle indicates the
central intersection. Forward and backward images of this intersection
are themselves intersections. Because the orbits are asymptotic to the
fixed point there are an infinity of such intersections.
A0
A−1 B1
B0 B2
B−1 A1
That would be ok, but what happens as the loop gets close to
the fixed point? There would still have to be loops, but then the
stable and unstable manifolds would not have the right behavior:
the stable and unstable manifolds of the linearized map do not
have loops. Therefore, the stable and unstable manifolds cannot
cross themselves.6
We are not done yet! The lobes that are defined by successive
crossings of the stable and unstable manifolds enclose a certain
area. The map is area preserving so all images of these lobes must
have the same area. So there are an infinite number of images of
these lobes, all with the same area. Furthermore, the boundaries
of these images cannot cross. As the lobes approach the fixed
point we get an infinite number of lobes with a base with an
exponentially shrinking length. In order to pack these together
on the plane, without the boundaries crossing each other, the
lobes must stretch out to preserve area. We see that the length of
the lobe must grow roughly exponentially (It may not be uniform
in width so it need not be exactly exponential.) This exponential
lengthening of the lobes no doubt bears some responsibility for the
exponential divergence of nearby trajectories of chaotic orbits, but
6
Sometimes it is argued that the stable and unstable manifolds cannot cross
themselves on the basis of the uniqueness of solutions of differential equations.
This is an incorrect argument. The stable and unstable manifolds are not
themselves solutions of a differential equation, they are sets of points whose
solutions are asymptotic to the unstable fixed points.
4.3.1 Computation of Stable and Unstable Manifolds 287
The near? argument is a test for whether two points are within
a given distance of each other in the graph. Because some co-
ordinates are angle variables, this may involve a principal value
comparison. For example, for the driven pendulum section, the
horizontal axis is an angle but the vertical axis is not, so the pic-
ture is on a cylinder:
(define (cylinder-near? eps)
(let ((eps2 (square eps)))
(lambda (x y)
(< (+ (square ((principal-value pi)
(- (car x) (car y))))
(square (- (cdr x) (cdr y))))
eps2))))
4.4 Integrable Systems 289
10
-10
−π 0 π
10
-10
−π 0 π
Figure 4.9 The computed homoclinic tangle for the driven pendulum
exhibits the features described in the text. Notice how the excursions
of the stable and unstable manifolds become longer and thinner as they
approach the unstable fixed point. A surface of section with the same
parameters is also shown.
4.4 Integrable Systems 291
J(t) = J(t0 )
θ(t) = ω(J(t0 ))(t − t0 ) + θ(t0 ). (4.37)
2π
θ1
0
0 θ0 2π
Figure 4.10 The solid and dotted lines show two periodic trajectories
on the configuration coordinate plane. For commensurate frequencies
the configuration motion is periodic, independent of the initial angles.
In this illustration the frequencies satisfy 3ω 0 (J(t0 )) = 2ω 1 (J(t0 )). The
orbit closes after 3 cycles of θ0 and 2 cycles of θ1 , for any initial θ0 and
θ1 .
i
P If thei frequencies ω (J(t0 )) satisfy an integer-coefficient relation
i ni ω (J(t0 )) = 0 among its frequencies we say that the frequen-
cies satisfy a commensurability. If there is no commensurability
for any non-zero integer coefficients we say that the frequencies
are linearly independent (with respect to the integers) and the so-
lution is quasiperiodic. One can prove that for n incommensurate
frequencies all solutions come arbitrarily close to every point in
the configuration space.7
For a system with two degrees of freedom the solutions in a
region described by a particular set of action-angle variables are
either equilibrium solutions, periodic solutions, or quasiperiodic
solutions.8 For systems with more than two degrees of degrees
of freedom there are trajectories that are neither periodic nor
quasiperiodic with n frequencies. These are quasiperiodic with
fewer frequencies and dense over a corresponding lower dimen-
sional torus.
Surfaces of section for integrable systems
As we have seen, in action-angle coordinates the angles move
with constant angular frequencies, and the momenta are constant.
Thus surfaces of section in action-angle coordinates are particu-
larly simple. We can make surfaces of section for time-independent
two degree of freedom systems or one degree of freedom systems
with periodic drive. In the latter case, one of the angles in the
action-angle system is the phase of the drive. We make surfaces
of section by accumulating points in one pair of canonical coordi-
nates as the other coordinate goes through some particular value,
such as zero. If we plot the section points with the angle coordi-
nate on the abscissa and the conjugate momentum on the ordinate
then the section points for all trajectories lie on horizontal lines,
as illustrated in figure 4.11.
For definiteness, let the plane of the surface of section be the
(θ0 , J0 ) plane, and the section condition be θ1 = 0. The other
7
Motion with n incommensurate frequencies is dense on the n-torus. Further-
more, such motion is ergodic on the n-torus. This means that time averages of
time independent phase space functions computed along trajectories are equal
to the phase space average of the same function over the torus.
8
For time-independent systems with two degrees of freedom the boundary
between regions described by different action-angle coordinates has asymptotic
solutions and unstable periodic orbits or equilibrium points. The solutions on
the boundary are not described by the action-angle Hamiltonian.
294 Chapter 4 Phase Space Structure
00
11
00
11 00
11
00
11 00 1
11 0
00
11 00
11 0
1
J0
0011
110000 000
1
0011 1
11
000
1
011 1
00
11
01111
00
0
0 θ0 2π
θ̂(i) = θ0 (i∆t + t0 )
ˆ = J0 (i∆t + t0 ),
J(i) (4.38)
4.4 Integrable Systems 295
9
The coordinate θ̂(i) is an angle. It can be brought to a standard interval such
as 0 to 2π.
10
Actually, to be a twist map we require |Dν(J)| > K > 0 over some interval
of J.
296 Chapter 4 Phase Space Structure
generated is filled densely. Again, this is the case for any initial
coordinates, because the frequencies depend only on the momenta.
There are infinitely many such orbits which are distinct for a given
set of frequencies.11
H = H0 + ²H1 . (4.40)
11
The section points for any particular orbit are countable and dense, but they
have zero measure on the line.
4.5 Poincaré-Birkhoff Theorem 297
Tk
J0
Tk
θ0
Figure 4.12 The map T k has a line of fixed points if the rotation
number is the rational j/k. Points above this line map to the larger θ0 ;
points below this line map to smaller θ0
T²k
T²k
J0
T²k
θ0
Figure 4.13 The map T²k is slightly different from T k , but above the
central region points still map to larger θ0 and below the central region
they map to smaller θ0 . By continuity there are points between for
which θ0 does not change.
C0
J0
C1
θ0
Figure 4.14 The curve C0 of points that map to the same θ0 under
T²k is indicated by the solid line. The image of this curve C1 under T²k
is the dotted curve. Area preservation implies these curves cross.
12
If Jˆ+ were not periodic in θ0 then it would have to spiral. Suppose it
spirals. The region enclosed by two successive turns of the spiral is mapped
to a region between succesive turns of the spiral further down the spiral.
The map preserves area, so the spiral cannot asymptote, but must progress
infinitely down the cylinder. This is impossible because of the twist condition:
sufficiently far down the cylinder the rotation number is too different to allow
the angle to be the same under T²k . So Jˆ+ does not spiral.
300 Chapter 4 Phase Space Structure
Figure 4.15 The fixed point on the left is linearly unstable. The one
on the right is linearly stable.
20
-20
−π 0 π
Figure 4.16 The curves C0 (solid) and C1 (dotted) for the 1:1 com-
mensurability.
20
-20
−π 0 π
Figure 4.17 A surface of section displaying the 1:1 commensurability.
4.6 Invariant Curves 303
5.5
4.5
3.5
−π 0 π
Figure 4.18 The curves C0 (solid) and C1 (dotted) for the 1:3 com-
mensurability. The angle runs from −π to π. The momentum runs from
3.5 to 4.5 in appropriate units.
5.5
4.5
3.5
−π 0 π
Figure 4.19 A surface of section displaying the 1:3 commensurability.
The angle runs from −π to π. The momentum runs from 3.5 to 4.5 in
appropriate units.
4.6 Invariant Curves 305
13
This depends on the assumptions that Jmin and Jmax bracket the actual mo-
mentum, and that the rotation number is sufficiently continuous in momentum
in that region.
4.6.1 Finding Invariant Curves 307
The maps are evolved and built into a stream by a simple recursive
procedure. The maps are represented in the same way that they
appeared in section 3.6.
(define (orbit-stream the-map x y)
(cons-stream (list x y)
(the-map x y
(lambda (nx ny)
(orbit-stream the-map nx ny))
(lambda () ’fail))))
14
The insert procedure is ugly:
(define (insert! x set cont)
(cond ((null? set)
(cont (list x) 1))
((< x (car set))
(cont (cons x set) 0))
(else
(let lp ((i 1) (lst set))
(cond ((null? (cdr lst))
(set-cdr! lst (cons x (cdr lst)))
(cont set i))
((< x (cadr lst))
(set-cdr! lst (cons x (cdr lst)))
(cont set i))
(else
(lp (+ i 1) (cdr lst))))))))
15
The principal-range procedure is implemented as follows:
(define ((principal-range period) index)
(let ((t (- index (* period (floor (/ index period))))))
(if (< t (/ period 2.))
t
(- t period))))
4.6.1 Finding Invariant Curves 309
Once we have created this mess we can use it to find the initial
momentum (for a given initial angle) for an invariant curve with
a given rotation number. We search the standard map for an
invariant curve with a golden rotation number:16
(find-invariant-curve (standard-map 0.95)
(- 1 (/ 1 golden-mean))
0.0
2.0
2.2
1e-5)
;Value: 2.114462280273437
16
There is no invariant curve in the standard map with rotation number φ =
1.618.... However 1 − 1/φ has the same continued-fraction tail as φ and there
are rotation numbers of this size in the standard map.
310 Chapter 4 Phase Space Structure
interest then we check to see if the angle of the other map is in the
corresponding interval. If so, the intervals for the uniform circle
map and the other map are narrowed and the iteration proceeds. If
the angle is not in the required interval, a discrepancy is noted and
the sign of the discrepancy is reported. For this process to make
sense the differences between the angles for successive iterations
of both maps must be less than π.
(define (which-way? rotation-number x0 y0 the-map)
(let ((pv (principal-value (+ x0 pi))))
(let lp ((z x0) (zmin (- x0 :2pi)) (zmax (+ x0 :2pi))
(x x0) (xmin (- x0 :2pi)) (xmax (+ x0 :2pi))
(y y0))
(let ((nz (pv (+ z (* :2pi rotation-number)))))
(the-map x y
(lambda (nx ny)
(let ((nx (pv nx)))
(cond ((< x0 z zmax)
(if (< x0 x xmax)
(lp nz zmin z nx xmin x ny)
(if (> x xmax) 1 -1)))
((< zmin z x0)
(if (< xmin x x0)
(lp nz z zmax nx x xmax ny)
(if (< x xmin) -1 1)))
(else
(lp nz zmin zmax nx xmin xmax ny)))))
(lambda ()
(error "Map failed" x y)))))))
−π
−π 0 π
Figure 4.21 Here is a small portion of the same invariant curve shown
in figure 4.20. The curve is magnified by 2π × 107 . We see that even at
this magnification the points appear to lie on a line. We also see that
the visitation frequency of points is highly nonuniform.
tation number will not exist. Indeed, if the invariant set persists
with the given rotation number it will have an infinite number of
holes (because it has an irrational winding number). Such a set is
sometimes called a cantorus.
p0 = ∂2 L0 (t, q 0 , v 0 )
5.1 Point Transformations 319
p = ∂2 L(t, q, v)
= ∂2 L(t, F (t, q 0 ), ∂0 F (t, q 0 ) + ∂1 F (t, q 0 )v 0 ). (5.4)
(t, q, p) = C(t, q 0 , p0 )
= (t, F (t, q 0 ), p0 (∂1 F (t, q 0 ))−1 ). (5.6)
H 0 (t, q 0 , p0 ) = p0 v 0 − L0 (t, q 0 , v 0 )
= (p∂1 F (t, q 0 )) ((∂1 F (t, q 0 )−1 (v − ∂0 F (t, q 0 ))))
− L(t, q, v)
= pv − L(t, q, v) − p∂0 F (t, q 0 )
= H(t, q, p) − p∂0 F (t, q 0 ), (5.7)
using relations (5.1) and (5.5) in the second step. Fully expressed
in terms of the transformed coordinates and momenta the trans-
formed Hamiltonian is
1
Solving for p in terms of p0 involves multiplying equation (5.3) on the right by
(∂1 F (t, q 0 ))−1 . This inverse is the structure that when multiplying ∂1 F (t, q 0 )
on the right gives a identity structure. Structures representing linear trans-
formations may be represented in terms of matrices. In this case, the matrix
representation of the inverse structure is the inverse matrix of the matrix
representing the given structure.
2
In chapter 1 the transformation C takes a local tuple in one coordinate system
and gives a local tuple in another coordinate system. In this chapter C is a
phase-space transformation.
320 Chapter 5 Canonical Transformations
v = ∂1 F (t, q 0 )v 0 . (5.9)
H(t, q, p) = pv − L(t, q, v)
= p0 v 0 − L0 (t, q 0 , v 0 )
= H 0 (t, q 0 , p0 ). (5.11)
3
The velocities and the momenta are dual geometric objects with respect to
time-independent point transformations. The velocities comprise a vector field
on the configuration manifold, and the momenta comprise a covector field on
the configuration manifold. The invariance of the inner product pv under point
transformations provides the motivation for the use of superscripts for velocity
components and subscripts for momentum components in our notation.
5.1 Point Transformations 321
1 2 1 2
2 pr 2 pφ
V (r) + +
m mr2
There are three terms. There is the potential energy, which de-
pends on the radius, there is the kinetic energy due to radial mo-
tion, and there is the kinetic energy due to tangential motion. As
expected, the angle φ does not appear and thus the angular mo-
mentum is a conserved quantity. By going to polar coordinates we
have decoupled one of the two degrees of freedom in the problem.
σ = C ◦ σ0. (5.12)
Dσ = Ds H ◦ σ, (5.15)
Using σ = C ◦ σ 0 we find
Ds H ◦ C = DC · (Ds H 0 ). (5.19)
Ds
Ds H
q̇, ṗ q, p R
H
D
DC C
Ds H 0 H0
q̇ 0 , ṗ0 q 0 , p0 R
Ds
Ds H ◦ C = DC Ds (H ◦ C). (5.20)
The value of Te does not depend on its arguments, and for time-
independent transformations Te = DC · Te, so the canonical condi-
tion becomes
Φ(A)(v) = A · v. (5.28)
Φ∗ (A)(p) = p · A. (5.29)
where
r
2I
x= sin θ (5.33)
α
√
px = 2αI cos θ. (5.34)
4
It is in principle impossible to generally determine if two functions are the
same, but in this case, since Φ(DC(s)) is linear, this test is valid.
5
The shape of DH(s) is a compatible shape to the shape of s: if they are
multiplied the result is a real number. The procedure compatible-shape takes
any structure and produces another structure that is guaranteed to multiply
with the given structure to produce a real number. The structure produced
is filled with unique real literals, so if the residual is zero then the functions
are the same.
328 Chapter 5 Canonical Transformations
(print-expression
((time-independent-canonical? (polar-canonical ’alpha))
(up ’t ’theta ’I)))
(up 0 0 0)
(print-expression
((time-independent-canonical? a-non-canonical-transform)
(up ’t ’theta ’p)))
p2x 1
H(t, x, px ) = + kx2 . (5.35)
2m 2
6
Actually, for I = 0 the transform is not well defined and so it is not composi-
tional canonical for that value. This transformation is “locally compositional
canonical” in that it is compositional canonical for nonzero values of I. We
will ignore this essentially topological problem.
7
The mysterious symbols such as x8102 are unique real literals introduced to
test functional equalities. That they appeared in a residual demonstrates that
the equality is invalid.
5.2.1 Time-independent Canonical Transformations 329
Dx = px /m
Dpx = −kx, (5.36)
mD2 x + kx = 0. (5.37)
The solution is
where
p
ω = k/m (5.39)
So
θ(t) = ωt + φ. (5.44)
330 Chapter 5 Canonical Transformations
8
The derivative of a linear transformation is a constant function, independent
of the argument.
9
The procedure s->m takes three arguments: (s->m s* A s). The s* and s
specify the shapes of objects that multiply A on the left and right to give a
numerical value; these specify the basis.
332 Chapter 5 Canonical Transformations
(print-expression
((symplectic? (F->CT p->r))
(up ’t
(up ’r ’varphi)
(down ’p r ’p varphi))))
(matrix-by-rows (list 0 0 0 0 0)
(list 0 0 0 0 0)
(list 0 0 0 0 0)
(list 0 0 0 0 0)
(list 0 0 0 0 0))
JTn = J−1
n = −Jn . (5.48)
Jn = AJn AT (5.49)
10
The qp submatrix of a 2n + 1-dimensional square matrix is the 2n-
dimensional matrix obtained by deleting the first row and the first column
of the given matrix. This can be computed by:
(define (qp-submatrix m)
(m:submatrix m 1 (m:num-rows m) 1 (m:num-cols m)))
5.2.3 Time-Dependent Transformations 333
(define ((symplectic-transform? C) s)
(symplectic-matrix?
(qp-submatrix
(s->m (compatible-shape s)
((D C) s)
s))))
(define (symplectic-matrix? M)
(let ((2n (m:dimension M)))
(let ((J (symplectic-unit (quotient 2n 2))))
(- J (* M J (m:transpose M))))))
H 0 = H ◦ C + K, (5.50)
and
(define ((canonical-K? C K) s)
(let ((s* (compatible-shape s)))
(- (T-func s*)
(+ (* ((D C) s) (J-func ((D K) s)))
(((partial 0) C) s)))))
Rotating coordinates
Consider a time-dependent point transformation to uniformly ro-
tating coordinates:
q = R(Ω)(t, q 0 ), (5.55)
with components
x = x0 cos(Ωt) − y 0 sin(Ωt)
y = x0 sin(Ωt) + y 0 cos(Ωt). (5.56)
As a program this is
(define ((rotating n) state)
(let ((t (time state))
(q (coordinate state)))
(let ((x (ref q 0))
(y (ref q 1))
(z (ref q 2)))
(up (+ (* (cos (* n t)) x) (* (sin (* n t)) y))
(- (* (cos (* n t)) y) (* (sin (* n t)) x))
z))))
(pe
((symplectic-transform? (C-rotating ’Omega))
(up ’t
(coordinate-tuple ’x ’y ’z)
(momentum-tuple ’px ’py ’pz))))
(matrix-by-rows (list 0 0 0 0 0 0)
(list 0 0 0 0 0 0)
(list 0 0 0 0 0 0)
(list 0 0 0 0 0 0)
(list 0 0 0 0 0 0)
(list 0 0 0 0 0 0))
e
The Poisson bracket can be written in terms of J:
q = A(t, q 0 , p0 ) (5.59)
p = B(t, q 0 , p0 ). (5.60)
δji = {Ai , Bj }
0 = {Ai , Aj }
0 = {Bi , Bj } (5.61)
where δji is one if i = j and zero otherwise. These are called the
fundamental Poisson brackets. If a transformation satisfies these
fundamental Poisson bracket relations then it is symplectic.
We have found that a time-dependent transformation is canon-
ical if its position-momentum part is symplectic and we modify
the Hamiltonian by the addition of a suitable K. We can rewrite
these conditions in terms of Poisson brackets. If the Hamiltonian
is
Exercise 5.8:
Fill in the details to show that the symplectic condition (5.31) is equiv-
alent to the fundamental Poisson brackets (5.61) and that the condition
on K (5.53) is equivalent to the Poisson bracket condition on K (5.63).
338 Chapter 5 Canonical Transformations
and so Dx is
r
2I(t) 1
Dx(t) = Dθ(t) cos θ(t) + DI(t) p sin θ(t). (5.65)
α 2I(t)α
px (t)Dx(t) − I(t)Dθ(t)
¡ ¢
= I(t)Dθ(t) 2 cos2 θ(t) − 1 + DI(t) sin θ(t) cos θ(t). (5.66)
{f ◦ C, g ◦ C} (s)
= (D(f ◦ C))(s) · (Je ◦ D(g ◦ C))(s)
e
= (Df ◦ C)(s)) · DC(s) · (J((Dg ◦ C(s)) · DC(s)))
e
= ((Df ◦ C)(s)) · (J((Dg ◦ C)(s)))
= ({f, g} ◦ C)(s), (5.67)
{f ◦ C, g ◦ C} = {f, g} ◦ C. (5.68)
Volume preservation
Consider a canonical transformation C. Let Ĉt be a function with
parameter t such that (q, p) = Ĉt (q 0 , p0 ) if (t, q, p) = C(t, q 0 , p0 ).
The function Ĉt maps phase space coordinates to alternate phase
space coordinates at a given time. Consider regions R in (q, p)
and R0 in (q 0 , p0 ) such that R = Ĉt (R0 ). The volume of region R0
is
Z Z
V (R) = 1= det(DĈt ). (5.69)
R R0
as can be seen by writing out the components. We use the fact that
Poisson brackets are invariant under canonical transformations
11
The ω form can also be written as a sum over degrees of freedom:
X
ω(ζ1 , ζ2 ) = Pi (ζ2 )Qi (ζ1 ) − Pi (ζ1 )Qi (ζ2 ).
i
Notice that the contributions for each i do not mix components from different
degrees of freedom.
5.3 Invariants of Canonical Transformations 341
where we have used the useful relation (5.74). The right-hand side
of equation (5.76) is
Now the left-hand side must equal the right-hand side for any f
and g, so the equation must also be true for arbitrary ζi0 of the
form:
The ζi0 are arbitrary incremental states with zero time components.
So we have proven that
for canonical C and incremental states ζi0 with zero time compo-
nents. Using equation (5.72) we have
(define zeta2
(up 0
(typical-object (coordinate a-polar-state))
(typical-object (momentum a-polar-state))))
Note that the time components of zeta1 and zeta2 are zero. We
evaluate the residual:
(print-expression
(let ((DCs ((D (F->CT p->r)) a-polar-state)))
(- (omega zeta1 zeta2)
(omega (* DCs zeta1) (* DCs zeta2)))))
0
That is, the sum of the projected areas on the canonical planes is
preserved by canonical transformations. Another way to say this
is
XZ XZ
i
dq dpi = dq 0i dp0i . (5.83)
i Rqi ,pi i Rq0 0i ,p0
i
R R0
p2 p02
R2 R20
R1 R10
q1 q10
q2 p1 q20 p01
To see why this is true we first consider how the area of an incre-
mental parallelogram in phase space transforms under canonical
transformation. Let (∆q, ∆p) and (δq, δp) represent small incre-
ments in phase space, originating at (q, p). Consider the incre-
mental parallelogram with vertex at (q, p) with these two phase
space increments as edges. The sum of the areas of the canonical
projections of this incremental parallelogram can be written
X X
∆Ai = (∆q i δpi − ∆pi δq i ). (5.84)
i i
The right hand side is the sum of the areas on the canonical planes;
for each i we see the area of a parallelogram computed from the
components of the vectors defining its adjacent sides. Let ζ1 =
344 Chapter 5 Canonical Transformations
(0, ∆q, ∆p) and ζ2 = (0, δq, δp), then the sum of the areas of the
incremental parallelograms is just
X
∆Ai = ω(ζ1 , ζ2 ), (5.85)
i
The canonical planes are disjoint except at the origin, so the pro-
jected areas only intersect in at most one point. Thus we may
5.4 Extended Phase Space 345
with
We have
The Lagrange equations for qe are satisfied for exactly the same
trajectories that satisfy the original Lagrange equations for q.
The extended system is subject to a constraint that relates the
time to the new independent variable. We assume the constraint
is of the form φ(τ ; qe , qt ; ve , vt ) = qt − f (τ ) = 0. The constraint is
a holonomic constraint involving the coordinates and time, so we
can incorporate this constraint by augmenting the Lagrangian:12
Pe (τ ; qe , qt , λ; ve , vt , vλ ) = ∂2,0 L0e (τ ; qe , qt , λ; ve , vt , vλ )
= ∂2 L(qt , qe , ve /vt )
= P(qt , qe , ve /vt ) (5.93)
Pt (τ ; qe , qt , λ; ve , vt , vλ ) = ∂2,1 L0e (τ ; qe , qt , λ; ve , vt , vλ )
12
We augment the Lagrangian with the total time derivative of the constraint
so that the Legendre transform will be well defined.
5.4 Extended Phase Space 347
∂2 Le (τ ; qe , qt ; ve , vt ) · (ve , vt ) = Le (τ ; qe , qt ; ve , vt ), (5.96)
∂2 L0e (τ ; qe , qt , λ; ve , vt , vλ ) · (ve , vt , vλ )
= ∂2 Le (τ ; qe , qt ; ve , vt ) · (ve , vt ) + vλ vt + (vt − Df (τ ))vλ
= Le (τ ; qe , qt ; ve , vt ) + vλ vt + (vt − Df (τ ))vλ . (5.97)
He0 (τ ; qe , qt , λ; pe , pt , pλ ) = vλ vt
= (pt + H(qt , qe , pe ))(pλ + Df (τ )). (5.98)
We have used the fact that that at corresponding states the mo-
menta have the same values, so on paths pe = p ◦ t, and
13
Once we have made this reduction, taking pλ to be zero, we can no longer
perform a Legendre transform back to the extended Lagrangian system; we
cannot solve for pt in terms of vt . However, the Legendre transform in the
extended system from He0 to L0e , with associated state variables, is well defined.
14
If f is strictly increasing then Df is never zero.
5.4 Extended Phase Space 349
r0 = r (5.109)
5.4 Extended Phase Space 351
θ0 = θ − Ωt (5.110)
p0r = pr (5.111)
p0θ = pθ (5.112)
with
He (τ ; r, θ, t; pr , pθ , pt ) = H(t; r, θ; pr , pθ ) + pt
= f (r, θ − Ωt, pr , pθ ) + pt (5.114)
t0 = t (5.115)
pt = −Ωp0θ + p0t . (5.116)
15
Actually, the traditional Jacobi constant is C = −2H 0 .
352 Chapter 5 Canonical Transformations
and
I n−1 X I n−1
X
i
( pi dq − Edt) = ( p0i dq 0i − E 0 dt0 ). (5.122)
∂R i=0 ∂R0 i=0
The relations (5.121 and 5.122) are two formulations of the Poincaré-
Cartan integral invariant.
qri = q i ◦ τ
pri = pi ◦ τ,
and thus
Note that in the reduced phase space we will have indices for the
structured variables in the range 0 . . . n − 1 whereas in the original
5.5 Reduced Phase Space 355
phase space the indices are in the range 0 . . . n. We will show that
Hr is an appropriate Hamiltonian for the given dynamical system
in the reduced phase space. To compute Hamilton’s equations we
must expand the implicit definition of Hr . We define an auxiliary
function
∂0 g = (∂0 f )n − (∂1 f )n ∂0 Hr = 0
(∂1 g)i = (∂0 f )i − (∂1 f )n (∂1 Hr )i = 0
(∂2 g)i = (∂1 f )i − (∂1 f )n (∂2 Hr )i = 0, (5.130)
Dq i (τ (x))
Dqri (x) =
Dq n (τ (x))
(∂2 H(τ (x), q(τ (x)), p(τ (x))))i
=
(∂2 H(τ (x), q(τ (x)), p(τ (x))))n
= (∂2 Hr (x, qr (x), pr (x)))i (5.133)
Dpi (τ (x))
Dpri (x) =
Dq n (τ (x))
− (∂1 H(τ (x), q(τ (x)), p(τ (x))))i
=
(∂2 H(τ (x), q(τ (x)), p(τ (x))))n
= −(∂1 Hr (x, qr (x), pr (x)))i . (5.134)
356 Chapter 5 Canonical Transformations
p2r p2φ
H(t; r, φ; pr , pφ ) = + + V (r) (5.135)
2m 2mr2
There are two degrees of freedom and the Hamiltonian is time-
independent. Thus the energy, the value of the Hamiltonian,
is conserved on realizable paths. Let’s forget about time and
reparametrize this system in terms of the orbital radius r.16 To
do this we solve
H(t; r, φ; pr , pφ ) = E (5.136)
for pr , obtaining
µ ¶ 12
p2φ
H 0 (r; φ; pφ ) = −pr = − 2m(E − V (r)) − (5.137)
r2
which is the Hamiltonian in the reduced phase space.
Hamilton’s equations are now quite simple:
µ ¶− 12
dφ ∂H 0 pφ p2φ
= = 2 2m(E − V (r)) − 2 (5.138)
dr ∂pφ r r
dpφ ∂H 0
=− = 0. (5.139)
dr ∂φ
We see that pφ is independent of r (as it was with t), so for any
particular orbit we may define a constant angular momentum L.
Thus our problem ends up as a simple quadrature:
Z r µ ¶− 12
L L2
φ(r) = 2m(E − V (r)) − 2 dr + φ0 . (5.140)
r2 r
16
We could have chosen to reparametrize in terms of φ, but then both pr
and r would occur in the resulting time-independent Hamiltonian. The path
we have chosen takes advantage of the fact that φ does not appear in our
Hamiltonian, so pφ is a constant of the motion. This structure suggests that
to solve this kind of problem we need to look ahead, as in playing chess.
5.5 Reduced Phase Space 357
p = a(1 − e2 ) (5.146)
p = ∂1 F1 (t, q, q 0 ) (5.148)
p0 = −∂2 F1 (t, q, q 0 ) (5.149)
H 0 (t, q 0 , p0 ) − H(t, q, p) = ∂0 F1 (t, q, q 0 ). (5.150)
then
q 0 = A(t, q, p) (5.152)
p0 = −∂2 F1 (t, q, A(t, q, p)). (5.153)
5.6 Generating Functions 359
q = B(t, q 0 , p0 ) (5.155)
p = ∂1 F1 (t, B(t, q 0 , p0 ), q 0 ). (5.156)
px = ∂1 F1 (t, x, θ) (5.159)
I = −∂2 F1 (t, x, θ). (5.160)
Using the relations (5.157) and (5.158), which specify the canoni-
cal transformation, the first equation (5.159) can be rewritten
α x2
I = −∂2 F1 (t, x, θ) = − ∂1 φ(t, θ), (5.163)
2 sin2 θ
but we see that if we set φ = 0 the desired relations are recovered.
So the generating function
α 2
F1 (t, x, θ) = x cot θ (5.164)
2
generates the polar-canonical transformation. This shows that
this transformation is canonical.
px = ∂1 F1 (t, x, y)
py = −∂2 F1 (t, x, y) (5.165)
−(∂2 H 0 )j (∂1 (∂2 F1 )j )i = (∂1 H)i + (∂2 H)j (∂1 (∂1 F1 )j )i + (∂1 ∂0 F1 )i
(∂1 H 0 )i − (∂2 H 0 )j (∂2 (∂2 F1 )j )i = (∂2 H)j (∂2 (∂1 F1 )j )i + (∂2 ∂0 F1 )i
(5.168)
17
Here we use indices to select particular components of structured objects.
If an index symbol appears both as a superscript and as a subscript in an
expression, the value of the expression is the sum over all possible values of the
index symbol of the designated components (Einstein summation convention).
Thus, for example, if q̇ and p are of dimension n then the indicated product
n−1
pi q̇ i is to be interpreted as Σi=0 pi q̇ i .
18
A structure is non-singular if the determinant of the matrix representation
of the structure is non-zero.
362 Chapter 5 Canonical Transformations
and let γ10 and γ20 be two paths with the same endpoints. Then
I X I X
0 0 i
Gt (γ2 ) − Gt (γ1 ) = pi dq − p0i dq 0i
∂R ∂R0
= 0. (5.177)
p = fp (t, q, q 0 )
p0 = fp0 (t, q, q 0 ) (5.181)
The function F1 has the same value as F but has different argu-
ments. We will show that this F1 is in fact the generating function
for canonical transformations introduced in section 5.6. Let’s be
explicit about the definition of F1 in terms of a line integral
The two line integrals can be combined into this one because they
are both expressed as integrals along a curve in (q, q 0 ).
We can use the path independence of F1 to compute the par-
tial derivatives of F1 with respect to particular components and
19
Point transformations are not in this class: we cannot solve for the momenta
in terms of the positions for point transformations, because for a point trans-
formation the primed and unprimed coordinates can be deduced from each
other, so there is not enough information in the coordinates to deduce the
momenta.
5.6.2 Generating Functions and Integral Invariants 365
and
These are just the configuration and momentum parts of the gen-
erating function relations for canonical transformation. So start-
ing with a canonical transformation, we can find a generating
function that gives the coordinate-momentum part of the trans-
formation through its derivatives.
Starting from a general canonical transformation, we have con-
structed an F1 generating function from which the canonical trans-
20
Let F be defined as the path-independent line integral
Z xX
F (x) = fi (x)dxi + F (x0 )
x0 i
then
∂i F (x) = fi (x).
The partial derivatives of F do not depend on the constant point x0 or the
path from x0 to x, so we can choose a path that is convenient for evaluating
the partial derivative. Let
H(x)(∆xi ) = F (x0 , . . . , xi + ∆xi , . . . , xn−1 ) − F (x0 , . . . , xi , . . . , xn−1 ).
The partial derivative of F with respect to the ith component of F is
∂i F (x) = D(H(x))(0).
The function H is defined by the line integral
Z x0 ,...,xi +∆xi ,...,xn−1 X
H(x)(∆xi ) = fj (x)dxj
x0 ,...,xi ,...,xn−1 j
Z x0 ,...,xi +∆xi ,...,xn−1
= fi (x)dxi ,
x0 ,...,xi ,...,xn−1
where the second line follows because the line integral is along the coordinate
direction xi . This is now an ordinary integral so
∂i F (x) = fi (x).
366 Chapter 5 Canonical Transformations
The minus sign arises because by flipping the axes we are travers-
ing the area in the opposite sense. Repeating the argument just
given, we can define a function
Z X Z X
0 0 0 0 0 0 i
F (t, q , p )−F (t, q0 , p0 ) = pi dq + qi0 dp0i ,(5.187)
γ=C(t,γ 0 ) i γ0 i
q 0 = fq0 0 (t, q, p0 )
p = fp0 (t, q, p0 ) (5.188)
and define
and
21
There may be some singular cases and topological problems that prevent
this from being rigorously true.
5.6.2 Generating Functions and Integral Invariants 367
as are F1 and F2
p = ∂1 F2 (t, q, p0 ). (5.200)
p = ∂1 F1 (t, q, q 0 ) (5.202)
p0 = −∂2 F1 (t, q, q 0 ) (5.203)
H 0 (t, q 0 , p0 ) − H(t, q, p) = ∂0 F1 (t, q, q 0 ). (5.204)
p = ∂1 F2 (t, q, p0 ) (5.205)
q 0 = ∂2 F2 (t, q, p0 ) (5.206)
H 0 (t, q 0 , p0 ) − H(t, q, p) = ∂0 F2 (t, q, p0 ) (5.207)
and
22
The various generating functions are traditionally known by the names: F1 ,
F2 , F3 , and F4 . Please don’t blame us.
5.6.3 Classes of Generating Functions 369
The relations between the coordinates and the momenta are the
same as before. We also have
with
so
F1 (t, q, q 0 ) = F2 (t, q, p0 ) − p0 q 0
= p0 S(t, q) − p0 q 0
= 0. (5.227)
x = r cos θ (5.228)
y = r sin θ.
(x, y) = ∂2 F2 (t; r, θ; px , py )
= (r cos θ, r sin θ)
[pr , pθ ] = ∂1 F2 (t; r, θ; px , py )
= [px cos θ + py sin θ, −px r sin θ + py r cos θ] . (5.230)
372 Chapter 5 Canonical Transformations
r0 = r
θ0 = θ − Ωt, (5.232)
r0 =r
θ0 = θ − Ωt
pr = p0r
pθ = p0θ , (5.234)
which show that the momenta are the same in both coordinate
systems. However, here the Hamiltonian is not a simple composi-
tion:
Two-body problem
In this example we illustrate how canonical transformations can
be used to eliminate some of the degrees of freedom, leaving an
essential problem with fewer degrees of freedom.
Suppose only certain combinations of the coordinates appear in
the Hamiltonian. We make a canonical transformation to a new
set of phase-space coordinates such that these combinations of
the old phase space coordinates are some of the new phase space
coordinates. We choose other independent combinations of the
coordinates to complete the set. The advantage is that these other
independent coordinates do not appear in the new Hamiltonian,
so the momenta conjugate to them are conserved quantities.
Let’s see how this idea lets us reduce the problem of two gravi-
tating bodies to the simpler problem of the relative motion of the
two bodies, and in the process discover that the momentum of the
center of mass is conserved.
Consider the motion of two masses m1 and m2 , subject only to
a mutual gravitational attraction described by the potential V (r).
This problem has six degrees of freedom. The rectangular coor-
dinates of the particles are x1 and x2 , with conjugate momenta
p1 and p2 . Each of these is a structure of the three rectangular
components. The distance between the particles is r = kx1 − x2 k.
The Hamiltonian for the two-body problem is:
p21 p2
H(t; x1 , x2 ; p1 , p2 ) = + 2 + V (r). (5.236)
2m1 2m2
We do not need to further specify V at this point.
374 Chapter 5 Canonical Transformations
x = x2 − x1 (5.237)
and
1 a2 b2
= + . (5.245)
M m1 m2
We recognize µ as the usual “reduced mass.”
Notice that if the term proportional to pP were not present
then the x and X degrees of freedom would not be coupled at all,
and furthermore, the X part of the Hamiltonian would be just
the Hamiltonian of a free particle which is trivial to solve. The
condition that the “cross terms” disappear is
b a
− = 0, (5.246)
m2 m1
which is satisfied by
a = cm1 (5.247)
b = cm2 (5.248)
with
p2
Hx (t, x, p) = + V (r) (5.250)
2µ
and
P2
HX (t, X, P ) = . (5.251)
2M
The reduced mass is the same as before, and now
1
M= (5.252)
c2 (m1+ m2 )
Notice that without further specifying c the problem has been
separated into the problem of determining the relative motion of
the two masses, and the problem of the other degrees of freedom.
We did not need to have a priori knowledge that the center of
376 Chapter 5 Canonical Transformations
and
X
V (t; x0 , x1 , . . . , xn−1 ; p0 , p1 , . . . , pn−1 ) = fij (kxi − xj k), (5.255)
i<j
for some constants m0i , and that the potential V can be written solely
in terms of the Jacobi coordinates x0i with indices i > 0.
c. Are there any other canonical transformations that isolate the center
of mass and leave the kinetic energy as a sum of squares of momenta?
Epicyclic motion
It is often useful to compose a sequence of canonical transforma-
tions to make up the transformation we need for any particular
mechanical problem. The transformations we have supplied are
especially useful as components in these computations.
We will illustrate the use of canonical transformations to learn
about planar motion in a central field. The strategy will be to
consider perturbations of circular motion in the central field. The
analysis will proceed by transforming to a rotating coordinate sys-
tem that rides on a circular reference orbit, and then to make ap-
proximations that restrict the analysis to orbits that differ from
the circular orbit only slightly.
Recall that in rectangular coordinates we could easily write a
Hamiltonian for the motion of a particle of mass m in a field
defined by a potential energy that is only a function of the distance
from the origin as follows:
p2x + p2y p
H(t; x, y; px , py ) = + V ( x2 + y 2 ) (5.262)
2m
378 Chapter 5 Canonical Transformations
x = r cos φ (5.267)
y = r sin φ (5.268)
p2r p2φ
H 0 (t; r, φ; pr , pφ ) = + + V (r) (5.269)
2m 2mr2
We can now write Hamilton’s equations in these new coordinates,
and they are much more illuminating than the equations expressed
in rectangular coordinates:
pr
Dr = (5.270)
m
pφ
Dφ = (5.271)
mr2
p2φ
Dpr = − DV (r) (5.272)
mr3
Dpφ = 0 (5.273)
r0 =r (5.275)
φ0 = φ − Ωt (5.276)
p0r = pr (5.277)
p0φ = pφ (5.278)
Using the formulas developed in the last section we can now write
the new Hamiltonian directly:
p02
r
p02
φ
H 00 (t; r0 , φ0 ; p0r , p0φ ) = + + V (r0 ) − p0φ Ω (5.279)
2m 2mr02
We see that H 00 is not time dependent, and therefore it is con-
served, but it is not energy. Energy is not conserved in the moving
coordinate system, but what is conserved here is a new quantity
which combines the energy with the product of the angular mo-
mentum of the particle in the new frame and the angular velocity
of the frame. We will want to keep track of this term.
Next, we return to rectangular coordinates, but they are rotat-
ing with the reference circular orbit:
x0 = r0 cos φ0 (5.280)
y 0 = r0 sin φ0 (5.281)
p0φ
p0x = p0r cos φ0 − sin φ0 (5.282)
r0
380 Chapter 5 Canonical Transformations
p0φ
p0y = p0r 0
sin φ + cos φ0 . (5.283)
r0
The Hamiltonian is
ξ = x0 − R0 (5.285)
η = y0 (5.286)
pξ = p0x (5.287)
pη = p0y . (5.288)
0000
p2ξ + p2η
H (t; ξ, η; pξ , pη ) = + Ω(ηpξ − (ξ + R0 )pη )
2m p
+ V ( (ξ + R0 )2 + η 2 ), (5.289)
η2
= V (R0 ) + DV (R0 )(ξ + )
2R0
ξ2
+ D2 V (R0 ) + ···. (5.292)
2
So the (negated) generalized forces are:
D3 ξ + ω 2 Dξ = 0 (5.301)
where
D2 V (R0 )
ω 2 = 3Ω2 + . (5.302)
m
Thus we have a simple harmonic oscillator with frequency ω as
one of the components of the solution. The general solution has
382 Chapter 5 Canonical Transformations
three parts
h i h i
ξ(t) 0
= η0 (5.303)
η(t) 1
h i
1
+ ξ0 (5.304)
−2At
h i
sin(ωt + ϕ0 )
+ C0 2Ω (5.305)
ω cos(ωt + ϕ0 )
where
Ω2 m − D2 V (R0 )
A= . (5.306)
4Ωm
The constants η0 , ξ0 , C0 , and ϕ0 are determined by the initial
conditions. If C0 = 0 the particle of interest is on a circular trajec-
tory, but not necessarily the same one as the reference trajectory.
If C0 = 0 and ξ0 = 0 we have a “fellow traveler”, a particle in
the same circular orbit as the reference orbit, but with different
phase. If C0 = 0 and η0 = 0 we have a particle in a circular orbit
that is interior or exterior to the reference orbit and shearing away
from the reference orbit. The shearing is due to the fact that the
angular velocity for a circular orbit varies with the radius. The
constant A gives the rate of shearing at each radius. If both η0 = 0
and ξ0 = 0 but C0 6= 0 then we have “epicyclic motion”. A particle
in a nearly circular orbit may be seen to move in an ellipse around
the circular reference orbit. The ellipse will be elongated in the
direction of circular motion by the factor 2Ω/ω and it will rotate
in the direction opposite the direction of the circular motion. The
initial phase of the epicycle is ϕ0 . Of course, any combination of
these solutions may exist.
The epicyclic frequency ω and the shearing rate A are deter-
mined by the force law (the radial derivative of the potential en-
ergy). For a force law proportional to a power of the radius
F ∝ r1−n (5.307)
n 0 1 2 3 4 5
A 1 1 3 5
Ω 0 4 2 4 1 4
ω
√ √
Ω 2 3 2 1 0 ±i
We can get some insight into the kinds of orbits that are pro-
duced by the epicyclic approximation by examining a few exam-
ples. For some force laws we have integer ratios of epicyclic fre-
quency to orbital frequency. In those cases we have closed orbits.
For an inverse-square force law (n = 3) we get elliptical orbits
with the center of the field at a focus of the ellipse. Figure 5.3
shows how an approximation to such an orbit can be constructed
by superposition of the motion on an elliptical epicycle with the
motion of the same frequency on a circle. If the force is propor-
tional to the radius (n = 0) we get a two-dimensional harmonic
oscillator. Here the epicyclic frequency is twice the orbital fre-
quency. Figure 5.4 shows how this yields elliptical orbits that are
centered on the source of the central force. An orbit is closed
ω
when Ω is a rational fraction. If the force is proportional to the
−3/4 power of the radius the epicyclic frequency is 3/2 the or-
bital frequency. This yields a 3-lobed pattern that can be seen
in figure 5.5. For other force laws the orbits predicted by this
analysis are multi-lobed patterns produced by precessing approx-
imate ellipses. Most of the cases have incommensurate epicyclic
and orbital frequencies, leading to orbits that do not close in finite
time.
The epicyclic approximation gives a very good idea of what ac-
tual orbits look like. Figure 5.6, drawn by numerical integration
of the orbit produced by integrating the original rectangular equa-
tions of motion for a particle in the field, shows the rosette-type
picture characteristic of incommensurate epicyclic and orbital fre-
quencies for an F = −r−2.3 force law.
We can directly compare a numerically integrated system with
one of our epicyclic approximations. For example the result of
numerically integrating our F ∝ r−3/4 system is very similar to
384 Chapter 5 Canonical Transformations
then the Lagrange equations of motion are the same. The gener-
alized coordinates used in the two Lagrangians are the same, but
the momenta conjugate to the coordinates are different. In the
usual way, define
and
So we have
q 0 = ∂2 F2 (t, q, p0 ) = q (5.316)
p = ∂1 F2 (t, q, p0 ) = p0 − ∂1 G(t, q) (5.317)
H 0 (t, q 0 , p0 ) = H(t, q, p) + ∂0 F2 (t, q, p0 )
= H(t, q, p) − ∂0 G(t, q). (5.318)
p2
H(t, x, p) = + V (x) (5.320)
2m
then the transformed Hamiltonian is
(p0 − ∂1 G(t, x0 ))2
H 0 (t, x0 , p0 ) = + V (x0 ) − ∂0 G(t, x0 ). (5.321)
2m
We see that this transformation may be used to modify terms in
the Hamiltonian that are linear in the momenta. Starting from H
the transformation introduces linear momentum terms; starting
from H 0 the transformation eliminates the linear terms.
We illustrate the use of this transformation with the driven
pendulum. The Hamiltonian for the driven pendulum was derived
automatically in section 3.1.1. We repeat the result here (cleaned
up a bit)
H(t, θ, pθ )
p2
= θ 2 − glm cos θ
2ml
pθ m
+ gmys (t) − sin θDys (t) − (cos θ)2 (Dys (t))2 , (5.322)
l 2
where ys is the drive function. The Hamiltonian is rather messy,
and includes a term that is linear in the angular momentum with
a coefficient that depends on both the angular coordinate and the
time. Let’s see what happens if we apply our transformation to
the problem to eliminate the linear term. We can identify the
transformation function G by requiring that the linear term in
momentum is killed:
(p0θ )2
H 0 (t, θ, p0θ ) = − ml(g + D2 ys ) cos θ. (5.326)
2ml2
So we have found, by a straightforward canonical transformation,
a Hamiltonian for the driven pendulum with the rather simple
form of a pendulum with gravitational acceleration that is mod-
ified by the acceleration of the pivot. It is, in fact, the Hamilto-
nian that corresponds to the alternate form of the Lagrangian for
the driven pendulum we found earlier by inspection (see equation
1.120). Here the derivation is by a simple canonical transforma-
tion, motivated by a desire to eliminate unwanted terms that are
linear in the momentum.
Show that these transformations are just the point transformations, and
that the corresponding F1 is zero.
b. Other linear canonical transformations can be generated by
F1 (t; x1 , x2 ; x01 , x02 ) = x01 ax1 + x01 bx2 + x02 cx1 + x02 dx2 .
F1 (t; x1 , x2 ; x01 , x02 ) = x01 ax1 + x01 bx2 + x02 cx1 + x02 dx2 ,
and the general parallelogram, with a vertex at the origin and with
adjacent sides starting at the origin and extending to the phase-space
points (x1a , x2a , p1a , p2a ) and (x1b , x2b , p1b , p2b ).
a. Find the area of the given parallelogram, and find the area of the
target parallelogram under the canonical transformation. Notice that
the area of the parallelogram is not preserved.
b. Find the areas of the projections of the given parallelogram, and the
areas of the projections of the target under canonical transformation.
Show that the sum of the areas of the projections on the action-like
planes is preserved.
that is obtained in the following way. Let σ(t) = (t, q̄(t), p̄(t)) be a
solution of Hamilton’s equations. The transformation C∆ satisfies
23
Many texts further muddy the matter by introducing an unjustified indepen-
dence argument here: they argue that because q̇ and q˙0 are independent the
relations (5.148–5.150) must hold. This is silly, because p and p0 are functions
of q̇ and q˙0 , respectively, so there are implied dependencies of the velocities
in many places, so it is unjustified to separately set pieces of this equation to
zero. However, notwithstanding this problem, the derivation of the fact that
the transformation is canonical is fallacious.
392 Chapter 5 Canonical Transformations
or, equivalently,
q 0 = q̄(t0 )
p0 = p̄(t0 ). (5.334)
The value (t, q, p) of C∆ (t0 , q 0 , p0 ) is then (t0 +∆, q̄(t0 +∆), p̄(t0 +∆)).
Time evolution is canonical if the transformation C∆ is symplec-
tic and if the Hamiltonian transforms in an appropriate manner.
The transformation C∆ is symplectic if the bilinear antisymmet-
ric form ω is invariant (see equation 5.73) for a general pair of
linearized state variations with zero time component.
Let ζ 0 be an increment with zero time component of the state
(t , q 0 , p0 ). The linearized increment in the value of C∆ (t0 , q 0 , p0 ) is
0
24
Our theorems about which transformations are canonical are still valid, be-
cause they only required that the derivative of the independent variable be 1.
5.7 Time Evolution is Canonical 393
DA(t) = 0. (5.341)
25
Partial derivatives of structured arguments do not generally commute, so
this deduction is not as simple as it may appear. It is helpful to introduce
component indices and consider the equation componentwise.
394 Chapter 5 Canonical Transformations
0
C∆ = C∆ ◦ S−∆ , (5.342)
26
The transformation S∆ is an identity on the qp components, so it is symplec-
tic. Although it adjusts the time, it is not a time-dependent transformation
in that the qp components do not depend upon the time. Thus, if we adjust
the Hamiltonian by composition with S∆ we have a canonical transformation.
5.7 Time Evolution is Canonical 395
or
0
H∆ = H ◦ S∆ . (5.348)
0 = H.
Notice that if H is time independent then H∆
Let us assume we have a procedure ((C delta-t) state) that
implements a time-evolution transformation of the state state
with time interval delta-t.
We can get a procedure ((Cp delta-t) state) that imple-
ments C∆0 from the ((C delta-t) state) that implements C us-
∆
ing the procedure
(define ((C->Cp C) delta-t)
(compose (C delta-t) (shift-t (- delta-t))))
0 the Hamiltonian is
C∆ the Hamiltonian is unchanged. For C∆
time-shifted .
The general solution for a given initial state (t0 , q0 , p0 ) evolved for a time
∆ is
h i
q(t0 + ∆)
p(t0 + ∆)/ω0
h ih i
cos ω0 ∆ sin ω0 ∆ q0 − α0 cos ωt0
=
− sin ω0 ∆ cos ω0 ∆ (1/ω0 )(p0 + α0 ω sin ωt0 )
h i
α0 cos ω(t0 + ∆)
+ 0
−α (ω/ω0 ) sin ω(t0 + ∆)
where α0 = α/(ω02 − ω 2 ).
a. Fill in the details of the procedure
(define (((C alpha omega omega0) delta-t) state)
... )
D(F ◦ σ) = ∂0 F ◦ σ + {F, H} ◦ σ
Show, by writing a short program to test it, that this is true of the
function implemented by (C delta) for the driven oscillator. Why is
this interesting?
d. Verify that both C and Cp are symplectic using symplectic?.
e. Use the procedure canonical? to verify that both C and Cp are canon-
ical with the appropriate transformed Hamiltonian.
P sum of
the P oriented projected areas for R0 . We will show that
0
i Ai = i Ai , and thus the Poincaré integral invariant is pre-
served by time evolution. By showing that the Poincaré integral
invariant is preserved we will have shown that the qp part of the
transformation generated by time evolution is symplectic. From
this we can construct canonical transformations from time evolu-
tion as before.
In the extended phase space we see that the evolution sweeps
out a cylindrical volume with endcaps the regions R0 and R, each
at a fixed time. Let R00 be the two-dimensional region swept out
by the trajectories that map the boundary of region R0 to the
27
By Stokes’ theorem we may compute the area of a region by a line integral
around the boundary of the region. We define the positive sense of the area
to be the area enclosed by a curve that is traversed in a counterclockwise
direction, when drawn on a plane with the coordinate on the abscissa and the
momentum on the ordinate.
398 Chapter 5 Canonical Transformations
(t, q, p)
R
Time
R00
R0
(t0 , q 0 , p0 )
28
We can see this is the following way. Let γ be any closed curve in the
boundary. This curve divides the boundary into two regions. By Stokes’
theorem the integral invariant over both of these pieces can be written as a
line integral along this boundary, but they have opposite signs, because γ is
traversed in opposite directions to keep the surface on the left. So we conclude
that the integral invariant over the entire surface is zero.
5.7.1 Another View of Time Evolution 399
The ω form applied to these incremental states that form the edges
of this parallelogram gives the area of the parallelogram:
ω(ζ1 , ζ2 )
= Q(ζ1 )P (ζ2 ) − P (ζ1 )Q(ζ2 )
= (∆q, 0)
· (−∂1 H(t, q, p)∆t, −∂0 H(t, q, p)∆t)
− (∆p, −∂1 H(t, q, p)∆q − ∂2 H(t, q, p)∆p)
· (∂2 H(t, q, p)∆t, ∆t)
= 0. (5.356)
the section this may take a different amount of time. Compute the
sum of the areas again for the mapped region. Again, all points
of the mapped region have the same q2 so the area on the (q2 , p2 )
plane is zero, and they continue to have the same energy so the
area on the (t, T ) plane is zero. So the area of the mapped re-
gion is again just the area on the surface of section, the (q1 , p1 )
plane. Time evolution preserves the sum of areas, so the area on
the surface of section is the same as the mapped area.
So surfaces of section preserve area provided that the section
points are entirely on a canonical plane. For example, for the
Hénon-Heiles surfaces of section we plotted py versus y when x = 0
with px ≥ 0. So for all section points the x coordinate has the
fixed value 0, the trajectories all have the same energy, and the
points accumulated are entirely in the (py , y) canonical plane. So
the Hénon-Heiles surfaces of section preserve area.
Recall that p and η are structures, and the product implies a sum
of products of components.
402 Chapter 5 Canonical Transformations
be the value of the action from t1 to t2 for path q̃(s). The deriva-
tive of the action along this parametric family of paths is 29
e
DS(s) = δη̃(s) S[q̃(s)]
Z t2
= (∂2 L ◦ Γ[q̃(s)])η̃(s)|tt21 − (E[L] ◦ Γ[q̃(s)])η̃(s). (5.361)
t1
where
29
Let f be a path dependent function, η̃(s) = Dq̃(s), and g(s) = f [q̃(s)]. The
variation of f at q̃(s) in the direction η̃(s) is δη̃(s) f [q̃(s)] = Dg(s).
5.8 Hamilton-Jacobi Equation 403
For a loop family of paths (such that q̃(s2 ) = q̃(s1 )), the differ-
ence of actions at the endpoints vanishes, so we deduce
I X I X
i
pi dq = pi dq i , (5.366)
γ2 i γ1 i
where Rji are the regions in the ith canonical plane. We have found
that the time evolution preserves the integral invariants, thus time
evolution generates a canonical transformation.
q 0 = ∂2 F2 (t, q, p0 ) (5.368)
p = ∂1 F2 (t, q, p0 ) (5.369)
H 0 (t, q 0 , p0 ) = H(t, q, p) + ∂0 F2 (t, q, p0 ). (5.370)
and are able to solve for W then the problem is essentially solved.
In this case, the primed momenta are all constant, and the primed
positions are linear in time. This is an alternate form of the
Hamilton-Jacobi equation.
These forms are related. Suppose that we have a W that sat-
isfies the second form of the Hamilton-Jacobi equation (5.372).
Then the F2 constructed from W
so the primed momenta are the same in the two formulations. But
q 0 = ∂2 F2 (t, q, p0 )
= ∂2 W (t, q, p0 ) − DE(p0 )t
= q 00 − DE(p0 )t, (5.375)
p2 kx2
H(t, x, p) = + . (5.377)
2m 2
We form the Hamilton-Jacobi equation for this problem
x0 = ∂2 W (t, x, p0 )
Z x
mDE(p0 )
= q ¡ ¢ dz (5.383)
kz 2
2m E(p0 ) − 2
with solution
for initial conditions x00 and p00 . If we plug these expressions for
x0 (t) and p0 (t) into equation (5.385) we find
r · r ¸
2E(p0 ) 1 k 0 0 0
x(t) = sin (DE(p )t + x0 − C(p ))
k DE(p0 ) m
r ·r ¸
2E(p0 ) k
= sin (t − t0 )
k m
= A sin (ωt + φ) , (5.389)
5.8.1 Harmonic Oscillator 407
p
where
p the angular frequency is ω = k/m, the amplitude is A =
2E(p0 )/k, and the phase is φ = −ωt0 = ω(x00 − C(p0 ))/DE(p0 ).
We can also use F2 = W − Et as the generating function. The
new Hamiltonian is zero, so both x0 and p0 are constant, but the
relationship between the old and new variables is
x0 = ∂2 F2 (t, x, p0 )
= ∂2 W (t, x, p0 ) − DE(p0 )t
Z x
mDE(p0 )
= q ¡ − DE(p0 )t
kz 2¢
2m E(p0 ) − 2
r Ãs !
m 0 −1 k
= DE(p ) sin x + C(p0 ) − DE(p0 )t. (5.390)
k 2E(p0 )
ẋ02
L0 (t, x0 , ẋ0 ) = . (5.400)
2ω
Of course, there may be additional properties that make one choice
more useful than others for particular applications.
p2 µ
Hr (t; x, y, z; px , py , pz ) = − , (5.401)
2m r
where r2 = x2 +y 2 +z 2 and p2 = p2x +p2y +p2z . The Kepler problem
describes the relative motion of two bodies; it is also encountered
in the formulation of other problems involving orbital motion such
as the n-body problem.
We try a generating function of the form W (t; x, y, z; p0x , p0y , p0z ).
The Hamilton-Jacobi equation is then30
1 h¡ ¢2
E(p0 ) = ∂1,0 W (t; x, y, z; p0x , p0y , p0z )
2m
¡ ¢2
+ ∂1,1 W (t; x, y, z; p0x , p0y , p0z )
¡ ¢2 i µ
+ ∂1,2 W (t; x, y, z; p0x , p0y , p0z ) − . (5.402)
r
30
Remember that ∂1,0 means the derivative with respect to the first coordinate
position.
410 Chapter 5 Canonical Transformations
W (t; r, θ, φ; p01 , p02 , p03 ) = f (r, θ, p01 , p02 , p03 ) + p03 φ, (5.405)
then ∂1,2 W (t; r, θ, φ; p01 , p02 , p03 ) = p03 , and then φ does not appear
in the remaining equation for f :
Any function of the p0i could have been used as the coefficient of
φ in the generating function. This particular choice has the nice
feature that p03 is the z component of the angular momentum.
We can eliminate the θ dependence if we choose
f (r, θ, p01 , p02 , p03 ) = R(r, p01 , p02 , p03 ) + Θ(θ, p01 , p02 , p03 ) (5.407)
5.8.2 Kepler Problem 411
2 (p03 )2
(∂0 Θ(θ, p01 , p02 , p03 )) + = (p02 )2 . (5.408)
sin2 θ
We are free to choose the right-hand side to be any function of
the new momenta. This choice reflects the fact that the left-hand
side is non-negative. It turns out that p02 is the total angular
momentum. This equation for Θ can be solved by quadrature.
The remaining equation that determines R is
1 h 2 1 i µ
E(p01 , p02 , p03 ) = (∂1,0 R(r, p01 , p02 , p03 )) + 2 (p02 )2 − , (5.409)
2m r r
which also can be solved by quadrature.
Altogether the solution of the Hamilton-Jacobi equation reads
Z r µ ¶1/2
2mµ (p02 )2
W (r, θ, φ, p01 , p02 , p03 ) = 2mE(p01 , p02 , p03 ) + − 2 dr
r r
Z θ µ ¶1/2
(p0 )2
+ (p02 )2 − 32 dθ
sin θ
+ p03 φ. (5.410)
mµ2
E(p01 , p02 , p03 ) = − . (5.415)
2(p01 )2
mµ2
H 0 (t; q10 , q20 , q30 ; p01 , p02 , p03 ) = E(p01 , p02 , p03 ) = − . (5.416)
2(p01 )2
Thus
q10 = nt + q10
0
(5.417)
q20 = q20
0
(5.418)
q30 = q30
0
, (5.419)
31
The canonical phase space coordinates can be written in terms of the pa-
rameters that specify an orbit. We will just summarize the results. For further
explanation see [33] or [35].
Assume we have a bound orbit, with semimajor axis a, eccentricity e,
inclination i, longitude of ascending node Ω, argument of pericenter √ ω,
and mean anomaly M . The three canonical momenta are p01 = mµa,
p p
p02 = mµa(1 − e2 ), and p03 = mµa(1 − e2 ) cos i. The first momentum
is related to the energy, the second momentum is the total angular momen-
tum, and the third momentum is the component of the angular momentum
5.8.3 F2 and the Lagrangian 413
DFe2 (t) = p(t)Dq(t) − H(t, q(t), p(t)) + ∂2 F2 (t, q(t), p0 (t))Dp0 (t)
= L(t, q(t), Dq(t)) + ∂2 F2 (t, q(t), p0 (t))Dp0 (t). (5.421)
For variations η that are not necessarily zero at the end times
and for realizable paths q the variation of the action is
δη S[q](t1 , t2 ) = ∂2 L ◦ Γ[q]η|tt21
= p(t2 )η(t2 ) − p(t1 )η(t1 ). (5.424)
Comparing equations (5.424) and (5.425), and using the fact that
the variation η is arbitrary, we find
∂1 F̄ (t1 , q1 , t2 , q2 ) = −p1
∂3 F̄ (t1 , q1 , t2 , q2 ) = p2 . (5.427)
Therefore
∂0 F̄ (t1 , q1 , t2 , q2 ) = H(t1 , q1 , p1 )
= H(t1 , q1 , −∂1 F̄ (t1 , q1 , t2 , q2 )). (5.429)
And similarly
∂2 F̄ (t1 , q1 , t2 , q2 ) = −H(t2 , q2 , p2 )
= −H(t2 , q2 , ∂3 F̄ (t1 , q1 , t2 , q2 )). (5.430)
416 Chapter 5 Canonical Transformations
C∆,H C∆,H 0
0
C²,W
t, q, p t, q 0 , p0
t0 , q0 , p0 t0 , q00 , p00
H 0 = H ◦ C²,W
0
. (5.435)
We will only work with Lie transforms with generators that are
independent of the independent variable.
Lie transforms of functions
The value of a phase-space function F changes if its arguments
0
change. We define the function E²,W of a function F of phase-
space coordinates (t, q, p) by
0 0
E²,W F = F ◦ C²,W . (5.436)
0
We say that E²,W F is the Lie transform of the function F .
In particular, the Lie transform advances the coordinate and
momentum selector functions Q = I1 and P = I2 :
0
(E²,W Q)(t, q 0 , p0 ) = (Q ◦ C²,W
0
)(t, q 0 , p0 ) = Q(t, q, p) = q
0
(E²,W P )(t, q 0 , p0 ) = (P ◦ C²,W
0
)(t, q 0 , p0 ) = P (t, q, p) = p (5.437)
32
In general, the generator W could depend on its independent variable. If
so, it would be necessary to specify a rule that gives the initial value of the
independent variable for the W evolution. This rule may or may not depend
upon the time. If the specification of the independent variable for the W evo-
lution does not depend on time then the resulting canonical transformation
0
C²,W is time independent and the Hamiltonians transform by composition. If
the generator W depends on its independent variable and the rule for speci-
0
fying its initial value depends on time, then the transformation C²,W is time
dependent. In this case there may need to be an adjustment to the relation
between the Hamiltonians H and H 0 . In the extended phase space all these
complications disappear. There is only one case. We can assume all generators
W are independent of the independent variable.
5.9 Lie Transforms 419
0
In terms of E²,W we have the canonical transformation:
0
q = (E²,W Q)(t, q 0 , p0 )
0
p = (E²,W P )(t, q 0 , p0 )
H 0 = E²,W
0
H. (5.440)
The identity I is
0
I = E0,W . (5.443)
W (τ ; r, θ; pr , pθ ) = pθ (5.446)
33 0
The set of transformations E²,W with the operation composition and with
parameter ² is a one parameter Lie group.
420 Chapter 5 Canonical Transformations
Dr =0
Dθ =1
Dpr =0
Dpθ =0 (5.447)
r = r0
θ = θ0 + ²
pr = p0r
pθ = p0θ (5.448)
Dx = −y
5.9 Lie Transforms 421
Dy =x
Dz =0
Dpx = −py
Dpy = px
Dpz =0 (5.450)
x = x0 cos ² − y 0 sin ²
y = x0 sin ² + y 0 cos ²
z = z0 (5.451)
px = p0x cos ² − p0y sin ²
py = p0x sin ² + p0y cos ²
pz = p0z (5.452)
Dx = px
Dy = py
Dpx = −a(x − y) − b(x + y)
Dpy = a(x − y) − b(x + y). (5.454)
34
We are playing fast-and-loose with differential operators here. In a formal
treatment it is essential to prove that these games are mathematically well-
defined and have appropriate convergence properties.
5.10 Lie Series 423
(f t)
(* ((D f) t) epsilon)
(* 1/2 (((expt D 2) f) t) (expt epsilon 2))
(* 1/6 (((expt D 3) f) t) (expt epsilon 3))
(* 1/24 (((expt D 4) f) t) (expt epsilon 4))
(* 1/120 (((expt D 5) f) t) (expt epsilon 5))
...
(series:for-each print-expression
(((exp (* ’epsilon D)) sin) 0)
6)
0
epsilon
0
(* -1/6 (expt epsilon 3))
0
(* 1/120 (expt epsilon 5))
...
It is often instructive to
√ expand functions we usually don’t re-
member, such as f (x) = 1 + x.
(series:for-each print-expression
(((exp (* ’epsilon D))
(lambda (x) (sqrt (+ x 1))))
0)
6)
1
(* 1/2 epsilon)
(* -1/8 (expt epsilon 2))
(* 1/16 (expt epsilon 3))
(* -5/128 (expt epsilon 4))
(* 7/256 (expt epsilon 5))
...
Dynamics
Now to play this game with dynamical functions we want to pro-
vide a derivative-like operator that we can exponentiate, which
will give us the advance operator. The key idea is to write the
derivative of the function in terms of the Poisson bracket. Equa-
tion (3.75) shows how to do this in general:
so
Dn (F ◦ σ) = DH
n
F ◦σ (5.466)
LH F = {F, H} (5.468)
DH = ∂0 + LH (5.469)
D(F ◦ σ) = LH F ◦ σ (5.470)
Df = (LH F ) ◦ σ (5.472)
D2 f = (L2H F ) ◦ σ (5.473)
··· (5.474)
35
Our LH is a special case of what is referred to as a Lie derivative in differ-
ential geometry. The more general idea is that a vector field defines a flow.
The Lie derivative of an object with respect to a vector field gives the rate of
change of the object as it is dragged along with the flow. In our case the flow
is the evolution generated by Hamilton’s equations, with Hamiltonian H.
5.10 Lie Series 427
Let’s start by examining the beginning of the Lie series for the
position of a simple harmonic oscillator of mass m and spring
constant k. Note that we make up the Lie transform (series)
operator by passing it an appropriate Hamiltonian function and
an interval to evolve for. The resulting operator is then given the
position selector procedure. The Lie transform operator returns
the new position selector procedure, that when given the phase-
space coordinates x0 and p0 returns the position selected from the
result of advancing those coordinates by the interval dt.
36
Actually, we define the Lie derivative slightly differently, as follows:
(define ((Lie-derivative-procedure H) F)
(Poisson-bracket F H))
(define Lie-derivative
(make-operator Lie-derivative-procedure ’Lie-derivative))
The reason is that we want Lie-derivative to be an operator, which is just like
a function except that the product of operators is interpreted as composition
while the product of functions is the function computing the product of their
values.
37
The Lie-transform procedure here is also defined to be an operator, just
like Lie-derivative, but in this case the operator declaration is purely formal
because the exp procedure will produce a series, and we do not currently have
a way of iterating that process.
428 Chapter 5 Canonical Transformations
(series:for-each print-expression
(((Lie-transform (H-harmonic ’m ’k) ’dt)
coordinate)
(up 0 ’x0 ’p0))
6)
x0
(/ (* dt p0) m)
(/ (* -1/2 (expt dt 2) k x0) m)
(/ (* -1/6 (expt dt 3) k p0) (expt m 2))
(/ (* 1/24 (expt dt 4) (expt k 2) x0) (expt m 2))
(/ (* 1/120 (expt dt 5) (expt k 2) p0) (expt m 3))
...
We should recognize the terms of this series. We start with the ini-
tial position x0 . The first-order correction (p0 /m)dt is due to the
initial velocity. Next we find an acceleration term (−kx0 /2m)dt2
due to the restoring force of the spring at the initial position.
The Lie transform is just as appropriate for showing us how the
momentum evolves over the interval:
(series:for-each print-expression
(((Lie-transform (H-harmonic ’m ’k) ’dt)
momentum)
(up 0 ’x0 ’p0))
6)
p0
(* -1 dt k x0)
(/ (* -1/2 (expt dt 2) k p0) m)
(/ (* 1/6 (expt dt 3) (expt k 2) x0) m)
(/ (* 1/24 (expt dt 4) (expt k 2) p0) (expt m 2))
(/ (* -1/120 (expt dt 5) (expt k 3) x0) (expt m 2))
...
(series:for-each print-expression
(((Lie-transform (H-harmonic ’m ’k) ’dt)
(H-harmonic ’m ’k))
(up 0 ’x0 ’p0))
6)
(up r 0 phi 0)
(up (/ (* dt p r 0) m)
(/ (* dt p phi 0) (* m (expt r 0 2))))
(up
(+ (/ (* -1/2 ((D U) r 0) (expt dt 2)) m)
(/ (* 1/2 (expt dt 2) (expt p phi 0 2))
(* (expt m 2) (expt r 0 3))))
(/ (* -1 (expt dt 2) p phi 0 p r 0)
(* (expt m 2) (expt r 0 3))))
430 Chapter 5 Canonical Transformations
(up
(+ (/ (* -1/6 (((expt D 2) U) r 0) (expt dt 3) p r 0)
(expt m 2))
(/ (* -1/2 (expt dt 3) (expt p phi 0 2) p r 0)
(* (expt m 3) (expt r 0 4))))
(+ (/ (* 1/3 ((D U) r 0) (expt dt 3) p phi 0)
(* (expt m 2) (expt r 0 3)))
(/ (* -1/3 (expt dt 3) (expt p phi 0 3))
(* (expt m 3) (expt r 0 6)))
(/ (* (expt dt 3) p phi 0 (expt p r 0 2))
(* (expt m 3) (expt r 0 4)))))
...
eA eB 6= eA+B . (5.477)
[A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0, (5.479)
An important identity is
eC Ae−C = e∆C A
1
= A + [C, A] + [C, [C, A]] + · · · . (5.482)
2
We can check this term by term.
We see that
¡ ¢2
eC A2 e−C = eC Ae−C eC Ae−C = eC Ae−C , (5.483)
eC eA e−C = ee Ae−C
C
. (5.486)
e∆C eA = ee A
∆C
. (5.487)
b. Consider the phase space state functions that gives the components
of the angular momentum in terms of rectangular canonical coordinates
Jx (t; x, y, z; px , py , pz ) = ypz − zpy
Jy (t; x, y, z; px , py , pz ) = zpx − xpz
Jz (t; x, y, z; px , py , pz ) = xpy − ypx
432 Chapter 5 Canonical Transformations
Show
[LJx , LJy ] + LJz = 0. (5.489)
c. Relate the Jacobi identity for operators to the Poisson bracket Jacobi
identity.
5.12 Summary
Canonical transformations can be used to reformulate a problem
in coordinates that are easier to understand or that expose some
symmetry of a problem.
In this chapter we have investigated different representations
of a dynamical system. We have found that different representa-
tions will be equivalent if the coordinate-momentum part of the
transformation has symplectic derivative, and if the Hamiltonian
transforms in a specified way. If the phase-space transformation
is time-independent then the Hamiltonian transforms by compo-
sition with the phase-space transformation. The symplectic con-
dition can be equivalently expressed in terms of the fundamental
Poisson brackets. The Poisson-bracket and the ω function are
invariant under canonical transformations. The invariance of ω
implies the areas of the projections onto fundamental coordinate-
momentum planes is preserved (Poincaré integral invariant) by
canonical transformations.
We can formulate an extended phase space in which time is
treated as another coordinate. Time dependent transformations
are simple in the extended phase space. In the extended phase
space the Poincaré integral invariant is the Poincaré-Cartan inte-
gral invariant. We can also reformulate a time independent prob-
lem as a time-dependent problem with fewer degrees of freedom
with one of the original coordinates taking on the role of time;
this is the reduced phase space.
A generating function is a real-valued function of the phase
space coordinates and time that represents a canonical transfor-
mation through its partial derivatives. We found that all canoni-
5.12 Summary 433
H = H0 + ²H1 (6.1)
∂1 H0 = 0. (6.3)
H 0 = E²,W
0
H = e²LW H
0
q = (E²,W Q)(t, q 0 , p0 ) = (e²LW Q)(t, q 0 , p0 )
0
p = (E²,W P )(t, q 0 , p0 ) = (e²LW P )(t, q 0 , p0 )
0
(t, q, p) = (E²,W I)(t, q 0 , p0 ) = (e²LW I)(t, q 0 , p0 ), (6.4)
H 0 = e²LW H
1
= H0 + ²LW H0 + ²2 L2W H0 + · · ·
2
+²H1 + ²2 LW H1 + · · ·
³1 ´
= H0 + ² (LW H0 + H1 ) + ²2 L2W H0 + LW H1 + · · · . (6.6)
2
The first order term in ² is zero if W satisfies the condition
LW H0 + H1 = 0, (6.7)
p2
H(t, θ, p) = − ²β cos(θ), (6.10)
2α
with coordinate θ and conjugate angular momentum p, and where
α = ml2 and β = mgl. The parameter ² allows us to scale the per-
turbation; it is 1 for the actual pendulum. We divide the Hamil-
tonian into the free rotor Hamiltonian and the perturbation from
gravity:
H = H0 + ²H1 , (6.11)
6.2 Pendulum as a Perturbed Rotor 439
where
p2
H0 (t, θ, p) =
2α
²H1 (t, θ, p) = −²β cos θ. (6.12)
{H0 , W } + H1 = 0, (6.13)
or
p
− ∂1 W (t, θ, p) − β cos θ = 0. (6.14)
α
So
αβ sin θ
W (t, θ, p) = − , (6.15)
p
where the arbitrary integration constant is ignored.
The transformed Hamiltonian is H 0 = H0 + o(²2 ). If we can
ignore the ²2 contributions, then the transformed Hamiltonian is
simply
(p0 )2
H 0 (t, θ0 , p0 ) = , (6.16)
2α
with solutions
p00
θ0 = θ00 + (t − t0 )
α
p0 = p00 . (6.17)
θ = (e²LW Q)(t, θ0 , p0 )
= θ0 + ²{Q, W }(t, θ0 , p0 ) + · · ·
= θ0 + ²∂2 W (t, θ0 , p0 ) + · · ·
αβ sin θ0
= θ0 + ² + ···. (6.18)
(p0 )2
440 Chapter 6 Canonical Perturbation Theory
Similarly,
αβ cos θ0
p = p0 + ² + ···. (6.19)
p0
Note that if the Lie series is truncated it is not exactly a canonical
transformation; only the infinite series is canonical.
The initial values θ00 and p00 are determined from the initial
values of θ and p by the inverse Lie transformation:
θ0 = (e−²LW Q)(t, θ, p)
αβ sin θ
=θ−² + ···, (6.20)
(p)2
and
αβ cos θ
p0 = p − ² + ···. (6.21)
p
Note that if we truncate the coordinate transformations after the
first order terms in ² (or any finite order) then the inverse trans-
formation is not exactly the inverse of the transformation.
The approximate solution for given initial conditions t0 , θ0 , p0 )
is obtained by finding the corresponding (t0 , θ00 , p00 ) using the
transformation (6.20) and (6.21). Then the system is evolved
using the solutions (6.17). The phase space coordinates of the
evolved point are transformed back to the original variables using
the transformation (6.18) and (6.19).
We define the two parts of the pendulum Hamiltonian:
(define ((H0 alpha) state)
(let ((ptheta (momentum state)))
(/ (square ptheta) (* 2 alpha))))
1 2 1 2 2 2
2 pθ 2 αβ ² (sin (θ))
+
α p2θ
Indeed, the order ² term has been removed, and an order ²2 term
has been introduced.
Ignoring the ²2 terms in the new Hamiltonian the solution is
(define (((solution0 alpha beta) t) state0)
(let ((t0 (time state0))
(theta0 (coordinate state0))
(ptheta0 (momentum state0)))
(up t
(+ theta0 (/ (* (- t t0) ptheta0) alpha))
ptheta0)))
1
We use the typical pendulum state
(define a-state (up ’t ’theta ’p theta))
442 Chapter 6 Canonical Perturbation Theory
t
1 α2 β 2 ²2 cos (θ) sin (θ) αβ² sin (θ)
2
− + + θ
p4θ p2θ
1 2 2 2
2α β ² αβ² cos (θ)
− 3 + + pθ
pθ pθ
0.75
0.50
0.25
−π 0 π
Figure 6.1 The perturbative solution in the phase plane, including
terms of first, second, third, and fourth order in the phase-space coordi-
nate transformation. The solutions appear to converge.
0.5
0.0
-0.5
−π 0 π
Figure 6.3 The perturbative solution does not converge in the os-
cillation region. As we include more terms in the Lie series for the
phase-space transformation the resulting trajectory develops loops near
the hyperbolic fixed point that increase in size with the order.
This sets the scale for the validity of the perturbative solution.
We can compare this scale to the size of the oscillation region
(see figure 6.4). We can calculate the extent of the region of
oscillation of the pendulum by considering the separatrix. The
value of the Hamiltonian on the separatrix is the same as the
value at the unstable equilibrium: H(t, θ = π, pθ = 0) = β². The
separatrix has maximum momentum psep θ at θ = 0:
H(t, 0, psep
θ ) = H(t, π, 0). (6.23)
p
1
2(αβ²) 2
−π +π
(p0 )2 αβ 2
H 0 (t, θ0 , p0 ) = + ²2 (sin θ0 )2 + · · ·
2α 2(p0 )2
(p0 )2 αβ 2
= + ²2 (1 − cos(2θ0 )) + · · ·
2α 4(p0 )2
= H0 (p0 ) + ²2 H2 (t, θ0 , p0 ) + · · · . (6.25)
H 00 = e² LW 0 H 0
2
= H0 + ²2 (LW 0 H0 + H2 ) + · · · . (6.26)
LW 0 H0 + H2 = 0. (6.27)
This is
p0 αβ 2
− ∂1 W 0 (t, θ0 , p0 ) + (1 − cos(2θ0 )) = 0. (6.28)
α 4(p0 )2
A generator that satisfies this condition is
α2 β 2 0 α2 β 2
W 0 (t, θ0 , p0 ) = θ + sin(2θ0 ). (6.29)
4(p0 )3 8(p0 )3
There are two contributions to this generator, one proportional to
θ0 and the other involving a trigonometric function of θ0 .
The phase-space coordinate transformation resulting from this
Lie transform is found as before. For given initial conditions, we
first carry out the inverse transformation corresponding to W ,
then that for W 0 , solve for the evolution of the system using H0 ,
then transform back using W 0 and then W . The approximate
solution is
0
(t, θ, p) = (E²,W E²0 2 ,W 0 E(t−t
0
0 ),H0
0
E−² 0
2 ,W 0 E−²,W I)(t0 , θ0 , p0 )
α2 β 2
W 00 (t, θ0 , p0 ) = sin(2θ0 ). (6.32)
8(p0 )3
After performing a Lie transformation with this generator the new
Hamiltonian is
(p00 )2 αβ 2
H 00 (t, θ00 , p00 ) = + ²2 + ···. (6.33)
2α 4(p00 )2
6.2.2 Eliminating Secular Terms 449
0.75
0.50
0.25
−π 0 π
Figure 6.5 The solution using a second perturbation step, eliminating
²2 terms from the Hamiltonian, is compared to the actual solution. The
initial agreement is especially good, but the error increases with time.
0.75
0.50
0.25
−π 0 π
Figure 6.6 The two-step perturbative solution is shown over longer
time. The actual solution is a closed curve in the phase plane; this
perturbative solution wanders all over the place and gets worse with
time.
450 Chapter 6 Canonical Perturbation Theory
0.75
0.50
0.25
−π 0 π
H = H0 + ²H1 , (6.37)
H 0 = e²LW H
452 Chapter 6 Canonical Perturbation Theory
= H0 + ² (LW H0 + H1 ) + · · · , (6.38)
{H0 , W } + H1 = 0, (6.39)
Substituting these into the condition that order ² terms are elim-
inated, we find
X X
Bk (p)(ω0 (p) · k) cos(k · θ) = Ak (p) cos(k · θ). (6.44)
k k
2
In general, we need to include sine terms as well, but the cosine expansion is
enough for this illustration.
6.3 Many Degrees of Freedom 453
H 0 = H0 + ²A0 + · · · , (6.47)
and
X Ak (p)
W (t, θ, p) = sin(k · θ). (6.48)
k · ω0 (p)
k6=0
H(τ ; θ, t; p, T )
p2
=T + − ml(g − Aω 2 cos(ωt)) cos θ
2ml2
p2
=T + − β cos(θ) + γ cos(θ − ωt) + γ cos(θ + ωt), (6.53)
2α
with the constants α = ml2 , β = mlg, and γ = 12 mlAω 2 .
6.3.1 Driven Pendulum as a Perturbed Rotor 455
Notice that the perturbation H1 has only three terms in its Pois-
son series, so in the first perturbation step there will only be three
regions excluded from the domain of applicability. The perturba-
tion H1 is particularly simple: it has only three terms, and the
coefficients are constants.
The Lie series generator that eliminates the terms in H1 to first
order in ², satisfying
{H0 , W } + H1 = 0, (6.55)
is
β
W (τ ; θ, t; p, T ) = − sin θ
ωr (p)
γ
+ sin(θ + ωt)
ωr (p) + ω
γ
+ sin(θ − ωt), (6.56)
ωr (p) − ω
H = H0 + ²H1 , (6.57)
Hn0 (t, θ, p) = Ĥ0 (p) + ²A0 (p) + ²An (p) cos(n · θ) + · · · , (6.60)
6.4 Nonlinear Resonance 457
The transformation is
p1 = n1 Σ
p2 = n2 Σ + Θ0
σ = n 1 θ1 + n2 θ2
θ0 = θ2 . (6.63)
3
Any linearly independent combination will be acceptable here.
458 Chapter 6 Canonical Perturbation Theory
and
00
Hn,1 (t; σ, θ0 ; Σ, Θ0 ) = An (n1 Σ, n2 Σ + Θ0 ) cos(σ). (6.67)
Hn00 = Hn,0
00 00
+ ²Hn,1 . (6.68)
Now expand both parts of the resonance about the resonance cen-
ter:
00
Hn,0 (t; σ, θ0 ; Σ, Θ0 ) = Hn,0
00
(t; σ, θ0 ; Σn , Θ0 )
00
+ ∂2,0 Hn,0 (t; σ, θ0 ; Σn , Θ0 ) (Σ − Σn )
1 2
+ ∂2,0 00
Hn,0 (t; σ, θ0 ; Σn , Θ0 ) (Σ − Σn )2
2
+ ···, (6.70)
6.4.1 Pendulum Approximation 459
and
00
Hn,1 (t; σ, θ0 ; Σ, Θ0 ) = Hn,1
00
(t; σ, θ0 ; Σn , Θ0 ) + · · · . (6.71)
W+ = W 0 + W − . (6.74)
p2
H+ (τ ; θ, t; p, T ) = T + + γ cos(θ − ωt) + · · · . (6.75)
2α
Excluding the higher order terms, this Hamiltonian has only
a single combination of coordinates, and so can be transformed
into a Hamiltonian that is cyclic in all but one degree of freedom.
Define the transformation through the mixed variable generating
function
F2 (τ ; t, θ; Σ, T 0 ) = (θ − ωt)Σ + tT 0 , (6.76)
σ = θ − ωt
t = t0
p=Σ
T = T 0 − ωΣ. (6.77)
Σ2
H+ 0 (τ ; σ, t0 ; Σ, T 0 ) = T 0 − ωΣ + + γ cos σ
2α
(Σ − αω)2 1
= + γ cos σ + T 0 − αω 2 . (6.78)
2α 2
This Hamiltonian is cyclic in t0 , so the solutions are level curves
0 0
of H + in (σ, Σ). Actually more can be said here because H +
is already of the form of a pendulum shifted in the Σ direction
by αω, and shifted by π in phase. The shift by π comes about
because the sign of the cosine term is positive rather than negative
as in the usual pendulum. A sketch of the level curves is given in
figure 6.8.
1
2(αγ²) 2
αω
−π +π
0
Figure 6.8 Contours of the resonance Hamiltonian H + give the mo-
tion in the (σ, Σ) plane. In this case the resonance Hamiltonian is a
generalized pendulum shifted in momentum
√ and phase. The half-width
of the resonance oscillation zone is 2 αγ².
10
-10
−π 0 π
10
-10
−π 0 π
10
-10
−π 0 π
show the chaotic zone near the separatrix apparent in the surface
of section for the actual driven pendulum.
We see, from the comparisons of the sections of the first-order
perturbative solutions for the various resonance regions that the
section for the actual driven pendulum can be approximately con-
structed by combining the approximations developed for each res-
onance. The shapes of the resonance regions are distorted by
the transformations that eliminate the nearby resonances, so the
resulting pieces fit together consistently. The predicted width of
each resonance region agrees with the actual width: it was not sub-
stantially changed by the distortion of the region introduced by
the elimination of the other resonance terms. Not all the features
of the actual section are reproduced in this composite of first-order
approximations: there are chaotic zones and islands that are not
accounted for in this collage of first-order approximations.
For larger drives the approximations derived by first-order per-
turbations are worse. In figure 6.11, with a factor of five larger
drive we lose the invariant curves that separate the resonance re-
gions. The main resonance islands persist, but the chaotic zones
near the separatrices have merged into one large chaotic sea.
The first-order perturbative solution for the more strongly driven
pendulum in figure 6.11 still approximates the centers of the main
resonance islands reasonably well, but it fails as we move out and
encounter the secondary islands that are visible in the resonance
region for ωr (p) = ω. Here the approximations for the two regions
do not fit together so well. The chaotic sea is found in the region
where the perturbative solutions do not match.
10
-10
−π 0 π
10
-10
−π 0 π
αω 1
2(αγ²) 2
1
2(αβ²) 2
σ
−π +π
Figure 6.12 Resonance overlap occurs when the sum of the half-
widths of adjacent resonances is larger than the spacing between them.
H2:1 (τ ; θ, t; p, T )
p2 αβγ α2 ω 2 + 2αωp + 2p2
= +T + cos (2θ + ωt) (6.81)
2α 4p2 (αω + p)2
This is solvable because there is only a single combination of co-
ordinates.
We can get an analytic solution by making the pendulum ap-
proximation. The Hamiltonian is already quadratic in the mo-
mentum p, so all we need to do is evaluate the coefficient of the
potential terms at the resonance center p2:1 = αω/2. The reso-
nance Hamiltonian, in the pendulum approximation, is
0 p2 2βγ
H2:1 (τ ; θ, t; p, T ) = + cos (2θ + ωt) . (6.82)
2α αω 2
Carrying out the transformation to the resonance variable σ =
2θ − ωt reduces this to a pendulum Hamiltonian with a single
degree of freedom. Combining the analytic solution of this pen-
dulum Hamiltonian, with the transformations generated by the
full W , we get an approximate perturbative solution
0
(τ ; θ, t; p, T ) = (E²,W Eτ0 −τ0 ,H2:1
00 E
0
−²,W I)(τ0 ; θ0 , t0 ; p0 , T0 ). (6.83)
10
-10
−π 0 π
HV0 (τ ; θ, t; p, T )
p2 αγ 2 ²2 (α2 ω 2 + p2 )
= − β² cos θ + cos(2θ) + · · · . (6.84)
2α 2(α2 ω 2 − p2 )2
p2 γ 2 ²2
HV00 (τ ; θ, t; p, T ) = − β² cos θ + cos(2θ) + · · · . (6.85)
2α 2αω 2
Linear stability analysis of the inverted vertical equilibrium indi-
cates stability for
10
-10
−π 0 π
10
-10
−π 0 π
-1
log10 (A/l)
-2
-3
0 1 2 3
log10 (ω/ωs )
6.5 Projects
Exercise 6.4: Periodically driven pendulum
a. Work out the details of the perturbation theory for the primary driven
pendulum resonances, as displayed in figure 6.10.
b. Work out the details of the perturbation theory for the stability of
the inverted vertical equilibrium. Derive the resonance Hamiltonian,
and plot its contours. Compare these contours to surfaces of section for
a variety of parameters.
c. Carry out the linear stability analysis leading to equation (6.87).
What is happening in the upper part of figure fig:dpend-inverted-summary?
Why is the system unstable when criterion (6.87) predicts stability? Use
surfaces of section to investigate this parameter regime.
6.5 Projects 473
(h 2)
.7518269446689928
(g 2)
7.274379414605454
Symbolic values
As in usual mathematical notation, arithmetic is extended to al-
low the use of symbols that represent unknown or incompletely
specified mathematical objects. These symbols are manipulated
as if they had values of a known type. By default, a Scheme
symbol is assumed to represent a real number. So the expression
’a is a literal Scheme symbol that represents an unspecified real
number.
(print-expression
((compose cube sin) ’a))
(expt (sin a) 3)
(print-expression
((compose (literal-function ’f) (literal-function ’g)) ’x))
(f (g x))
(print-expression (g ’x ’y))
(g x y)
p = [p0 , p1 , p2 ] . (7.5)
479
(print-expression v)
(up vˆ0 vˆ1 vˆ2)
(print-expression p)
(down p 0 p 1 p 2)
I(s) = s
I0 (s) = t
I1 (s) = (x, y)
I2 (s) = [px , py ]
I1,0 (s) = x
...
I2,1 (s) = py . (7.7)
480 Chapter 7 Appendix: Our Notation
pv = p0 v 0 + p1 v 1 + p2 v 2 . (7.8)
(print-expression
(* p v))
(+ (* p 0 vˆ0) (* p 1 vˆ1) (* p 2 vˆ2))
1
The arrangement of the components of a tuple structure is not significant,
as it is in matrix notation: We might just as well have written this tuple as
[(cos θ, sin θ) , (− sin θ, cos θ)].
482 Chapter 7 Appendix: Our Notation
and
(D + 1)(D − 1) = D2 − 1,
Dg(x, y) (∆x, ∆y) = [∂0 g(x, y), ∂1 g(x, y)] · (∆x, ∆y)
= ∂0 g(x, y)∆x + ∂1 g(x, y)∆y. (7.18)
∂0 g = I0 ◦ Dg (7.19)
∂1 g = I1 ◦ Dg. (7.20)
Concretely, if
g(x, y) = x3 y 5 (7.21)
then
£ ¤
Dg(x, y) = 3x2 y 5 , 5x3 y 4 (7.22)
(print-expression
(h (up ’x ’y)))
(g x y)
= [∂0 H(s), [∂1,0 H(s), ∂1,1 H(s)] , (∂2,0 H(s), ∂2,1 H(s))] ,
where ∂1,0 indicates the partial derivative with respect to the first
component (index 0) of the second argument (index 1) of the func-
tion, and so on. Indeed ∂z F = Iz ◦ DF , for any function F and
access chain z. So, if we let ∆s be an incremental phase-space
state tuple,
then
DH(s)∆s = ∂0 H(s)∆t
+ ∂1,0 H(s)∆x + ∂1,1 H(s)∆y
+ ∂2,0 H(s)∆px + ∂2,1 H(s)∆py . (7.27)
(print-expression
(H s))
(H (up t (up x y) (down p x p y)))
(print-expression
((D H) s))
(down
(((partial 0) H) (up t (up x y) (down p x p y)))
(down (((partial 1 0) H) (up t (up x y) (down p x p y)))
(((partial 1 1) H) (up t (up x y) (down p x p y))))
(up (((partial 2 0) H) (up t (up x y) (down p x p y)))
(((partial 2 1) H) (up t (up x y) (down p x p y)))))
Structured results
Some functions produce structured outputs. A function whose
output is a tuple is equivalent to a tuple of component functions
each of which produces one component of the output tuple.
For example, a function that takes one numerical argument and
produces a structure of outputs may be used to describe a curve
487
or just
(define helix (up cos sin identity))
In Scheme:
(define (g x y)
(up (square (+ x y)) (cube (- y x)) (exp (+ x y))))
(define (g x y)
(up (f x y) y))
(define (h x y)
(f (f x y) y))
(define (f v)
(let ((x (ref v 0))
(y (ref v 1)))
(* (square x) (cube y))))
(define (g v)
(let ((x (ref v 0))
(y (ref v 1)))
(up (f v) y)))
1
Many of the statements here are only valid assuming there are no assignments.
492 Chapter 8 Appendix: Scheme
(+ 1 2.14)
3.14
(+ 1 (* 2 1.07))
3.14
2
In examples we show the value that would be printed by the Scheme system
using an italic face following the input expression.
3
In Scheme every parenthesis is essential: you cannot add extra parentheses
or remove any.
4
The logician Alonzo Church [12] invented λ notation to allow the specification
of an anonymous function of a named parameter: λx[expression in x]. This
is read “That function of one argument that is obtained by substituting the
argument for x in the indicated expression.”
493
we can then use the symbols pi and square wherever the numeral
or the λ-expression could appear. For example, the area of the
surface of a sphere of radius 5 meters is:
(* 4 pi (square 5))
314.1592653589793
5
The examples are indented to help with readability. Scheme does not care
about extra whitespace, so we may add as much as we please to make things
easier to read.
494 Chapter 8 Appendix: Scheme
(define compose
(lambda (f g)
(lambda (x)
(f (g x)))))
Using the syntactic sugar shown above we can write the defini-
tion more conveniently. The following are both equivalent to the
definition above:
(define (compose f g)
(lambda (x)
(f (g x))))
(define ((compose f g) x)
(f (g x)))
Conditionals
Conditional expressions may be used to choose among several ex-
pressions to produce a value. For example, a procedure that im-
plements the absolute value function may be written:
(define (abs x)
(cond ((< x 0) (- x))
((= x 0) x)
((> x 0) x)))
(factorial 6)
720
(factorial 40)
815915283247897734345611269596115894272000000000
Local names
The let expression is used to give names to objects in a local
context. For example,
(define (f radius)
(let ((area (* 4 pi (square radius)))
(volume (* 4/3 pi (cube radius))))
(/ volume area)))
(f 3)
1
The value of the let expression is the value of the body expression
in the context where the variables variable-i have the values of
the expressions expression-i. The expressions expression-i may
not refer to the variables variable-i.
A slight variant of the let expression provides a convenient
way to express looping constructs. We can write a procedure that
implements an alternative algorithm for computing factorials as
follows:
(define (factorial n)
(let clp ((count 1) (answer 1))
(if (> count n)
answer
(clp (+ count 1) (* count answer)))))
(factorial 6)
720
Here, the symbol following the let (in this case clp) is locally de-
fined to be a procedure that has the variables count and answer
as its formal parameters. It is called the first time with the ex-
pressions 1 and 1, initializing the loop. Whenever the procedure
named clp is called later, these variables get new values, which are
the values of the operand expressions (+ count 1) and (* count
answer).
Compound data—lists and vectors
Data can be glued together to form compound data structures.
A list is a data structure in which the elements are linked se-
quentially. A Scheme vector is a data structure in which the el-
ements are packed in a linear array. New elements can be added
to lists, but a list takes computing time proportional to its length
to access. Scheme vectors can be accessed in constant time, but
a Scheme vector is of fixed length. All data structures in this
book are implemented as combinations of lists and Scheme vec-
tors. Compound data objects are constructed from components by
procedures called constructors and the components are accessed
by selectors.
497
a-list
(6 946 8 356 12 620)
(list-ref a-list 3)
356
(list-ref a-list 0)
6
Lists are built from pairs. A pair is made using the constructor
cons. The selectors for the two components of the pair are car
and cdr.6 A list is a chain of pairs, such that the car of each pair
is the list element and the cdr of each pair is the next pair, except
for the last cdr, which is a distinguishable value called the empty
list and which is written (). Thus,
(car a-list)
6
(cdr a-list)
(946 8 356 12 620)
(define another-list
(cons 32 (cdr a-list)))
another-list
(32 946 8 356 12 620)
Both a-list and another-list share the same tail (their cdr).
6
These names are accidents of history. They stand for “the Contents of the Ad-
dress Register” and “the Contents of the Decrement Register” of the IBM 704
computer, which was used for the first implementation of Lisp in the late
1950’s.
498 Chapter 8 Appendix: Scheme
a-vector
#(37 63 49 21 88 56)
(vector-ref a-vector 3)
21
(vector-ref a-vector 0)
37
[1] Harold Abelson and Gerald Jay Sussman with Julie Sussman, Struc-
ture and Interpretation of Computer Programs, 2nd edition, MIT
Press and McGraw-Hill, 1996.
[2] Ralph H. Abraham and Jerrold E. Marsden, Foundations of Me-
chanics, 2nd edition, Addison-Wesley, 1978.
[3] Ralph H. Abraham, Jerrold E. Marsden, and Tudor Raţiu, Mani-
folds, Tensor Analysis, and Applications, 2nd edition, Springer Ver-
lag, 1993.
[4] V. I. Arnold, “Small Denominators and Problems of Stability of
Motion in Classical and Celestial Mechanics,” in Russian Math. Sur-
veys, 18, 6 (1963).
[5] V. I. Arnold, Mathematical Methods of Classical Mechanics,
Springer Verlag, 1980.
[6] V. I. Arnold, V. V. Kozlov, and A. I. Neishtadt, “Mathematical
Aspects of Classical and Celestial Mechanics,” in Dynamical Systems
III, Springer Verlag, 1988.
[7] Max Born, Vorlesungen über Atommechanik, J. Springer, Berlin,
1925-30.
[8] Constantin Carathéodory, Calculus of variations and partial differ-
ential equations of the first order. Translated by Robert B. Dean and
Julius J. Brandstatter, Holden-Day, San Francisco, 1965-67.
[9] Constantin Carathéodory, Geometrische Optik. Series title: Ergeb-
nisse der Mathematik und ihrer Grenzgebiete, 4. Bd., J. Springer,
Berlin, 1937.
[10] Élie Cartan, Leçons sur les invariants intégraux, Hermann, Paris,
1922; reprinted in 1971.
[11] Boris V. Chirikov, “A Universal Instability of Many-Dimensional
Oscillator Systems,” in Physics Reports 52, 5, pp. 263–379, (1979).
[12] Alonzo Church, The Calculi of Lambda-Conversion, Princeton Uni-
versity Press, 1941.
[13] Richard Courant and David Hilbert, Methods of Mathematical
Physics, 2 volumes., Wiley-Interscience, 1957.
[14] Jean Dieudonné, Treatise on Analysis, Academic Press, 1969.
[15] Hans Freudenthal, Didactical Phenomenology of Mathematical
Structures, Kluwer Publishing Co., Boston, 1983.
502 Bibliography
2.1 120 2.5 125 2.9 133 2.13 151 2.17 176
2.2 120 2.6 125 2.10 136 2.14 155 2.18 177
2.3 120 2.7 125 2.11 151 2.15 166 2.19 177
2.4 120 2.8 129 2.12 151 2.16 176 2.20 178
3.1 185 3.4 193 3.7 199 3.10 232 3.13 261
3.2 185 3.5 195 3.8 218 3.11 254 3.14 263
3.3 189 3.6 198 3.9 220 3.12 258 3.15 264
4.1 275 4.3 277 4.5 287 4.7 303 4.9 313
4.2 277 4.4 282 4.6 289 4.8 311 4.10 313
5.1 321 5.8 337 5.15 376 5.22 396 5.29 425
5.2 324 5.9 345 5.16 386 5.23 396 5.30 425
5.3 330 5.10 349 5.17 389 5.24 405 5.31 431
5.4 330 5.11 349 5.18 389 5.25 409 5.32 432
5.5 330 5.12 349 5.19 390 5.26 414
5.6 333 5.13 352 5.20 390 5.27 416
5.7 333 5.14 373 5.21 390 5.28 424
6.1 446 6.2 460 6.3 461 6.4 472 6.5 473