You are on page 1of 117

Lectures on Classical Mechanics

by John C. Baez

notes by Derek K. Wise Department of Mathematics University of California, Riverside

A L TEXed by Blair M. Smith

Department of Physics and astronomy Louisiana State University 2005

c 2005 John C. Baez & Derek K. Wise

ii

iii

Preface
These are notes for a mathematics graduate course on classical mechanics. Ive taught this course twice recently. The rst time I focused on the Hamiltonian approach. This time I started with the Lagrangian approach, with a heavy emphasis on action principles, and derived the Hamiltonian approach from that. Derek Wise took notes. A The chapters in this L TEX version are in the same order as the weekly lectures, but Ive merged weeks together, and sometimes split them over chapter, to obtain a more textbook feel to these notes. In some chapters a familiarity with modern dierential geometry (dierential forms, bundles, homology) is helpful but the eort to learn the relevant mathematics is not overly demanding. For reference, the weekly lectures are outlined here. Week 1: (Mar. 28, 30, Apr. 1)The Lagrangian approach to classical mechanics: deriving F = ma from the requirement that the particles path be a critical point of the action. The prehistory of the Lagrangian approach: DAlemberts principle of least energy in statics, Fermats principle of least time in optics, and how DAlembert generalized his principle from statics to dynamics using the concept of inertia force. Week 2: (Apr. 4, 6, 8)Deriving the Euler-Lagrange equations for a particle on an arbitrary manifold. Generalized momentum and force. Noethers theorem on conserved quantities coming from symmetries. Examples of conserved quantities: energy, momentum and angular momentum. Week 3 (Apr. 11, 13, 15)Example problems: (1) The Atwood machine. (2) A frictionless mass on a table attached to a string threaded through a hole in the table, with a mass hanging on the string. (3) A special-relativistic free particle: two Lagrangians, one with reparametrization invariance as a gauge symmetry. (4) A special-relativistic charged particle in an electromagnetic eld. Week 4 (Apr. 18, 20, 22)More example problems: (5) A special-relativistic charged particle in an electromagnetic eld in special relativity, continued. (6) A general-relativistic free particle. Week 5 (Apr. 25, 27, 29)How Jacobi unied Fermats principle of least time and Lagranges principle of least action by seeing the classical mechanics of a particle in a potential as a special case of optics with a position-dependent index of refraction. The ubiquity of geodesic motion. Kaluza-Klein theory. From Lagrangians to Hamiltonians. Week 6 (May 2, 4, 6)From Lagrangians to Hamiltonians, continued. Regular and strongly regular Lagrangians. The cotangent bundle as phase space. Hamiltons equations. Getting Hamiltons equations directly from a least action principle. Week 7 (May 9, 11, 13)Waves versus particles: the Hamilton-Jacobi equation. Hamiltons principal function and extended phase space. How the Hamilton-Jacobi equation foreshadows quantum mechanics

iv

Week 8 (May 16, 18, 20)Towards symplectic geometry. The canonical 1-form and the symplectic 2-form on the cotangent bundle. Hamiltons equations on a symplectic manifold. Darbouxs theorem. Week 9 (May 23, 25, 27)Poisson brackets. The Schrdinger picture versus the Heisenberg picture in classical mechanics. The Hamiltonian version of Noethers theorem. Poisson algebras and Poisson manifolds. A Poisson manifold that is not symplectic. Liouvilles theorem. Weils formula. Week 10 (June 1, 3, 5)A taste of geometric quantization. Khler manifolds. a If you nd errors in these notes, please email me!

Contents
1 From Newtons Laws to Lagranges Equations 1.1 Lagrangian and Newtonian Approaches . . . . . . . . . . 1.1.1 Lagrangian versus Hamiltonian Approaches . . . 1.2 Prehistory of the Lagrangian Approach . . . . . . . . . . 1.2.1 The Principle of Minimum Energy . . . . . . . . 1.2.2 DAlemberts Principle and Lagranges Equations 1.2.3 The Principle of Least Time . . . . . . . . . . . . 1.2.4 How DAlembert and Others Got to the Truth . . 2 Equations of Motion 2.1 The Euler-Lagrange Equations 2.1.1 Comments . . . . . . . 2.1.2 Lagrangian Dynamics . 2.2 Interpretation of Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 6 6 7 8 10 12 14 14 16 16 18 20 20 21 22 23 24 25 25 26 28 28 29 31 35 35 36

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

3 Lagrangians and Noethers Theorem 3.1 Time Translation . . . . . . . . . . . . . . . . . . . . . 3.1.1 Canonical and Generalized Coordinates . . . . . 3.2 Symmetry and Noethers Theorem . . . . . . . . . . . 3.2.1 Noethers Theorem . . . . . . . . . . . . . . . . 3.3 Conserved Quantities from Symmetries . . . . . . . . . 3.3.1 Time Translation Symmetry . . . . . . . . . . . 3.3.2 Space Translation Symmetry . . . . . . . . . . . 3.3.3 Rotational Symmetry . . . . . . . . . . . . . . . 3.4 Example Problems . . . . . . . . . . . . . . . . . . . . 3.4.1 The Atwood Machine . . . . . . . . . . . . . . . 3.4.2 Disk Pulled by Falling Mass . . . . . . . . . . . 3.4.3 Free Particle in Special Relativity . . . . . . . . 3.5 Electrodynamics and Relativistic Lagrangians . . . . . 3.5.1 Gauge Symmetry and Relativistic Hamiltonian . 3.5.2 Relativistic Hamiltonian . . . . . . . . . . . . . v

vi

CONTENTS

3.6 3.7

3.8

3.9

Relativistic Particle in an Electromagnetic Field . . . . . . . . Alternative Lagrangians . . . . . . . . . . . . . . . . . . . . . 3.7.1 Lagrangian for a String . . . . . . . . . . . . . . . . . . 3.7.2 Alternate Lagrangian for Relativistic Electrodynamics . The General Relativistic Particle . . . . . . . . . . . . . . . . 3.8.1 Free Particle Lagrangian in GR . . . . . . . . . . . . . 3.8.2 Charged particle in EM Field in GR . . . . . . . . . . The Principle of Least Action and Geodesics . . . . . . . . . . 3.9.1 Jacobi and Least Time vs Least Action . . . . . . . . . 3.9.2 The Ubiquity of Geodesic Motion . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

37 39 39 41 43 44 45 46 46 48 51 51 54 54 55 55 55 56 57 59 61 61 64 70 70 72 72 74 75 77 80 80 82 85 86 86 87

4 From Lagrangians to Hamiltonians 4.1 The Hamiltonian Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Regular and Strongly Regular Lagrangians . . . . . . . . . . . . . . . . . . 4.2.1 Example: A Particle in a Riemannian Manifold with Potential V (q) 4.2.2 Example: General Relativistic Particle in an E-M Potential . . . . . 4.2.3 Example: Free General Relativistic Particle with Reparameterization Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.4 Example: A Regular but not Strongly Regular Lagrangian . . . . . 4.3 Hamiltons Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Hamilton and Euler-Lagrange . . . . . . . . . . . . . . . . . . . . . 4.3.2 Hamiltons Equations from the Principle of Least Action . . . . . . 4.4 Waves versus ParticlesThe Hamilton-Jacobi Equations . . . . . . . . . . 4.4.1 Wave Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 The Hamilton-Jacobi Equations . . . . . . . . . . . . . . . . . . . . 5 Symplectic Geometry 5.1 Towards Symplectic Geometry . . . . . . . . . . . . . . . 5.2 The Canonical Forms on the Cotangent Bundle . . . . . 5.2.1 The Canonical 1-form on the Cotangent Bundle . 5.2.2 The Symplectic 2-form on the Cotangent Bundle 5.3 Hamiltons Equations on a Symplectic Manifold . . . . . 5.4 Darbouxs Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 Poisson Brackets 6.1 The Schrdinger Picture and the Heisenberg Picture in Classical o 6.2 The Hamiltonian Version of Noethers Theorem . . . . . . . . . 6.3 Poisson Algebras and Poisson Manifolds . . . . . . . . . . . . . 6.3.1 A Poisson Manifold that is Not Symplectic . . . . . . . . 6.4 Liouvilles Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Phase Space Volumes . . . . . . . . . . . . . . . . . . . .

Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS

6.4.2

Weils formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

7 Introduction to Geometric Quantization 91 7.1 A Taste of Geometric Quantization . . . . . . . . . . . . . . . . . . . . . . 91 7.2 Khler Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 a 7.3 General Geometric Quantization . . . . . . . . . . . . . . . . . . . . . . . . 96

Chapter 1 From Newtons Laws to Lagranges Equations


(Week 1, March 28, 30, April 1.) Classical mechanics is a very peculiar branch of physics. It used to be considered the sum total of our theoretical knowledge of the physical universe (Laplaces daemon, the Newtonian clockwork), but now it is known as an idealization, a toy model if you will. The astounding thing is that probably all professional applied physicists still use classical mechanics. So it is still an indispensable part of any physicists or engineers education. It is so useful because the more accurate theories that we know of (general relativity and quantum mechanics) make corrections to classical mechanics generally only in extreme situations (black holes, neutron stars, atomic structure, superconductivity, and so forth). Given that GR and QM are much harder theories to use and apply it is no wonder that scientists will revert to classical mechanics whenever possible. So, what is classical mechanics?

1.1

Lagrangian and Newtonian Approaches

We begin by comparing the Newtonian approach to mechanics to the subtler approach of Lagrangian mechanics. Recall Newtons law: F = ma (1.1) wherein we consider a particle moving in n . Its position, say q, depends on time t , so it denes a function, q : n . From this function we can dene velocity, v = q : n 2

1.1 Lagrangian and Newtonian Approaches

where q =

dq , dt

and also acceleration, a = q : n .

Now let m > 0 be the mass of the particle, and let F be a vector eld on n called the force. Newton claimed that the particle satises F = ma. That is: m a(t) = F (q(t)) . (1.2)

This is a 2nd-order dierential equation for q : n which will have a unique solution given some q(t0 ) and q(t0 ), provided the vector eld F is nice by which we technically mean smooth and bounded (i.e., |F (x)| < B for some B > 0, for all x n ). We can then dene a quantity called kinetic energy: K(t) := This quantity is interesting because d K(t) = m v(t) a(t) dt = F (q(t)) v(t) So, kinetic energy goes up when you push an object in the direction of its velocity, and goes down when you push it in the opposite direction. Moreover,
t1

1 m v(t) v(t) 2

(1.3)

K(t1 ) K(t0 ) =
t0 t1

F (q(t)) v(t) dt F (q(t)) q(t) dt


t0

So, the change of kinetic energy is equal to the work done by the force, that is, the integral of F along the curve q : [t0 , t1 ] n . This implies (by Stokess theorem relating line integrals to surface integrals of the curl) that the change in kinetic energy K(t1 ) K(t0 ) is independent of the curve going from q(t0 ) = a to q(t0 ) = b i F = 0. This in turn is true i F = V (1.4) for some function V : n . This function is then unique up to an additive constant; we call it the potential. A force with this property is called conservative. Why? Because in this case we can dene the total energy of the particle by E(t) := K(t) + V (q(t)) (1.5)

From Newtons Laws to Lagranges Equations

where V (t) := V (q(t)) is called the potential energy of the particle, and then we can show that E is conserved: that is, constant as a function of time. To see this, note that F = ma implies d [K(t) + V (q(t))] = F (q(t)) v(t) + V (q(t)) (t) dt = 0, (because F = V ). Conservative forces let us apply a whole bunch of cool techniques. In the Lagrangian approach we dene a quantity L := K(t) V (q(t)) (1.6)

called the Lagrangian, and for any curve q : [t0 , t1 ] n with q(t0 ) = a, q(t1 ) = b, we dene the action to be t
1

S(q) :=
t0

L(t) dt

(1.7)

From here one can go in two directions. One is to claim that nature causes particles to follow paths of least action, and derive Newtons equations from that principle. The other is to start with Newtons principles and nd out what conditions, if any, on S(q) follow from this. We will use the shortcut of hindsight, bypass the philosophy, and simply use the mathematics of variational calculus to show that particles follow paths that are critical points of the action S(q) if and only if Newtons law F = ma holds. To do this,

qs + sq

t0 t1

Figure 1.1: A particle can sni out the path of least action. let us look for curves (like the solid line in Fig. 1.1) that are critical points of S, namely: d S(qs )|s=0 = 0 ds (1.8)

1.1 Lagrangian and Newtonian Approaches

where qs = q + sq for all q : [t0 , t1 ] n with, q(t0 ) = q(t1 ) = 0. To show that, F = ma d S(qs )|t=0 = 0, ds q : [] n , q(t0 ) = q(t1 ) = 0 (1.9)

we start by using integration by parts on the denition of the action, and rst noting that dqs /ds = q(t) is the variation in the path, d d S(qs )|s=0 = ds ds
t1 t1 t0

1 mqs (t) qs (t) V (qs (t)) dt 2


s=0

s=0

=
t0 t1

d 1 2 mq (t) V (qs (t)) dt ds 2 s mqs

=
t0 t1

d d qs (t) V (qs (t)) qs (t) dt ds ds

s=0

=
t0

d d qs (t) V (qs (t))q(t) dt mqs dt ds

s=0

so after integrating by parts and noting the boundary terms vanish (because q = 0 at t1 and t0 ), d S(qs )|s=0 = ds =
t0 t1

[ms (t) V (qs (t))] q


t0 t1

d qs (t) dt ds

s=0

[ms (t) V (qs (t))] q(t)dt q


s=0

The variation in the action is then clearly zero for all variations q i the term in brackets [. . .] is identically zero, that is, m(t) V (q(t)) = 0 q So, the path q is a critical point of the action S i F = ma. The above result applies only for conservative forces, i.e., forces that can be written as the minus the gradient of some potential. However, this seems to be true of the most fundamental forces that we know of in our universe. It is a doctrine of classical mechanics that has withstood the test of time and experiment.

From Newtons Laws to Lagranges Equations

1.1.1

Lagrangian versus Hamiltonian Approaches

I am not sure where to mention this, but before launching into the history of the Lagrangian approach may be as good a time as any. In later chapters we will describe another approach to classical mechanics: the Hamiltonian approach. Why do we need two approaches, Lagrangian and Hamiltonian? They both have their own advantages. In the simplest terms, the Hamiltonian approach focuses on position and momentum, while the Lagrangian approach focuses on position and velocity. The Hamiltonian approach focuses on energy, which is a function of position and momentum indeed, Hamiltonian is just a fancy word for energy. The Lagrangian approach focuses on the Lagrangian, which is a function of position and velocity. Our rst task in understanding Lagrangian mechanics is to get a gut feeling for what the Lagrangian means. The key is to understand the integral of the Lagrangian over timethe action, S. We shall see that this describes the total amount that happened from one moment to another as a particle traces out a path. And, peeking ahead to quantum mechanics, the quantity exp(iS/ ), where is Plancks constant, will describe the change in phase of a quantum system as it traces out this path. In short, while the Lagrangian approach takes a while to get used to, it provides invaluable insights into classical mechanics and its relation to quantum mechanics. We shall see this in more detail soon.

1.2

Prehistory of the Lagrangian Approach

Weve seen that a particle going from point a at time t0 to a point b at time t1 follows a path that is a critical point of the action,
t1

S=
t0

K V dt

so that slight changes in its path do not change the action (to rst order). Often, though not always, the action is minimized, so this is called the Principle of Least Action. Suppose we did not have the hindsight aorded by the Newtonian picture. Then we might ask, Why does nature like to minimize the action? And why this action KV dt? Why not some other action? Why questions are always tough. Indeed, some people say that scientists should never ask why. This seems too extreme: a more reasonable attitude is that we should only ask a why question if we expect to learn something scientically interesting in our attempt to answer it. There are certainly some interesting things to learn from the question why is action minimized? First, note that total energy is conserved, so energy can slosh back and forth between kinetic and potential forms. The Lagrangian L = K V is big when most of the energy is in kinetic form, and small when most of the energy is in potential

1.2 Prehistory of the Lagrangian Approach

form. Kinetic energy measures how much is happeninghow much our system is moving around. Potential energy measures how much could happen, but isnt yet thats what the word potential means. (Imagine a big rock sitting on top of a cli, with the potential to fall down.) So, the Lagrangian measures something we could vaguely refer to as the activity or liveliness of a system: the higher the kinetic energy the more lively the system, the higher the potential energy the less lively. So, were being told that nature likes to minimize the total of liveliness over time: that is, the total action. In other words, nature is as lazy as possible! For example, consider the path of a thrown rock in the Earths gravitational eld, as in Fig. 1.2. The rock traces out a parabola, and we can think of it as doing this in order K V small here. . . good! Spend as much time as possible here

K V big. . . bad! Get this over with quick! Figure 1.2: A particles lazy motion, minimizes the action. to minimize its action. On the one hand, it wants to spend a lot much time near the top of its trajectory, since this is where the kinetic energy is least and the potential energy is greatest. On the other hand, if it spends too much time near the top of its trajectory, it will need to really rush to get up there and get back down, and this will take a lot of action. The perfect compromise is a parabolic path! Here we are anthropomorphizing the rock by saying that it wants to minimize its action. This is okay if we dont take it too seriously. Indeed, one of the virtues of the Principle of Least Action is that it lets us put ourselves in the position of some physical system and imagine what we would do to minimize the action. There is another way to make progress on understanding why action is minimized: history. Historically there were two principles that were fairly easy to deduce from observations of nature: (i) the principle of minimum energy used in statics, and (ii) the principle of least time, used in optics. By putting these together, we can guess the principle of least action. So, let us recall these earlier minimum principles.

1.2.1

The Principle of Minimum Energy

Before physicists really got going in studying dynamical systems they used to study statics. Statics is the study of objects at rest, or in equilibrium. Archimedes studied the laws of

From Newtons Laws to Lagranges Equations

L1 m1

L2 m2

Figure 1.3: A principle of energy minimization determines a levers balance. a see-saw or lever (Fig. 1.3), and he found that this would be in equilibrium if m1 L1 = m2 L2 . Later DAlembert understood this using his principle of virtual work. He considered moving the lever slightly, i.e., innitesimally, He claimed that in equilibrium the innitesdq1 d dq2

Figure 1.4: A principle of energy minimization determines a levers balance. imal work done by this motion is zero! He also claimed that the work done on the ith body is, dWi = Fi di and gravity pulls down with a force mi g so, dWi = (0, 0, mg) (0, 0, L1 d) = m1 gL1 d and similarly, dW2 = m2 gL2 d Now DAlemberts principle says that equilibrium occurs when the virtual work dW = dW1 +dW2 vanishes for all d (that is, for all possible innitesimal motions). This happens when m1 L1 m2 L2 = 0 which is just as Archimedes wrote.

1.2.2

DAlemberts Principle and Lagranges Equations

Lets go over the above analysis in more detail. Ill try to make it clear what we mean by virtual work.

1.2 Prehistory of the Lagrangian Approach

The forces and constraints on a system may be time dependent. So equal small innitesimal displacements of the system might result in the forces Fi acting on the system doing dierent amounts of work at dierent times. To displace a system by ri for each position coordinate, and yet remain consistent with all the constraints and forces at a given instant of time t, without any time interval passing is called a virtual displacement. Its called virtual because it cannot be realized: any actual displacement would occur over a nite time interval and possibly during which the forces and constraints might change. Now call the work done by this hypothetical virtual displacement, Fi ri , the virtual work. Consider a system in the special state of being in equilibrium, i.e., when Fi = 0. Then because by denition the virtual displacements do not change the forces, we must deduce that the virtual work vanishes for a system in equilibrium, Fi ri = 0,
i

(when in equilibrium)

(1.10)

Note that in the above example we have two particles in 3 subject to a constraint (they are pinned to the lever arm). However, a number n of particles in 3 can be treated as a single quasi-particle in 3n , and if there are constraints it can move in some submanifold of 3n . So ultimately we need to study a particle on an arbitrary manifold. But, well postpone such sophistication for a while. For a particle in n , DAlemberts principle simply says, q(t) = q0 satises F = ma, vanishes for all dq n , (its in equilibrium) (virtual work is zero for q 0) (no force on it!)

dW = F dq F = 0,

If the force is conservative (F = V ) then this is also equivalent to, V (q0 ) = 0 that is, we have equilibrium at a critical point of the potential. The equilibrium will be stable if q0 is a local minimum of the potential V . We can summarize all the above by proclaiming that we have a principle of least energy governing stable equilibria. We also have an analogy between statics and dynamics, Statics equilibrium, a = 0 potential, V critical points of V Dynamics F = ma
t1

action, S =
t0

K V dt

critical points of S

10

From Newtons Laws to Lagranges Equations

unstable equilibrium unstable equilibrium

stable equilibrium

Figure 1.5: A principle of energy minimization determines a levers balance.

d dt d dt

T qi L qi

T = Qi . qi L = 0. qi

(1.11)

(1.12)

DAlemberts principle is an expression for Newtons second law under conditions where the virtual work done by the forces of constraint is zero.

1.2.3

The Principle of Least Time

Time now to look at the second piece of history surrounding the principles of Lagrangian mechanics. As well as hints from statics, there were also hints from the behavior of light, hints again that nature likes to minimize eort. In a vacuum light moves in straight lines, which in Euclidean space is the minimum distance. But more interesting than straight lines are piecewise straight paths and curves. Consider reection of light from a mirror, What path does the light take? The empirical answer was known since antiquity, it chooses B such that 1 = 2 , so the angle of incidence equals the angle of reection. But this is precisely the path that minimizes the distance of the trajectory subject to the condition that it must hit the mirror (at least at one point). In fact light traveling from A to B takes both the straight paths ABC and AC. Why is ABC the shortest path hitting

1.2 Prehistory of the Lagrangian Approach

11

1 B

2 2

C the mirror? This follows from some basic Euclidean geometry: B minimizes AB + BC B minimizes AB + BC A, B, C lie on a line 1 = 2 Note the introduction of the ctitious image C behind the mirror, this is a trick often used in solving electrostatic problems (a conducting surface can be replaced by ctitious mirror image charges to satisfy the boundary conditions), it is also used in geophysics when one has a geological fault, and in hydrodynamics when there is a boundary between two media (using mirror image sources and sinks). The big clue leading to DAlemberts principle however came from refraction of light. Snell (and predecessors) noted that each medium has some number n associated with it, 1 medium 1

medium 2 2

called the index of refraction, such that, n1 sin 1 = n2 sin 2

12

From Newtons Laws to Lagranges Equations

(having normalized n so that for a vacuum n = 1). Someone guessed the explanation, realizing that if the speed of light in a medium is proportional to 1/n, then light will satisfy Snells law if the light minimizes the time it takes to get from A to C. In the case of refraction it is the time that is important, not just the path distance. But the same is true for the law of reection, since in that case the path of minimum length gives the same results as the path of minimum time. So, not only is light the fastest thing around, its also always taking the quickest path from here to there!

1.2.4

How DAlembert and Others Got to the Truth

Sometimes laws of physics are just guessed using a bit of intuition and a gut feeling that nature must be beautiful or elegantly simple (though occasionally awesomely complex in beauty). One way to make good guesses is to generalize. DAlemberts principle of virtual work for statics says that equilibrium occurs when F (q0 ) q = 0, q n DAlembert generalized this to dynamics by inventing what he called the inertia force=m a, and he postulated that in dynamics equilibrium occurs when the total force = F +inertia force, vanishes. Or symbolically when, F (q(t)) ma(t) q(t) = 0 We then take a variational path parameterized by s, qs (t) = q(t) + s q(t) where q(t0 ) = q(t1 ) = 0 and with these paths, for any function f on the space of paths we can dene the variational derivative, d f := f (qs ) (1.14) ds s=0 Then DAlemberts principle of virtual work implies
t1

(1.13)

F (q) m q dt = 0 q
t0

for all q, so if F = V , we get


t1

0=
t0 t1

V (q) m q dt q V (q) q + mq q dt
t0

1.2 Prehistory of the Lagrangian Approach

13

using d V (qs (t)) ds and (q 2 ) = then we have 0= dV dqs m dqs (t) dt + dq dt 2 ds t0 d t1 d m = V qs (t) + qs (t) dt ds t0 dt 2 s=0 t1 m d = V qs (t) + qs (t) dt dt 2 t0
t1 t1

=
s=0

dV dqs (t) dq ds

s=0

dqs (t) 2 ds

therefore

V (q) + K dt = 0
t0

so the path taken by the particle is a critical point of the action, S(q) = (K V ) dt (1.15)

Weve described how DAlembert might have arrived at the principle of least action by generalizing previously known energy minimization and least time principles. Still, theres something unsatisfying about the treatment so far. We do not really understand why one must introduce the inertia force. We only see that its necessary to obtain agreement with Newtonian mechanics (which is manifest in Eq.(1.13)). We conclude with a few more words about this mystery. Recall from undergraduate physics that in an accelerating coordinate system there is a ctional force = ma, which is called the centrifugal force. We use it, for example, to analyze simple physics in a rotating reference frame. If you are inside the rotating system and you throw a ball straight ahead it will appear to curve away from your target, and if you did not know that you were rotating relative to the rest of the universe then youd think there was a force on the ball equal to the centrifugal force. If you are inside a big rapidly rotating drum then youll also feel pinned to the walls. This is an example of an inertia force which comes from using a funny coordinate system. In general relativity, one sees that in a certain sense gravity is an inertia force!

Chapter 2 Equations of Motion


(Week 2, April 4, 6, 8.) In this chapter well start to look at the Lagrangian equations of motion in more depth. Well look at some specic examples of problem solving using the Euler-Lagrange equations. First well show how the equations are derived.

2.1

The Euler-Lagrange Equations

We are going to start thinking of a general classical system as a set of points in an abstract conguration space or phase space1 . So consider an arbitrary classical system as living in a space of points in some manifold Q. For example, the space for a spherical double pendulum would look like Fig. 2.1, where Q = S 2 S 2 . So our system is a particle in

Q = S2 S2

Figure 2.1: Double pendulum conguration space. Q, which means you have to disabuse yourself of the notion that were dealing with real
The tangent bundle T Q will be referred to as conguration space, later on when we get to the chapter on Hamiltonian mechanics well nd a use for the cotangent bundle T Q, and normally we call this the phase space.
1

14

2.1 The Euler-Lagrange Equations

15

particles, we are not, we are dealing with a single quasi-particle in an abstract higher dimensional space. The single quasi-particle represents two real particles if we are talking about the classical system in Fig. 2.1. Sometimes to make this clear well talk about the system taking a path, instead of the particle taking a path. It is then clear that when we say, the system follows a path q(t) that were referring to the point q in conguration space Q that represents all of the particles in the real system. So as time passes the system traces out a path q : [t0 , t1 ] Q and we dene its velocity, q(t) Tq(t) Q to be the tangent vector at q(t) given by the equivalence class [] of curves through q(t) with derivatives (t) = dq(s)/ds|s=t. Well just write is as q(t). Let be the space of smooth paths from a Q to b Q, = {q : [t0 , t1 ] Q|q(t0 ) = a, q(t1 ) = b} ( is an innite dimensional manifold, but we wont go into that for now.) Let the Lagrangian=L for the system be any smooth function of position and velocity (not explicitly of time, for simplicity), L : T Q and dene the action, S: S : by
t1

S(q) :=
t0

L(q, q) dt

(2.1)

The path that the quasi particle will actually take is a critical point of S, in accord with DAlemberts principle of least action. In other words, a path q such that for any smooth 1-parameter family of paths qs with q0 = q1 , we have d S(qs ) ds We write, d ds so Eq.(2.2) can be rewritten S = 0 (2.3)
s=0

s=0

=0

(2.2)

as

16

Equations of Motion

2.1.1

Comments

What is a 1-parameter family of paths? Well, a path is a curve, or a 1D manifold. So the 1-parameter family is nothing more nor less than a set of well-dened paths {qs }, each one labeled by a parameter s. So a smooth 1-parameter family of paths will have q(s) everywhere innitesimally close to q(s + ) for an innitesimal hyperreal . So in Fig. 2.2

qs q0

Figure 2.2: Schematic of a 1-parameter family of curves. we can go from q0 to qs by smoothly varying s from s = 0 to s = s What does the condition S = 0 imply? Patience, we are just getting to that. We will now start to explore what S = 0 means for our Lagrangian.

2.1.2

Lagrangian Dynamics

We were given that Q is a manifold, so it admits a covering of coordinate charts. For now, lets pick coordinates in a neighborhood U of some point q(t) Q. Next, consider only variations qs such that qs = q outside U. A cartoon of this looks like Fig. 2.3 Then

a U b

Figure 2.3: Local path variation.

2.1 The Euler-Lagrange Equations

17

we restrict attention to a subinterval [t0 , t1 ] [t0 , t1 ] such that qs (t) U for t0 t t1 . Lets just go ahead and rename t0 and t1 as t0 and t1 to drop the primes. We can use the coordinate charts on U, : U n x (x) = (x1 , x2 , . . . , xn ) and we also have coordinates for the 1-forms, d : T U T n n n = (x, y) d(x, y) = (x1 , . . . , xn , y 1 , . . . , y n ) where y Tx Q. We restrict L : T M to T U T M, in our case the manifold is M = Q, and then we can describe it, L, using the coordinates xi , y i on T U. The i are generalized position coordinates, the y i are the associated generalized velocity coordinates. (Velocity and position are in the abstract conguration space Q). Using these coordinates we get,
t1

S =
t0 t1

L q(t), q(t) dt L(q, q) dt


t0 t1

= =
t0

q i (t) + i q i dt i x y

where weve used the given smoothness of L and the Einstein summation convention for repeated indices i. Note that we can write L as above using a local coordinate patch because the path variations q are entirely trivial outside the patch for U. Continuing, using the Leibniz rule d L L d L q = q + q dt y dt y y we have,
t1

S =
t0

L d L q i (t) dt i i x dt y

= 0. If this integral is to vanish as demanded by S = 0, then it must vanish for all path variations q, further, the boundary terms vanish because we deliberately chose q that

18

Equations of Motion

vanish at the endpoints t0 and t1 inside U. That means the term in brackets must be identically zero, or d L L i =0 (2.4) i dt y x This is necessary to get S = 0, for all q, but in fact its also sucient. Physicists always give the coordinates xi , y i on T U the symbols q i and q i , despite the fact that these i i also have another meaning, namely the x and y coordinates of the quantity, q(t), q(t) T U. So in any case, physicists write, d L L = i dt q i q and they call these the Euler-Lagrange equations.

2.2

Interpretation of Terms

The derivation of the Euler-Lagrange equations above was fairly abstract, the terms position and velocity were used but were not assumed to be the usual kinematic notions that we are used to in physics, indeed the only reason we used those terms was for their analogical appeal. Now well try to illuminate the E-L equations a bit by casting them into the usual position and velocity terms. So consider, Q= 1 L(q, q) = mq q V (q) 2 1 = mq i qi V (q) 2 TERMS L qi L q i MEANING (in this example) mq (V )i MEANING (in general) the momentum pi the force Fi

When we write V /q i = V were assuming the q i are Cartesian coordinates on Q.

2.2 Interpretation of Terms

19

So translating our example into general terms, if we conjure up some abstract Lagrangian then we can think of the independent variables as generalized positions and velocities, and then the Euler-Lagrange equations can be interpreted as equations relating generalized concepts of momentum and force, and they say that p=F (2.5)

So theres no surprise that in the mundane case of a single particle moving in 3 under time t this just recovers Newton II. Of course we can do all of our classical mechanics with Newtons laws, its just a pain in the neck to deal with the redundancies in F = ma when we could use symmetry principles to vastly simplify many examples. It turns out that the Euler-Lagrange equations are one of the reformulations of Newtonian physics that make it highly convenient for introducing symmetries and consequent simplications. Simplications generally mean quicker, shorter solutions and more transparent analysis or at least more chance at insight into the characteristics of the system. The main thing is that when we use symmetry to simplify the equations we are reducing the number of independent variables, so it gets closer to the fundamental degrees of freedom of the system and so we cut out a lot of the wheat and cha (so to speak) with the full redundant Newton equations. One can of course introduce simplications when solving Newtons equations, its just that its easier to do this when working with the Euler-Lagrange equations. Another good reason to learn Lagrangian mechanics is that it translates better into quantum mechanics.

Chapter 3 Lagrangians and Noethers Theorem


If the form of a system of dynamical equations does not change under spatial translations then the momentum is a conserved quantity. When the form of the equations is similarly invariant under time translations then the total energy is a conserved quantity (a constant of the equations of motion). Time and space translations are examples of 1-parameter groups of transformations. Invariance under a group of transformations is precisely what we mean by a symmetry in group theory. So symmetries of a dynamical system give conserved quantities or conservation laws. The rigorous statement of all this is the content of Noethers theorem.

3.1

Time Translation

To handle time translations we need to replace our paths q : [t0 , t1 ] Q by paths q : Q, and then dene a new space of paths, = {q : Q}. The bad news is that the action

S(q) =

L q(t), q(t) dt

typically will not converge, so S is then no longer a function of the space of paths. Nevertheless, if q = 0 outside of some nite interval, then the functional variation,

S :=

d L qs (t), qs (t) ds

dt
s=0

will converge, since the integral is smooth and vanishes outside this interval. Moreover, demanding that this S vanishes for all such variations q is enough to imply the Euler20

3.1 Time Translation

21

Lagrange equations:

S =

d L qs (t), qs (t) dt ds s=0 L L qi + qi dt qi qi d L L qi dt qi dt qi

where again the boundary terms have vanished since q = 0 near t = . To be explicit, the rst term in L i d L d L i q = q q i i q dt q dt q i vanishes when we integrate. Then the whole thing vanishes for all compactly supported smooth q i L d L = . dt qi qi Recall that, L = pi , is the generalized momentum, by defn. qi L = pi , is the force, by the E-L eqns. qi Note the similarity to Hamiltons equationsif you change L to H you need to stick in a minus sign, and change variables from q to pi and eliminate pi .

3.1.1

Canonical and Generalized Coordinates

In light of this noted similarity with the Hamilton equations of motion, lets spend a few moments clearing up some terminology (I hate using jargon, but sometimes its unavoidable, and sometimes it can be ecientprovided everyone is clued in). Generalized Coordinates For Lagrangian mechanics we have been using generalized coordinates, these are the {qi , qi }. The qi are generalized positions, and the qi are generalized velocities. The full set of independent generalized coordinates represent the degrees of freedom of a particle, or system of particles. So if we have N particles then wed typically have 6N generalized coordinates (the 6 is for 3 space dimensions, and at each point a position and a momentum). These can be in any reference frame or system of axes, so for example,

22

Lagrangians and Noethers Theorem

in a Cartesian frame, with two particles, in 3D space wed have the 2 3 = 6 position coordinates, and 2 3 = 6 velocities, {x1 , y1 , z1 , x2 , y2 , z2 }, {u1, v1 , w1, u2 , v2 , w2 } where say u = vx , v = vy , w = vz are the Cartesian velocity components. This makes 12 = 6 2 = 6N coordinates, matching the total degrees of freedom as claimed. If we constrain the particles to move in a plane (say place them on a table in a gravitational eld) then we get 2N fewer degrees of freedom, and so 4N d.o.f. overall. By judicious choice of coordinate frame we can eliminate one velocity component and one position component for each particle. It is also handy to respect other symmetries of a system, maybe the particles move on a sphere for example, one can then dene new positions and momenta with a consequent reduction in the number of these generalized coordinates needed to describe the system. Canonical Coordinates In Hamiltonian mechanics (which we have not yet fully introduced) we will nd it more useful to transform from generalized coordinates to canonical coordinates. The canonical coordinates are a special set of coordinates on the cotangent bundle of the conguration space manifold Q. They are usually written as a set of (q i , pj ) or (xi , pj ) with the xs or qs denoting the coordinates on the underlying manifold and the ps denoting the conjugate momentum, which are 1-forms in the cotangent bundle at the point q in the manifold. It turns out that the q i together with the pj , form a coordinate system on the cotangent bundle T Q of the conguration space Q, hence these coordinates are called the canonical coordinates. We will not discuss this here, but if you care to know, later on well see that the relation between the generalized coordinates and the canonical coordinates is given by the Hamilton-Jacobi equations for a system.

3.2

Symmetry and Noethers Theorem

First, lets give a useful denition that will make it easy to refer to a type of dynamical system symmetry. We want to refer to symmetry transformations (of the Lagrangian) governed by a single parameter. Denition 3.1 (one-parameter family of symmetries). A 1-parameter family of symmetries of a Lagrangian system L : T Q is a smooth map, F : (s, q) qs ,

with q0 = q

3.2 Symmetry and Noethers Theorem

23

such that there exists a function (q, q) for which L = for some : T Q , that is, d L qs (t), qs (t) ds for all paths q. Remark: The simplest case is L = 0, in which case we really have a way of moving paths around (q qs ) that doesnt change the Lagrangiani.e., a symmetry of L in the d most obvious way. But L = dt is a sneaky generalization whose usefulness will become clear. =
s=0

d dt d qs (t), qs (t) dt

3.2.1

Noethers Theorem

Heres a statement of the theorem. Note that in this theorem is the function associated with F in denition 3.1. Theorem 3.1 (Noethers Theorem). Suppose F is a one-parameter family of symmetries of the Lagrangian system, L : T Q . Then, pi qi is conserved, that is, its time derivative is zero for any path q satisfying the EulerLagrange equations. In other words, in boring detail,Euler-Lagrange!equations!symmetry d L d i q(s)q(s) q (t) dt y i ds s Proof. d d pi q i = pi q i + pi q i dt dt L L = i q i + i q L q q = L L = 0. q(t), q(t)
s=0

=0

OK, big deal you might say. Before this can be of any use wed need to nd a symmetry F . Then wed need to nd out what this pi qi business is that is conserved. So lets look at some examples.

24

Lagrangians and Noethers Theorem

Example 1. Conservation of Energy. (A most important example!) All of our Lagrangian systems will have time translation invariance (because the laws of physics do not change with time, at least not to any extent that we can tell). So we have a one-parameter family of symmetries qs (t) = q(t + s) This indeed gives, L = L for d L(qs ) ds =
s=0

d L=L dt

so here we take = L simply! We then get the conserved quantity pi q i = pi q i L which we normally call the energy. For example, if Q = n , and if 1 L = mq 2 V (q) 2 then this quantity is mq q 1 mq q V 2 1 = mq 2 + V (q) 2

The term in parentheses is K V , and the left-hand side is K + V . Lets repeat this example, this time with a specic Lagrangian. It doesnt matter what the Lagrangian is, if it has 1-parameter families of symmetries then itll have conserved quantities, guaranteed. The trick in physics is to write down a correct Lagrangian in the rst place! (Something that will accurately describe the system of interest.)

3.3

Conserved Quantities from Symmetries


Fs : q qs

Weve seen that any 1-parameter family

3.3 Conserved Quantities from Symmetries

25

which satises L = for some function = (q, q) gives a conserved quantity pi q i As usual weve dened L := d L qs (t), qs (t) ds

s=0

Lets see how we arrive at a conserved quantity from a symmetry.

3.3.1

Time Translation Symmetry

For any Lagrangian system, L : T Q , we have a 1-parameter family of symmetries qs (t) = q(t + s) because L = L so we get a conserved quantity called the total energy or Hamiltonian, H = pi q i L (3.1)

(You might prefer Hamiltonian to total energy because in general we are not in the same conguration space as Newtonian mechanics, if you are doing Newtonian mechanics then total energy is appropriate.) 1 For example: a particle on n in a potential V has Q = n , L(q, q) = 2 mq 2 V (q). This system has L pi q i = i q i = mq 2 = 2K q so H = pi q i L = 2K (K V ) = K + V as youd have hoped.

3.3.2

Space Translation Symmetry

For a free particle in n , we have Q = n and L = K = 1 mq 2 . This has spatial translation 2 n symmetries, so that for any v we have the symmetry qa (t) = q(t) + s v

26

Lagrangians and Noethers Theorem

with L = 0 because q = 0 and L depends only on q not on q in this particular case. (Since L i i does not depend upon q well call q and ignorable coordinate; as above, these ignorables always give symmetries, hence conserved quantities. It is often useful therefore, to change coordinates so as to make some of them ignorable if possible!) In this example we get a conserved quantity called momentum in the v direction: pi q i = mqi v i = mq v Aside: Note the subtle dierence between two uses of the term momentum; here it is a conserved quantity derived from space translation invariance, but earlier it was a dierent thing, namely the momentum L/ q i = pi conjugate to q i . These two dierent momentums happen to be the same in this example! Since this is conserved for all v we say that mq n is conserved. (In fact that n whole Lie group G = is acting as a translation symmetry group, and were getting a q(= n )-valued conserved quantity!)

3.3.3

Rotational Symmetry

The free particle in n also has rotation symmetry. Consider any X so(n) (that is a skew-symmetric n n matrix), then for all s the matrix esX is in SO(n), that is, it describes a rotation. This gives a 1-parameter family of symmetries qs (t) = esX q(t) which has L = L i L i q + i q = mqi q i q i q

now qi is ignorable and so L/q i = 0, and L/ q i = pi , and qi = d i q ds s s=0 d d sX = e q ds dt d = Xq dt =Xq

s=0

3.3 Conserved Quantities from Symmetries

27

So, L = mqi Xj q j i = mq (X q) =0 since X is skew symmetric as stated previously (X so(n)). So we get a conserved quantity, the angular momentum in the X direction. (Note: this whole bunch of maths above for L just says that the kinetic energy doesnt change when the velocity is rotated, without changing the magnitude in other words.) We write, pi q i = mqi (X q)i (q i = Xq just as q i = X q in our previous calculation), or if X has zero entries except in ij and ji positions, where its 1, then we get m(qi q j qj q i ) the ij component of angular momentum. If n = 3 we write these as, mqq Note that above we have assumed one can construct a basis for so(n) using matrices of the form assumed for X, i.e., skew symmetric with 1 in the respectively ij and ji elements, otherwise zero. I mentioned earlier that we can do mechanics with any Lagrangian, but if we want to be useful wed better pick a Lagrangian that actually describes a real system. But how do we do that? All this theory is ne but is useless unless you know how to apply it. The above examples were for a particularly simple system, a free particle, for which the Lagrangian is just the kinetic energy, since there is no potential energy variation for a free particle. Wed like to know how to solve more complicated dynamics. The general idea is to guess the kinetic energy and potential energy of the particle (as functions of your generalized positions and velocities) and then let, L=KV So we are not using Lagrangians directly to tell us what the fundamental physical laws should be, instead we plug in some assumed physics and use the Lagrangian approch to solve the system of equations. If we like, we can then compare our answers with experiments, which indirectly tells us something about the physical lawsbut only provided the Lagrangian formulation of mechanics is itself a valid procedure in the rst place.

28

Lagrangians and Noethers Theorem

3.4

Example Problems

(Week 3, Apr. 11, 13, 15.) To see how the formalisms in this chapter function in practise, lets do some problems. Its vastly superior to the simplistic F = ma formulation of mechanics. The Lagrangian formulation allows the conguration space to be any manifold, and allows us to easily use any coordinates we wish.

3.4.1

The Atwood Machine

A frictionless pulley with two masses, m1 and m2 , hanging from it. We have

x m1 m2 x

1 2 1 K = (m1 + m2 )( x) = (m1 + m2 )x2 2 2 V = m1 qx m2 g( x) so 1 L = K V = (m1 + m2 )x2 + m1 gx + m2 g( x) 2 The conguration space is Q = (0, ), and x (0, ) (we could use the owns symbol here and write Q = (0, ) x ). Moreover T Q = (0, ) (x, x). As usual L : T Q . Note that solutions of the Euler-Lagrange equations will only be dened for some time t , as eventually the solutions reaches the edge of Q. The momentum is L p= = (m1 + m2 )x x and the force is, L = (m1 m2 )g F = x

3.4 Example Problems

29

The Euler-Lagrange equations say p=F (m1 + m2 ) = (m1 m2 )g x m1 m2 g x= m1 + m2


m1 m2 So this is like a falling object in a downwards gravitational acceleration a = m1 +m2 g. It is trivial to integrate the expression for x twice (feeding in some appropriate initial conditions) to obtain the complete solution to the motion x(t) and x(t). Note that x = 0 when m1 = m2 , and x = g if m2 = 0.

3.4.2

Disk Pulled by Falling Mass

Consider next a disk pulled across a table by a falling mass. The disk is free to move on a frictionless surface, and it can thus whirl around the hole to which it is tethered to the mass below. r m1

r m2

no swinging allowed!

Here Q = open disk of radius , minus its center = (0, ) S 1 (r, ) T Q = (0, ) S 1 (r, , r, ) 1 d 1 K = m1 (r 2 + r 2 2 ) + m2 ( r)2 2 2 dt V = gm2 (r ) 1 1 L = m1 (r 2 + r 2 2 ) + m2 r 2 + gm2 ( r) 2 2 having noted that is constant so d/dt( r) = r. For the momenta we get, L = (m1 + m2 )r r L = m1 r 2 . p = pr =

30

Lagrangians and Noethers Theorem

Note that is an ignorable coordinateit doesnt appear in Lso theres a symmetry, rotational symmetry, and p , the conjugate momentum, is conserved. The forces are, L = m1 r 2 gm2 r L = 0, ( is ignorable) F = Fr = Note: in Fr the term m1 r 2 is recognizable as a centrifugal force, pushing m1 radially out, while the term gm2 is gravity pulling m2 down and thus pulling m1 radially in. So, the Euler-Lagrange equations give, pr = Fr , p = 0, (m1 + m2 ) = m1 r 2 m2 g r p = m1 r 2 = J = a constant.

Lets use our conservation law here to eliminate from the rst equation: J = m1 r so (m1 + m2 ) = r J2 m2 g m1 r 3

Thus eectively we have a particle on (0, ) of mass m = m1 + m2 feeling a force Fr = J2 m2 g m1 r 3

which could come from an eective potential V (r) such that dV /dr = Fr . So integrate Fr to nd V (r): J2 V (r) = + m2 gr 2m1 r 2 this is a sum of two terms that look like Fig. 3.1 If (t = 0) = 0 then there is no centrifugal force and the disk will be pulled into the hole until it gets stuck. At that time the disk reaches the hole, which is topologically the center of the disk that has been removed from Q, so then weve hit the boundary of Q and our solution is broken. At r = r0 , the minimum of V (r), our disc mass m1 will be in a stable circular orbit of radius r0 (which depends upon J). Otherwise we get orbits like Fig. 3.2.

3.4 Example Problems

31

V (r)
attractive gravitational potential repulsive centrifugal potential

r0

Figure 3.1: Potential function for disk pulled by gravitating mass.

Figure 3.2: Orbits for the disc and gravitating mass system.

3.4.3

Free Particle in Special Relativity

In relativistic dynamics the parameter coordinate that parametrizes the particles path in Minkowski spacetime need not be the time coordinate, indeed in special relativity there are many allowed time coordinates. Minkowski spacetime is, n+1 (x0 , x1 , . . . , xn ) if space is n-dimensional. We normally take x0 as time, and (x1 , . . . , xn ) as space, but of course this is all relative to ones reference frame. Someone else travelling at some high velocity relative to us will have to make a Lorentz transformation to translate from our coordinates to theirs. This has a Lorentzian metric g(v, w) = v 0 w 0 v 1 w 1 . . . v n w n = v w

32

Lagrangians and Noethers Theorem

where

In special relativity we take spacetime to be the conguration space of a single point particle, so we let Q be Minkowski spacetime, i.e., n+1 (x0 , . . . , xn ) with the metric dened above. Then the path of the particle is, q : ( t) Q where t is a completely arbitrary parameter for the path, not necessarily x0 , and not necessarily proper time either. We want some Lagrangian L : T Q , i.e., L(q i , q i ) such that the Euler-Lagrange equations will dictate how our free particle moves at a constant velocity. Many Lagrangians do this, but the best ones(s) give an action that is independent of the parameterization of the pathsince the parameterization is unphysical (it cant be measured). So the action
t1

1 0 0 ... 0 0 1 0 . . . 0 0 0 1 0 . . . .. . . . . . . . 0 0 0 . . . 1

S(q) =
t0

L q i (t), q i (t) dt

for q : [t0 , t1 ] Q, should be independent of t. The obvious candidate for S is mass times arclength,
t1

S=m
t0

ij q i (t)q j (t) dt

or rather the Minkowski analogue of arclength, called proper time, at least when q is a timelike vector, i.e., ij q i q j > 0, which says q points into the future (or past) lightcone and makes S real, in fact its then the time ticked o by a clock moving along the path q : [t0 , t1 ] Q. By obvious candidate we are appealing somewhat to physical intu-

Timelike Lightlike Spacelike

ition and generalization. In Euclidean space, free particles follow straight paths, so the arclength or pathlength variation is an extremum, and we expect the same behavior in

3.4 Example Problems

33

Minkowski spacetime. Also, the arclength does not depend upon the parameterization, and lastly, the mass m merely provides the correct units for action. So lets take L = ij q i q j (3.2) and work out the Euler-Lagrange equations. We have pi = L = i i q q =m ij q i q j

2ij q j 2 ij q i q j j ij q mqi =m = iqj q ij q (Note the numerator is mass times 4-velocity, at least when n = 3 for a real single particle system, but were actually in a more general n + 1-dim spacetime, so its more like the mass times n + 1-velocity). Now note that this pi doesnt change when we change the parameter to accomplish q q. The Euler-Lagrange equations say, pi = Fi = L =0 q i

The meaning of this becomes clearer if we use proper time as our parameter (like parameterizing a curve by its arclength) so that
t1

q dt = t1 t0 ,
t0

t0 , t1

which xes the parametrization up to an additive constant. This implies q = 1, so that pi = m and the Euler-Lagrange equations say pi = 0 mi = 0 q so our (free) particle moves unaccelerated along a straight line, which is as we desired (expected). Comments This Lagrangian from Eq.(3.2) has lots of symmetries coming from reparameterizing the path, so Noethers theorem yields lots of conserved quantities for the relativistic free qi = mqi q

34

Lagrangians and Noethers Theorem

particle. This is in fact called the problem of time in general relativity, here we see it starting to show up in special relativity. These reparameterization symmetries work as follows. Consider any (smooth) 1parameter family of reparameterizations, i.e., dieomorphisms fs : with f0 = . These act on the space of paths = {q : Q} as follows: given any q we get qs (t) = q fs (t) where we should note that qs is physically indistinguishable from q. Lets show that L = , (when E-L eqns. hold)

so that Noethers theorem gives a conserved quantity pi q i Here we go then. L = L i L i q + i q qi q = pi q i mqi d i q fs (t) = q ds s=0 mqi d d i = q fs (t) q dt ds s=0 fs (t) mqi d i q fs (t) = q dt ds s=0 mqi d i = q fs q dt d pi q i f = dt

d where in the last step we used the E-L eqns., i.e. dt pi = 0, so L = with = pi q i f . So to recap a little: we saw the free relativistic particle has

L=m q =m

ij q i q j

and weve considered reparameterization symmetries qs (t) = q fs (t) , fs :

3.5 Electrodynamics and Relativistic Lagrangians

35

weve used the fact that q i := so (repeating a bit of the above) L =

d i q fs (t) ds

= q i f
s=0

L i L i q + i q q i q = pi q i , (since L/q i = 0, and L/ q i = p)

= pi q i d = pi q i dt d = pi q i f dt d = pi q i f, dt

and set pi q i f =

so Noethers theorem gives a conserved quantity pi q i = pi q i f pi q i f =0 So these conserved quantities vanish! In short, were seeing an example of what physicists call gauge symmetries. This is a good topic for starting a new section.

3.5

Electrodynamics and Relativistic Lagrangians

We will continue the story of symmetry and Noethers theorem from the last section with a few more examples. We use principles of least action to conjure up Lagrangians for our systems, realizing that a given system may not have a unique Lagrangian but will often have an obvious natural Lagrangian. Given a Lagrangian we derive equations of motion from the Euler-Lagrange equations. Symmetries of L guide us in nding conserved quantities, in particular Hamiltonians from time translation invariance, via Noethers theorem. This section also introduces gauge symmetry, and this is where we begin.

3.5.1

Gauge Symmetry and Relativistic Hamiltonian

What are gauge symmetries? 1. These are symmetries that permute dierent mathematical descriptions of the same physical situationin this case reparameterizations of a path.

36

Lagrangians and Noethers Theorem

2. These symmetries make it impossible to compute q(t) given q(0) and q(0): since if q(t) is a solution so is q(f (t)) for any reparameterization f : . We have a high degree of non-uniqueness of solutions to the Euler-Lagrange equations. 3. These symmetries give conserved quantities that work out to equal zero! Note that (1) is a subjective criterion, (2) and (3) are objective, and (3) is easy to test, so we often use (3) to distinguish gauge symmetries from physical symmetries.

3.5.2

Relativistic Hamiltonian

What then is the Hamiltonian for special relativity theory? Were continuing here with the example problem of 3.4.3. Well, the Hamiltonian comes from Noethers theorem from time translation symmetry, qs (t) = q(t + s) and this is an example of a reparametrization (with f = 1), so we see from the previous results that the Hamiltonian is zero! H = 0. Explicitly, H = pi q i where under q(t) q(t + s) we have q i = q i f , and so i L = d/dt, which implies = pi q . The result H = 0 follows. Now you know why people talk about the problem of time in general relativity theory, its glimmerings are seen in the at Minkowski spacetime of special relativity. You may think its nice and simple to have H = 0, but in fact it means that there is no temporal evolution possible! So we cant establish a dynamical theory on this footing! Thats bad news. (Because it means you might have to solve the static equations for the 4D universe as a whole, and thats impossible!) But there is another conserved quantity deserving the title of energy which is not zero, and it comes from the symmetry, qs (t) = q(t) + s w where w n+1 and w points in some timelike direction. In fact any vector w gives a conserved quantity, L = L i L i q + i q q i q = pi q i , (since L/q i = 0 and L/ q i = pi ) = pi 0 = 0

3.6 Relativistic Particle in an Electromagnetic Field

37

qs

q w

since q i = w i , q i = w i = 0. This is our from Noethers theorem with = 0, so Noethers theorem says that we get a conserved quantity pi q i = pi w namely, the momentum in the w direction. We know p = 0 from the Euler-Lagrange equations, for our free particle, but here we see it coming from spacetime translation symmetry; p =(p0 , p1 , . . . , pn ) p0 is energy, (p1 , . . . , pn ) is spatial momentum. Weve just about exhausted all the basic stu that we can learn from the free particle. So next well add some external force via an electromagnetic eld.

3.6

Relativistic Particle in an Electromagnetic Field

The electromagnetic eld is described by a 1-form A on spacetime, A is the vector potential, such that dA = F (3.3) is a 2-from containing the electric and magnetic elds, F = Ai Aj xj xi (3.4)

Wed write (for Q having local charts to n+1 ), A = A0 dx0 + A1 dx1 + . . . An dxn

38

Lagrangians and Noethers Theorem

and then because d2 = 0 dA = dA0 dx0 + dA1 dx1 + . . . dAn dxn and since the Ai are just functions, dAi = Ai dx using the summation convention and := /x . The student can easily check that the components for F = F01 dx0 dx1 + F02 dx0 dx2 + . . ., agrees with the matrix expression below (at least for 4D). So, for example, in 4D spacetime 0 E1 E2 E3 E1 0 B3 B2 F = E2 B3 0 B1 E3 B2 B1 0 where E is the electric eld and B is the magnetic eld. The action for a particle of charge e is
t1

S=m
t0

q dt + e
q

here
t1

q dt = proper time,
t0

A = integral of A along the path q.


q

Note that since A is a 1-form it can be integrated (it is a linear combination of some basis 1-forms like the {dxi }). (Week 4, April 18, 20, 22.) Note that since A is a 2-form we can integrate it over an oriented manifold, but one can also write the path integral using time t as a parameter, with Ai q i dt the dierential, i i after dq = q dt. The Lagrangian in the above action, for a charge e with mass m in an electromagnetic potential A is L(q, q) = m q + eAi q i (3.5) so we can work out the Euler-Lagrange equations: pi = qi L =m + eAi i q q = mvi + e Ai

3.7 Alternative Lagrangians

39

where v n+1 is the velocity, normalized so that v = 1. Note that now momentum is no longer mass times velocity! Thats because were in n + 1-d spacetime, the momentum is an n + 1-vector. Continuing the analysis, we nd the force Fi = L = i e Aj q j i q q Aj = e j qj q

So the Euler-Lagrange equations say (noting that Ai = Aj q(t) : p=F d Aj mvi + eAi = e i q j dt q Aj dAi dvi = e i qj e m dt q dt dvi Aj Ai m = e i qj e j qj dt q q Aj Ai =e j qj i q q the term in parentheses is F ij = the electromagnetic eld, F = dA. So we get the following equations of motion m dvi = eF ij q j , dt (Lorentz force law) (3.6)

(Usually called the Lorenz force law.)

3.7

Alternative Lagrangians

Well soon discuss a charged particle Lagrangian that is free of the reparameterization symmetry. First a paragraph on objects other than point particles!

3.7.1

Lagrangian for a String

So weve looked at a point particle and tried S = m (arclength) + A

40

Lagrangians and Noethers Theorem

or with proper time instead of arclength, where the 1-from A can be integrated over a 1-dimensional path. A generalization (or specialization, depending on how you look at it) would be to consider a Lagrangian for an extended object. In string theory we boost the dimension by +1 and consider a string tracing out a 2D surface as time passes (Fig. 3.3).

becomes

Figure 3.3: Worldtube of a closed string. Can you infer an appropriate action for this system? Remember, the physical or physico-philosophical principle weve been using is that the path followed by physical objects minimizes the activity or aliveness of the system. Given that we presumably cannot tamper with the length of the closed string, then the worldtube quantity analogous to arclength or proper time would be the area of the worldtube (or worldsheet for an open string). If the string is also assumed to be a source of electromagnetic eld then we need a 2-form to integrate over the 2D worldtube analogous to the 1-form integrated over the pathline of the point particle. In string theory this is usually the Kalb-Ramond eld, call it B. To recover electrodynamic interactions it should be antisymmetric like A, but its tensor components will have two indices since its a 2-form. The string action can then be written S = (area) + e B (3.7)

Weve also replaced the point particle mass by the string tension [masslength1 ] to obtain the correct units for the action (since replacing arclength by area meant we had to compensate for the extra length dimension in the rst term of the above string action). This may still seem like weve pulled a rabbit out of a hat. But we havent checked that this action yields sensible dynamics yet! But supposing it does, then would it justify our guesswork and intuition in arriving at Eq.(3.7)? Well by now youve probably realized that one can have more than one form of action or Lagrangian that yields the same dynamics. So provided we supply reasonabe physically realistic heuristics then whatever Lagrangian or action that we come up with will stand a good chance of describing some system with a healthy measure of physical verisimilitude.

3.7 Alternative Lagrangians

41

Thats enough about string for now. The point was to illustrate the type of reasoning that one can use in conjuring up a Lagrangian. Its particularly useful when Newtonian theory cannot give us a head start, i.e., in relativistic dynamics and in the physics of extended particles.

3.7.2

Alternate Lagrangian for Relativistic Electrodynamics

In 3.6, Eq.(3.5) we saw an example of a Lagrangian for relativistic electrodynamics that had awkward reparametrization symmetries, meaning that H = 0 and there were non-unique solutions to the Euler-Lagrange equations arising from applying gauge transformations. This freedom to change the gauge can be avoided. Recall Eq.(3.5), which was a Lagrangian for a charged particle with reparametrization symmetry L = m q + eAi q i just as for an uncharged relativistic particle. But theres another Lagrangian we can use that doesnt have this gauge symmetry: 1 L = mq q + eAi q i 2 This one even has some nice features.
1 It looks formally like 2 mv 2 , familiar from nonrelativistic mechanics.

(3.8)

Theres no ugly square root, so its everywhere dierentiable, and theres no trouble with paths being timelike or spacelike in direction, they are handled the same. What Euler-Lagrange equations does this Lagrangian yield? L = mqi + eAi qi Aj L Fi = i = e i q j q q pi = Very similar to before! The E-L eqns. then say d Aj mqi + eAi = e i q j dt q mi = eF ij q j q almost as before. (Ive taken to using F here for the electromagnetic eld tensor to avoid clashing with F for the generalized force.) The only dierence is that we have mi instead q of mvi where vi = qi / q . So the old Euler-Lagrange equations of motion reduce to the

42

Lagrangians and Noethers Theorem

new ones if we pick a parametrization with q = 1, which would be a parametrization by proper time for example. Lets work out the Hamiltonian for this 1 L = mq q + eAi q i 2 for the relativistic charged particle in an electromagnetic eld. Recall that for our reparametrization-invariant Lagrangian L=m qi q i + eAi q i we got H = 0, time translation was a gauge symmetry. With the new Lagrangian its not! Indeed H = pi q i L and now pi = so
1 H = (mqi + eAi )q i ( 2 mqi q i + eAi q i )

L = mq i + eAi qi

= 1 mqi q i 2 Comments. This is vaguely like how a nonrelativistic particle in a potential V has H = pi q i L = 2K (K V ) = K + V, but now the potential V = eAi q i in linear in velocity, so now H = pi q i L = (2K V ) (K V ) = K. As claimed H is not zero, and the fact that its conserved says q(t) is constant as a function of t, so the particles path is parameterized by proper time up to rescaling of t. That is, were getting conservation of speed rather than some more familiar conservation of energy. The reason is that this Hamiltonian comes from the symmetry qs (t) = q(t + s) instead of spacetime translation symmetry qs (t) = q(t) + s w, w n+1 the dierence is illustrated schematically in Fig. 3.4. Our Lagrangian L(q, q) = 1 m q 2 + Ai (q)q i 2 has time translation symmetry i A is translation invariant (but its highly unlikely a given system of interest will have A(q) = A(q + sw)). In general then theres no conserved energy for our particle corresponding to translations in time.

3.8 The General Relativistic Particle

43

4 5 3 4 2 3 1 2 0 1

qs (t) = q(t + s)

qs (t) = q(t) + sw

Figure 3.4: Proper time rescaling vs spacetime translation.

3.8

The General Relativistic Particle

In GR spacetime, Q, is an (n + 1)-dimensional Lorentzian manifold, namely a smooth (n + 1)-dimensional manifold with a Lorentzian metric g. We dene the metric as follows. 1. For each x Q, we have a bilinear map g(x) : Tx Q Tx Q (v, w) g(x)(v, w) or we could write g(v, w) for short. 2. With respect to some basis of Tx Q we have g(v, w) = gij v i w j 1 0 ... 0 0 1 0 gij = . . .. . . . . . 0 0 . . . 1

Of course we can write g(v, w) = gij v i w j in any basis, but for dierent bases gij will have a dierent form. 3. g(x) varies smoothly with x.

44

Lagrangians and Noethers Theorem

3.8.1

Free Particle Lagrangian in GR


L(q, q) = m =m g(q)(q, q) gij q i q j

The Lagrangian for a free point particle in the spacetime Q is

just like in special relativity but with ij replaced by gij . Alternatively we could just as well use L(q, q) = 1 mg(q)(q, q) 2
1 = 2 mgij q i q j

The big dierence between these two Lagrangians is that now spacetime translation symmetry (and rotation, and boost symmetry) is gone! So there is no conserved energymomentum (nor angular momentum, nor velocity of center of energy) anymore! Lets nd the equations of motion. Suppose then Q is a Lorentzian manifold with metric g and L : T Q is the Lagrangian of a free particle, L(q, q) = 1 mgij q i q j 2 We nd equations of motion from the Euler-Lagrange equations, which in this case start from L pi = i = mgij q j q The velocity q here is a tangent vector, the momentum p is a cotangent vector, and we need the metric to relate them, via g : Tq M Tq M (v, w) g(v, w) which gives Tq M Tq M v g(v, ). In coordinates this would say that the tangent vector v i gets mapped to the cotangent vector gij v j . This is lurking behind the passage from q i to the momentum mgij v j . Getting back to the E-L equations, L = mgij q j i q 1 L Fi = i = i 2 mgjk (q)q j q k q q pi = = 1 mi gik q j q k , 2 (where i = ). q i

3.8 The General Relativistic Particle

45

So the Euler-Lagrange equations say d mgij q j = 1 mi gik q j q k . 2 dt The mass factors away, so the motion is independent of the mass! Essentially we have a geodesic equation. We can rewrite this geodesic equation as follows d 1 gij q j = 2 i gik q j q k dt k gij q k q j + gij q j = 1 i gjk q j q k 2 gij q j = =
1 2 1 g 2 i jk

k gij q j q k

i gjk k gij j gki q j q k

where the last line follows by symmetry of the metric, gik = gki . Now let, ijk = i gjk k gij j gki the minus sign being just a convention (so that we agree with everyone else). This denes what we call the Christoel symbols i . Then jk qi = gij q j = ijk q j q k q i = i q j q k . jk So we see that q can be computed in terms of q and the Christoel symbols i , which jk is really a particular type of connection that a Lorentzian manifold has (the Levi-Civita connection), a connection is just the rule for parallel transporting tangent vectors around the manifold. Parallel transport is just the simplest way to compare vectors at dierent points in the manifold. This allows us to dene, among other things, a covariant derivative.

3.8.2

Charged particle in EM Field in GR

We can now apply what weve learned in consideration of a charged particle, of charge e, in an electromagnetic eld with potential A, in our Lorentzian manifold. The Lagrangian would be 1 L = 2 mgij q j q k + eAi q i which again was conjured up be replacing the at space metric ij by the metric for GR gij . Not surprisingly, the Euler-Lagrange equations then yield the following equations of motion, mi = mijk q j q k + eF ij q j . q If you want to know more about Lagrangians for general relativity we recommend the paper by Peldan [Pel94], and also the black book of Misner, Thorne & Wheeler [WTM71].

46

Lagrangians and Noethers Theorem

3.9

The Principle of Least Action and Geodesics

(Week 4, April 18, 20, 22.)

3.9.1

Jacobi and Least Time vs Least Action

Weve mentioned that Fermats principle of least time in optics is analogous to the principle of least action in particle mechanics. This analogy is strange, since in the principle of least action we x the time interval q : [0, 1] Q. Also, if one imagines a force on a particle resulting from a potential gradient at an interface as analogous to light refraction then you also get a screw-up in the analogy (Fig. 3.5).

n high

light faster light slower

particle slower particle faster

V high

n low

V low

Figure 3.5: Least time versus least action. Nevertheless, Jacobi was able to reinterpret the mechanics of a particle as an optics problem and hence unify the two minimization principles. First, lets consider light in a medium with a varying index of refraction n (recall 1/n speed of light). Suppose its in n with its usual Euclidean metric. If the light s trying to minimize the time, its trying to minimize the arclength of its path in the metric gij = n2 ij that is, the index of refraction n : n (0, ), times the usual Euclidean metric 1 .. .

ij =

0
1

3.9 The Principle of Least Action and Geodesics

47

This is just like the free particle in general relativity (minimizing its proper time) except that now gij is a Riemannian metric g(v, w) = gij v i w j where g(v, v) 0 So well use the same Lagrangian, L(q, q) = and get the same Euler-Lagrange equations, d2 q i + i q j q k jk dt2 if q is parameterized by arclength or more generally q = gij (q)q i q j = constant. (3.9) gij (q)q iq j

As before the Christoel symbols are built from the derivatives of the metric g). Now, what Jacobi did is show how the motion of a particle in a potential could be viewed as a special case of this. Consider a particle of mass m in Euclidean n with potential V : n . It satises F = ma, i.e., m d2 q i = i V dt2 (3.10)

How did Jacobi see (3.10) as a special case of (3.9)? He considered a particle of energy E and he chose the index of refraction to be n(q) = 2 E V (q) m

which is just the speed of a particle of energy E when the potential energy is V (q), since 2 (E V ) = m 21 m q m2
2

= q .

Note: this is precisely backwards compared to optics, where n(q) is proportional to the reciprocal of the speed of light!! But lets see that it works. L= = = gij (q)q i q j n2 (q)q i q j 2/m(E V (q))q 2

48

Lagrangians and Noethers Theorem

where q 2 = q q is just the usual Euclidean dot product, v w = ij v i w j . We get the Euler-Lagrange equations, pi = Fi = L = qi L = i q i 1 = 2 q 2 (E V ) m q 2 (E V (q)) q m 2/mi V q 2/m(E V q)

Then p = F says, d dt 2/m(E V (q)) qi 1 = i V q m q 2/m(E V )

Jacobi noticed that this is just F = ma, or mi = i V , that is, provided we reparameq terize q so that, q = 2/m(E V (q)). Recall that our Lagrangian gives reparameterization invariant Euler-Lagrange equations! This is the unication between least time (from optics) and least action (from mechanics) that we sought.

3.9.2

The Ubiquity of Geodesic Motion

Weve seen that many classical systems trace out paths that are geodesics, i.e., paths q : [t0 , t1 ] Q that are critical points of
t1

S(q) =
t0

gij q i q j dt

which is proper time when (Q, g) is a Lorentzian manifold, or arclength when (Q, g) is a Riemannian manifold. We have 1. The metric at q Q is, g(q) : Tq Q Tq Q (v, w) g(v, w) and it is bilinear. 2. w.r.t a basis of Tq Q g(v, w) = ij v i w j

3.9 The Principle of Least Action and Geodesics

49

3. g(q) varies smoothly with q Q. An important distinction to keep in mind is that Lorentzian manifolds represent spacetimes, whereas Riemannian manifolds represent that wed normally consider as just space. Weve seen at least three important things. (1) In the geometric optics approximation, light in Q = n acts like particles tracing out geodesics in the metric gij = n(q)2 ij where n : Q (0, ) is the index of refraction function. (2) Jacobi saw that a particle in Q = n in some potential V : Q traces out geodesics in the metric 2 gij = (E V )ij m 1 if the particle has energy E (where V < E). (3) A free particle in general relativity traces out a geodesic on a Lorentzian manifold (Q, q). In fact all three of these results can be generalized to cover every problem that weve discussed! (1 ) Light on any Riemannian manifold (Q, q) with index of refraction n : Q (0, ) traces out geodesics in the metric h = n2 g. (2 ) A particle on a Riemannian manifold (Q, q) with potential V : Q traces out geodesics w.r.t the metric 2 h = (E V )g m if it has energy E. Lots of physical systems can be described this way, e.g., the Atwood machine, a rigid rotating body (Q = SO(3)), spinning tops, and others. All of these systems have a Lagrangian which is a quadratic function of position, so they all t into this framework. (3 ) Kaluza-Klein Theory. A particle with charge e on a Lorentzian manifold (Q, q) in an electromagnetic vector potential follows a path with e qi = ijk q j q k + Fij q j m where Fij = i Aj j Ai but this is actually geodesic motion on the manifold QU(1) where U(1) = {ei : } is a circle.
1

The case V > E, if they exist, would be classically forbidden regions.

50

Lagrangians and Noethers Theorem

Lets examine this last result a bit further. To get the desired equations for motion on Q U(1) we need to given Q U(1) a cleverly designed metric built from g and A where the amount of spirallingthe velocity in the U(1) direction is e/m. The metric h on

Q U(1) a geodesic Q the apparent path Q U(1) is built from g and A in a very simple way. Lets pick coordinates xi on Q where i {0, . . . , n} since were in n + 1-dimensional spacetime, and is our local coordinate on S 1 . The components of h are hij = gij + Ai Aj hi = hi = Ai h = 1 Working out the equations for a geodesic in this metric we get qi = ijk q j q k + q = 0, if q = e/m since Fij is part of the Christoel symbols for h. To summarize this section on least time versus least action we can say that every problem that weve discussed in classical mechanics can be regarded as geodesic motion! e Fij q j m

Chapter 4 From Lagrangians to Hamiltonians


In the Lagrangian approach we focus on the position and velocity of a particle, and compute what the particle does starting from the Lagrangian L(q, q), which is a function L : T Q where the tangent bundle is the space of position-velocity pairs. But were led to consider momentum L pi = i q since the equations of motion tell us how it changes dpi L = i. dt q

4.1

The Hamiltonian Approach

In the Hamiltonian approach we focus on position and momentum, and compute what the particle does starting from the energy H = pi q i L(q, q) reinterpreted as a function of position and momentum, called the Hamiltonian H : T Q where the cotangent bundle is the space of position-momentum pairs. In this approach, position and momentum will satisfy Hamiltons equations: H dq i = , dt pi 51 dpi H = dt qi

52

From Lagrangians to Hamiltonians

where the latter is the Euler-Lagrange equation dpi L = dt qi in disguise (it has a minus sign since H = pq L). To obtain this Hamiltonian description of mechanics rigorously we need to study this map : T Q T Q (q, q) (q, p) where q Q, and q is any tangent vector in Tq Q (not the time derivative of something), and p is a cotangent vector in Tq Q := (Tq Q) , given by q pi =

L qi

So is dened using L : T Q . Despite appearances, can be dened in a coordinateL free way, as follows (referring to Fig. 4.1). We want to dene qi in a coordinate-free

Tq Q TQ T(q,q) Tq Q

Tq Q q Figure 4.1: way; its the dierential of L in the vertical directioni.e., the q i directions. We have : T Q Q (q, q) q and d : T (T Q) T Q Q

4.1 The Hamiltonian Approach

53

has kernel1 consisting of vertical vectors: V T Q = ker d T T Q The dierential of L at some point (q, q) T Q is a map from T T Q to , so we have
(dL)(q,q) T(q,q) T Q

that is, dL(q,q) : T(q,q) T Q . We can restrict this to V T Q T T Q, getting f : V(q,q) T Q . But note V(q,q) T Q = T (Tq Q) and since Tq Q is a vector space, T(q,q) Tq Q Tq Q = in a canonical way2 . So f gives a linear map p : Tq Q that is, p this is the momentum! (Week 6, May 2, 4, 6.) Given L : T Q T Q, we now know a coordinate-free way of describing the map : T Q T Q (q, q) (q, p) given in local coordinates by pi =
1

Tq Q V(q,q) T Q T(q,q) Tq Q

Tq Q

L . qi

The kernel of a map is the set of all elements in the domain that map to the null element of the range, so ker d = {v T T Q : d(v) = 0 T Q}. 2 The ber Tv V at v V of a vector manifold V has the same dimension as V .

54

From Lagrangians to Hamiltonians

We say L is regular if is a dieomorphism from T Q to some open subset X T Q. In this case we can describe what our system is doing equally well by specifying position and velocity, (q, q) T Q or position and momentum (q, p) = (q, q) X. We call X the phase space of the system. In practice often X = T Q, then L is said to be strongly regular.

4.2

Regular and Strongly Regular Lagrangians

This section discusses some examples of the above theory.

4.2.1

Example: A Particle in a Riemannian Manifold with Potential V (q)

For a particle in a Riemannian manifold (Q, q) in a potential V : Q has Lagrangian 1 L(q, q) = mgij q i q j V (Q) 2 Here pi = so (q, q) = q, mg(q, ) so3 L is strongly regular in this case because Tq Q Tq Q v g(v, ) is 1-1 and onto, i.e., the metric is nondegenerate. Thus is a dieomorphism, which in this case extends to all of T Q.
The missing object there is of course any tangent vector, not inserted since itself is an operator on tangent vectors, not the result of the operation.
3

L = mij q j qi

4.2 Regular and Strongly Regular Lagrangians

55

4.2.2

Example: General Relativistic Particle in an E-M Potential

For a general relativistic particle with charge e in an electromagnetic vector potential A the Lagrangian is 1 L(q, q) = mgij q i q j eAi q i 2 and thus L pi = i = mgij q i q j + eAi . q This L is still strongly regular, but now each map |Tq Q : Tq Q Tq Q q m g(q, ) + eA(q) is ane rather than linear4 .

4.2.3

Example: Free General Relativistic Particle with Reparameterization Invariance

The free general relativistic particle with reparameterization invariant Lagrangian has, L(q, q) = m gij q i q j .

This is terrible from the perspective of regularity propertiesits not dierentiable when gij q i q j vanishes, and undened when the same is negative. Where it is dened pi = mgij q j L = i q q

(where q is timelike), we can ask about regularity. Alas, the map is not 1-1 where dened since multiplying q by some number has no eect on p! (This is related to the reparameter ization invariancethis always happens with reparameterization-invariant Lagrangians.)

4.2.4

Example: A Regular but not Strongly Regular Lagrangian

Heres a Lagrangian thats regular but not strongly regular. Let Q = and L(q, q) = f (q)
All linear transforms are ane, but ane transformations include translations, which are nonlinear. In ane geometry there is no dened origin. For the example the translation is the +eA(q) part.
4

56

From Lagrangians to Hamiltonians

L = f (q) q This will be regular if f : is a dieomorphism from to some proper subset U , but not strongly regular. For example, take f (q) = eq so f : (0, ) . So p= positive slope L(q, q) = eq
G

so that

or slope between 1 and 1 L(q, q) = and so forth. 1 + q2


G

4.3

Hamiltons Equations
: T Q X T Q (q, q) (q, p)

Now lets assume L is regular, so

This lets us have the best of both worlds: we can identify T Q with X using . This lets us treat q i , pi , L, H, etc., all as functions on X (or T Q), thus writing qi for the function q i 1 In particular (function on X) (function on T Q)

L (Euler-Lagrange eqn.) q i which is really a function on T Q, will be treated as a function on X. Now lets calculate: pi := dL = L i L i dq + i dq q i q = pi dq i + pi dq i

4.3 Hamiltons Equations

57

while dH = d(pi q i L) = q i dpi + pi dq i L = q i dpi + pi dq i (pi dq i + pi dq i ) = q i dpi pi dq i so dH = q i dpi pi dqi . Assume the Lagrangian L : T Q is regular, so : T Q X T Q (q, q) (q, p) is a dieomorphism. This lets us regard both L and the Hamiltonian H = pi q i L as i i functions on the phase space X, and use (q , q ) as local coordinates on X. As weve seen, this gives us dL = pi dq i + pi dq i , dH = q i dpi pi dq i . But we can also work out dH directly, this time using local coordinates (q i , pi ), to get dH = H H dpi + i dq i . i p q

Since dpi , dq i form a basis of 1-forms, we conclude: qi = These are Hamiltons Equations. H , pi pi = H qi

4.3.1

Hamilton and Euler-Lagrange

Though q i and pi are just functions of X, when the Euler-Lagrange equations hold for some path q : [t0 , t1 ] Q, they will be the time derivatives of q i and pi . So when the Euler-Lagrange equations hold, Hamiltons equations describe the motion of a point x(t) = q(t), p(t) X. In fact, in this context, Hamiltons equations are just the EulerLagrange equations in disguise. The equation qi = H pi

58

From Lagrangians to Hamiltonians

really just lets us recover the velocity q as a function of q and p, inverting the formula pi = L qi

which gave p as a function of q and q. So we get a formula for the map 1 : X T Q (q, p) (q, q). Given this, the other Hamilton equation pi = is secretly the Euler-Lagrange equation d L L = i, i dt q q These are the same because H L = i pi q i L = i . q i q q Example: Particle in a Potential V (q) For a particle in Q = n in a potential V : n the system has Lagrangian m L(q, q) = q 2 V (q) 2 which gives p = mq p q= , m and Hamiltonian H(q, p) = pi q i L = 1 p 2 p 2 V m 2m 1 p 2 + V (q). = 2m p m g ij pj ) m or p= L . q i H q i

(though really thats q =

So Hamiltons equations say H pi H pi = i q qi = q=

p = V

The rst just recovers q as a function of p; the second is F = ma.

4.3 Hamiltons Equations

59

Note on Symplectic Structure Hamiltons equations push us toward the viewpoint where p and q have equal status as coordinates on the phase space X. Soon, well drop the requirement that X T Q where Q is a conguration space. X will just be a manifold equipped with enough structure to write down Hamiltons equations starting from any H : X . The coordinate-free description of this structure is the major 20th century contribution to mechanics: a symplectic structure. This is important. You might have some particles moving on a manifold like S 3 , which is not symplectic. So the Hamiltonian mechanics point of view says that the abstract manifold that you are really interesting in is something dierent, it must be a symplectic manifold. Thats the phase space X. Well introduce symplectic geometry more completely in later chapters.

4.3.2

Hamiltons Equations from the Principle of Least Action

Before, we obtained the Euler-Lagrange equations by associating an action S with any q : [t0 , t1 ] Q and setting S = 0. Now lets get Hamiltons equations directly by assigning an action S to any path x : [t0 , t1 ] X and setting S = 0. Note: we dont impose any relation between p and q, q at all! The relation will follow from S = 0. Let P be the space of paths in the phase space X and dene the action S : P by
t1

S(x) =
t0

(pi q i H)dt

where pi q i H = L. More precisely, write our path x as x(t) = q(t), p(t) and let
t1

S(x) =
t0

pi (t)

d i q (t) H q(t), p(t) dt

dt

d we write dt q i instead of q i to emphasize that we mean the time derivative rather than a coordinate in phase space. Lets show S = 0 Hamiltons equations.

S = =

(pi q i H)dt pi q i + pi q i H dt

60

From Lagrangians to Hamiltonians

then integrating by parts, S = = = pi q i pi q i H dt pi q i pi q i H i H q pi dt q i pi H H pi q i + q i pi i dt pi q H , pi H q i

This vanishes x = (q, p) if and only if Hamiltons equations qi = pi =

hold. Just as we hoped. Weve now seen two principles of least action: 1. For paths in conguration space Q, S = 0 Euler-Lagrange equations. 2. For paths in phase space X, S = 0 Hamiltons equations. Additionally, since X T Q, we might consider a third version based on paths in position-velocity space T Q. But when our Lagrangian is regular we have a dieomor phism : T Q X, so this third principle of least action is just a reformulation of principle 2. However, the really interesting principle of least action involves paths in the extended phase space, X , where we have an additional coordinate for time. Recall the action S(x) = = = (pi q i H) dt pi dq i dt H dt dt

pi dq i H dt

We can interpet the integrand as a 1-form = pi dq i H dt on X , which has coordinates {pi , q i , t}. So any path x : [t0 , t1 ] X gives a path : [t0 , t1 ] X t (x(t), t)

4.4 Waves versus ParticlesThe Hamilton-Jacobi Equations

61

and the action becomes the integral of a 1-form over a curve: S(x) = pi dq i H dt =

Why is this more interesting? you may ask, the answer is, just read onwards.

4.4

Waves versus ParticlesThe Hamilton-Jacobi Equations

(Week 7, May 9, 11, 13.) In quantum mechanics we discover that every particleelectrons, photons, neutrinos, etc.is a wave, and vice versa. Interestingly Newton already had a particle theory of light (his corpuscules) and various physicists argued against it by pointing out that diraction is best explained by a wave theory. Weve talked about geometrized optics, an approximation in which light consists of particles moving along geodesics. Here we start with a Riemannian manifold (Q, g) as space, but we use the new metric hij = n2 gij where n : Q (0, ) is the index of refraction throughout space (generally not a constant).

4.4.1

Wave Equations

Huygens considered this same setup (in simpler language) and considered the motion of a wavefront:

62

From Lagrangians to Hamiltonians

and saw that the wavefront is the envelope of a bunch of little wavelets centered at points along the big wavefront:
balls of radius centered at points of the old wavefront

1$%&(' 0) 1$%&(' 0) 1$%&(' 0) 1$%&(' 0) 1$%&(' 0) 1$%&(' 0) 1$%&(' 0) 1$%&(' 0) 1$%&(' 0) 1$%&(' 0) 1$%&(' 0) 1$%&(' 0) 1$%&(' 0)

In short, the wavefront moves at unit speed in the normal direction with respect to the optical metric h. We can think about the distance function d : Q Q [0, ) on the Riemannian manifold (Q, h), where d(q0 , q1 ) = inf (arclength)

where = {paths from q0 to q1 }. (Secretly this d(q0 , q1 ) is the least actionthe inmum of action over all paths from q0 to q1 .) Using this we get the wavefronts centered at q0 Q as the level sets {q : d(q0 , q) = c} or rather this is so for small c > 0, as depicted in Fig. 4.2. For larger c the level sets can

q0

Figure 4.2: cease to be smoothwe say a catastrophe occursand then the wavefronts are no longer the level sets. This sort of situation can happen for topological reasons (Fig. 4.3) or it can happen for geometrical reasons (Fig. 4.4). Assuming no such catastrophes occur, we

4.4 Waves versus ParticlesThe Hamilton-Jacobi Equations

63

level set of d(q0 , ) Figure 4.3:

Figure 4.4: can approximate the waves of light by a wavefunction: (q) = A(q)eik d(q,q0 ) where k is the wavenumber of the light (i.e., its color) and A : Q describes the amplitude of the wave, which drops o far from q0 . This becomes the eikonal approximation in optics5 once we gure out what A should be. Hamilton and Jacobi focused on distance d : Q Q [0, ) as a function of two variables and called it W =Hamiltons principle function. They noticed,

W (q0 , q1 ) = (p1 )i , i q1

q1 q0

p1

where p1 is a cotangent vector pointing normal to the wavefronts.


Eikonal comes form the Greek word for image or likeness, in optics the eikonal approximation is the basis for ray tracing methods.
5

64

From Lagrangians to Hamiltonians

4.4.2

The Hamilton-Jacobi Equations

Weve seen that in optics, particles of light move along geodesics, but wavefronts are level sets of the distance functions:
qqq qqVq qqq

at least while the level sets remain smooth. In the eikonal approximation, light is described by waves : Q (q1 ) = A(q1 )eik W (q0 ,q1 ) where (Q, h) is a Riemannian manifold, h is the optical metric, q0 Q is the light source, k is the frequency and W : Q Q [0, ) is the distance function on Q, or Hamiltons principle function: W (q0 , q1 ) = inf S(q)
q

where is the space of paths from q0 to q and S(q) is the action of the path q, i.e., its arclength. This is begging to be generalized to other Lagrangian systems! (At least retrospectively with the advantage of our historical perspective.) We also saw that q1 p1 q0

W (q0 , q1 ) = (p1 )i , i q1

points normal to the wavefrontreally the tangent vector pi = hij (p1 )j 1 points in this direction. In fact kp1i is the momentum of the light passing through q1 . This foreshadows quantum mechanics! (Note: in QM, the momentum is a derivative operatorwe get p by dierentiating the wavefunction!) Jacobi generalized this to the motion of point particles in a potential V : Q , using the fact that a particle of energy E traces out geodesics in the metric hij = 2(E V ) gij . m

4.4 Waves versus ParticlesThe Hamilton-Jacobi Equations

65

Weve seen this reduces point particle mechanics to opticsbut only for particles of xed energy E. Hamilton went further, and we now can go further still. Suppose Q is any manifold and L : T Q is any function (Lagrangian). Dene Hamiltons principle function W : Q Q by W (q0 , t0 ; q1 , t1 ) = inf S(q)
q

where = q : [t0 , t1 ] Q, q(t0 ) = q0 , & q(t1 ) = q1 and S(q) =


t0 t1

L q(t), q(t) dt

Now W is just the least action for a path from (q0 , t0 ) to (q1 , t1 ); itll be smooth if (q0 , t0 ) and (q1 , t1 ) are close enoughso lets assume that is true. In fact, we have p (q1 , t1 ) 1 (q0 , t0 )

W (q0 , q1 ) = (p1 )i , i q1

where p1 is the momentum of the particle going from q0 to q1 , at time t1 , and W = (p0 )i , (= momentum at time t0 ) i q0 W = H1 , (= energy at time t1 ) t1 W = H0 , (= +momentum at time t0 ) t0 (H1 = H0 as energy is conserved). These last four equations are the Hamilton-Jacobi equations. The mysterious minus sign in front of energy was seen before in the 1-form, = pi dq i H dt on the extended phase space X . Maybe the best way to get the Hamilton-Jacobi equations is from this extended phase space formulation. But for now lets see how Hamiltons principle function W and variational principles involving least action also yield the Hamilton-Jacobi equations. Given (q0 , t0 ), (q1 , t1 ), let q : [t0 , t1 ] Q

66

From Lagrangians to Hamiltonians

be the action-minimizing path from q0 to q1 . Then W (q0 , t0 ; q1 , t1 ) = S(q) Now consider varying q0 and q1 a bit

q t0 t1

and thus vary the action-minimizing path, getting a variation q which does not vanish at t0 and t1 . We get W = S
t1

=
t0 t1

L(q, q) dt L i L i q q dt q i q i L i q pi q i dt + pi q i q i L pi q i dt + pi q i q i
t1 t0 t1 t0

=
t0 t1

=
t0 t1

=
t0

the term in parentheses is zero because q minimizes the action and the Euler-Lagrange equations hold. So we have i i W = p1i q1 p0i q0 and so W W = p1i , and = p0i i i q1 q0 These are two of the four Hamilton-Jacobi equations! To get the other two, we need to vary t0 and t1 :
Now change in W will involve t0 and t1

t0 t0 + t0

t1 t1 + t1

4.4 Waves versus ParticlesThe Hamilton-Jacobi Equations

67

(you can imagine t0 < 0 in that gure if you like). We want to derive the Hamilton-Jacobi equations describing the derivatives of Hamiltons principle function W (q0 , t0 ; q1 , t1 ) = inf S(q)
q

where is the space of paths q : [t0 , t1 ] Q with q(t0 ) = q, q(t1 ) = q1 and


t1

S(q) =
t0

L(q, q) dt

where the Lagrangian L : T Q will now be assumed regular, so that T Q X T Q (q, q) (q, p) is a dieomorphism. We need to ensure that (q0 , t0 ) is close enough to (qi , t1 ) that there is a unique q that minimizes the action S, and assume that this q depends smoothly on U = (q0 , t0 ; q1 , t1 ) (Q )2 . Well think of q as a function of U:
(Q )2 uq dened only when (q0 , t0 ) and (q1 , t1 ) are suciently close.

q
(t0 , q0 )

(t1 , q1 )

Then Hamiltons principle function is W (u) := W (q0 , t0 ; q1 , t1 ) = S(q)


t1

=
t0 t1

L(q, q) dt pq H(q, p) dt
t0 t1

= =
t0

p dq H dt
C

where = pdq H(q, p)dt is a 1-form on the extended phase space X , and C is a curve in the extended phase space: C(t) = q(t), p(t), t X .

68

From Lagrangians to Hamiltonians

Note that C depends on the curve q , which in turn depends upon u = (q0 , t0 ; q1 , t1 ) (Q )2 . We are after the derivatives of W that appear in the Hamilton-Jacobi relations, so lets dierentiate W (u) =
C

with respect to u and get the Hamilton-Jacobi equations from . Let us be a 1-parameter family of points in (Q )2 and work out d d W (us ) = ds ds where Cs depends on us as above
Cs

... . . . .. . . .. . ........ .. . . . . . . . . .. ........... Bs . .. . . . . . . ... As .................. . . ........ . . . . . . ....... . .


C0
D

CB s

Lets compare
C0

and
As +Cs Bs

=
As

+
Cs

Bs

Since C0 minimizes the action among paths with the given end-points, and the curve As + Cs Bs has the same end-points, we get d ds =0
As +Cs Bs

(although As + Cs Bs is not smooth, we can approximate it by a path that is smooth). So d d d = at s = 0. ds Cs ds Bs ds As Note d d = (Ar ) dr ds As ds = (A0 ) where A0 = v is the tangent vector of As at s = 0. Similarly, d ds = (w)
Bs

4.4 Waves versus ParticlesThe Hamilton-Jacobi Equations

69

where w = B0 . So,

d W (us ) = (w) (v) ds where w keeps track of the change of (q1 , p1 , t1 ) as we move Cs and v keeps track of (q0 , p0 , t0 ). Now since = pi dqi Hdt, we get W = pi , 1 i q1 W = H t1 and similarly W = pi , 0 i q0 W = H. t0

So, if we dene a wavefunction: (q0 , t0 ; q1 , t1 ) = eiW (q0 ,t0 ;q1 ,t1 )/ then we get i = H1 , t1 i = p1 i q1 which at the time of Hamilton and Jacobis research was interesting enough, but nowadays it is thoroughly familiar from quantum mechanics!

Chapter 5 Symplectic Geometry

(Week 8, May 16, 18, 20.)

5.1

Towards Symplectic Geometry


= pi dq i H dt

Last time we saw that the 1-form

on the extended phase space X (q, p, t) can be integrated to get the action, with least action giving Hamiltons principle function W (q0 , t0 ; q1 , t1 ). Dierentiating this we get momentum and energythe HamiltonJacobi equations. These foreshadow quantum mechanics, since we can dene a wavefunction (q0 , t0 ; q1 , t1 ) = ei W (q0 ,t0 ;q1,t1 )/ which (approximately) gives the amplitude for a quantum particle to get from (q0 , t0 ) to (q1 , t1 )

least action path

wavefunction

(q0 , t0 )

(q1 , t1 )

70

5.1 Towards Symplectic Geometry

71

In full-edged quantum mechanics we instead calculate exactly by (q0 , t0 ; q1 , t1 ) =

ei S(q) Dq

where = q : [t0 , t1 ] Q | q(t0 ) = q0 & q(t1 ) = q1 and Dq makes no sense (not yet at leastexcept in certain special cases, like a particle in a potential on a Riemannian manifold Q). The case when = eiW/ is a good approximation to the full-edged path integral, is precisely the case when classical mechanics is approximately right, though really = eiW/ works better (the eikonal or WKB approximation). But in classical mechanics, people focus less on the extended phase space X than on the phase space X. The 1-form doesnt live on X, but something else does. If x : [t0 , t1 ] X is a path, then we get C : [t0 , t1 ] X with C(t) = (x(t), t), and then S=
C

=
C

pi dq i H dt

and if x satises Hamiltons equations, so H is conserved then, S = E(t1 t0 ) +


X

pi dq i

(doing the integral over dt leaves us just with a 1-form to integrate on X), and where E is the energy. So people focus on = pi dq i on X. If the system executes periodic motion, we can take x to be a loop on X, and the

X
x

action is S = E period +
x

= E period +

72

Symplectic Geometry

if is any surface with = x by Stokes theorem (this works provided x is a contractible loop). Note d = dpi dq i is a 2-form on Xa lot of theorists focus on this: the symplectic structure. In the BohrSommerfeld old quantum mechanics, energy eigenstates correspond to periodic orbits for which dpi dq i = n2 = nh

so that qi
R

dpi dq i

= 1.

In the modern symplectic geometry approach, people focus on the 1-form on T Q and especially on the 2-form := d. In these lectures we will follow suit and (re)develop classical mechanics starting with these objects. The point of all the extra work is to gain deeper insight through generalizations and geometrization of physics.

5.2

The Canonical Forms on the Cotangent Bundle

These forms will be the natural objects of study in classical mechanics in the Hamiltonian approach. Why? Because one can integrate forms on a manifold and, as we have seen, the Hamiltonian picture involves a symplectic manifold, the cotangent bundle T Q of the conguration space Q. But we will see that symplectic geometry is a powerful tool because with it we can extend the study of classical mechanics to more general phase spaces X that are not subsets of T Q. Well get to this later.

5.2.1

The Canonical 1-form on the Cotangent Bundle


= pi dq i

First lets describe the 1-form, in a coordinate free way (which as usual means looking for unique mappings between appropriate spaces). Any tangent vector at (q, p) T Q is of the form v = ai + bi i q pi

using coordinates (q i , pi ). Note that this means v T(q,p) T Q. So (v) = pi ai .

5.2 The Canonical Forms on the Cotangent Bundle

73

Alternatively, let : T Q Q (q, p) q and we claim that (v) = p d(v) p Tq Q d(v) Tq Q. So for any manifold Q (which physically is the conguration space in our classical mechanics exposition) the cotangent bundle T Q has a 1-form on it: = pi dq i in local coordinates (q i , pi) on T Q coming from local coordinates on Q. Given any tangent vector v T(q,p) T Q, we can write v = ai + bi i q pi

thus this eats an element (q, p) of the cotangent space T Q and spits out a tangent vector belonging to T(q,p) . Then we declare (v) = pi ai . This is called the canonical 1-form on T Q because its coordinate independent. We have : T Q Q (q, p) q and thus d : T(q,p) T Q Tq Q and in fact (v) = p d(v) , v T(q,p) T Q this is the coordinate-free description of . Lets check that this is true, i.e., lets show that (v) = p(d(v)) = pi ai . Well, if v = ai + bi i q pi

74

Symplectic Geometry

then d(v) = ai

q i

Tq Q

where now { qi } is a basis of tangent vectors on Q. We can then write out

p d(v) = (pj dq j )(ai = pi ai ,

j ), and dq i = i i i q q which is exactly as required.

5.2.2

The Symplectic 2-form on the Cotangent Bundle


= d.

Symplectic geometry focuses not on but on the 2-form

Well write down Hamiltons equations using and show that (unlike ) is invariant under the resulting time evolution maps T Q T Q (or X X).

If we isolate the key properties of that make this work, we can generalize classical mechanics to systems where the phase space X is not T Q or some open subset of T Q. For example, a classical spinning point particle has X = S 2 :
v njf rf J vr { h `TP F   j f  B8 h41  $z (1 1  4 ( 8B $ FP  T` gr | v rv fjn

S2

Such phase spaces look locally but not globally like some cotangent bundle, and not in any canonical way. Here are the key properties of : 1. is a 2-form on X (in this case X = T Q) 2. is closed: d = 0 (in this case exact, since = d) 3. is nondegenerate:
: Tx X Tx X v (v, )

5.3 Hamiltons Equations on a Symplectic Manifold

75

is an isomorphism (its 1-1, hence onto). In our example if v = ai qi + bi pi , then

(v) = (v, ) = dpj dq j ai

, + bi i q pi

= bj dq j aj dpj which is nonzero if v is nonzero: dpi q i dq i : pi : which should remind you of Hamiltons equations! To summarize, we dene a symplectic structure on a manifold X to be a nondegenerate closed 2-form on X, and we then call (X, ) a symplectic manifold. Now let X be any symplectic manifold (e.g., T Q). Given any function H : X (like our Hamiltonian) we can dene a vector eld vH on X by: vH = 1 (dH) and then Hamiltons equations say that a state x(t) X evolves as follows: d x(t) = vH x(t) dt

PP vH PP P P PPPPPPPPP PPP PP PP PP PP

dH

5.3

Hamiltons Equations on a Symplectic Manifold


: Tx X Tx X v (v, )

Let (X, ) be a symplectic manifold: is a nondegenerate closed 2-form, so we get

76

Symplectic Geometry

an isomorphism. For any H C (X) we thus get the Hamiltonian vector eld
y

vH = 1 (dH)

QQ Q

level sets of H

If X is the space of states of a classical system, we describe how a state x X evolves in time using the curve x(t) X satisfying Hamiltons equations: d x(t) = vH x(t) dt with x(0) = x as its initial conditions. (Note that x is being used both for the initial state x X and the curve x : (a, b) X and x(0) = x.) Its easy to nd nonintegrable examples: X = T p2 H= q4 2m Here the particle shoots o to in nite time. There are even solutions in the 5-body Newtonian gravity problem where bodies shoot o to in nite time [Xia92]. Lets see what these Hamiltons equations look like when X = T Q (not always true, but its nice to see what this special type of system yields). Then = dpi dq i and we have : Tx Q Tx Q dpi q i dq i pi so
1 : Tx Q Tx Q dq i pi dpi i q

5.4 Darbouxs Theorem

77

so given H : T Q we get vH = 1 (dH) H i H dq + dpi = 1 q i pi H H = i q pi pi q i and thus Hamiltons equations say d x(t) = vH x(t) dt or d i H q x(t) = vH (q i ) = dt pi H d pi x(t) = vH (pi ) = i dt q

and

Alas, these have a minus sign as compared with the usual Hamiltons equations; to cure this we should dene = d = dqi dpi and so get H dq i , = dt pi dpi H = i dt q

Weve just discovered that any symplectic manifold (M, ) has a dual or time-reversed version (M, ).

5.4

Darbouxs Theorem

This condition seems special to the case of a cotangent bundle X = T Q, but in fact we have: Theorem 5.1 (Darbouxs theorem.). If (X, ) is any symplectic manifold and x X, then you can nd symplectic coordinates pi , q i , (i = 1, . . . , n) in a neighborhood of x such that = dqi dpi A short way of stating Darbouxs theorem is to use some jargon and say every symplectic manifold of xed dimension is locally symplectomorphic, That is, every 2ndimensional symplectic manifold can be made to look locally like the linear symplectic

78

Symplectic Geometry

space C n with its canonical symplectic form. The above theorem is then just a denition of what we mean by a symplectomorphism: Let (M1 , 1 ) and (M2 , 2 ) be symplectic manifolds. A map : M1 M2 is a symplectomorphism if it is a dieomorphism and the pullback of 2 under is equal to 1 : 2 = 1 . The Wikipedia entry contrasts symplectic and Riemannian manifolds: This result implies that there are no local invariants in symplectic geometry: a Darboux basis can always be taken, valid near any given point. This is in marked contrast to the situation in Riemannian geometry where the curvature is a local invariant, an obstruction to the metric being locally a sum of squares. It should be emphasized that the dierence is that Darbouxs theorem states that can be made to take the standard form in an entire neighborhood around a point p M. In Riemannian geometry, the metric can always be made to take the standard form at any given point, but not always in a neighborhood around that point. Summarizing so far: 1. all symplectic manifolds are even-dimensional 2. all symplectic manifolds of the same dimension are locally alikeunlike Riemannian manifolds. We can think of any F C (X) as an observable: it assigns to any state x X a number F (x) namely the result of measuring something about x. We know how states evolve in time: d x(t) = vH x(t) . dt How do observables evolve in time? Given an observable F C (X) and a time t we get a new observable Ft C (X) by Ft (x) = F x(t) . We then have, noting, d F x(t) = F x (t) = dF x (t) = vH F x(t) dt where weve used the denition of the 1-form df derived from any smooth function f (a 0-form) on a manifold M, that acts on vector elds v on M by df (v) = vf (the directional

5.4 Darbouxs Theorem

79

derivative of f in the direction of v). So (applying the above chain rule and Hamiltons eqns.), d d Ft (x) = F x(t) dt dt = vH F x(t) = vH F (x) Abstracting this result: d Ft = vH Ft dt for the time evolution of an observable F .

Chapter 6 Poisson Brackets

(Week 9, May 23, 25, 27.) You can view a physical system as a point at some initial position in the phase space or conguration space and describe its (the whole system!) time evolution by giving the motion of the point as a function of time. Thats like the Schrdinger picture in quantum o mechanics. Alternatively, you can view the same system as a bunch of objects that have certain properties (like mass, charge, position, velocity) and consider the time evolution as a change in those observable properties, which is like the Heisenberg picture in quantum mechanics. In this chapter well begin with the analogous pictures for classical mechanics, then quickly move on to Poisson brackets and Noethers theorem using Poisson brackets, and then take a look at Possion manifolds. All this will be in aid of proving Liouvilles theorem for symplectic manifolds. This will justify all the mathematics because the usual way that physicists derive Liouvilles theorem (the distribution function is constant along any trajectory in phase space.) uses physical assumptions like conservation of particles, whereas Liouvilles theorem, in the context of symplectic geometry, is a deeper result that generalizes the invariance of the volume measure on phase space to a theorem about the symplectic 2-form.

6.1

The Schrdinger Picture and the Heisenberg Pico ture in Classical Mechanics

In the Schrdinger picture observables are xed while states evolve in time; in the Heiseno berg picture states are xed while observables evolve in time: 80

6.1 The Schrdinger Picture and the Heisenberg Picture in Classical Mechanics o

81

Schrdinger picture o x X is a state x x(t) where d x(t) = vH x(t) dt

and

Heisenberg picture F C (X) is an observable F Ft where d Ft = vH Ft dt

F x(t) = Ft (x) is the result of measuring F in a state x after waiting a time t . We dene the Poisson bracket of F, G C (X) by {F, G} = vF G C (X) In this language, Hamiltons equations become d Ft = {H, Ft } dt Example: If X = T Q then = dq i dpi (notice the dierence with the initial development in 5.3 where we used dpi dq i, weve switched to dq i dpi in order to obtain agreement with the Euler-Lagrange equations). So we have : v (, ) dpi q i dq i pi so vF = 1 (dF ) F F = i i pi q q pi so lastly {F, G} = vF G F G F G = pi q i q i pi

82

Poisson Brackets

6.2

The Hamiltonian Version of Noethers Theorem


{H, F } = 0 F is a conserved quantity

Suppose (X, ) is a symplectic manifold, H C (X) has vH integrable. Then

where we say F is conserved (for the time evolution generated by H) if Ft = F, t . Suppose also that vF is integrable and let x x(s) be the ow generated by vF : d x(s) = vF x(s) . ds We say F generates symmetries of H if H(x) = H x(s) , Then we have a theorem. Theorem 6.1. F is a conserved quantity F generates symmetries of H Proof. F is a conserved quantity F = Ft , t d Ft = 0, t dt {H, Ft } = 0, t {H, F } = 0 and () is true too: since Ft and F solve 1st order ODE we get F = Ft . Next,
d () dt

= {H, ()}, so by uniqueness of solutions of

F generates symmetries of H H(x) = H x(s) , s d H x(s) = 0 dt vF (H) = 0 {F, H} = 0 by exactly the same argument as before. So to nish the proof we just need {F, G} = {G, F }

6.2 The Hamiltonian Version of Noethers Theorem

83

which is true in the previous example, but also in general, as we will now show: vF = 1 (dF ) (vF ) = dF (vF , ) = dF () (vF , vG ) = dF (vG ) = vG (F ) = {G, F } so {F, G} = vF G = (vG , vF ) = (vF , vG ) = {G, F }.

In short, its the antisymmetry of that says F generates symmetries of H H generates symmetries of F In the next few exercises well start to see how the Poisson bracket denes an algebra. After a few lemmas presented as examples well prove a theorem and dene a Poisson algebra. Example: Taking F = H, we see that energy is conserved! d Ht = 0 since {H, H} = 0 dt by antisymmetry of {, }. Example: If F and G are conserved quantities, then aF + bG, a, b F G, and {F, G} are all also conserved quantities. First, aF + bG is a conserved quantity because {H, aF + bG} = vH (aF + bG) = avH F + bvH G = a{H, F } + b{H, G}.

84

Poisson Brackets

Next, F G is a conserved quantity because {H, F G} = vH (F G) = (vH F )G + F vH G = {H, F }G + F {H, G}. Finally, {F, G} is conserved since we have the Jacobi identity: {H, {F, G}} = {{H, F }, G} + {F, {H, G}} but this is trickier! This last result uses the fact that is closedand is equivalent to this fact. To prove the Jacobi identity we use the fact that is closed, and this formula d : p (X) p+1 (X) which is dened by (vi Vect(X), p (X)), d(v0 , . . . , vp ) =
0ip

vi (v0 , . . . , vi , . . . , vp ) + (1)i+j vi ([vi , vj ], v0 , . . . , vi , . . . , vj , . . . , vp )


0i<jp

(the second sum will vanish for coordinate vector elds). These imply 0 = d(vF , vG , vH ) = vF (vG , vH ) vG (vF , vH ) + vH (vF , vG ) ([vF , vG ], vH ) + ([vF , vH ], vG ) ([vG , vH ], vF ) and note that vF (vG , vH ) = vF {H, G} = {F, {H, G}} and (vG , ) = dG() (vG , vH ) = dG(vH ) = vH G = {H, G} and so [vF , vG ], vH = vH , [vF , vG ] = dH [vF , vG ] = [vF , vG ](H) = vF vG H + vG vF H = {F, {G, H}} + {G, {F, H}}

6.3 Poisson Algebras and Poisson Manifolds

85

Using the symmetry of {, }, the nine terms we get from above reduce to three and we obtain the Jacobi identity: {H, {F, G}} = {{H, F }, G} + {F, {H, G}}. Now we can prove the theorem relating symplectic manifolds to a Poisson algebra.

6.3

Poisson Algebras and Poisson Manifolds

First a denition: a Poisson algebra denoted C (X), {, } , has the following properties, 1. C (X) is a commutative associative algebra (with usual operations +, , etc.) 2. C (X), {, } is a Lie algebra: (a) {F, aG + bH} = a{F, G} + b{F, H} (b) {F, G} = {G, H} (c) {F, {G, H}} = {{F, G}, H} + {G, {F, H}} (Jacobi identity). 3. For any F C (X), {F, } is a derivation of C (X), meaning its linear and satises a Leibniz rule: {F, GH} = {F, G}H + G{F, H}. Theorem 6.2. If (X, ) is a symplectic manifold then (X, {, }) is a Poisson manifold, i.e., C (X), {, } is a Poisson algebra. (Proof omitted. But weve already done most of the work for the proof.) If (X, {, }) is a Poisson manifold and F C (X), then {F, } is a derivation, so (by a nice theorem) theres a unique vector eld vF Vect(X) such that {F, G} = vF G. Theorem 6.3. If X, {, } is a Poisson manifold then v : C (X) Vect(X) F vF is a Lie algebra homomorphism: meaning that its linear and v{F,G} = [vF , vG ]

86

Poisson Brackets

Proof. {F, } and thus also vF are linear in F . Also, v{F,G} H = {{F, G}, H} = {F, {G, H}} + {{F, H G} = {F, {G, H}} {G, {F, H}} = vF vG H vG vF H = [vF , vG ]H.

6.3.1

A Poisson Manifold that is Not Symplectic

Heres an example of a Poisson manifold that is not a symplectic manifold: 3 is odddimensional, hence not symplectic, but we can dene {, } on 3 by {x, y} = z {y, z} = x {z, x} = y where x, y, z are the coordinate functions 3 . The brackets are the Gibbs vector cross products in disguise. Using the fact that {F, } is a derivation, one can calculate {F, G} for all polynomials. Then by some approximation argument one can dene {F, G} for all F, G C (X) it has spheres as symplectic leaves 
  i f sm u e w ~ d  G q T 1 % 7 1

In fact, any Poisson manifold has a foliation of symplectic manifolds.

6.4

Liouvilles Theorem

Not to bore you with history, but we should at least recall that Liouvilles theorem is the fundamental result that forms the bedrock of both equilibrium and non-equilibrium statistical mechanics. In equilibrium statistical mechanics the stationary state solutions

6.4 Liouvilles Theorem

87

to Liouvilles equation for the density of states is satised by the Maxwell-Boltzmann distribution function. In non-equilibrium statistical mechanics one obtains the Vlasov equation for a collisionless system (a rare gas) which is incredibly useful in astrophysics and for describing collisionless plasmas. Generalized to collisional gases one obtains the Boltzmann transport equation (of fundamental importance to the physics of gases, uids, and for mass, heat & radiation transport). Thats the cursory history. Now for the modern treatment, though well be stopping well short of applying the theory.

6.4.1

Phase Space Volumes

Let our conguration space be Q = and our phase space be X = T Q 2 (q, p), = and let 1 H = q 2 + p2 2 be the harmonic oscillator Hamiltonian. Then Hamiltons equations say dq H = =p dt p H dp = = q dt q The drawing here depicts a phase space region at dierent times (moving harmonically around the outer circle path), as well as the state at a point (q, p) X: p

@ @ hhh(q,p) hhhh @hh@@@ hhhh h q @@ @@ @@ @@

vH

the phase space volume (in 2D itll be an area) around a point doesnt change with time evolution. So the time evolution x x(t) is rotation by t clockwise. As a region R X evolves in time its area is preservedor conserved! In fact this is completely general. For any symplectic manifold and any (integrable) Hamiltonian, the ow x x(t) given by Hamiltons equations d x(t) = vH x(t) dt

88

Poisson Brackets

preserves the symplectic structure . In our example = dq dp is the area element: Area(R) =
R

Weve just examined a 1D example that had a 2D symplectic phase space. For a general 2n-dimensional symplectic manifold,
n

=
i=1

dqi dpi

in symplectic coordinates, and n = . . . , (n copies of get wedged here) = n! dq1 dp1 . . . dqn dpn is a 2n-form which measures volume, and n /n! is called the Liouville form. We can now state the modern version of Liouvilles theorem. Theorem 6.4 (Liouvilles theorem.). Let (X, ) be any symplectic manifold and let H C (X) be such that vH is integrable, so that Hamiltons equations determine the time evolution: t X X x x(t). Then is preserved by t : and thus t = t n = . n! n!

Proof (sketch). In general if v is any integrable vector eld on any manifold X, we get a ow
ee P

t : X X x x(t) d x(t) = v x(t) dt x

mm T U oo

G F G

D yy 9 yy 9

x(t)

dd I

6.4 Liouvilles Theorem

89

and for any dierential form p (X) we have d dt t where Lv : p (X) p (X) is called the Lie derivative, which can be computed by Weils formula; Lv = iv d + div where d : p (X) p+1 (X) is the exterior derivative, and iv : p (X) p1 (X) is the interior product, given by (iv )(v1 , . . . , vp1) := (v, v1, . . . , vp1 ) (Weils formula isnt hard to prove, but we wont do it now.) In our situation we note, dH() = (vH , ) = ivH () so, d dt t = Lv = ivH d + divH , = divH = ddH = 0. One can similarly show but d = 0 ( is closed), so = Lv

t=0

t=0

d = 0 for any time t so dt t = . t

Great, so is conserved for integrable vH . But we need to justify the use of Weils formula.

90

Poisson Brackets

6.4.2

Weils formula
d : p p+1 grade 1 Lv : p p grade 0 iv : p p1 grade -1

Why is Weils formula true? Well, in fact we have the following grade changing operators

(OK, so Lv changes the grade by zero), now all of these are superderivatives or graded derivatives. A superderivative of grade k is a linear operator D : p p+k such that D( ) = D + (1)pk D (for p and k ). If D, D are superderivatives of grades k, k , then their supercommutator is again a derivation: [D, D ] = DD (1)kk D D. In particular, Weils formula says Lv = [d, iv ] = div + iv d, and we can reduce proving it to proving Lv and [d, iv ] agree on 1-forms, by the product rule. Before moving on to the next chapter lets pause to consider what weve accomplished above. Weve used almost purely geometric considerations to prove Liouvilles theorem, a theorem normally thought of by engineers and physicists as something very physical about conservation of particles. Yet the only physics weve touched upon is the symplectic structure of the phase space for classical mechanics systems, but theres no need above to mention that we have a physical system, we could be talking about something totally dierent, something of pure mathematical interest. This is not unusual in mathematical physics, and far from telling us that we are getting too far divorced from physics it is quite the opposite, in fact it tells us that physics is in some sense simpler (or deeper) than we might at rst imagine, in the sense that a lot of physics can be done by considering the geometric structure of things. In fancy philosophical lingo (which we make no real pretenses to understand!) one might even say that weve been deconstructing 19th and 20th century physics, taking out a lot of the turgid assumptions and traditional lore, and examining the remaining bare bones. In the process weve reconstructed classical mechanics more along the lines of Einsteins model of general relativity, which was deeply about the geometry of spacetime.

Chapter 7 Introduction to Geometric Quantization

(Week 10, June 1, 3, 5.) In this chapter, especially in the nal section 7.3 we have condensed a lot of information for which glossary entries are provided as additional help to the reader, and hopefully as a spur to further independent study. Some recommendations for further study are given at the end of the chapter.

7.1

A Taste of Geometric Quantization

In Schrdingers approach to quantum mechanics, you take the conguration space Q of o a classical system, give it a Riemannian metric, and then form a Hilbert space L2 (Q, vol) whose unit vectors are states in the corresponding quantum system, and where vol is the measure coming from the Riemannian metric. In geometric quantization we instead construct the Hilbert space from the phase space X, which is a symplectic manifold. De Broglie and then Bohr and Sommerfeld noticed a quantization condition which picked out the allowed quantum orbits among the classical ones in say, the harmonic 91

92

Introduction to Geometric Quantization

oscillator or hydrogen atom models: p

c c

If the orbit is some loop : S X, its an allowed orbit if the phase e


i /

=1

where is the canonical 1-form on X (if X = T Q) and is Plancks constant ( = h/2). Here is a term in the action for the path : [0, 1] X in the extended phase iS space, and e is what appears in the wave approach to classical mechanicsthe eikonal approximation relating particles to waves. So were demanding that we get complete constructive interferencethat our particles wavefunction comes back into phase when the particle traverses the loop . When X = T Q we have =
D

d =
D

where D is any disk with boundary D = . If Q = , = dp dq so


c

= area of D.
D

The Bohr-Sommerfeld quantization condition says e or = 2n = nh


D 1 which almost agrees with the modern formula (n + 2 )h. i D /

=1

7.1 A Taste of Geometric Quantization

93

In modern terms, A = i/ is a connection on some U(1) bundle over T Q, e


i /

U(1) i

is the holonomy of this connection around the loop , and F =

is the curvature of A. These results are specic to the abelian U(1) bundle. So, in geometric quantization, we begin to quantize a system with phase space (X, ) by nding a U(1) bundle P X with a connection A such that the curvature F of A is i/ . 2 We can do this if and only if / is an integral closed 2-form, i.e., [/ ] HdeRham (X) is the image of i H 2 (X, ) H 2 (X, ) HdeRham (X) = 2 Given any U(1) bundle P , its rst Chern class c1 (P ) H 2 (X, ) is a complete invariant of U(1) bundles, and 2 i c1 (P ) HDR (X) equals F 2i where F is the curvature of any connection on P . The square brackets [] denote an equivalence class of curvatures derived from their respective available connections on P . If you need to review some of these concepts then the glossary at the end of this book may help, there are some entries for cohomology and Chern class. Example: 3 is a Poisson manifold with {x, y} = z {y, z} = x {z, x} = y This is the phase space of a spinning point particlei.e., J 3 represents angular momentum. This is not symplectic, but its foliated by symplectic leavesspheres centered at the origin
{|J|  tt   t tt f t t t n j tt u f s | i ttt 1 1 1 1 t tt tt tt tt tt tt tt tt tt t

= j} = Xj

94

Introduction to Geometric Quantization

The sphere of radius j, Xj is the phase space for a particle of total angular momentum j. In fact Xj has an integral symplectic structure when 1 3 j = 0, , 1, , . . . 2 2 The phase space for a spinning point particle of total angular momentum j [0, ) is Xj = {J 3 : J = j}, a sphere with Poisson structure such that {x, y} = z {y, z} = x {z, x} = y.

One can check that this comes from a symplectic structure on Xj : {f, g} = (dg, df ) The 2-from is a multiple of the area 2-form on the sphere Xj , normalized so that = 4j,
Xj

(note not 4j 2 !)

Wed like to quantize the system if possible, so the question is, when is integral? Theres a theorem that says a closed 2-form on some manifold X is integral if for any 2D surface S mapped into X, .
S

Were using the symplectic leaves take note (an odd dimensional manifold like 3 is not itself symplectic). Here what we really want is
Xj

, 2

(taking

= 1)

This happens when


Xj

1 = 4j = 2j 2 2

3 1 j = 0, , 1, , . . . 2 2 the usual quantization condition for angular momentum! When this condition holds, there exists a U(1) bundle P Xj with a connection A whose curvature F 2 (Xj , AdP ) = 2 Xj , (1) = 2 (Xj , i) is U(1) F = i,

that is,

(1) i = 1

7.2 Khler Manifolds a

95

We then build the Hilbert space Hj of states of a quantum spin-j particle using P as follows. We form an associated vector bundle L = P U (1) using the God-given action of U(1) on . This sort of bundle, with a complex line bundle, or usually a line bundle
LX =

as bres, is called

(x)

Xj
A section of L looks locally like a complex function on the phase space. The Hilbert space Hj will consist of certain sections of L. We shouldnt use all L2 sections, since we want Hj to be nite dimensional! To get the right answer, we think of Xj as the Riemann sphere P 1 = {}. This lets us do complex analysis on Xj , and so dene holomorphic sections of L. Then we let Hj be the space of these holomorphic sections. Then dim Hj = (2j + 1) as desired (for integral for geometric quantization). We make Hj into a Hilbert space as follows: U(1) acts on in a way that preserves the inner product on , so the bres Lx of L get an inner product. This lets us dene , =
Xj

(x), (x)

So much for quantized spinning particles. We now want to know how to generalize the above.

7.2

Khler Manifolds a

So whats going on in general? Our phase space X starts out as an integral symplectic manifold, which then acquires a line bundle L X. To dene holomorphic sections of L we need to equip X with extra structurenamely a Khler structure. a Denition 7.1. An almost Khler structure on a manifold X is a

96

Introduction to Geometric Quantization

1. a linear operator Jx : Tx X Tx X depending smoothly on x, such that


2 Jx = 1

This makes Tx X into a complex vector space. 2. a complex inner product ,


X:

Tx X Tx X

depending smoothly on X. This makes Tx X into a (nite dimensional) Hilbert space. So a Khler manifold involves three ingredients, it is complex, it carries a Riemannian a metric and a symplectic form on its underlying real manifold. Given this, we have u, v
X

= u, v X + i u, v = g(u, v) + i(u, v)

where g is a Riemannian metric and is a nondegenerate 2-form Denition 7.2. An almost Khler manifold is Khler if is closed, i.e., a symplectic a a structure. If you have this then you can do geometric quantization. To recap: geometric quantization is a mathematical approach to obtain a quantum theory from a given classical theory, in the spirit of Bohrs original correspondence principle but in modern form and in reverse! The idea is to carry out quantization while preserving certain analogies between the classical theory and the quantum theory (such as the time evolution given by the Hamiltonian, i.e., the analogy between Hamiltonian mechanics and the Heisenberg picture in QM).

7.3

General Geometric Quantization

With the above remarks in mind we can now give a terse summary of geometric quantization. Here is an eight-step synopsis (not exactly a mystical eight-fold way, but if you have not followed these lectures carefully it will seem mystical to you!). 1. We start with a classical phase space: mathematically, this is a manifold X with a symplectic structure

7.3 General Geometric Quantization

97

2. Then we do prequantization: this gives us an Hermitian line bundle L over X, equipped with a U(1) connection D, whose curvature equals i. L is called the prequantum line bundle. Warning: we can only do this step if satises the Bohr-Sommerfeld condition, which says that /2 denes an integral cohomology class . If this condition holds, L and D are determined up to isomorphism, but not canonically. 3. The Hilbert space H0 of square-integrable sections of L is called the prequantum Hilbert space. This is not yet the Hilbert space of our quantized theoryits too big. But its a good step in the right direction. In particular, we can prequantize classical observables: theres a map sending any smooth function on X to an operator on H0 . This map takes Poisson brackets to commutators, just as one would hope. The formula for this map involves the connection D. 4. To cut down the prequantum Hilbert space, we need to choose a polarization, say P . Whats this? Well, for each point x in X, a polarization picks out a certain subspace Px of the complexied tangent space at x. We dene the quantum Hilbert space, H, to be the space of all square-integrable sections of L that give zero when we take their covariant derivative at any point x in the direction of any vector in Px . The quantum Hilbert space is a subspace of the prequantum Hilbert space. Warning: for P to be a polarization, there are some crucial technical conditions we impose on the subspaces Px . First, they must be isotropic: the complexied symplectic form must vanish on them. Second, they must be Lagrangian: they must be maximal isotropic subspaces. Third, they must vary smoothly with x. And fourth, they must be integrable. 5. The easiest sort of polarization to understand is a real polarization. This is where the subspaces Px come from subspaces of the tangent space by complexication . It boils down to this: a real polarization is an integrable distribution P on the classical phase space where each space Px is Lagrangian subspace of the tangent space Tx X. 6. To understand this rigmarole, one must study examples! First, its good to understand how good old Schrdinger quantization ts into this framework. Remember, o in Schrdinger quantization we take our classical phase space X to be the cotangent o bundle T M of a manifold M called the classical conguration space. We then let our quantum Hilbert space be the space of all square-integrable functions on M. Modulo some technical trickery, we get this example when we run the above machinery and use a certain god-given real polarization on X = T M, namely the one given by the vertical vectors 7. Its also good to study the Bargmann-Segal representation , which we get by taking X = n with its god-given symplectic structure (the imaginary part of the inner

98

Introduction to Geometric Quantization

product) and using the god-given Khler polarization. When we do this, our quana tum Hilbert space consists of analytic functions on n which are square-integrable with respect to a Gaussian measure centered at the origin. 8. The next step is to quantize classical observables, turning them into linear operators on the quantum Hilbert space H. Unfortunately, we cant quantize all such observables while still sending Poisson brackets to commutators, as we did at the prequantum level. So at this point things get trickier and my brief outline will stop. Ultimately, the reason for this problem is that quantization is not a functor from the category of symplectic manifolds to the category of Hilbert spacesbut for that one needs to learn a bit about category theory and thats for another book. If one is interested in the interplay between geometry and quantization then the online Quantum Gravity Seminar notes (http://math.ucr.edu/home/baez/QG.html) might be a good place to start. There are few introductory level books dealing with the correspondences between classical mechanics and quantum mechanics, but a good way to dig deeper into the geometry would be [BM94], for more on gauge theory try [OR97], for a sweeping overview of the geometry of physics try [Pen05] for starters. All of these books cite numerous reference works that should keep you busyat least until youve developed enough skills to undertake original research. For an in-depth exposition of classical mechanics theory try Arnolds text [Arn78]its Kung Fu is potent and stylistically unique. The traditional classical mechanics text by Goldstein [GPS02] provides a good background with plenty of example problems in case you want to develope specic mental muscles for mastering technical problem solving in classical mechanics without resorting to numerical methods. A nal word or two for students of physics is in order. At the end of the previous chapter we discussed the importance of reducing physics (in this case classical mechanics) to the simplest or deepest structure. We can probably never say that weve found the deepest description of natures physics, but the intention is still worthwhile. Its worthwhile framing classical mechanics in the mathematical garb of this lecture note volume precisely because it makes it easier to see connections with other branches of physics, such as general relativity and quantum mechanics. If you havent had fun on this journey then perhaps at least weve shown that the mental pain was worth it! When we peel back the layers that most high school and undergraduate textbooks lather the subject with then we can more lucidly see that physics does seem to be a little more comprehensible as a whole than the disparate textbook treatments at rst seem to convey (most notably the initially baing disparity between the formalisms of classical physics and quantum mechanics). We hope that some of this disparity has been removed and that the diligent student will be motivated to weave the same geometric themes of this small book into their own future research and study.

Glossary
adjoint Suppose H is a Hilbert space, with inner product , . Then for a continuous linear operator A : H H (this is the same as a bounded operator), using the Riesz representation theorem, one can show that there exists a unique continuous linear operator A : H H with the following property: A(x), y = x, A y for all x, y H. This operator A is the adjoint of A. More generally one can dene an adjoint T : W V for any linear operator T : V W between vector spaces with an inner product by T v, w = v, T w . The operation of taking the adjoint has some key properties: (1) A = A, (2) if A is invertible, so is A . (3) (A+B) = A +B , (4) (A) = A , where denotes the complex conjugate of the complex number , nally (5) (AB) = B A is often used. (Source: Wikipedia) 98 Bargmann-Segal representation Also called the coherant state representation or phase space representation. Everyone bumps into the harmonic oscillator in quantum mechanics, but rather few get to see its full beauty. There are three main representations to learn: (1) the Schrdinger representation, (2) the Heisenberg (or Fock) representation, (3) the Bargmann-Segal representation. The rst should really be called the wave representation. In this representation we diagonalize the position operator, thinking of the state as a function on position space. Similarly, the second should be called the particle representation. In this we diagonalize the energy, thinking of the state as a linear combination of states |n > having n quanta of energy in them. The equivalence of these rst two representations is the basis of wave-particle duality in quantum eld theory. The third representation could be called the complex wave representation. At least thats what its called in the book Introduction to Algebraic and Constructive Quantum Field Theory, where the rst representation is called the real wave representation. In some rough sense, this representation diagonalizes the creation operators. Of course, not being self-adjoint, the creation operators cant be diagonalized in the usual sense. There is a nice substitute, however. In the complex wave 99

100

Glossary

representation, we think of phase space as a complex vector space using the trick Z = q + ip, and then think of states as analytic functions on phase space. Then the creation operator becomes multiplication by Z. Everyone who studies quantum mechanics learns about the rst two representations. The third, while in many ways the most beautiful, is somewhat less widely known. Its lurking in the background whenever you relate quantum mechanics to the classical phase space and drag in the Z = q + ip trick. If you master these three basic viewpoints on the harmonic oscillator, its a snap to generalize to quantum eld theory, at least for free quantum elds, which are just big bunches of harmonic oscillators. (Source: Wikipedia) 97 category theory You dont know category theory yet? If sets provide a foundation for mathematics based on formalising the idea of ownership or membership then categories can provide an alternative foundation formalising the notion of relationship or map (arrow). A category (C, Hom) consists of the following three mathematical entities: (1). A class Ob(C) of objects, (2). A class Hom(C) of morphisms. Each morphism f has a unique source object X and target object Y . We write f : X Y , and we say f is a morphism from X to Y . We write Hom(X, Y ) to denote the hom-class of all morphisms from X to Y . (Some authors write Mor(X, Y ) or C(X, Y ).) (3). A binary operation , called composition of morphisms, such that for any three objects X, Y , and Z, we have Hom(X, Y ) Hom(X, Z) Hom(X, Z). The composition of f : X Y and g : Y Z is written as g f or gf . (Some authors write f g.), and all of this stu is governed by two axioms: (Axiom.1) Associativity: If f : U V , g : W X and h : Y Z then h (g f ) = (h g) f , and (Axiom.2) Identity: For every object X, there exists a morphism 1X : X X called the identity morphism for X, such that for every morphism f : X Y , we have 1Y f = f = f 1X . From these axioms, it can be proved that there is exactly one identity morphism for every object. Some authors deviate from the denition just given by identifying each object with its identity morphism. Relations among morphisms (such as f g = h) are often depicted using commutative diagrams, with point (corners) representing objects and arrows representing morphisms. The inuence of commutative diagrams has been such that arrow and morphism are now synonymous. (Source: Wikipedia) 98 Chern class The kth Chern class, denoted ck (E) of the vector bundle E over a manifold M is the cohomology class of tr(F k )the trace of the n-fold wedge product F k , also known as the nth Chern formwhere F is the curvature of any connection on E. When properly normalized the integrals of the Chern classes over any compact oriented manifold mapped into M are integers, in which case they are integral cohomology

Glossary

101

classes. This is important in Yang-Mills theory and geometric quantization. (Source: [BM94], p.281) 93 cohomology In mathematics, specically in algebraic topology, cohomology is a general term for a sequence of abelian groups dened from a cochain complex. That is, cohomology is dened as the abstract study of cochains, cocycles, and coboundaries. Cohomology can be viewed as a method of assigning algebraic invariants to a topological space that has a more rened algebraic structure than does homology. Cohomology arises from the algebraic dualization of the construction of homology. In less abstract language, cochains in the fundamental sense should assign quantities to the chains of homology theory. A particular type of cohomology theory (the deRham cohomology) follows from studying closed p-forms modulo the exact p-forms on a topological manifold. (Source: Wikipedia) 93 cohomology class (integral) Any closed p-form on a manifold M denes an element of the pth deRham cohomology of M. This is a nite-dimensional vector space, and it contains a lattice called the pth integral cohomology group of M. We say a cohomology class is integral if it lies in this lattice. Most notably, if you take any U(1) connection on any Hermitian line bundle over M, its curvature 2-form will dene an integral cohomology class once you divide it by 2i. This cohomology class is called the rst Chern class, and it serves to determine the line bundle up to isomorphism. (Source: Wikipedia) 96 cohomology, deRham A particular type of cohomology theory, deRham cohomology, arises when we want to study when closed forms are exact. All exact forms are closed by not vice versa. The pth deRham cohomology of a manifold M is a vector space, written H p (M), whose dimension is the number of p-holes in M. To dene this vector space, rst write Z p (M) for the set of closed p-forms on M. This is a vector space, since the sum of a closed form is again closed. Similarly, write B p (M) for the vector space of exact p-forms. The excat p-forms are a subspace of the closed p-forms, B p (M) Z p (M), so the natural way to see how many closed forms there are that are not exact is to take the quotient space H p (M) = Z p (M)/B p (M) which is called the pth deRham cohomology group (actually more than a group since its a vector space). The concept arises, for example, in geometric quantization, and also in physics whenever one wants to integrate dierential forms on a space that is not simply connected or modeled as such (as with wormholes, and with the Aharonov-Bohm eect). (Source: [BM94]) 93

102

Glossary

complexication We can tensor a real vector space with the complex numbers and get a complex vector space; this process is called complexication. For example, we can complexify the tangent space at some point of a manifold, which amounts to forming the space of complex linear combinations of tangent vectors at that point. (Source: http://math.ucr.edu/home/baez/quantization.html) 97 distribution The word distribution means many dierent things in mathematics, but heres one: a distribution V on a manifold X is a choice of a subspace Vx of each tangent space Tp (X), where the choice depends smoothly on x. (Source: http://math.ucr.edu/home/baez/quantization.html) 97 Hamiltonian vector eld Given a manifold X with a symplectic structure , any smooth function f : X R can be thought of as a Hamiltonian, meaning physically that we think of it as the energy function and let it give rise to a ow on X describing the time evolution of states. Mathematically speaking, this ow is generated by a vector eld v(f ) called the Hamiltonian vector eld associated to f . It is the unique vector eld such that (., v(f )) = df In other words, for any vector eld u on X we have (u, v(f )) = df (u) = uf The vector eld v(f ) is guaranteed to exist by the fact that is nondegenerate. (Source: http://math.ucr.edu/home/baez/quantization.html) 75 Hermitian Synonym for self adjoint. See the entries for Hermitian functions, matrix and operator 96 Hermitian function An Hermitian function is a complex function with the property that its complex conjugate, f , is equal to the original function with the variable changed in sign: f (x) = f (x) . From this denition follows immediately that f is a Hermitian function, then the real part of f is an even function the imaginary part of f is an odd function. (Source: Wikipedia) 98

Glossary

103

Hermitian operator On a nite-dimensional inner product space, a self-adjoint operator is one that is its own adjoint, or, equivalently, one whose matrix is Hermitian, where a Hermitian matrix is one which is equal to its own conjugate transpose. By the nite-dimensional spectral theorem such operators have an orthonormal basis in which the operator can be represented as a diagonal matrix with entries in the real numbers. These can be generalized to operators on Hilbert spaces of arbitrary dimension. Self-adjoint operators are used in functional analysis and quantum mechanics. In quantum mechanics their importance lies in the fact that in the Dirac-von Neumann formulation of quantum mechanics, physical observables such as position, momentum, angular momentum and spin are represented by self-adjoint operators on a Hilbert space. 2 Of particular signicance is the Hamiltonian H = 2m + V , which as an observable corresponds to the total energy of a particle of mass m in a potential eld V . Dierential operators are an important class of unbounded operators. The structure of self-adjoint operators on innite dimensional Hilbert spaces essentially resembles the nite dimensional case, that is to say, operators are self adjoint if and only if they are unitarily equivalent to real-valued multiplication operators. With suitable modications, this result can be extended to possibly unbounded operators on innite dimensional spaces. Since an everywhere dened self adjoint operator is necessarily bounded, one needs be more attentive to the domain issue in the unbounded case. (Source: Wikipedia) 98 integrable distribution A distribution on a manifold X is integrable if at least locally, there is a foliation of X by submanifolds such that Vx is the tangent space of the submanifold containing the point x. (Source: http://math.ucr.edu/home/baez/quantization.html) 97 line bundle In algebraic topology and dierential topology a line bundle is dened as a vector bundle of rank 1. A line bundle expresses the concept of a line that varies from point to point of a space. For example a curve in the plane having a tangent line at each point determines a varying line: the tangent bundle is a way of organising these. (Source: Wikipedia) 95 square-integrable sections We can dene an inner product on the sections of a Hermitian line bundle over a manifold X with a symplectic structure. The symplectic structure denes a volume form which lets us do the necessary integral. A section whose inner product with itself is nite is said to be square-integrable. Such sections form a Hilbert space H0 called the prequantum Hilbert space. It is a kind of preliminary version of the

104

Glossary

Hilbert space we get when we quantize the classical system whose phase space is X. (Source: http://math.ucr.edu/home/baez/quantization.html) 97 symplectic structure A symplectic structure on a manifold M is a closed 2-form which is nondegenerate in the sense that for any nonzero tangent vector u at any point of M, there is a tangent vector u at that point for which w(u, v) is nonzero. (Source: http://math.ucr.edu/home/baez/quantization.html) 59, 96 U(1) connection The group U(1) is the group of unit complex numbers. Given a complex line bundle L with an inner product on each ber Lx , a U(1) connection on L is a connection such that parallel translation preserves the inner product. (Source: http://math.ucr.edu/home/baez/quantization.html) 96 vector bundle A vector bundle is a geometrical construct where to every point of a topological space (or manifold, or algebraic variety) we attach a vector space in a compatible way, so that all those vector spaces, glued together, form another topological space (or manifold or variety). A typical example is the tangent bundle of a dierentiable manifold: to every point of the manifold we attach the tangent space of the manifold at that point. Or consider a smooth curve in 2 , and attach to every point of the curve the line normal to the curve at that point; this yields the normal bundle of the curve. (Source: Wikipedia) 94 vertical vectors Given a bundle E over a manifold M, we say a tangent vector to some point of E is vertical if it projects to zero down on M. (Source: http://math.ucr.edu/home/baez/quantization.html) 52, 97

Bibliography
[Arn78] [BM94] [GPS02] [OR97] [Pel94] [Pen05] V. I. Arnold. Mathematical methods of classical mechanics. Springer, New York, 1978. (Translated from Russian). J. C. Baez and J. P. Muniain. Gauge Fields, Knots and Gravity. World Scientic, Singapore, 1994. H. Goldstein, C. Poole, and J. Safko. Classical Mechanics. Addison-Wesley, Reading, Massachusetts, third edition edition, 2002. L. ORaifeartaigh. The Dawning of Gauge Theory. Princeton University Press, Princeton, New Jersey, 1997. Peter Peldan. Actions for gravity, with generalizations: A review. Classical and Quantum Gravity, 11:1087, 1994. R. Penrose. The Road to Reality: A Complete Guide to the Laws of the Universe. Knopf, New York, 2005.

[WTM71] J. A. Wheeler, K. S. Thorne, and C. W. Misner. Gravitation. W. H. Freeman, New York, 1971. [Xia92] Zhihong Xia. The existence of noncollision singularities in Newtonian systems. Annals of Mathematics, 135(3):411468, 1992.

105

Index
action, 4 charged particle, 38 denition, 4 Hamiltons eqns, 59 least action, 4, 15 string action, 40 versus least time, 46 ane, 55 angular momentum, 27 quantization, 93 Atwood machine, 28 Bargmann-Segal, 97 Bohr, 72, 91 Bohr-Sommerfeld, 72 condition, 97 boundary boundary terms, 5, 21 bundle abelian, 93 cotangent bundle, 51 ber, 95 line bundle, 95 tangent bundle, 51 U(1), 93 canonical coordinates, 22 canonical forms, 72 catastrophe, 62 category theory, 98 centrifugal force, 13 charge charged particle, 38 charged particle general relativity, 45 Chern class, 93 Christoel symbols, 45, 47 commutative algebra, 85 compact, 21 complexication, 97 conguration space, 14, 22 connection, 45 curvature, 93 U(1), 97 conservation conserved quantities, 25, 83 energy, of, 24 Jacobi identity, 83 conservative force, 3 coordinates canonical, 22 generalized, 17, 19, 21 cotangent bundle, 22, 51 canonical forms, 72 Darbouxs thm, 77 covariant derivative, 45 critical point, 4, 13, 15 geodesics, 48 DAlembert, 8 dAlemberts principle, 9 Darbouxs theorem, 77 De Broglie, 91 De Rham cohomology, 93 derivation, 90 106

INDEX

107

derivative graded, 90 Lie derivative, 89 superderivative, 90 dieomorphism, 34, 54 dierential, 53 dynamics equilibrium, 12 eigenstates, 72 eikonal approximation, 63, 71 quantization, 92 electric eld, 38 electrodynamics, 35 electrostatics, 11 energy conservation, 24, 36, 42 Hamiltonian, 25 equilibrium, 7, 12 Euler-Lagrange and Hamiltons eqns, 52, 57 equations, 14, 18 reparameterization invariance, 48 exterior derivative, 89 Fermats principle, 46 ber bundle, 95 foliation, 86, 93 force centrifugal, 13 inertial, 12, 13 forms, 72 closed 2-form, 75 gauge symmetry, 35, 42 gauge transformation, 41 general relativity, 43 charged particle, 45 electromagnetic potential, 55 free particle, 44, 55 reparameterization invariance, 55 generalization, 12

generalized coordinates, 17, 19, 21 geodesic, 45, 46, 48 geometric optics, 61 geometric quantization, 91 geometrization of physics, 72 gravity 5-body problem, 76 GR, 43 Hamiltons eqns, 51, 56 least action, 59 symplectic space, 75 Hamiltons principle function, 64 Hamilton-Jacobi eqns, 61 Hamiltonian, 25 Hamiltonian vs Lagrangian, 6 harmonic oscillator, 87 mechanics, 51 relativistic, 36 vector eld, 76 harmonic oscillator, 87, 92 hase space symplectic, 59 Heisenberg QM picture, 80 Hilbert space, 91 Kahler, 96 prequantum, 97 holomorphic, 95 homomorphism, 85 Huygens, 61 hydrogen model, 92 inertial force, 13 innity 5-body, 76 inner product, 96 integrable

108

INDEX

polarization, 97 vector eld, 82 isotropic maximal subspace, 97 Jacobi least time, 46 Jacobi identity, 84 Khler manifold, 95 a Kaluza-Klein theory, 49 kernel, 53 Lagrangian choice of, 27 denition, 15 dynamics, 16 electrodynamics, 41 general relativity, 45 Hamiltonian vs Lagrangian, 6 regular, 55 string, 39 strongly regular, 54 leact action geodesics, 46 least action, 4, 6, 15 Hamiltons eqns, 59 versus least time, 46 least energy, 9 least time, 10 versus least action, 46 Leibniz rule, 17 Levi-Civita connection, 45 Lie algebra, 85 line bundle, 95 prequantum, 97 linear transform, 55 Liouvilles theorem, 86 Lorentz force, 39 Lorentz transformation, 31 magnetic eld, 37

manifold, 9 compact, 21 cotangent space, 22 foliation, 86 Kahler, 95 Lorentzian, 44, 48 oriented, 38 Poisson, 85 Riemannian, 48, 54 smooth, 21 symplectic, 59, 75 vector manifold, 53 Maxwell-Boltzmann, 87 measure, 91 mechanics analogy to optics, 46 classical, 2 Hamiltonian, 6, 51 Lagrangian, 2 Newtonianian, 2 statistical, 87 metric Euclidean, 46 Lorentzian, 31, 44 nondegenerate, 54 Riemannian, 47 Minkowski spacetime, 33 momentum angular, 27 space translation, 26 nature beauty of, 12 Newton II, 19 Newtons laws, 2, 10 Noethers theorem, 20, 23 Hamiltonian version, 82 nondegenerate, 54 one-parameter family of paths, 16, 22, 24, 34 operator

INDEX

109

grade changing, 90 linear, 90 optics analogy to mechanics, 46 eikonal approx., 63 geometric, 61 geometric approximation, 49 orbits periodic, 72 quantum, 91 oriented manifold, 38 oscillator harmonic, 92 parallel transport, 45 parameterization reparameterization, 34 path one-parameter family, 16, 22, 24, 34 variation, 20 pendulum double pendulum, 14 phase space, 54 extended, 65, 70 physics geometrization, 72 Poisson algebra, 83 Poisson bracket, 81 quantization, 98 Poisson manifold, 85 potential, 9 energy, 4 Riemannian case, 54 vector potential, 37, 49, 55 prequantization, 97 proper time, 33 quantization condition, 92 prequantization, 97 quantum mechanics, 19

Hamilton-Jacobi eqns, 69 quantization, 91 time evolution, 80 quasi-particle, 9, 15 refraction, 11 index of, 11 least time, 46 regular Lagrangian, 55 relativity general theory, 43 special theory, 31 reparameterization, 39 invariance, 48, 55 Riemannian manifold, 54 Schrdinger o QM picture, 80 scientic questions, 6 section holomorphic, 95 of a bundle, 95 Snells law, 11 Sommerfeld, 72, 91 space translation symmetry, 25 spacetime, 33 special relativity, 31 spin, 95 statics, 7 versus dynamics, 9 Stokes theorem, 72 string action, 40 Lagrangian, 39 strongly regular, 54 Lagrangian, 54 submanifold, 9 superderivative, 90 symemtry

110

INDEX

conserved quantity, 25 symmetry gauge symmetry, 35 rotational, 26 space translation, 25 spacetime translation, 42 time translation, 25 transformation, 22 symplectic 2-form, 74 geometry, 70 leaves, 86, 93 manifold, 75 structure, 59, 75 preservation, 88 symplectomorphism, 77 tangent bundle, 51 tangent vector, 15 time evolution, 78 proper time, 33 reversal invariance, 77 time translation, 20 symmetry, 25, 42 topology catastrophe, 62 variational derivative, 12 Hamilton-Jacobi, 65 path variation, 17, 20 vector cotangent vector, 44 tangent vector, 15 vector eld, 75 vertical vectors, 53 vector eld, 75 Hamiltonian eld, 76 vector potential, 37, 55 vector space, 53 vertical vectors, 53

virtual displacement, 9 virtual work, 8 waves eikonal approximation, 92 particles, 92 wave function, 61, 70 Weils formula, 89 WKB approximation, 71

You might also like