Professor Jean-Louis Basdevant Auth. Variational Principles in Physics

Variational Principles in Physics
lean-Louis Basdevant
Variational Principles in Physics
~ Springer
Professor Jean-Louis Basdevant
Physics Department
Ecole Poly technique
91128 Palaiseau
France
jean-louis. basdevant@polytechnique.edu
Library of Congress Control Number: 2006931784
ISBN 0-387-37747-6 ISBN 0-387-37748-4 (eBook)

ISBN 978-0-387-37747-6 ISBN 978-0-387-37748-3 (eBook)
Printed on acid-free paper.
2007 Springer Science+ Business Media, LLC

All rights reserved. This work may not be translated or copied in whole or in part without the
written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street,
New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly
analysis. Use in connection with any form of information storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now known or hereafter
developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they
are not identified as such, is not to be taken as an expression of opinion as to whether or not
they are subject to proprietary rights.
9 8 7 6 543 2 1
springer.com
Preface
Optimization under constraints is part of our daily lives. To live as comfortably

as possible given that there exist conflicts such as the chores of everyday life
or the desires of each individual in a family or group is a simple example.
With the development of computer science, optimization has acquired a major
role in the modern world. In the future, it is plausible that optimization will
become one of the very first concepts to be taught in an elementary course in
mathematics.
It is an amazing observation that laws of nature appear to follow such rules.
These are expressed mathematically as variational principles. These princi-
ples possess two characteristics. First, they appear to be universal. Second,
they express physical laws as the results of optimal equilibrium conditions
between conflicting causes. In other words, they present natural phenomena
as problems of optimization under constraints. The founding idea in modern
physics is due to Fermat and his least time principle in optics. This was fur-
ther developed in the framework of the calculus of variations of Euler and
Lagrange. In 1844, Maupertuis found, with the help of Euler, the least action
principle in mechanics.
The philosophical impact of the discovery of such principles of natural
economy was considerable in the 18th century. However, if the metaphysical
enthusiasm did not last long, it is not because of any lack of intellectual beauty
or richness. It is because variational principles have constantly produced more
and more profound physical results, many of which underlie contemporary
theoretical physics. The ambition of this book is to describe some of their
physical applications.
After presenting and analyzing some examples, the core of this book is
devoted to the analytical mechanics of Lagrange and Hamilton, which is a
must in the culture of any physicist of our time. The tools that we will develop
will also be used to present the principles of Lagrangian field theory. We then
study the motion of a particle in a curved space. This allows us to have a
simple but rich taste of general relativity and its first applications. These
have had a spectacular revival of interest in recent years, for instance in the
vi Preface
development of gravitational optics which allows us to probe the universe

at very far distances. Another unexpected spinoff lies in the accuracy of the
global positioning system.
In the last chapter, we present the theory of Feynman path integrals in
quantum mechanics. This allows us to discover general structures common to
different domains of physics that may seem, a priori, quite far apart.
This book resulted from the last course I delivered in the Ecole Poly-
technique, for three years starting in 2001. I was struck by the interest that
students found in this aspect of physics. They discovered a cultural compo-
nent of science that they did not expect. For that reason, teaching this was a
very rewarding piece of work.
I have deliberately chosen to develop as few mathematical techniques as
possible in order to concentrate on the physical aspects. Mathematical devel-
opments can be found in the bibliography.
I am indebted to Andre Rouge for all his useful comments and suggestions.
I profited considerably from his great culture.
I want to pay a tribute to the memory of Gilbert Grynberg. He should
have been in charge of teaching this course at the Ecole Poly technique. His
tremendous fight against a brain tumor prevented him from doing so. I admire
his courage, his human qualities, and his intellectual elevation.
I am very grateful to James Rich, who was able to extract me from the
traditional French academism and make me share his creative enthusiasm for
physics. I hope he doesn't mind some of my mathematically minded remarks.
Part of Chapter 6 was directly inspired by his work in a different context.
I thank my friends Adel Bilal, Fran~ois Jacquet, Christoph Kopper, David
Langlois and Jean-Fran~ois Roussel for all their comments and suggestions
when we were teaching this matter and having fun together.
Finally, I want to thank my students, in particular Claire Biot, Amelie
Deslandes, Juan Luis Astray Riveiro, Clarice Aiello Demarchi, Joime Barral,
Zoe Fournier, Celine Vallot, and Julien Boudet, for their questions and their
kind comments. They have provided this book with a flavor and a spirit of
youth that would have been absent without them.
Paris lean-Louis Basdevant

January 2006
Contents
Preface........................................................ v
1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Esthetics and Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Metaphysics and Science ................................. 3
1.3 Numbers, Music, and Quantum Physics .................. " 4
1.4 The Age of Enlightenment and the Principle of the Best. . . . .. 7
1.5 The Fermat Principle and Its Consequences. . . . . . . . . . . . . . . .. 8
1.6 Variational Principles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 9
1.7 The Modern Era, from Lagrange to Einstein and Feynman .... 12
2 Variational Principles . ................................... " 21

2.1 The Fermat Principle and Variational Calculus. . . . . . . . . . . . .. 22
2.1.1 Least Time Principle .............................. 22
2.1.2 Variational Calculus of Euler and Lagrange ........... 26
2.1.3 Mirages and Curved Rays .......................... 27
2.2 Examples of the Principle of Natural Economy .............. 30
2.2.1 Maupertuis Principle .............................. 30
2.2.2 Shape of a Massive String . . . . . . . . . . . . . . . . . . . . . . . . .. 31
2.2.3 Kirchhoff's Laws ...... . . . . . . . . . . . . . . . . . . . . . . . . . . .. 32
2.2.4 Electrostatic Potential ........... . . . . . . . . . . . . . . . . .. 33
2.2.5 Soap Bubbles ................................... " 34
2.3 Thermodynamic Equilibrium: Principle of Maximal Disorder .. 35
2.3.1 Principle of Equal Probability of States ... . . . . . . . . . .. 35
2.3.2 Most Probable Distribution and Equilibrium ........ " 36
2.3.3 Lagrange Multipliers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 37
2.3.4 Boltzmann Factor ................................. 38
2.3.5 Equalization of Temperatures. . . . . . . . . . . . . . . . . . . . . .. 39
2.3.6 The Ideal Gas .................................... 40
2.3.7 Boltzmann's Entropy ............................ " 41
2.3.8 Heat and Work ................................... 42
viii Contents
2.4 Problems............................................... 43
3 The Analytical Mechanics of Lagrange. . . . . . . . . . . . . . . . . . . .. 47

3.1 Lagrangian Formalism and the Least Action Principle. . . . . . .. 49
3.1.1 Least Action Principle ............................ , 49
3.1.2 Lagrange-Euler Equations .......................... 50
3.1.3 Operation of the Optimization Principle. . . . . . . . . . . . .. 52
3.2 Invariances and Conservation Laws ...... . . . . . . . . . . . . . . . . .. 53
3.2.1 Conjugate Momenta and Generalized Momenta ....... 53
3.2.2 Cyclic Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 54
3.2.3 Energy and Translations in Time .................... 54
3.2.4 Momentum and Translations in Space. . . . . . . . . . . . . . .. 56
3.2.5 Angular Momentum and Rotations .................. 57
3.2.6 Dynamical Symmetries ............................ , 57
3.3 Velocity-Dependent Forces ............................... , 58
3.3.1 Dissipative Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 58
3.3.2 Lorentz Force. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 59
3.3.3 Gauge Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 60
3.3.4 Momentum....................................... 61
3.4 Lagrangian of a Relativistic Particle . . . . . . . . . . . . . . . . . . . . . .. 61
3.4.1 Free Particle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 61
3.4.2 Energy and Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . .. 62
3.4.3 Interaction with an Electromagnetic Field. . . . . . . . . . .. 63
3.5 Problems............................................... 65
4 Hamilton's Canonical Formalism. . . . . . . . . . . . . . . . . . . . . . . . . .. 67

4.1 Hamilton's Canonical Formalism .......................... 68
4.1.1 Canonical Equations ............................... 69
4.2 Dynamical Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 70
4.2.1 Poincare and Chaos in the Solar System .............. 71
4.2.2 The Butterfly Effect and the Lorenz Attractor ........ 71
4.3 Poisson Brackets and Phase Space. . . . . . . . . . . . . . . . . . . . . . . .. 73
4.3.1 Time Evolution and Constants of the Motion ......... 74
4.3.2 Canonical Transformations ......................... 75
4.3.3 Phase Space; Liouville's Theorem. . . . . . . . . . . . . . . . . . .. 78
4.3.4 Analytical Mechanics and Quantum Mechanics. . . . . . .. 80
4.4 Charged Particle in an Electromagnetic Field .... . . . . . . . . . .. 81
4.4.1 Hamiltonian...................................... 81
4.4.2 Gauge Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 82
4.5 The Action and the Hamilton-Jacobi Equation .............. 82
4.5.1 The Action as a Function of the Coordinates and Time 83
4.5.2 The Hamilton-Jacobi Equation and Jacobi Theorem. .. 85
4.5.3 Conservative Systems, the Reduced Action, and the
Maupertuis Principle .............................. 87
4.6 Analytical Mechanics and Optics . . . . . . . . . . . . . . . . . . . . . . . . .. 89
Contents ix
4.6.1 Geometric Limit of Wave Optics .................... 89

4.6.2 Semiclassical Approximation in Quantum Mechanics ... 91
4.7 Problems............................................... 92
5 Lagrangian Field Theory .................................. 97

5.1 Vibrating String. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 98
5.2 Field Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 99
5.2.1 Generalized Lagrange-Euler Equations ............... 99
5.2.2 Hamiltonian Formalism ............................ 100
5.3 Scalar Field ............................................. 101
5.4 Electromagnetic Field .................................... 102
5.5 Equations of First Order in Time .......................... 104
5.5.1 Diffusion Equation ................................ 104
5.5.2 Schrodinger Equation .............................. 104
5.6 Problems ............................................... 105
6 Motion in a Curved Space ................................. 107

6.1 Curved Spaces .......................................... 108
6.1.1 Generalities ....................................... 108
6.1.2 Metric Tensor ..................................... 110
6.1.3 Examples ........................................ 111
6.2 Free Motion in a Curved Space ............................ 112
6.2.1 Lagrangian ....................................... 113
6.2.2 Equations of Motion ............................... 113
6.2.3 Simple Examples .................................. 114
6.2.4 Conjugate Momenta and the Hamiltonian ............ 117
6.3 Geodesic Lines .......................................... 117
6.3.1 Definition ........................................ 117
6.3.2 Equation of the Geodesics .......................... 118
6.3.3 Examples ........ , ............................... 119
6.3.4 Maupertuis Principle and Geodesics ................. 121
6.4 Gravitation and the Curvature of Space-Time ............... 122
6.4.1 Newtonian Gravitation and Relativity ................ 122
6.4.2 The Schwarzschild Metric .......................... 124
6.4.3 Gravitation and Time Flow ......................... 125
6.4.4 Precession of Mercury's Perihelion ................... 125
6.4.5 Gravitational Deflection of Light Rays ............... 130
6.5 Gravitational Optics and Mirages .......................... 133
6.5.1 Gravitational Lensing .............................. 133
6.5.2 Gravitational Mirages .............................. 134
6.5.3 Baryonic Dark Matter ............................. 139
6.6 Problems ............................................... 144
x Contents
7 Feynman's Principle in Quantum Mechanics ............... 145

7.1 Feynman's Principle ..................................... 146
7.1.1 Recollections of Analytical Mechanics ................ 146
7.1.2 Quantum Amplitudes .............................. 147
7.1.3 Superposition Principle and Feynman's Principle ...... 147
7.1.4 Path Integrals .................................... 148
7.1.5 Amplitude of Successive Events ..................... 150
7.2 Free Particle ............................................ 152
7.2.1 Propagator of a Free Particle ....................... 152
7.2.2 Evolution Equation of the Free Propagator ........... 154
7.2.3 Normalization and Interpretation of the Propagator .... 155
7.2.4 Fourier and Schrodinger Equations .................. 155
7.2.5 Energy and Momentum ............................ 156
7.2.6 Interference and Diffraction ......................... 157
7.3 Wave Function and the Schrodinger Equation ............... 157
7.3.1 Free Particle ...................................... 158
7.3.2 Particle in a Potential .............................. 159
7.4 Concluding Remarks ..................................... 161
7.4.1 Classical Limit .................................... 161
7.4.2 Energy and Momentum ............................ 162
7.4.3 Optics and Analytical Mechanics .................... 163
7.4.4 The Essence of the Phase ........................... 163
7.5 Problems ............................................... 164
Solutions ...................................................... 167
References ..................................................... 179
Index .......................................................... 181

1
Introduction
Since mysteries are beyond us,

let us make believe we organized them.
Jean Cocteau
Art cannot be dissociated from metaphysics and philosophy. In his Aesthetics:

Lectures on Fine A rt, in order to answer the question "Why does man have
the need to produce works of art?" Georg Wilhelm Friedrich Hegel says that
"The general need for art is ... the rational need which drives man to become
aware of the inside and outside worlds and to make an object in which he can
recognize himself."
1.1 Esthetics and Physics
The same need explains why physics is deeply filled with esthetic consider-
ations. In fact, the beauty of a theory has very often been considered as a
decisive argument in its favor. Albert Einstein's general relativity is a famous
example. It was formulated in 1916 but only got its true experimental verifica-
tions 70 years later.l Nevertheless, nobody seriously thought that the theory
could really be disproved. 2 As Lev Davidovich Landau says ([1], Section 82),
"[It] is probably the most beautiful of existing theories. It is remarkable that
Einstein constructed it purely by deductive arguments and that it is only
afterwards that it was confirmed by astronomical observations."
The ingredients of esthetics have many origins. Of course, the beauty of
an idea in itself is difficult if not impossible to define in general. However,
lOne usually makes a distinction between the verifications of the equivalence prin-
ciple, such as the deviation of light rays by a gravitational field, the variation of
the pace of a clock in a gravitational potential, or the general relativistic correc-
tions to celestial mechanics, and the true predictions of Einstein's equations, such
as the radiation of gravitational waves.
2 This does not mean that one should give up finding experimental proofs.
J.-L. Basdevant, Variational Principles in Physics

Springer Science+Business Media, LLC 2007
2 1 Introduction
two factors are easier to identify. These are the simplicity of a theory and
its unifying nature. Below, we will mention the archetype of such an intellec-
tual achievement, the Pythagorean musical scale. There are numerous other
examples.
After extensive work, both observational3 and calculational,4 Johannes
Kepler founded his famous laws on the motion of planets in the solar system.
The discovery that, from a Copernican viewpoint, the orbits are the pure and
legendary ellipses of the geometry of Apollonius, Euclid and Archimedes has
a beauty and a simplicity that Kepler could not resist. He naturally conceived
the universe as being constructed in a mathematical esthetic that exhibits
both purity and unity. He expressed his emotion in his celebrated phrase
"Nature likes simplicity."5 It was both a triumph and a wonder when in the
framework of his Principia Isaac Newton was able to deduce Kepler's laws.
The same thing happened in the unification of electricity and magnetism
by Andre-Marie Ampere, followed by that of electromagnetism and light by
James Clerk Maxwell. This amazing adventure of the 19th century lasted
for a long time. The mathematical structure of Maxwell's equations revealed
relativity. The unification of electro-weak interactions by Sheldon Glashow,
Steven Weinberg and Abdus Salam in the 1960s was the following stage of this
fascinating endeavor. It led to the perspective of unifying all fundamental in-
teractions, including gravitation. At each step, simplicity, unity, and esthetics
are dominant features.
Simplicity does not mean that things can become understandable by the
layman. It is quite the contrary. The simplicity appears within the mathemat-
ical language. Galileo was the first to realize that:
Philosophy is written in the immense book which is constantly open
in front of us (I mean the universe), however, one cannot understand
it without first learning the language and the characters in which it is
written. It is written in the mathematical language and its characters
are triangles, circles and other geometrical figures in the absence of
which it is not possible for a human being to understand a single word
of it.
It is tempting to recall the words of Leonardo da Vinci in his Treatise on
Painting: "Non mi legga chi non e matematico, nelli mia principi.,,6 Simplic-
ity lies in the mysterious possibility of representing natural phenomena by
more and more general mathematical structures. If one can say that one of
the most fundamental mathematical structures of quantum mechanics lies in
the simplest of the four basic operations (i.e. addition), it is a consequence
3 Galileo's telescope was invented in 1609, ten years later than Kepler's works.
4 Kepler dedicated his memoir Mysterium cosmographicum to John Napier, who
invented logarithms. Kepler said that without logarithms he never would have
been able to perform his accurate and difficult calculations.
5 Natura simplicitatem amat.
6 Those who are not mathematicians should not read my principles.
1.2 Metaphysics and Science 3
of an enormous amount of effort toward simplicity and synthesis that was

made by physicists and mathematicians in the mid 1920s. The superposition
principle, which is the first and most important concept for anyone starting
to learn quantum mechanics, is what goes completely against common sense
and first physical intuition. However, if the mathematical expression of the
superposition principle is very simple, that simplicity can only be appreciated
after a long and unavoidable mathematical travel.
1.2 Metaphysics and Science
Philosophical thinking is frequently related to scientific progress, which is

understandable. It is interesting, however, to note that truly metaphysical
considerations have frequently followed the same paths as physics. When Ed-
mund Halley studied the trajectory of the comet that now bears his name
and had been known since at least 240 B.C., he showed that the orbit was an
ellipse and, by applying Newton's laws of motion, he was able to predict that
the comet would reappear in 1758. Celestial motion was deeply interlinked
with the notion of time, that most mysterious physical concept 7 that had
been preying on people's minds since they started observing the sky. With
the use of Newton's laws, people had become capable of predicting the state
of the sky with great accuracy. Newton was amazed by that accuracy, and
he found there an argument for the existence of God. Since the system was
so finely tuned, since one could predict the future state of the sky, and since,
by solving equations, one could recover the state of planets at any previous
moment, one had to admit that the solar system, as well as all the cosmos,
had been conceived by some superior intelligence.
At the peak of a physical theory, it is not unusual to find that scientists
have invoked some "higher power." This can be transformed into a theological
argument, as in the case of Newton. It is frequently expressed as a question
addressed to the structured "organism" of which natural phenomena appear
to consist. Kepler and the planetary orbits are one example. Einstein's famous
sayings such as "Subtle is the Lord, but malicious He is not" or "God does
not play dice" are in the minds of everyone. However, beyond these questions
or assertions, one constantly finds, together with the progress of physics, a
metaphysical quest for the causes of the world and the principles of knowl-
edge. This quest is often transformed into seeking a genuine "meta-theory."
By itself, the name of "M-Theory," which was born in 1995 with the proof of
the equivalence between different superstring theories and led to a spectac-
ular revival of interest in the theory of fundamental interactions, is perhaps
revealing in this respect.
7 "What really is time? If no one asks me, I know. But if someone asks the question
and that I must explain it, I no longer know." Saint Augustine, Confessions XI,
XIV, 17.
4 1 Introduction
1.3 Numbers, Music, and Quantum Physics
The birth of modern physics is commonly placed in the 17th century with
Galileo. In fact, he laid its two founding stones: the experimental method and
the formulation of the theory in a mathematical language.
However, the starting point of experimental and theoretical physics lies
2200 years before that. In fact, the Pythagorean theorem occults what is, in
precisely Galileo's terms, the first modern discovery in physics-the theory of
sounds and the musical scale. It is modern in the sense that this discovery
possesses the two properties of having an experimental foundation and of
being expressed in a mathematical manner.
Music is the first abstract art. It is fascinating because it reaches directly
the subconscious. It escapes any attempt to be verbalized. Apart from tech-
nical discussions between experts, one cannot tell music. Musical writing is a
permanent source of amazement, as one can see in Figure 1.1 .
b
1fJ,----
tp~
1~e:.. ~
..,.....
3~frttI..... r"b.. ~
o__
dur .. t .. - - - -
l"hlllht1 ~
Fig. 1.1. Sylvano Bussotti, "Piano pieces for David Tudor # 4," excerpt from
"Pieces de Chair II" (Pieces of Flesh). (Courtesy of Casa Ricordi-BMG Ricordi
Milan; all rights reserved.)
It is difficult to put a date on the birth of this art, but for sure, very
quickly, humans in their songs understood the existence of harmony. The
octave, which is the simplest example, consists in the amazing discovery that
the same sound can be reproduced at a high pitch as well as at a low pitch.
The legend is that Pythagoras discovered the explanation of the musical
pitch by noticing that the pitch was directly related to the length of the ob-
ject that produced the sound. He used to pass daily in front of a blacksmith's
1.3 Numbers, Music, and Quantum Physics 5
workshop in Samos, his native island. 8 He observed that rods of iron of differ-
ent length gave different sounds under the blacksmith's hammer. As Arthur
Koestler says ([2], Chapters V and VII), "The ear-splitting crashes and bangs
in the workshop which, since the Bronze Age had yielded to the Iron Age,
had been regarded by ordinary mortals as a mere nuisance, were suddenly
lifted out of their habitual context: the 'bangs' became 'clangs' of music. In
the technical language of the communication engineer, Pythagoras had turned
'''noise'' into "information."
Back home, Pythagoras proceeded to an experimental verification of his
ideas on musical objects, in particular the vibrating strings of a lyre. He
understood that if he divided a string by integers belonging to the tetraktys,
the set of integers 1, 2, 3, 4 whose sum is the "perfect" number 10, he obtained
what had for a long time been named the "harmony" -the octave, the fifth,
and the fourth.
As Denis Diderot wrote in the entry Pythagorism of his encyclopedia
"L'Encyclopedie" :
Music is a concert of several discordant sounds. One must not

restrict one's ideas to sounds only. The purpose of harmony is more
general. Harmony has its invariant rules .... The octave, the fifth and
the fourth form the basis of harmonic arithmetics.
The way Pythagoras discovered the ratios of these intervals shows
he was a man of great genius. . .. There are songs for each kind of
passion, whether one wants to temper them, or to excite them. The
flute is dull. The philosopher will take the lyre: he will play it in the
morning and in the evening.
After studying all subtleties he could find on the harmonics of a sound, and
the way he could reduce them to the interval of one octave by dividing them
by powers of 2, Pythagoras ended up with musical scales, in particular the one
that bears his name, which is shown in Table 1.1. The numbers indicate ratios
of frequencies. (The Greek modes were expressed in a decreasing sequence
according to the length of the string.)
Table 1.1. Frequency ratios in the musical scale of Pythagoras.
note CDEFGAB C
ratio of frequencies 1 2.8 .!. :! 9.
64 3 2
TI
16
243
128
2
8 Actually, it is unimportant whether the anecdote is true or not, or whether it

is Pythagoras himself who made the discovery. In fact, the important points lie
in the profoundness of the idea, in the experimental observation that it entailed,
and in the resulting integer number theory that reached us.
6 1 Introduction
In this scale, the intervals between two consecutive notes can take only
two values: the tone (ratio 9/8) and the half-tone (ratio 256/243). Pythagoras
considered it particularly important that the numerators and the denomina-
tors of these fractions were powers of the elements of the tetraktys (in that
case 2 and 3). For him, this scale has a much greater beauty than all others.
In fact, at this point, we must give a further observation of Diderot, in his
last sentence of the entry:
The motion of celestial orbits, which carries the seven planets, forms
a perfect concert.
As Aristoxenes put it, one merit of Pythagoras was that he "elevated arith-
metics above the needs of merchants." He transformed a set of empirical and
utilitarian rules, in particular for trade, into a genuine deductive science. How-
ever, starting from his analysis of musical harmony, which can be reduced to
integer numbers, he could not resist believing that numbers are the principle,
the source, and the root of all things. Therefore, we are back to metaphysics.
According to this principle, the Pythagoreans elaborated a mystical "arith-
mology" by assigning qualitative properties to numbers. For instance, they
arrived at the idea that they could conceive and describe the cosmos and
its origin by the harmony of spheres. The principle of harmony invaded all
the philosophy of the Pythagoreans. They believed that all the universe is
determined by integers and the resulting harmony.
Pythagoras himself is one of the most mysterious personalities in Greek
antiquity. We do not possess any written text of his. For a long time, his
thought was known through oral tradition. Aristotle avoided mentioning his
name and only spoke about the Pythagoreans. Pythagoras was born in the 6th
century B.C. in Samos, in Asia Minor. Around the age of 40, he emigrated to
Crotone, in Italy. He founded some community, which was both religious and
political. The community was massacred during a revolt of the population. He
elevated integer numbers to the rank of foundations of the world. One legend
is that he committed suicide on the day he discovered he had proven that V2
was irrational; i.e. that he could not express the diagonal of a square in terms
of its side as the ratio of two integers.
Numerology played a considerable role in the development of science in
the 19th century. John Dalton's law of definite proportions enabled chemical
reactions to be reduced to an interplay of integers and secured atomic theory.
The classification of species in zoology as well as in botany rested on the
counting of various elements such as the number of petals in botany and the
number of teeth or nails in zoology.
The phenomenological analysis of atomic spectra involved rational frac-
tions. This adventure led to one of the most amazing breakthroughs, Johann
Balmer's integer number formula and its role in the birth of quantum me-
chanics.
It was by chance that, in 1885, Balmer, who was a high school teacher
in Basel and passionate about numerology, became aware of the first four
1.4 The Age of Enlightenment and the Principle of the Best 7
lines of the visible part of the spectrum of atomic hydrogen. He noticed that
the wavelengths of these lines could be fitted to one part in a thousand by a
formula involving integers: 1/)" ex (n 2 - 4)/n 2 , n;:::: 3. Although he was not a
physicist, this result struck him with its simplicity and its beauty. In his 1885
paper, he wrote: "It appears to me that hydrogen ... more than any other
substance, will open new roads in the knowledge of matter, of its structure
and of its properties."
In fact, in 1912, when the 27-year-old Niels Bohr was working at Ernest
Rutherford's laboratory on a model of an atom, he was completely unaware of
Balmer's formula and of the analogous formulas of Johannes Rydberg concern-
ing alkali atoms. When, by accident, he was informed of the Balmer formula,
it took him a few weeks to construct his celebrated model of the hydrogen
atom, one of the turning points of quantum physics.
An amusing enigma remains to be settled, namely the empirical law that
Titius found in 1772 and that was also published by Bode in 1778. This law
is a relation between the distances a of planets from the sun (more exactly
the major axis of the ellipses), expressed in astronomical units (1 A.U. = 150
million km), and their ranks n, assuming the rank of Mercury is n = -()() and
the rank of Venus n = 1. The original form of the law was a = (n + 4)/10
with n = 0, 3, 6, 12, 24, 48, ... ; the present form is
a = 0.4 + 0.3 x 2n - 1
where a is the distance between the planet and the sun. For Mercury, n = -()()
and a = 0.4; n = 1 for Venus; n = 2 for the Earth; n = 3 for Mars; and n = 5
for Jupiter. The "gap" observed for n = 4 led to the discovery of the belt of
asteroids when astronomers tried to observe a planet at a distance of 2.8 A.U.
The Titius-Bode law, which is accurate up to Uranus, becomes bad for larger
distances (it gives a = 77.2 A.U. for Pluto, whose actual distance from the sun
is 39.2 A.D., but is now classified as a "dwarf planet," after the August 26,
2006, Congress of the International Astronomical Union). There are present
speculations to see whether it holds or not for extra-solar planetary systems
discovered in recent years. No dynamical calculation has ever been able to
recover this formula from the theory of celestial mechanics.
1.4 The Age of Enlightenment and the Principle of

the Best
Philosophers of the 18th century were fond of the idea of balance and equilib-
rium. Let us mention, for its actuality, the following assertion of Charles de
Secondat de Montesquieu in The Spirit of the Laws: "In any public office, one
must compensate the might of the power by the brevity of its duration."
With the philosophy of Gottfried Wilhelm Leibniz (1646-1716), there ap-
pears an acknowledgment that optimal conditions appear in Nature. Coming
8 1 Introduction
back to Diderot and the item Leibnizianism in the "Encyclopedie," one can
read the following:
He had on general physics a particular idea, namely that God built

in the most economic manner what was the most perfect and the
best. He is the founder of optimism, or of that system that seems to
transform God into an automat in his decisions and in his actions,
and to recover in a spiritual form the fatum of antiquity, or also the
necessity for things to be such as they are.
However, there is an infinite number of combinations and of pos-
sible worlds in the ideas of God, but only one of them can exist,
therefore there must be some sufficient reason for that choice. That
reason can only lie in the various degrees of perfection, therefore it
follows that the existing world is the most perfect. God has chosen it
in his wisdom, known it in his goodness, produced it in the fullness of
his power.
In his New Essays on Human Understanding, Leibniz wrote "My system

takes the best from all sides." For him, God is conceived as a mathematician.
We are back to metaphysics.
Naturally, one must temper this impression of enthusiasm. There was by
no means a unanimous agreement on such ideas, be it of Leibniz or of any
other thinker. In Candide, Voltaire mocked the ideas of Leibniz in particular:
"Everything is the best in the best of possible worlds."
It is proven, he said, that things cannot be different from what they
are, since everything exists for a given purpose, everything is neces-
sary for the best purpose. Notice that noses have been made in an
appropriate shape in order to wear glasses.
1.5 The Fermat Principle and Its Consequences
The scientific thunderbolt, the mathematical formalization of the ideas above,

is primarily due to Pierre de Fermat (1601-1665), as we shall see in chapter 2.
The founding idea is the principle of geometrical optics that bears his name,
which is a principle of least time.
Actually, everything started with a harsh criticism that Fermat made in
1637 about Rene Descartes's work and about the notion of proof. Fermat was
annoyed by the chapter of Descartes's Discours de la Methode on geometri-
cal optics, "Dioptrique." Fermat, who was a judge in Toulouse, was a known
mathematician but not a physicist. He was, however, interested in the struc-
ture of physicallawsY The lack of rigor of Descartes's "pseudo proof" irritated
9 He was engaged in a correspondence with Etienne Pascal, the father of Blaise,

and with Gilles de Roberval on mechanical equilibrium.
1.6 Variational Principles 9
him. He was convinced that things could be done properly. As Fermat said,
"It seems to me that a little geometry can help us solve the problem."
When he managed to formulate the law of refraction nl sin i 1 = n2 sin i 2 , in
a geometrical manner, Fermat was fascinated: "The fruits of my work were the
most unexpected and the most extraordinary that ever were. In fact ... I have
found that my principle gives exactly and precisely the same proportion of
refractions that Monsieur Descartes established." At the end of 1661, Fermat
wrote his principle of least time, which started everything. He called it the
principle of natural economy.
In 1744, Pierre-Louis Moreau de Maupertuis (1698-1759), who in 1730 had
introduced the ideas of Newton into France and continental Europe, stated
for the first time the principle of the least amount of action in mechanics.
Even though the initial form and justification given by Maupertuis are
obscure, it is a historical landmark in the evolution of ideas in physics and
likewise, at the time, in philosophy.
In the same line as Fermat, Maupertuis understood that, in some well-
defined conditions, Newton's equations were equivalent to the fact that a
quantity, which he called the action, was minimal. His statement is the fol-
lowing:
The Action is proportional to the product of the mass by the velocity
and by space. Now, here is this principle, so wise, so worthy of the
Supreme Being: when some change occurs in Nature, the amount of
Action used for this change is always the smallest possible.
(Notice the presence of the Supreme Being.)

For a particle of mass m, of velocity v, Maupertuis's action is the product
of three factors: the mass, the velocity, and the length of the trajectory (i.e.
the integral of the linear momentum along the trajectory: A = Jmv dl). The
formulation of the Maupertuis principle and its proof were given soon after
by his friend Leonhard Euler.
These principles had a great impact in the 18th century. The fact that
the laws of nature could be deduced by optimization principles, a balance
between various conflicting sources, struck the minds of people in the Age of
Enlightenment. The principle of natural economy was fascinating. It appeared
as a natural balance between various laws of physics that seemed to lead
toward opposite directions, if not simply inconsistent with one another. This
principle was naturally related to the "principle of the best" of Leibniz.
1.6 Variational Principles
As Philip M. Morse and Herman Feshbach say [10], variational principles are
the mathematical formulation of the superlative. This formulation of physical
laws consists in imposing that some typical physical quantity of the system
10 1 Introduction
under consideration is optimal for the actual performance of the system com-
pared with the value it would take if one were to imagine a different perfor-
mance. In a certain sense, owing to their universality, variational principles
can appear as a general "metatheory" of physics and perhaps, one day, of other
branches of science such as biology, psychology, and social phenomena. They
playa central role in economics. The first formulation of a physical theory con-
sists in explaining a phenomenon by a local law. This is the case for Newton's
laws of dynamics, for the Snell-Descartes laws, and for the differential laws
of electromagnetism or thermodynamics. After this first formulation has been
performed and exploited, one always seeks the underlying basic principles and
their relations with other theoretical schemes. "Variational principles" express
physical laws in a global manner. The corresponding formulation can restore
the local laws, however one discovers that it is richer and more powerful. It
allows us to bring out the fundamental principles of the laws under consider-
ation. This provides a more fruitful view both of fundamental principles and
their applications.
This way of considering physical processes and structures can be traced
back to Greek mathematicians and philosophers. The Greeks characterized a
straight line as the shortest path between its endpoints. In the first century
B.C., Hero of Alexandria had discovered and proved the remarkable fact that
the equality of the angles of incidence and reflection in geometrical optics boils
down to the fact that the length of the path between the source and the eye
of the observer is the shortest possible. In the same spirit, The Aristotelians
thought they could "justify" that celestial orbits are circular by the fact that,
for a given value of the perimeter of closed planar curves, the circle is the
one that surrounds the largest area (this is called the isoperimetal problem in
mathematics).1 o Considering a straight line as the shortest path between two
points or a circle as the shortest line around a given area are simple ways to
define these geometric objects.
Similarly, in physics, saying that electric current is distributed in a network
in such a way that the energy loss by Joule heating is as small as possible is a
simple and direct description of the flow of electric current that encompasses
a variety of particular cases without using any complicated mathematics. Of
course, calculations reappear as soon as one applies these principles to specific
cases. The assertion that a physical system acts or evolves in such a way that
some function related to it is minimum or maximum is very often the starting
point of theoretical investigations, and it enables one to uncover the ultimate
relations between physical facts.
Therefore, variational principles present natural phenomena as problems
of optimization under constraints. They are present in all sectors of physics (a
10 The legend says that, when she founded Carthage, Dido had to satisfy the con-
dition that the city should be contained within a bull's skin. She cut the skin in
narrow strips in order to make an enormous circle with it.
1.6 Variational Principles 11
remarkable discussion of this fact is given in the Feynman Lectures, Chapters

1.26 and II.19 [3], and in the book by Yourgenau and Mandelstam [4]).
In mechanics, which is the first physical science if one includes the acous-
tics of Pythagoras, it is generally acknowledged that the great philosophical
and physical breakthrough that led to our present views came when peo-
ple attacked Aristotle's ideas on motion. In order to explain motion and its
changes, Aristotle imagined that space was filled with "movers" that impart
the motion. John Philoponus (490-566) a Christian philosopher, scientist, and
theologian, was the first to make a critique of Aristotelian concepts of motion
in his Physics commentary. Through a series of fascinating remarks, he de-
parted from Aristotle's dynamics and opened the way to modern conceptions
by introducing the theory of impetus, which contains the notion of inertia.
Two of his amazingly modern objections are the following. When two bod-
ies in motion collide, their trajectories sharply deviate; however, if they skim
past each other, their trajectories are unaffected. How can the "movers" act
in such a discontinuous and unpredictable manner? Why is it easier to throw
a light object higher than a heavy one? Philoponus preferred to think that
some momentum was given to the object by the thrower.
This was followed 800 years later by the works of Jean de Buridan (1300-
1358), who followed the ideas of Peter of Spain and William of Ockham.
Buridan was the rector of the University of Paris between 1328 and 1340.
He published several leading works of the Middle Ages on logic, metaphysics,
natural philosophy, and ethics. There is a story, reported by Fram;ois Villon,
that Buridan died when the Queen of France had him thrown into the Seine
River in a sack after making love with him. Another story is that he founded
the University of Vienna after being expelled from Paris for his nominalist
teachings and that he hit the (future) Pope Clement VI over the head with a
shoe while competing for the affections of the wife of a German shoemaker (the
blow was apparently the cause of the prodigious memory for which Clement
became known).
Like Philoponus, Buridan conceived of motion as resulting from a balance
between conflicting causes and the notion of impetus. This is the first modern
conception of mechanics. The nature of motion was, for Buridan, the result of
the interplay and conflicts of various sources of impetus. The laws of motion
followed from an optimization of this set of conflicts. 11
In the motion of the projectile, one made the distinction between three
phases, represented in Figure 1.2. In the first phase, called violent motion, the
trajectory is a straight line and the motion develops under the impetus given
by the cannon. In the third phase, called natural motion, the trajectory is
11 In the same line of thought, he gave a famous argument in favor of free choice.
The argument is known as "Buridan's Ass," where two piles of hay are set at
equal distances from a starving donkey. Nobody, even God, can predict which
pile the poor beast will choose. In order to express such ideas at that time in the
Sorbonne, some amount of courage, authority, and skill was necessary.
12 1 Introduction
also straight; it is due to the impetus of gravity, also called natural impetus,
and the cannonball falls down. 12
media quies (re ting state)
:?/
B C
';oloot ~ moo"
if
Fig. 1.2. Successive phases of the motion of a cannonball in the theory of impetus.
The intermediate phase corresponds to the weakening of the violent im-

petus under the action of the natural impetus. This results in some sort of
resting phase (vertically speaking) called the media quies. This phase was con-
ceived as a transition, or a compromise, between two contradictory states of
motion where the projectile has roughly a horizontal uniform motion.
In the 16th century, the impetus concept was quite fashionable. Gunners
used to calculate the motion of cannonballs by using Buridan's impetus con-
cept as shown in Figure 1.3. Leonardo da Vinci qualitatively explained the
motion of a spinning top as a conflict between two axial impetuses.
1.7 The Modern Era, from Lagrange to Einstein

and Feynman
The metaphysical enthusiasm did not last very long. This was not due to any
lack of intellectual richness or esthetics but because since then variational prin-
ciples have never stopped producing important physical results. Our ambition
in this book is to describe a few of them.
Leonhard Euler (1707- 1783) and Joseph-Louis Lagrange (1736- 1813),
whose works were pursued by William R. Hamilton (1805-1865), set the math-
ematical foundations of the subject. They constructed a founding stone of
present day theoretical physics.
The consequences of this conception of physics can be found in Einstein's
general relativity as well as in gauge theories of fundamental interactions.
The central mathematical tool is the variational calculus (also called calcu-
lus of variations). This is the work of Euler, who understood the mathematical
foundation, and Lagrange who made a decisive contribution in 1766. 13 Vari-
12 The fall is steeper than the rise of the violent impetus. Fortunately, air friction
does produce this effect!
13 Euler, who had been partially blind since the age of 28, became completely blind
in that same year of 1766. The 18-year-old Lagrange visited him in 1754 and told
1.7 The Modern Era, from Lagrange to Einstein and Feynman 13
Fig. 1.3. Diagram from the Polish 16th century artillery handbook Ars Magne Ar-
tilleriae pars prima: DellAqua Praxis: examples of shots. (One can imagine that 20th
century colliding beam facilities were already in the minds of people.) Archives of
Casimir Siemienowicz, General of the artillery of the Polish and Lithuanian Crown.
(Courtesy of Richard J. Orli.)
ational calculus is an amazing chapter of mathematics, both in its unifying

aspect and in the number of questions for which it has given an answer.
In 1744, Euler published his treatise M ethodus inveniendi lineas curvas
maximi minimive proprietate gaudens, which founded variational calculus.
This was along the lines of the works of Jacob and Johann Bernoulli, which
had a considerable impact on Lagrange. It is in that work that Euler gave an
a posteriori proof of the least action principle of his friend Maupertuis.
Lagrange belonged to a family who lived in Torino. He was particularly
gifted and precocious. The positive reaction of Euler encouraged him and, in
1756, he applied his techniques to the least action principle, which founded
modern mechanics.
A major contribution of Lagrange is his Analytical Mechanics, where he
wrote the synthesis of all the methods that he had developed before, in statics
as well as in dynamics. The work was finished in 1782 but did not appear
in Paris until 1788. Lagrange's Mechanics is as important in the history of
physics, mechanics and mathematics, as Newton's celestial mechanics. His
him about his work. Euler was filled with wonder over the talent of this young
man, and he hid his own results for some time so that the full credit would go
to Lagrange. This is nearly a unique example, nonexistent nowadays, of human
courtesy and passion for science.
14 1 Introduction
work was the starting point of the work of Hamilton who called it a "scientific
poem written by the Shakespeare of Mathematics."
Hamilton was born in Dublin. Like Lagrange, he was also a precocious
child. At the age of 19, he wrote a remarkable paper on optics. At the age of
23, he became Professor of Astronomy at Dublin and Royal Astronomer at the
Dunsink Observatory. He spent all his life in Dublin and in his observatory.
Hamilton's interest in optics came from the instruments in his laboratory.
His memoir On caustics, written in 1824, is a milestone of optics. Soon after
that, he developed and amplified the analytical mechanics of Lagrange, and
he gave it its modern form.
Hamilton was fascinated by variational principles and, in particular, by the
similarity between Maupertuis's principle in mechanics and Fermat's principle
in optics. In 1830, he made the remarkable observation that the formalisms
of optics and mechanics could be unified and that Newtonian mechanics cor-
responds to the same limit or approximation as geometrical optics compared
with wave optics.
His contemporaries paid no attention to that remark, and the great math-
ematician Felix Klein said in 1890 that it was a shame. Of course, in 1830,
there was no experimental evidence for Planck's constant. Nevertheless, to
a large extent, Hamilton's work can be considered a precursor of quantum
mechanics.
Our main purpose here is to give an instructive account of the analytical
mechanics of Lagrange and Hamilton. These are inescapable chapters in the
culture of physicists. We shall also show the many spinoffs in other sectors.
We shall, in particular, show the relation of analytical mechanics with optics
and with quantum mechanics.
In Chapter 2, we recall Fermat's principle that was given in 1661 as a
least time principle. Fermat poses the problem of the propagation of light by
asking what is the effective path followed by a light ray to go from one point
to another. This will bring us in a natural way to the mathematical core of
our purpose, the variational calculus of Euler and Lagrange. It is a very rich
chapter of mathematics. Here, we only wish to obtain physical results in a
simple and straightforward manner.
We will investigate some simple examples in order to get acquainted with
the matter. These will be the Maupertuis least action principle and other
more unexpected examples, such as Kirchhoff's laws or Poisson's equation in
electrostatics.
Finally, we shall turn to a case that is quite analogous in its spirit but is
fascinating because of the number and power of its results compared with the
simplicity of its starting point, the foundations of statistical thermodynam-
ics. Introducing the technique of Lagrange multipliers and the principle of
equiprobability of configurations, we will obtain a very remarkable definition
of temperature, together with its first physical property, that temperatures
of systems in thermal contact equalize. Next, we will give the statistical ab-
solute definition of entropy due to Ludwig Boltzmann. This will lead us to a
1. 7 The Modern Era, from Lagrange to Einstein and Feynman 15
incredibly simple principle:
Thermal equilibrium corresponds to a situation that maximizes the en-

tropy for given constraints; in other words, a situation where disorder is max-
imal for given constraints.
The range of application of such a principle goes far beyond thermody-

namics, which is understandable. In particular, it is a founding stone in the
construction of economic models.
Chapter 3 is devoted to the analytical mechanics of Lagrange. The end of
the 17th century was marked by the triumph of Newton's great synthesis, the
Philosophiae Naturalis Principia Mathematica, in 1687. In addition, Newton
formulated the universal law of gravitation which enabled him to explain
Kepler's laws and the motion of celestial bodies. Humans had been concerned
with celestial motion, which was completely entangled with the notion of time,
ever since they started to observe the sky. They were now able to predict the
state of the sky with incredible accuracy.
But this was by no means the end of the story. Following the Newtonian
synthesis, an amazing adventure happened in the the 18th and 19th centuries.
This started with Jean Le Rond d'Alembert, Maupertuis, and the Bernoulli
brothers, and was followed by Euler, Lagrange, and, later on, Hamilton. The
true structure of mechanics was discovered. It was a geometric structure. A
large category of mechanical problems could be reduced to purely geometrical
problems.
D'Alembert, who was the first to understand the concept of mass through
the notion of linear momentum and its conservation, attacked the abstract
concept of force introduced by Newton. For d'Alembert, the only observable
phenomenon is motion, whereas the "cause of motion" is an abstraction; hence
the idea of studying the global set of motions that a theory predicts rather
than its particular trajectory.
The crowning achievement of these ideas came with Lagrange in 1788, one
century after the Principia. Lagrange published, in his Analytical Mechanics
(Mechanique Analitique) , a new formulation of mechanics where the global
and geometric structure of the theory was emphasized. Lagrange proposed
a new way of considering mechanical problems. Instead of determining the
position r(t) and velocity v(t) of a particle at time t, given its initial state
{r(O), v(O)}, Lagrange wanted to determine the effective trajectory followed
by the particle if it starts at Tl at time tl and it arrives at T2 at time t2. This
is in exactly the same spirit as Fermat's for light rays.
The Lagrangian formalism is particularly well suited for discussing invari-
ance laws of physical phenomena and the resulting conservation laws. This
is a fundamental question, since symmetry properties and invariance laws
are what is known a priori of the physics of a problem. In the course of the
discussion, we will introduce the fundamental notion of Lagrange conjugate
16 1 Introduction
momenta or generalized momenta, which plays a central role in all that will
follow.
Finally, we will extend these considerations to the case of a relativistic
particle. Our starting point will precisely be relativistic invariance. The least
action principle can only be meaningful if it determines the motion of a particle
in the same way, whatever the relative state of motion of the observer. This
will enable us to construct the Lagrangian of a relativistic particle. We shall
see how the energy and momentum of a free particle are related to its mass
and velocity. We will prove that the set {E / c, p} is a four-vector of space-time
in relativity.
Chapter 4 leads us to the next stage, in the 1830s, and to the so-called
canonical formulation of analytical mechanics due to Hamilton. The canonical
formalism was elaborated in 1834. It is more convenient for a series of problems
such as the dynamics of point-like particles. But it is impressive, above all, in
the number of its developments, both in physics and in mathematics. In the
present book, we are mainly concerned with applications to mechanics, but
we shall describe several other spinoffs of Hamilton's work. We will establish
the canonical formalism that consists in describing the state of a system by
conjugate variables (i.e., positions {x} and Lagrange conjugate momenta {p} )
and not by positions and velocities. In other words, a system is described by a
point in phase space, and it is characterized by a Hamiltonian that is obtained
from the Lagrangian by a Legendre transformation.
After finding Hamilton's canonical equations, which are first order coupled
differential equations for the evolution of the state variables, we shall present
some aspects of dynamical systems. In fact, this type of physical problem has
been an amazing source of discoveries, both in mathematics and in physics.
Henri Poincare founded this field of research in 1885 when he studied the three-
body problem. This leads to fascinating developments, such as the behavior for
t = 00, attractors and strange attractors, bifurcations, chaos, etc. The most
famous strange attractor is the Lorenz attractor, named after its inventor,
Edward N. Lorenz, who discovered it in 1963 in a mathematical model for the
evolution of the atmosphere. Lorenz generated a new and spectacular source
of interest in chaos with his "butterfly" effect in meteorology.
Next, we will introduce the Poisson brackets, which bear a mathemati-
cal structure of great interest and whose applications are closer to what we
are concerned with here. Jacobi considered that to be Poisson's greatest dis-
covery. In fact, Poisson brackets are the starting point of the theory of Lie
groups. We shall use them to define canonical transformations, which have
many applications and show that there is a complete equivalence between
the two types of state variables: positions {x} and momenta {p}. From the
mathematical point of view, phase space is the space that is appropriate to
describe the evolution of a set of points, as opposed to the "empirical" space
of positions and velocities. We will then be able to understand in a natural
way the amazing property discovered by Dirac in 1925. There is a remark-
able similarity between analytical mechanics and quantum mechanics if one
replaces the classical Poisson brackets by the commutators (divided by in)

of quantum physical quantities. We shall extend these considerations to the
case of a charged particle in a magnetic field, where precisely the conjugate
momentum and the linear momentum differ radically.
The last part of this chapter is devoted to the Hamilton-Jacobi equation,
where one chooses to work directly with the action, as a function of the vari-
ables (x,p) and no longer with the Lagrangian or the Hamiltonian. After we
have established the major properties and the Hamilton-Jacobi equation, we
will discover an impressive series of results. We shall see how, for conserva-
tive systems, the flow of trajectories is orthogonal to the surfaces of constant
action. From that point of view, we will see that the Maupertuis principle
can be cast in a completely geometric form. At that point, we will be able to
understand how geometrical optics appears as the limit of wave optics, as was
discovered by Hamilton. The proof involves what is called the eikonal, which
is the optical analog of the action (divided by the wavelength). In the ap-
proximation of small wavelengths, called the eikonal approximation, the wave
propagates with a wave vector that is locally perpendicular to the surfaces of
constant eikonal. The surfaces are the geometric wave fronts. We will see that
the eikonal equation corresponds exactly to the Fermat principle. The geo-
metric interpretation is nothing but the Huygens-Presnel principle. Finally,
we will show that the same methodology can be applied to the Schrodinger
equation in wave mechanics. This constitutes the famous semi-classical WKB
approximation.
In Chapter 5, we expound the Lagrangian formulation of field theory. In
itself, field theory is a vast domain. In fact, the Lagrangian formalism ex-
hibits its real power when one deals with systems possessing a large, possibly
infinite, number of degrees of freedom. Here, we will examine how this for-
malism deals with field theory. In this chapter, which is deliberately rather
short, we want to explain the principles of Lagrangian field theory and its
application to the electromagnetic field. The classical theory of gravitation
is beyond the scope of this book. We will first understand the principle of
the Lagrangian formulation of field theory starting with the case of a vibrat-
ing string. Then the extension to three space dimensions, as well as several
degrees of freedom, is discussed. One can easily guess the extension of the
method to four dimensional space-time and relativistic fields. We shall de-
scribe the electromagnetic field and Maxwell's equations. We shall say a few
words about field equations that are of first order in time. The first example is
the Fourier diffusion equation, which corresponds to a nonreversible problem
(i.e., a dissipative problem). The interest in this example comes from the sim-
ilarity between the Fourier equation and the Schrodinger equation. We shall
see that a Lagrangian approach can be constructed for the latter but that it
essentially leads nowhere in nonrelativistic quantum mechanics.
In Chapter 6, we give the formulation of the motion of a free particle
in a curved space. Einstein's masterpiece, general relativity, stems from the
amazing observation that two physical quantities that a priori have nothing
18 1 Introduction
in common are equal or strictly proportional. These quantities are the two
concepts of mass. One is the inertial mass, or the coefficient of inertia, and
the other is the coupling coefficient to the gravitational field, or the gravita-
tional mass. There is no a priori argument that can explain why this equality
occurs. In a gravitational field, this equality eliminates the mass from the
equations of motion. Two bodies placed with the same initial conditions have
the same motion whatever their masses. It took some time to realize how deep
this observation is. The historical experiment of ECitvCis in 1890 14 has been
systematically redone and improved since then. It is still performed with more
and more sophisticated techniques.
The underlying idea of general relativity is that the equality becomes nat-
ural if what we call the "gravitational" motion is actually a free motion in a
curved space-time. Einstein used to say that in 1907, when he was working
on how to incorporate Newtonian gravitation in relativity (the incorporation
of electromagnetism was by construction automatic), he had the "happiest
thought of his life." He was thinking of what a carpenter falling from the roof
would feel. For such an "observer" (and of course as long as he does not en-
counter any obstacle), there is no gravitational field. If this observer lets any
object "fall" from his pocket, this object stands still or has a uniform linear
motion with respect to him, whatever its nature or physical and chemical
composition. (The resistance of the atmosphere is of course neglected in this
example.)
The ambition of this chapter is to show how the notion of motion in a
curved space can lead to a theory where the equality of the "two masses"
emerges naturally. We start by studying the free motion of a particle in a
curved space and the notion of the metric of the space. We will then write the
motion of a free particle in such a space. This will lead us to a fundamental
result: The physical trajectories are the geodesics of the space, the curves of
minimal (or extremal) length. As we shall see, this is how the motion of a
particle of constant energy E in a Euclidean space-time can be transformed
into the free motion of the same particle in a curved space, which is equivalent
to the Maupertuis principle.
This will allow us to understand the reasoning of Einstein when he con-
structed general relativity and some consequences of this theory. We will dis-
play three historical examples: The variation of the beat of a clock due to the
gravitational field, the corrections to Newton's celestial mechanics, and the
deviation of light rays by a gravitational field. These examples are histori-
cal. They are also very important in present-day astrophysics and cosmology.
The deviation of light by a gravitational field plays an important role via the
gravitational lensing effect that it induces. One application is the search for
a baryonic component in the "missing mass" of the universe. Another is that
the mass distribution in the universe, be it the visible mass or the missing
14 Roland Eotvos, "Uber die Anziehung der Erde auf Verchiedene Substanzen,"
Math. nat. Ber. Ungarn, 8, 65, (1890).
mass, can act as a natural telescope that can enable us to see faraway objects
and therefore much younger objects. Through this natural cosmic telescope
(or microscope), the universe appears as an endless gallery of gravitational
mirages.
Finally, Chapter 7 is devoted to Feynman's variational formulation of
quantum mechanics. Richard P. Feynman was probably the greatest theo-
retical physicist of the second half of the 20th century. In his thesis work in
1942 at Princeton, Feynman attempted to solve the problem of the self-mass
of the electron, which is infinite in second-order perturbation theory in quan-
tum electrodynamics. Feynman discovered a "least action principle," which
enabled him to solve the problem by using both retarded and advanced po-
tentials. In order to do this, he introduced the mathematical concept of path
integrals, which has been a field of extensive interest since then. The first tri-
umph of this method came when it led to the correct calculation of the Lamb
shift in the hydrogen atom without introducing any arbitrary cutoff parame-
ters. The infinities were dealt with in a systematic and well-defined manner in
terms of basic physical parameters. Since then, the renormalization group has
acquired a depth that places it at the forefront of present theoretical physics.
It was only a few years later that Feynman understood that he could apply
his ideas to a variational formulation of nonrelativistic quantum mechanics. In
an article published in 1948 15 followed a few years later by the book Quantum
Mechanics and Path Integrals by Feynman and Hibbs [20], which corresponds
to the course Feynman gave on quantum mechanics at Caltech for a few years,
one can find the essence and the beauty of his ideas and results.
The two pillars of this approach are the following. First, Feynman is not
interested in states of a system but rather in amplitudes of processes. This is a
more realistic attitude in the sense that any phenomenon, any measurement,
consists in a process. Second, Feynman addresses the problem of quantum
mechanics in space-time.
Feynman's approach relies on the superposition principle. To any physical
process there correspond a number of complex amplitudes that add up. The
probability of observing an event is the modulus squared of the sum of ampli-
tudes that can lead to that event. The Feynman principle consists in assuming
that the phase of the amplitude for a given process is given by the classical
action along the path under consideration divided by Planck's constant n.
The sum of all amplitudes that contribute to the process under consideration
is a mathematically complicated object called a path integral.
Feynman shows how one recovers the Einstein and de Broglie relations,
together with the Schr6dinger equation, observables, and all usual quantum
mechanics in this framework. If one considers systems and processes where
the classical action S(b, a) is macroscopic (i.e., much larger than Planck's
constant n), the contributions of paths that may seem very close to each
15 R.P. Feynman, "Space-Time approach to Non-Relativistic Quantum Mechanics,"

Rev. Mod. Phys., 20, 367 (1948).
20 1 Introduction
other classically but are such that the difference of the classical action along
these paths is much larger than n will be destructive with probability one.
The total contribution of the sum of such paths will therefore vanish in the
global action.
However, in the vicinity of the classical trajectory Xcl(t), the action
Scl(b, a) is stationary. Therefore, the only paths that contribute apprecia-
bly are those for which the action S(b, a) is sufficiently close to the classical
action Scl(b, a), the difference being small compared with n. In other words,
under these considerations, it is only an infinitesimal vicinity of the classical
trajectory, impossible to resolve experimentally, that occurs. The "probabil-
ity" of the classical trajectory is therefore equal to one. In this way, classical
mechanics appears as the limit of quantum mechanics for macroscopic values
of the action. In addition, as we shall see, the amplitude satisfies identically
a modern version of the Huygens-Fresnel principle in optics.
Therefore, Feynman's principle contains an amazing unifying esthetics af-
ter the five previous chapters of this book. It consists in taking into account,
in the calculation of an amplitude, the "largest number" of possible paths con-
strained by the fact that paths that are too far apart interfere destructively.
One can also visualize this as the fact that an amplitude increases when the
"volume" of the space of alternative paths that contribute in a coherent man-
ner is larger. From that point of view, the phase of an amplitude acquires a
physical role and an essence that is perhaps not fully appreciated.
2
Variational Principles
Nature always acts by the shortest paths.

Pierre de Fermat
The remarkable aspects of variational principles are twofold. First, they re-
veal that natural structures and processes result from principles of optimal
conditions. Second, they are universal. All physical laws can be expressed in
such a global form. This form leads to the local expression of physical laws,
but it is richer and more powerful. In particular, it reveals the fundamental
principles that govern physical laws.
Variational principles possess the common feature of presenting natural
phenomena as a result of optimization under constraints. The founding idea
in modern physics and its first formalization are due to Fermat and the prin-
ciple he proposed in geometrical optics. This was followed by the variational
calculus developed by Euler and Lagrange in the 18th century.
In this chapter, we review a number of examples and introduce the neces-
sary mathematical tools. In Section 2.1, we turn back to the Fermat principle,
in particular Fermat's proof of the laws of refraction. Fermat did not know
the velocity of light and the existence of an index of refraction. He assumed
that the time it takes light to travel a certain distance in a medium is propor-
tional to the "resistance" of that medium to the propagation of light. Fermat
stated his "least time principle" at the end of 1661. He called it the "prin-
ciple of natural economy." We know that this principle explains curved light
rays and mirages, which the Snell-Descartes laws cannot account for. This
will directly lead us to the central underlying mathematical foundation of the
problem under consideration: the variational calculus of Euler and Lagrange.
It is an amazing chapter of mathematics, both in its unifying aspect and in
the number of problems that it allows one to solve. Deliberately, we shall not
go into
J-L. any mathematical details. Such details can be found in the literature,
Basdevant,
and we shall focus on physical applications and results.

22 2 Variational Principles
In Section 2.2, we will give a series of examples. First is the "least action
principle," as first stated by Maupertuis for mechanics in 1744. This was a
landmark in the evolution of ideas in physics as well as philosophy at the
time. Then we shall display simple, but mOre original, applications such as
Kirchhoff's laws in electricity Or Poisson's equation in electrostatics.
In Section 2.3, we consider a physical problem that is very similar in its
spirit but is fascinating in the number and importance of its consequences
compared with the simplicity of the starting point. This concerns the foun-
dations of statistical thermodynamics. We shall introduce the technique of
Lagrange multipliers and the principle of equiprobability of configurations.
From this, a very simple definition of the notion of temperature will emerge,
together with its first physical property: The temperatures of two systems in
thermal contact at equilibrium are equal. Then, we will obtain the statistical
and absolute definition of entropy, due to Boltzmann. This will lead us to the
simple but striking principle that Thermodynamical equilibrium corresponds
to a situation where the entropy is maximum, given the constraints imposed
on the system; in other words, a situation where disorder is maximum, given
the constraints.
2.1 The Fermat Principle and Variational Calculus

2.1.1 Least Time Principle
As we have already mentioned, everything started with a quarrel between

Descartes and Fermat in 1637 on the notion of proof after the publication of
"Dioptrique" in Descartes's Discours de la Methode.
The Snell~Descartes laws predict which path a given initial light ray will
follow. Fermat takes a more general point of view. He wants to determine
what path a light ray actually follows when it goes from A to B. We know
that this point of view allows us to explain curved light rays and mirages,
whereas the Snell~Descartes laws are unable to do so. Fermat understands, as
did Hero of Alexandria, that the law of reflection is a geometric property of
the optical length of the light rays. The proof is sketched in Figure 2.l.
Consider an emitter A and an observer B. We assume that light emitted by
A is reflected by a plane mirror before it reaches B. Let B' be the symmetric to
B with respect to the plane of the mirror, and G the intersection of the mirror
and the straight line AB' (Figure 2.1). The length of AGB' is the same as that
of AGB. The shortest distance between A and B' is quite obviously a straight
line. A path AF B where F # G is such that by the triangle inequality (or by
the definition of a straight line) AF + F B' > AB', whatever F. Elementary
geometry then shows that the angles of incidence i and reflection r are equal
for the path AG B.
2.1 The Fermat Principle and Variational Calculus 23
Fig. 2.1. Possible light rays between an emitter A and an observer B when there
is a reflection on a plane. Since B' is symmetric to B with respect to the plane of
the mirror the length of AO B' is equal to that of AO B. The shortest path between
A and B' is a straight line. A path AF B is longer whenever F f. O.
Refraction
Concerning the laws of refraction, Descartes had assumed that the velocity
of light in matter (a dense medium) is greater than in a vacuum (or in a
diluted medium).l That fact, together with the lack of rigor of Descartes's
"proof," had made Fermat angry. He was convinced that things could be
done properly. As he said, "It seems to me that some geometry can help us
solve this problem."
Fermat solved the problem of refraction only much later, in 1661, annoyed
by the critiques of Descartes's supporters. The key point of his proof lies in
the assumption that the velocity of light is, on the contrary, smaller in a dense
medium than in a dilute one.
Let (X, Y) be the plane separating the two media of indices nl and n2. 2
The source is at point A and the observer is at point B, as represented in
Figure 2.2.
Let Hand H' be the projections of A and B on the (x, y) plane. We denote
by h the distance of A to the surface and hi that of B. The distance H H' is
l. We consider a path AOB and we denote by x the distance HO. We want
to minimize the optical path nl AO + n2 OB.
By the Pythagorean theorem, we have
The time T it takes light to follow this path is
T = (nl AO + n2 OB)jc. (2.1)

1 This idea probably comes from the fact that many interfaces under consideration
were horizontal liquid surfaces, perpendicular to the direction of gravity. Since
the refracted light ray appears to be closer to the vertical when it passes from air
to water, for instance, it seems intuitive that it "falls" more rapidly.
2 Fermat knew neither the velocity of light nor the index of refraction; he only
spoke about the "resistance" of a medium to the propagation of light. The time
it takes light to travel along a distance L in the medium is proportional to this
resistance, which seemed more sensible than the opposite.
A
H x
Fig. 2.2. Possible light ray between an emitter A and an observer B when there is
refraction across a plane surface between two media of indices nl and n2. H and H'
are the projections of A and B on the surface. h is the distance between A and the
surface, and hi that between B and the surface. The distance H H' is l.
Here, we give an analytic proof, at present simpler to understand than

the beautiful purely geometric argument of Fermat. (Fermat did not fully
know differential calculus, which was developed later by Newton and Leibniz,
although he had correct ideas on the subject.)
We seek x such that (2.1) is minimal. By taking the derivative of this
expression with respect to x, and writing that the derivative dT / dx vanishes,
we obtain
n2 (l - x)
(2.2)
We note that
and
where the angles 81 and 82 are indicated In the figure, and i1 and i2 are the
angles of incidence and refraction.
Therefore, we obtain the Snell-Descartes law
(2.4)
Furthermore, this extremum is indeed a minimum (d 2 T / dx 2 > 0).

This result fascinated Fermat: "The outcome of my work was the most
extraordinary, the most unexpected and the happiest that ever was. Indeed
... I found that my principle gave exactly and precisely the same proportion
of refractions as Mr. Descartes has found." At the end of 1661, Fermat wrote
his principle of least time, which was the first formulation of everything that
concerns us. Fermat called it the "principle of natural economy," and he added
the remark that "Nature always acts by the shortest paths." As we said, this
principle had a great impact in the 18th century. It was used by Maupertuis
in mechanics.
Rescuing a swimmer
This result can be transposed into many other situations. One example is the
optimal path that a rescuer must follow on a beach and in the water in order
to rescue a bather in difficulty. The velocities of the rescuer on the beach, VI,
and in the water, V2, are not the same. The optimal trajectory, which can be
sketched as in (2.2), obeys the law 3
Curved Rays
Consider a two-dimensional problem (x, z) such as the propagation of light in

a fixed atmosphere whose density varies so that the index of refraction varies
continuously from one point to another. The situation is represented in Figure
2.3.
Fig. 2.3. Light ray between an emitter A and an observer B in a medium whose
index of refraction varies with the altitude z. The variable x is the horizontal dis-
tance. We assume that the problem is translation invariant in the perpendicular y
direction. The apparent direction of point A as seen by B is the tangent to the light
ray reaching B.
The light rays propagate along curved paths and not straight lines, and
the optical angular position of an object differs from its geometrical direction.
From the mathematical point of view, we need to find the path z = Z(x)
of a light ray propagating in a medium of index n( z, x) (or simply n( z) if the
system is translation invariant along the x direction) and going from a point
A at (zo, xo) to an observer B at (Zl' xd. The time dT that it takes light to
go from [x, z] to [x + dx, z + dz] is
dT = n(z) dl = n(z) vdz 2 + dx 2

C C
3 Andre Martin, private communication.

By definition, along the path Z (x) that we wish to determine, we have dz =

(dZ(x)jdx)dx =Z(x)dx.
Therefore, we must find the function Z (x) that minimizes the time along
the path; in other words, the integral
T = -
c
11B n(zh/1 + z(x)2 dx,
A
(2.5)
given the constraints Z(x = xo) = Zo and Z(x = xt) = Zl.

This leads us to the mathematical core of this subject.
2.1.2 Variational Calculus of Euler and Lagrange
The problem under consideration consists in finding a function, or a family of

functions, that minimizes some integral. It is called the variational calculus.
This part of mathematics was developed by Euler, who understood how it
functions, and Lagrange, who made decisive contributions.
The variational calculus is an amazing chapter in mathematics. It bears
many unifying features and it gives the answers to a large number of ques-
tions. It is fully treated in the literature, and we do not wish to enter into
mathematical details in this book; all details can be found in the works listed
in reference [5].
The elementary problem is the following. Find the function z(x) of a real
variable x that minimizes (or maximizes) the integral
1= lB (z(x), z(x), x)dx. (2.6)
where the endpoints A and B are fixed, where z(x) = dz(x)jdx, and where
is a known function, called the Lagrange function. Needless to emphasize, it
is exactly the problem of equation 2.5.
Let us assume there exists a solution, that we denote as z = Z(x). We
want that, for any infinitesimal variation 8z(x) of Z(x), there corresponds
a second-order (or more) variation of the integral I. In the transformation
Z -7 Z + 8z(x), Z -7 Z + 8z(x), where it is assumed that the endpoints of
the integration do not change, 8z(A) = 8z(B) = 0, the variation OJ of the
integral is
OJ = lB [~~ 8z(x) + ~~ 8Z(x)] dx.

The second term can be integrated by parts since by definition 8z = (djdx)8z.
Since 8z(A) = 8z(B) = 0, the variation OJ is
OJ = 1 [0
A
B -
OZ
- -d (0)]
dx
-. 8z(x)dx.
OZ
(2.7)
We want this integral to vanish for any infinitesimal variation rSz. The
integrand must vanish identically. Therefore, the solution z = Z (x) must
satisfy the second-order differential equation
(2.8)
called a Lagrange-Euler equation.

This procedure can be extended to a function.c( {Zi(X), Zi(X)}, x) of several
variables Zi(X), i = 1, ... , N. This leads to N Lagrange-Euler equations
812 d (812) (2.9)

8z i - dx 8z i .
2.1.3 Mirages and Curved Rays
Let us come back to the case of curved rays considered above. Consider the
integral (2.5) and let us assume for definiteness that the index of refraction
varies with the altitude as n(z) = 1 + vz, with v constant for definiteness.
(This formula is only valid for a finite range in z, we could use n(z) = no + vz
for negative v.) We also assume that the endpoints are at the same height
z(x = 0) = hand z(x = l) = h. The Lagrange function is
1 .
12 = -(1 + vz)J1 + z(x)2,
c
from which we deduce the Lagrange-Euler equation
v(1 + z2) = (1 + vz)z. (2.10)
The solution of this equation is simple. We shift to the function u = z + 1/v

and insert this into (2.10). We obtain
1+ u2 = uu, (2.11)
whose general solution is
u = dcosh((x - b)/d), (2.12)
where band d are constants.

One way to obtain this result consists of taking the derivative of (2.11).
This yields (u'/u) = (Ulll/U") , whose "solution" is u" = u/d2, where d is an
arbitrary constant. The solution of this latter equation is u = a cosh( (x-b)/ d),
where, if we use (2.11), a = d.
One can obtain the result (2.12) in a more elegant fashion by using the
conservation laws associated with the Lagrange-Euler equations, as we shall
see in Chapter 3.
If we impose the boundary conditions (i.e., constraints) z(O) hand

z(l) = h, we obtain the result
z = dcosh((x -1/2)/d) - l/v with dcosh(l/2d) - l/v = h. (2.13)
In this simplistic model, the trajectory of the curved ray is a cosh function
whose minimum (or maximum) altitude is attained at x = 1/2 (the symmetry
of the problem).
This situation is encountered in mirages. Perhaps the most common is
highway shimmer. Parts of a hot highway can appear as "lakes." This sort
of mirage is sketched in Figure 2.4. The index of refraction is smaller near
the highway, where the temperature is high and the air less dense, whereas it
increases with the altitude, where the temperature is lower. The "lake" is a
reflection of the sky. Such a case is called an inferior mirage. The apparent
image is below the actual direction of the object. This is depicted in Figure
2.4.
Fig. 2.4. Diagrams of an inferior mirage (left) and a superior mirage (right).
As one can understand from this simple example, a more complex variation
of the index of refraction n( z) will lead to a variety of phenomena. The reverse
happens if the index is smaller at high altitudes than at lower altitudes. This
type of situation happens when light rays propagate near a hot hill. These are
called superior mirages. One can then see an object that should be hidden
geometrically by the hill, such as the famous mirages in the desert.
At sunset, one can see the sun for quite a long time after it has gone below
the geometrical horizon. As shown in Figure 2.5, when the sun is close to
the horizon, light rays cross an atmosphere whose index of refraction varies
considerably with the altitude. At sunset, the angle between the apparent
direction of the sun and its actual direction is roughly half a degree. The sun
is far below the horizon (see [3] for other examples).
Mirages happen frequently in the Arctic and Antarctic. For a long time, the
line of sight crosses a large thickness of the atmosphere. Over that distance, the
density and chemical composition of the atmosphere can vary considerably.
This results in spectacular effects.
Figure 2.6 is a picture taken during a German expedition led by the ship
Germania in the Arctic in 1888. It is particularly rich, since for both ships
,
Apparent direction of su n
,/
~---_/
Palh of light
Sun
Earth
Fig. 2.5. Actual and apparent directions of the sun near the horizon. They differ
by '" 0.5 degrees.
there are two superior mirages, inverted with respect to one another. Between
the ships, one can see an iceberg. This picture is reminiscent of the legend
of the Flying Dutchman 4 at the Cape of Good Hope (in the Southern Hemi-
sphere).
Fig. 2.6. Double superior mirages observed by sailors of the Germania expedition
in the Arctic in 1888. (Courtesy of Roger Lapthorn.)
Figure 2.7 shows two mirages: The superior mirage of an iceberg in the
arctic and a remarkable double sunset mirage, where the lower, inferior, image
of the sun in the forefront is caused by the strong density variations inside a
layer of clouds visible on the picture (see http://www.atoptics.co.uk/).
The variations of the index of refraction of the atmosphere generate a series
of effects, in particular lensing effects. It is possible to observe islands, ships,
and coasts that are several hundred kilometers away. The variation of the index
allows one to see and take pictures of the famous "Green ray" at sunset (see
Pekka Parviainen at http://virtual.finland.fi/finfo/english/mirage2.html).
4 The "Flying Dutchman" was a famous sailor. He claimed he could sail around
the Cape whatever the weather conditions. Years after he disappeared in a huge
storm, many sailors claimed they had seen his ship, in particular in the sky, which
was proof that storms were unable to beat him.
Fig. 2.7. Above: superior mirage of icebergs in the Arctic. (Courtesy ofPekka Parvi-
ainen.) Below: remarkable double sunset mirage observed at Paranal Observatory
in the Atacama Desert, Chile, by Luc Arnold in 2002 at the site of the European
Southern Observatory Very Large Telescope. (Courtesy of Luc Arnold)
2.2 Examples of the Principle of Natural Economy
2.2.1 Maupertuis Principle
In 1744, Maupertuis stated for the first time his principle of the least quantity
of action in mechanics. Even though the initial version and justification of
Maupertuis are confused, it is a historical landmark in the evolution of ideas,
both in physics and, at the time, in philosophy.
Consider a particle of mass m and velocity v. The action of Maupertuis
is the product of three terms: the mass, the velocity, and the distance cov-
ered. Actually, it is the integral of the linear momentum projected along the
J
trajectory: A = mv dl.
The correct formulation and the proof of Maupertuis's principle were given
a little later by Euler. In present terminology, consider a point-like particle
of mass m in a potential V(r). We denote by v the velocity and v its norm.
Assuming (this is essential) that the energy E is a constant of the motion, we
have
1
E = "2mv2 + V(r).
The action of Maupertuis is
Aa,b = lb mv dl == lb j2m(E - V) dl, (2.14)

2.2 Examples of the Principle of Natural Economy 31
where dl is the length element along the trajectory. The principle of Mauper-
tuis is that the physical trajectory that the particle follows to go from a to b
with a fixed energy E is the path that makes (2.14) minimum.
There are many proofs. We parameterize the state variables {r, r}, where
r = (x,y,z), by the time t on the physical trajectory (i.e., we work with
{r( t), r( t)} ). The times of departure ta and arrival tb are therefore well defined.
We have dl = v dt = J(i;2 + iP + i 2) dt, and the action (2.14) is
Aa,b =
lta
tb J2m(E - V(r)) Ji;2 + iP + i 2 dt. (2.15)
We want this quantity to be stationary under infinitesimal variations r -+

r + or, r -+ r + or.
The Lagrange-Euler equations (2.9) lead to
av
-mv- 1 = -d (i;-J2m(E - V(r)) ) ,
ax J2m(E - V(r)) dt v
av
-mv- 1 = -d (iJ-J2m(E - V(r)) ) ,
ay J2m(E - V(r)) dt v .
av
-mv-
az
1
J2m(E - V(r))
= -d
dt
(i
-J2m(E - V(r))
v
). (2.16)
By definition, along the trajectory we are looking for, we have
Therefore, equations (2.16) boil down to
dv
-V'V=m- QED. (2.17)
dt
2.2.2 Shape of a Massive String
Consider a massive string of constant linear mass density f.l and length L whose
endpoints are fixed at A (x = 0, z = zo) and B (x = a, z = zd. The string
lies in the vertical plane (x, z), and it is in the gravitational field, oriented
along the vertical z axis. We want to determine the shape of the string at
equilibrium. (Of course, we assume that (Zl - ZO)2 + a2 ::; L2.)
Equilibrium corresponds to the configuration where the gravitational po-
tential energy of the string is minimal. Consider an arbitrary shape of the
string z (x). An element of the string in the interval [x, x + dx 1 has a length
dl 2 = dx 2 + dz 2 = (1 + i(x )2)dx 2, and its potential energy is dV = f.l9 z dl (g
is the acceleration of gravity). We must therefore minimize the integral
(2.18)
The Lagrange-Euler equation yields
1 + Z2 = ZZ, (2.19)
which we have already considered in (2.11). (This equation is frequently en-

countered in this kind of problem because it is one of the few cases where an
analytic solution is available.) As we already know, the solution is
Z = ccosh((x - xo)jc)
where the parameters c and Xo are determined by the constraints z(O) = zo,
z(a) = ZI, and the length of the string L = foa Jl + z(x)2dx.
The minimum is located in the interval x E [0, a] according to the relative
positions of the endpoints.
In Problem 2.2 one can see that by using the technique of Lagrange mul-
tipliers, which we will define in Section 2.3.3, the problem can be cast as a
translation-invariant problem along the z axis since the length L is an intrinsic
quantity of the string.
2.2.3 Kirchhoff's Laws
We want to determine the relative intensities hand 12 of the electric current in

the two legs ofthe simple electric circuit shown in Figure 2.8, whose resistances
are R I and R 2 . The incoming current has an intensity I. The well-known result
is easily obtained with the Ohm-Kirchhoff laws.
R, I,
Fig. 2.8. Simple element of an electrical network with one bifurcation.
The variational principle here consists in imposing that the energy losses
by Joule heating are as small as possible. In other words, we want to find the
minimum value of
W = RIIf + R2ii with the constraint h + 12 = I.

We find the zero of the derivative of W = RIIr + R 2(I - h? with respect
to h, and this results in RIh = R 2I 2, which is of course the same as if we
2.2 Examples of the Principle of Natural Economy 33
impose that the potential difference V between the two nodes is given. Notice
that we do not need the notion of electric potential. We have replaced the
local notion of potential difference by a global energetic condition and a very
simple principle.
Considering an arbitrary circuit, the principle is that the global heating
loss Lk Rklk is minimal. Of course, one recovers the Kirchhoff laws. For a
relatively simple network, the two approaches are equivalent. In practice, they
may be very different if we consider a large network of electricity transporta-
tion, with, for instance, 10 million elements. Inverting a 10 7 x 10 7 matrix in
real time is not realistic, whereas mathematical optimization procedures are
extremely efficient and easy to handle.
2.2.4 Electrostatic Potential
Consider now a slightly more complicated problem. We want to determine the

electrostatic potential (r) created by a given distribution of charges p(r). We
know that the answer is Poisson's law,
i1 P
= --. (2.20)
EO
This result can be obtained by the following variational principle (which is

a particular case of a more general principle concerning Maxwell's equation,
as we shall see in Chapter 5). The electrostatic field is expressed in terms of
the potential by E = - 'V , and the field energy is EE = (EO /2) J E2 d3 r. The
electrostatic potential energy of a charge distribution p( r) in the potential
(r) is Ep = J p(r)(r) d3 r.
The variational principle here is that the physical potential (r) minimizes
the difference between these two energies (or maximizes it if one takes the
opposite expression). Consider the integral
(2.21 )
The problem under consideration is to find the potential (r) that minimizes
this expression.
We remark on the following points:
1. As usual, we assume there are no charges at infinity, so that can be
chosen to vanish at infinity. The integrals run over all three-dimensional
space.
2. Since the first term is positive, if there exists a minimum of this expression
for a function (r), this minimum corresponds to an equilibrium situation.
In this respect, it is similar to the case of the massive string in Section
2.2.2. There is an equilibrium between two contributions to the total en-
ergy that compete with one another. Any excess of one form of energy
corresponds to an unstable situation.
3. In comparison with the mirage (2.5) or the massive string (2.18), it is the
potential and its gradient V' that play the role of the previous single
variable z and its derivative z. The variable x of the previous simple
examples is now a point r of three-dimensional space (Le., r E R 3 ).
Let be the solution and rJ(r) an infinitesimal variation of this potential.
In the variation -+ + rJ, we have, to first order,
Therefore, the variation of (2.21) is
(2.22)
Integrating the first term by parts, and taking into account the fact that
vanishes at infinity, we obtain
Therefore,
(2.23)
The fact that 8U =

(2.20)
for any infinitesimal rJ(r) yields the Poisson equation
11 = P
--.
co
A particular case is when the charge density vanishes. By that, we mean
that there are a certain number of charged conductors each of which is at
a given potential Vb V2 ,, Vn . There is a surface charge density, but the
volume density p vanishes everywhere. Let L\, E 2 ,, En be the surfaces of
the conductors. Then equation (2.23) boils down to
11 = 0,
with the n constraints = Vi on Ei .
2.2.5 Soap Bubbles
The potential energy of a soap bubble of total area A is V = (J A, where (J is

the surface tension constant of the soap. We consider a soap bubble between
two circles of the same axis and same radius R, as in Figure 2.9. The z axis
is the common axis perpendicular to the two circles, which are centered at
z = -h and z = h, respectively. The problem consists in finding the surface of
minimum area attached to the two circles, which are separated by a distance
d= 2h.
2.3 Thermodynamic Equilibrium: Principle of Maximal Disorder 35
Fig. 2.9. Soap bubble between two symmetric circles.
Consider the interval {z , z+ dz} and let r(z) be the radius of a transverse
section of the surface. We want to minimize the energy
A= Jh 21l'rvh + i' 2dz

-h
with the boundary conditions (constraints) r(-h) = r(h) = R. The problem

is strictly the same as in the case of the string (2.18). The solution is
r = acosh(z/a), with R = acosh(h/a).
This surface, which is rotation invariant around the z axis, bears the sweet
name of a catenoid.
One can attempt to determine shapes of bubbles attached to more com-
plicated structures. (Needless to emphasize again, the present problem has an
analytic solution.)
2.3 Thermodynamic Equilibrium: Principle of

Maximal Disorder
2.3.1 Principle of Equal Probability of States
Let us turn to a case that is similar in its motivation but that has fascinating
consequences compared with the simplicity of the starting point.
As Schrodinger wrote [6], there is, basically, only one problem in statisti-
cal thermodynamics: the distribution of a given amount of energy E over N
identical systems.
We only consider here classical statistical thermodynamics. Quantum
statistics is outside the scope of this book. The only "quantum" feature lies
in the fact that we assume there are discrete energy levels.
We consider an isolated assembly of N identical systems {Sl' S2, ... , 09 N },
each of which can occupy one of the energy levels Ck (for instance, the energy
levels in a box where we place the atoms of a monatomic gas).
We assume that the pairwise interactions of these systems are weak in the
sense that they do not affect their energy levels. The energy of the assembly
is therefore the sum of the energies of the N systems.
Let us call the state or configuration of the assembly the fact that
system Sl has energy e1,

system S2 has energy e2,
system S3 has energy e3, etc.
The ei belong to the set {ck} and, of course, the sum is equal to the (given)
total energy E.
We call distribution of the N systems the fact that
n1 systems are in the energy level C1,

n2 systems are in the energy level C2,
n3 systems are in the energy level c3,
etc.
with the conditions or constraints
Of course, for a given distribution, there correspond several states or con-

figurations. Their number W is
(2.24)
The fundamental premise of statistical physics is extremely simple. It is

called the principle of equal probability of states or configurations.
All states, or configurations, of an isolated assembly of systems in weak
mutual interaction with total energy E are equally probable.
In other words, if it were possible to take pictures, at given times, of the
assembly, or its state, one would observe that the probability of finding the
assembly in anyone of its possible states is the same.
The consequence of this assumption is that the probability of finding
the assembly in a given distribution {n1 n2 ... , nk, ... } is proportional to W
(2.24).
2.3.2 Most Probable Distribution and Equilibrium
In a macroscopic assembly, the number of systems is extremely large. Among

all the {nd possible distributions, there is one in the vicinity of which the
number W is maximum. Furthermore, and this can be proven, W has a sharp
maximum in the vicinity of that distribution. This particular vicinity is much
more probable than any other distribution. In other words, if one were to
inspect the state of the assembly, one would most of the time find a state in
the vicinity of the most probable distribution.
This distribution (more correctly, this vicinity) corresponds to the ther-
modynamic equilibrium of the assembly.
We therefore want to determine the distribution that maximizes W. Ac-
tually, W is a very large number. It is convenient to maximize its logarithm
rather than W itself.
Since the numbers {nd are very large, we can use Stirling's formula N! rv
NN e- N (27rN)1/2 (where the last factor doesn't play any significant role),
which leads to
In W = NlnN-N- L ni lnni+ L ni = L ni In(N/ni) = -NLPi In(Pi),
(2.25)
where we have introduced the probabilities Pi = ndN.
We want to find the distribution {ni} that maximizes this expression under
the constraints
(2.26)
2.3.3 Lagrange Multipliers
In order to solve this problem, we use the technique of Lagrange multipliers,

which has many applications.
The problem under consideration is to find the maximum of a function
f(x,y) with the constraint that (x,y) lie on a path y = yo(x) (for instance,
find the highest point not of a mountain but of a road on that mountain).
Of course, one can think of injecting the equation of the path in f and
calculating x such that
d of of d
dx f(x, yo(x)) = ox + oy dx (Yo(x)) = O. (2.27)
The method of Lagrange has several practical advantages. It consists of

introducing an auxiliary function g(x, y) = yo(x) - Y and a new variable >-,
called a Lagrange multiplier. We search for the extremum of the function of
three variables (x, y, >-),
f(x,y)+>-g(x,y) with (x,y)=yo(x)-y. (2.28)
We must therefore solve three equations:
of og of og
-;:;-+>--;:;-=0 (1),
uX uX
oy + >- oy = 0 (2), 9 = 0 (3). (2.29)
Since g(x, y) = yo(x) - Y = 0, we obtain for (1) and (2)

a1 + >.. ayo a1
ax ax
= 0 (1), ay - >.. = 0 (2). (2.30)
Eliminating>.. between (1) and (2) obviously amounts to solving the initial
equation (2.27).
This method applies in the case of a function 1({Xi}) of any number of
variables Xi, i = 1, ... , n related by any number P of constraints 9k( {Xi} =
0, k=1, ... ,p(withp<n).
2.3.4 Boltzmann Factor
It is simpler to work not with the occupation numbers ni but with the prob-
abilities Pi = ndN, which can be considered continuous quantities since N is
very large.
In terms of these probabilities, the two constraints are
The function we want to maximize is
In W = L ni In(N/ni) = -NLPi In(pi).

i i
We must therefore introduce two Lagrange multipliers, a and /3, and the
probability law {pd, which maximizes this expression under the constraints
above, is the function for which the variation of the quantity
8ln W - a8N - /38E (2.31)

vanishes. In other words, whatever the 8Pi (with 2: 8Pi = 0), the following
variation must vanish
(2.32)
We therefore obtain -lnpi - a- /3ci = 0; i.e., Pi = e-o:-{3C:i. The constants

a and /3 are determined by the constraints
L e-o:-{3C: i = 1,
i
The condition 2:i Pi = 1 implies
-0: 1
e - =--..."..-
- 2:i e-{3C:i (2.33)
Therefore, the equilibrium distribution is characterized by the fact that there

exists a number /3 related to the total energy E by
'" -130
E=NuiEie ' (2.34)
Li e-{3ci
We will see that this number (3 defines the temperature T of the assembly
by
(3 = l/kT, (2.35)
where k is Boltzmann's constant. It is, in particular, a quantity that equalizes
when two assemblies are put in thermal contact (which is the first property
of temperature).
Therefore, the probability Pi of finding a system in the energy level Ei at
equilibrium is given by Boltzmann's factor
Pi = --=Z- (2.36)
where the function

(2.37)
is called the partition function of the system (from the German Zustands-
summe, sum over states). This function plays an important role in statistical
physics; -k In Z is the free energy divided by T. The form (2.36) is called the
Boltzmann-Gibbs distribution.
2.3.5 Equalization of Temperatures
Consider two assemblies [ and [', which may be of different natures, formed
respectively of Nand N' systems Sand S'. The energy levels of S are Ei
and those of S' are Ej. These two assemblies are in "thermal contact," which
means that they can exchange energy but that their interaction is sufficiently
weak that it does not change the individual energy levels Ei and Ej of the two
systems considered separately.
Furthermore, these two systems are isolated. We denote by E the total
energy.
1. The number W of states in a distribution ({ ni}, {nj}) of the systems S
and S' is
W= N!N'!
IIi (ni!)IIj (nj!)
2. There are now three constraints on the distributions ({ nd, {nj} ):
We want to find the distribution that maximizes In W under these con-

straints. We introduce three Lagrange multipliers, 0:,0:', (3, and the prob-
abilities Pi = nd N, pj = nj / N. This leads to finding the zero of
or equivalently
- N L bPi In(pi) - n' L bpj In(Pj) - aN L bPi - a' n' L bpj

j j
(2.38)
3. This expression must vanish for all infinitesimal bPi and bpj. Therefore,
one obtains
-lnpi - a - (3Ei = 0, - In p', - a' - ='

(3ElJ 0
or
,
Pi = e
-a-{3U
" Pj = e -a' -{3U~J. (2.39)
We notice that it is the same Lagrange multiplier (3 that appears in both
expressions. This is due to the fact that it is the total energy that is a
given quantity. The constants a, a', and (3 are fixed by the constraints as
above. Therefore, the two temperatures are equal if (3 = l/kT.
2.3.6 The Ideal Gas
Consider a monatomic ideal gas of temperature T. We assume it is confined

in a cubic box of side a. The energy levels are therefore
2~2
7rn 2 2 2
Enl m = --2(n +l +m).
" 2ma
We assume that the spacing of these levels is very small compared with the
mean thermal energy kT. We go to the continuum limit using the density of
states in phase space5
i.e.,
where V is the volume of the cubic container. Inserting this in (2.36), we

obtain the probability dP of finding an atom of the gas in an element d 3 v
around the velocity v = p/m,
(2.40)
5 See, for instance [7], Chapter 4.

If we identify this with Maxwell's distribution, dP = (2::a) 3/2 e- ';kv; d3v we

obtain the fundamental relation
1
{3 = kT' (2.41)
where T is the absolute temperature.

We now come back to the equalization of the factors {3 of two assemblies
in thermal contact. We see that, since one of these assemblies can be an ideal
gas, the energy-absolute-temperature relation holds for any assembly of N
identical systems of individual energy levels Ci,
E
I> c,c si / kT
= N .::::::::::":."--"----,.:_=_ (2.42)
-Z=i e- si / kT
(all appropriate care is assumed in the counting of degenerate states and in
taking the continuum limit).
Finally, this method allows us to define a thermostat by considering the
limit where one of the assemblies is much larger than the other. Establishing
thermal contact with the second, small assembly does not change the tem-
perature of the first one. We therefore recover the usual treatment of ther-
modynamics of assemblies in thermal contact with a thermostat at a given
temperature.
2.3.7 Boltzmann's Entropy
If we let an assembly evolve freely, it will eventually reach a state of equilib-

rium. This evolution is not arbitrary: The assembly evolves and reaches its
most probable situation where the number of states is maximum, as we have
just seen.
It is perhaps the greatest discovery of Boltzmann that the quantity
S=klnW (2.43)
is nothing but the entropy of the assembly, which is therefore defined in an

absolute manner (k is Boltzmann's constant). This provides a measure of the
state of disorder of the assembly. The greater its disorder, the more stable it
is.
We therefore reach a fundamental result of great simplicity:
The thermodynamic equilibrium corresponds to a situation that maxi-

mizes the entropy for a given set of constraints. In other words, it maximizes
the disorder given the constraints.
This principle applies to a large variety of daily life situations. What is

the state of a child's room that optimizes the satisfaction of all the family?
Its range of application goes far beyond physics. This notion is one of the
founding blocks of economic models.
2.3.8 Heat and Work
The notion of heat, which is very intuitive and has been known since very
ancient times, was viewed for a long time as emanating from some fluid that
could flow from one body to another. The first principle of thermodynamics
tells us that it is a particular form of energy. Statistical thermodynamics allows
us to understand this in a very natural manner.
Indeed, consider an assembly at equilibrium whose total energy is E =
N LPici and whose temperature is T. In any infinitesimal evolution of this
assembly through a contact with the outside, two things can happen. One
is the variation of the energy levels Ci if the total volume changes, or if an
electric field is applied, etc. Another is the reorganization of the populations
of the various energy levels ni = NPi' The corresponding variation dE of the
total energy of the assembly is
(2.44)
The first term is obvious. It corresponds simply to the work of the external
forces
(2.45)
(we avoid the traditional dW in order to avoid confusion). Under the external
action, the energy levels vary, resulting in a variation (2.45) of the total energy
of the system.
The second term is less obvious. It comes from the fact that, even if the
energy levels Ci do not change (in the absence of external work), the total
energy can be modified by a rearrangement of the populations ni of the levels.
This variation of the (internal) energy without any intervention of external
forces is what we call "heat." We obtain the statistical definition of heat as
(2.46)
In order to relate Boltzmann's entropy equation (2.43) and the usual for-
mula of thermodynamics, consider a variation dS = kd(ln W). We obtain
(2.47)
(of course, Li dni = 0). Suppose the evolution is sufficiently slow that at any
time thermodynamical equilibrium is achieved. (The temperature may evolve
during the process). This is called a reversible transformation in macroscopic
thermodynamics. If this is the case, the ni are proportional to exp (-ci/kT),
which yields
(2.48)
2.4 Problems 43
If we go back to the definition of heat (2.46), we obtain
dS rev . = (di) ,
rev.
(2.49)
which is exactly the definition of entropy given by Clausius.

We note the following aspects of entropy:
1. Classically, entropy is only defined for systems at equilibrium. The gener-
alization to systems out of equilibrium is completely natural in statistical
thermodynamics.
2. The statistical definition expresses the entropy in an absolute way. It is
not limited to entropy variations.
2.4 Problems
2.1. Conserved quantities

In the calculation of bent rays (Section 2.1.3) or the massive string (Section
2.2.2), show that the quantity r(z) = r/ JI + r(z)2 is constant along the
path (or string). From that observation, deduce the solution of the problem.
2.2. Lagrange Multipliers

Reconsider the massive string exercise (Section 2.2.2) using Lagrange multi-
pliers in order to express the constraints; i.e., the length of the string and the
positions of the endpoints.
2.3. Brachistochrone
A popular problem for mathematicians is the brachistochrone curve. Consider
two points A and B in a vertical plane, joined by a curve C. In A, a massive
particle is dropped with zero initial velocity, and it slides without friction
along the curve under the effect of gravity. We want to determine the curve C
such that the time for the particle to go from A to B is minimum. We note z
the altitude and x the abscissa of a point on the curve. The endpoints A and
B correspond respectively to (x = a, z = a) and (x = b, z = (3).
2.4. Win a Slalom

A problem similar to the previous one is encountered by a skier sliding down a
snowy plane slope. The plane makes an angle a with respect to the horizontal
direction. The skier is in the vertical field of gravity, of acceleration g. The
skier starts with a zero velocity from some point 0 and wants to reach a given
point A, downhill, in the shortest time. What is the optimal trajectory?
We choose in the plane a reference frame with origin at 0, with horizontal
axis Oy, and whose x axis is along the line of greatest slope, as shown in the
Figure 2.10. We choose the origin of the potential energy at point 0 so that
the initial energy E of the skier is zero.
Fig. 2.10. Definition of coordinates.
We neglect friction of air and the track, as well as the efforts of the skier to
maintain his trajectory. Therefore, the total energy of the skier is a constant
of the motion.
1. Check that with this definition of the variable x, the potential energy of
the skier at point (x,y) is V = -mgxsina.
2. Write the expression of the skier's total energy at a given time. We denote
x == dx/dt, if == dy/dt. What is the relation between the potential energy
and the kinetic energy owing to energy conservation?
3. Use the previous expression to express the square of the time interval dt
between two positions, (x, y) and (x + dx, y + dy), of the skier, in terms
of dx 2, dy2, x, y, g, and a.
4. Calculate the time it takes to go from 0 to A if the skier follows a trajec-
tory defined by a function y(x) (note y' == dy/dx).
5. What is the equation of the optimal trajectory?
vi
6. Show that along the optimal trajectory the quantity C = y' / x(l + (y')2)
is a constant. Deduce from this that along the trajectory the quantity
f(t) = if/x is a constant K, and express its value in terms of C, g, and a.
7. Check that the parametric form x(B) = (1 - cos2B)/(2C2) and y(B) =
(2B - sin 2B) / (2C 2 ) is a solution. Use the result of the previous question
to calculate the function B(t).
8. What kind of curve is it? Draw the trajectory qualitatively in the case
y'(A) 1.
9. Explain the result physically. (It is not necessary to do all previous calcu-
lations in order to answer this question.)
2.5. Strategy of a Regatta
A sailboat has velocity v(B), which is a function of the angle B between the
direction of the wind and the direction of the boat and also of the norm w
of the velocity of the wind. We assume that the velocity of the boat v is
proportional to the velocity of the wind wand that it depends on the angle
e chosen by the skipper. For convenience, in what follows, we shall write this
velocity in the form
w
v(B) = cos(B) h(tanB) , (2.50)
We are interested in the strategy where the sailboat tacks to the wind (i.e.,
e ~ 71"/2), as shown in Figure 2.11. We assume that the x component Vx of
2.4 Problems 45
the velocity of the boat is opposite to that of the wind and that the position
of the sailboat along the x axis always increases with time. We assume the
coast is linear (land = half-plane z < 0, sea = half-plane z > 0).
We assume the wind is parallel to the coast, of direction opposite to the
x axis, and that the norm of its velocity w(z) depends only on the distance z
to the coast.
x=L
z
z =z,
shore x
Fig. 2.11. Diagram of the direction of the sailboat compared with that of the wind.
Here, we assume that the velocity of the wind has the form
zo
w(z) = Wo - WI--, (2.51 )
Z Zo+
where Wo is the velocity far from the coast, which is larger than the velocity
(wo - wd ;::: 0 on the coast z = O.
1. We denote
. dx . dz , dz
x = dt' Z = dt' Z = dx'
Show that z' = tan B.
2. We first assume the wind is uniform (w = constant, WI = 0). Write the
expression of the velocity of the boat along the axis of the wind Vx = x
in terms of wand h(tanB). For what values of Band z' is this velocity
maximum? What is its value?
3. We now assume WI -:I O. The boat sails from the origin (x = 0, z = 0)
to a given point (x = L, z = zd. We assume that z' ;::: 0 for all t (i.e.
the boat never changes tack). We want to determine the fastest trajectory
z( x). Write the expression of the time dt to go, on this trajectory, from x
to x + dx in terms of the functions wand h. Give the value of the total
time T between the starting point and the arrival.
4. Deduce from (3) the equation that determines the optimal trajectory
(which minimizes T).
5. Show that the translation invariance of the problem along the x direction
yields
h'(z')z' - h(z')
w(z) = A,
where A is a constant.
6. Use the previous result to calculate the trajectory in the form of a function
x(z) (and not a function z(x)). Fix the value of the constant A.
7. Calculate the value of z/ = dz/dx as a function of z. We assume that
Zl Land Zl zoo Do you think the result corresponds to the best
strategy? If not, what modifications must the skipper make?
3
The Analytical Mechanics of Lagrange
In the beginning there was the Action.

Johann Wolfgang Goethe
The fundamental concepts and principles of mechanics, or dynamics, were

established in the 17th century. Copernicus gave the notion of reference system
in 1543, and Galileo stated the principle of inertia in 1638 in his important
work Discorsi e dimostrazioni mathematiche intorno a due nove scienze alla
meccanica ed i movimenti locali. l A particle on which no force is exerted has
a constant velocity. Linear uniform motion is a state relative to the observer,
and not a process. It is the variation of the velocity that is a process resulting
from an external action. Many scientists participated in this evolution, such
as Tycho Brahe, Kepler, Descartes, and Christiaan Huygens, to name a few.
The great achievement was the synthesis of Newton in 1687, the Philoso-
phiae Naturalis Principia Mathematica. Newton stated his four laws: the prin-
ciple of inertia, the composition of forces, the proportionality of the acceler-
ation and the force, and the principle of action and reaction. In addition,
Newton formulated the universal law of gravitation, which enabled him to
explain Kepler's laws and the motion of celestial bodies.
Humans had been concerned with celestial motion, which was completely
entangled with the notion of time, ever since they first observed the sky. They
were now able to predict the state of the sky with incredible accuracy.
But this is by no means the end of the story. Following the Newtonian
synthesis, an amazing adventure happened in the 18th and 19th centuries.
This started with d'Alembert, Maupertuis, and the Bernoulli brothers (in
particular, Daniel), and was followed by Euler, Lagrange, and later on by
Hamilton. The true structure of mechanics was discovered to be a geometric
structure. A large category of mechanical problems could be reduced to purely
geometrical problems.
D'Alembert, who was the first to understand the concept of mass through
the notion of linear momentum and its conservation, attacked the abstract
1 Discursive reasoning and proofs concerning two new sciences ....

48 3 The Analytical Mechanics of Lagrange
concept of force introduced by Newton. For d'Alembert, the only observable

phenomenon is motion, whereas the "cause of motion" is an abstraction; hence
the idea of studying not a particular trajectory of the theory but the global
set of motions that it predicts (this is a very modern conception of forces or
interactions) .
The crowning achievement of these ideas came with Lagrange in 1788,
one century after the Principia. Lagrange published, in his Analytical Me-
chanics (Mechanique Analitique), a new formulation of mechanics where the
global and geometric structure of the theory was emphasized. 2 Lagrange wrote
(nevertheless) at that time,
One will not find Figures in this work. The methods I present do not
require any construction or geometric arguments, but only algebraic
operations, subject to a uniform and continuous methodology. Those
who appreciate Analysis will discover with pleasure how Mechanics
becomes a new branch part of it, and they will be grateful to me for
extending its realm.
This explains the name "Analytical Mechanics."
The first section of this chapter describes the principles of the analytical
mechanics of Lagrange. It is based on the least action principle. Lagrange pro-
posed a new way of considering mechanical problems. Instead of determining
the position r(t) and velocity v(t) of a particle at time t, given its initial state
{r(O), v(O)}, Lagrange wants to determine the actual trajectory followed by
the particle in order to start at r1 at time t1 and to arrive at r2 at time t2'
This is exactly the same approach that Fermat used for light rays. We will
write the Lagrange-Euler equations that elicit the geometrical aspect of the
mechanical problem.
Section 3.2 deals with invariance properties of physical phenomena and the
resulting conservation laws. Invariance laws are fundamental in the sense that
they are what we know a priori about the physics of a problem. We will see
that energy conservation is associated with homogeneity of time, momentum
conservation with homogeneity of space, and angular momentum conservation
with the isotropy of space. In the discussion, we will introduce the notion
of Lagrange conjugate momenta or generalized momenta, which will play a
central role in all that follows.
Section 3.3 is devoted to velocity-dependent forces that are not gradients
of potentials. We shall say a few words on dissipative systems, but our main
point of interest will be the fundamental Lorentz force. The magnetic part
of the Lorentz force exerted on a charged particle does not work. We shall
see that in that case the generalized momentum does not coincide with the
linear momentum (i.e., the product of mass times velocity). This fact, which
2 There are many books on analytic mechanics, or dynamics. One can refer to the
classics of Landau and Lifshitz, [8] and [1], and the book Classical Mechanics [9]
by Herbert Goldstein, which is clear and complete.
3.1 Lagrangian Formalism and the Least Action Principle 49
is intimately related to gauge invariance, has considerable consequences in

quantum mechanics and, more generally, in all modern theories of fundamental
interactions.
Finally, in Section 3.4, we shall extend such considerations to the case of a
relativistic particle. We shall restrict ourselves to the case of a massive particle
that is free or placed in an electromagnetic field. The basic assumption will be
Lorentz invariance. In order for the least action principle to have any physical
meaning, it must determine the motion of the particle independently of the
state of motion of the observer. This will allow us to construct the Lagrangian
of a relativistic particle. We will see how the energy and momentum of a free
particle are related to its mass and velocity. Two points must be emphasized.
First, the Lagrangian formalism allows us to prove that the set {E / c, p} is a
four-vector of space-time, whereas we have no idea a priori of the values of
energy and momentum in terms of the velocity. The second point is that these
properties come from the assumption of relativistic invariance of physical laws.
3.1 Lagrangian Formalism and the Least Action

Principle
In his Analytic Mechanics, Lagrange proposes a new way of considering me-
chanical problems. Instead of determining the position r(t) and velocity v(t)
of a particle at time t, given its initial state {r(O), v(O)}, Lagrange asks the
following question. What is the actual trajectory followed by the particle if it
starts at rl at time h and it arrives at r2 at time t2 ?
actual trajectory X(t)

x
~
po sible trajectory
Fig. 3.1. Examples of trajectories starting from Xl at time tl and arriving at X2

at time t2. Among all such trajectories, the physical trajectory actually followed by
the particle renders the action S minimal (extremal).
3.1.1 Least Action Principle
In order to make things simple, let us consider first the case of only one space
dimension. Among the infinite class of possible trajectories (see Figure 3.1),
what is the law that determines the physical one? Lagrange knows that the
answer to this question lies in the "principle of natural economy" of Fermat,
further developed by Maupertuis, as we said in Chapter 2.
The variational principle we present here is not the original one used by
Lagrange; it was formulated by Hamilton in 1834 and is simpler in this dis-
cussion. In order not to complicate things, we reverse chronology.
One assumes the following:
1. Any mechanical system is characterized by a Lagrange function, or La-
grangian (x, X, t), which depends on the position x, on its time derivative
x = dx / dt, and possibly on time. The quantities x and x are called the
state variables of the particle. For a particle in a potential V(x, t), we have
for instance
1 .
= 2mx2 - V(x, t). (3.1)
2. For any trajectory x(t), one can define the action S by the integral
S = I
tl
t2
(x, x, t) dt. (3.2)
The Least Action Principle states that the physical trajectory X(t) fol-
lowed by the particle is such that S is minimum, or, more generally, has an
extremum.
3.1.2 Lagrange-Euler Equations
We call X(t) the physical trajectory, and we proceed as in Section 2.1.2, except
that the variable is now the time t. Consider a trajectory x(t) infinitely close
to X(t), which also starts from Xl at tr and reaches X2 at t2,
. d
x(t) = X(t) + c5x(t), x(t) = X(t) + c5x(t), c5x(t) = dt c5x(t), (3.3)
where by assumption
(3.4)
To first order in c5x, the variation of S is
c5S = itlt2 (8
8x c5x(t)
8)
+ 8x c5x(t) dt. (3.5)
We integrate the second term by parts and take into account (3.4), so that
the integrated term vanishes. This leads to
c5S = I (8
h
t2 -
8x
- -d (8))
-.
dt
c5x(t) dt.
8x
(3.6)
3.1 Lagrangian Formalism and the Least Action Principle 51
The least action principle states that 88 must vanish whatever the infinitesi-
mal variation 8x(t). Therefore, the equation of motion (i.e., the equation that
determines the physical trajectory), is the Lagmnge-Euler equation
(3.7)
It is simple to check in (3.1) that this leads to the usual equation
mx= - ~~ =j,
where j is the force.
Generalization
The generalization to s degrees of freedom (Xi, Xi), i = 1, ... , s is straight-

forward. The Lagrangian is a function ( {xd, {Xi}, t) of the variables {xd
and {Xi}, and the equations of motion are given by the set of Lagrange-Euler
equations
:~ -:t (:~) i = 1, ... , s. (3.8)
In the case of N particles in the usual three-dimensional space, we have

s = 3N and can use the notation (xi, xn,a = 1, ... , N, i = 1,2,3, which
leads to
i = 1,2,3; a = 1, ... , N. (3.9)
Remarks
1. Nonuniqueness of the Lagrangian

The Lagrangian of a given system is by no means unique. It is easy to see
that if we add a total derivative with respect to time, i.e.,
'=+ :/({xd,t),
the equations of motion are unchanged.
2. Form of the Lagrangian
It is mainly invariance considerations that dictate the form of the La-
grangian, in particular translation or rotation invariance. We shall come
back to this point. The kinetic term mv 2 /2 comes from the principle of
inertia, or equivalently from invariance under Galilean transformations.
Consider the simple case of a free particle in space.
a) Time has no privileged origin, and therefore a/at = O.
b) Space has no privileged origin, and therefore a/aXi) = o.
c) Rotation invariance implies that can only depend on the square of

the velocity, (v 2 ).
d) Galilean invariance entails that a given law is the same in all Galilean
frames. In a reference frame of relative infinitesimal velocity E, v' =
V + E, The Lagrangian becomes, to first order in E,
8
+ 2v . E 8v 2
I
=
The second term on the right-hand side is a total derivative with

respect to time if and only if 8j8v 2 = constant. Therefore, the La-
grangian of a free particle is of the form = K v 2 , where K is a con-
stant. We choose this constant to be mj2 since this entails momentum
conservation for an isolated system, as we will see. Therefore, for a free
particle,
(3.10)
e) In a reference frame with constant velocity V with respect to the
previous one, the Lagrangian becomes
and the equations of motion are the same in both reference frames.
f) If the particle is in a field of force, the potential energy term in (3.1)
is merely a definition of the force. We wish to recover Newton's law,
and this choice guarantees it for forces that derive from potentials.
3. Generalization
The Lagrangian of a set of N particles in a potential V(rl,"" rN; t)
(which includes the mutual interactions of the particles) is
N
= ~ Lmi(ri)2 - V(rl, ... ,rN;t). (3.11)
i=l
4. Change of System of Coordinates
The Lagrange-Euler equations keep the same form in all systems of co-
ordinates (for instance (x, y, z) -+ (r, e, <p)). This feature is particularly
useful in order to perform changes of variables. One calls a sy,;tem of
coordinates {qd generalized coordinates.
3.1.3 Operation of the Optimization Principle
It is remarkable that the laws of mechanics can be derived from a variational

principle. The physical trajectory is that for which the action is minimum or
optimal.
This optimization appears as a "compromise" between various causes in
"conflict." Indeed, in the absence of forces (V = constant in (3.1)), S is
3.2 Invariances and Conservation Laws 53
minimum for x = constant, the motion is linear and uniform. In the absence
of inertia, on the contrary, the particle would go to the maximum of the
potential at the initial point and come back at the final point. The presence
of the potential can be considered as a property of space that curves the
trajectory. Inertia and force can be viewed as conflicting effects. The particle
follows a path of minimal "length," this length being measured by the action
S.
We see here how the mechanical problem can be transformed into a ge-
ometric problem. As we shall see later on, the motion of a particle in a fiat
Euclidean space can be transformed into the free motion in a curved space,
where it moves along geodesics. We will come back extensively to this point
in chapter 6. Einstein had this idea in mind in 1908 when he was construct-
ing general relativity. It took him seven years to elaborate the mathematical
details of the final theory.
3.2 Invariances and Conservation Laws

Invariance laws of physical phenomena are fundamental. They form the set of
what is known a priori about a physical problem. They imply corresponding
conservation laws, which playa crucial role. In more elaborate problems than
those we have seen here, they constitute the guiding line in order to construct
the Lagrangian of a system (we have seen a simple example above in discussing
the form of the free-particle Lagrangian).
A system with s degrees of freedom possesses a priori 2s conserved quan-
tities. Indeed, the evolution of the system is completely determined by the
knowledge of the 2s initial conditions {Xi (0), Xi (O)}. Therefore, there are in
principle 2s relations between the variables {Xi(t),Xi(t)}, which allow one to
calculate {Xi(O), Xi(O)} at any time. In practice, only a subset of such relations
are useful.
3.2.1 Conjugate Momenta and Generalized Momenta
In order to discuss conservation laws, we introduce the notion of Lagrange

conjugate momenta. The quantity
8
Pi = -8' (3.12)
Xi
is called the conjugate momentum of the variable Xi, or its generalized mo-
mentum. In the simple case (3.11), this coincides with the linear momentum
Pi = mXi, but this is no longer true in non-Cartesian coordinate systems, or,
as we shall see, when the forces depend on the velocity. We remark that, from
(3.7), the time evolution of the conjugate momentum Pi is given by
(3.13)
which can be considered the generalized form of Newton's law.
3.2.2 Cyclic Variables
In the Lagrangian formalism, one can make any change of variables
or
In this change of variables, the Lagrange-Euler equations keep the same form.
We can define the conjugate momentum Pi of the generalized variable qi by
the relation
8C'
Pi = 8qi (3.14)
This quantity satisfies the same equation as (3.13); i.e., 'Pi = 8' j8qi.
A cyclic variable is a variable qi that does not appear explicitly in the
Lagrangian '. This means that
8'
-8 =0.
qi
In that case, the conjugate momentum Pi = 8' j8qi is conserved; we have
Pi = constant.
It is useful to find cyclic variables owing to the resulting conservation laws.
3.2.3 Energy and Translations in Time
Assume the system is isolated (i.e., 8j8t = 0). Another way to describe this
assumption is to say that the system is invariant under translations in time
or that time is homogeneous.
We evaluate the evolution of (x,x) along the physical trajectory x(t),
d( .) .()8 .. ()8 d (.()8)

di x, x = x t 8x + x t 8x = dt x t 8x ' (3.15)
where we have transformed the first term by taking into account the Lagrange
equation (3.7). We deduce that
(3.16)
Consequently, for an isolated system, or when there is invariance under

translation in time, the quantity
8
E = x(t) - - (3.17)
8x '
is conserved. It is a constant oj the motion, the energy of the system.

In the general case (3.11), the energy is indeed the sum of the kinetic and
potential energies
(3.18)
If we use the Lagrange conjugate momenta, the expression of the energy

becomes
(3.19)
Examples
Consider the massive string of Chapter 2 and equation (2.18). The Lagrangian
is (up to factors).c ex: z(x)Jl + z(x)2 (here the variable is x). This Lagrangian
does not depend on the variable x. Therefore, the quantity pz -.c, where p is
the conjugate momentum of z, is constant along the curve (it is a "constant
of the motion" in the language of the present chapter). One obtains with no
difficulty p = zzj Jl + z(X)2 and pz -.c = -zj Jl + z(x)2 = -c, where cis
a constant.
We deduce from this that
z(x) = cJl + z(x)2.

If, by the definition of (x), we set z(x) = sinh((x)), we obtain, by inserting
this into the equation,
z(x) = ccosh((x)); i.e., z = c(x) sinh((x)).

We conclude that c(x) = 1, and the solution given in (2.12) follows: z(x) =
ccosh((x - xo)jc). Using this conserved quantity simplifies the resolution of
the problem.
In general, if we consider a Lagrangian of the form
.c = J(z(x))Jl + z(x)2, (3.20)
the conjugate momentum of z is
J(z)z
p= -,====~=;:: (3.21 )
Jl + z(X)2
Since the Lagrangian does not depend explicitly on variable x, the quantity
J(z)
A = pz -.c = (3.22)
Jl + z(x)2'
whose value is fixed by the initial conditions, is conserved. We therefore obtain
i.e., (3.23)
The general solution amounts to a simple quadrature
i.e., (3.24)
This is the generalization of the usual method of integration of the equation

of motion when there is energy conservation.
3.2.4 Momentum and Translations in Space
Suppose the problem is invariant under translations in space. This is the case
for a free particle, and it is also the case for a system of particles whose
interactions depend only on the relative coordinates: V ( { r i - r j } ).
In this case, for any infinitesimal transformation ri ---+ ri + E, the La-
grangian is invariant:
8
5 = '"' - . E = 0 'IE; i.e., (3.25)
~8ri
t
For convenience, we use the notation
(3.26)
where the gradient is taken with respect to the vector variable rio
If the Lagrangian of the system is of the form (3.11), this relation is simply
the principle of action and reaction of Newton. Indeed, if we consider a system
of two particles interacting via a potential V(rl - r2), we obtain
(3.27)
However, there is another interpretation of the result (3.25). Using the

definitions (3.12) and (3.13) of the momenta and their time derivatives, this
relation can be written as
(3.28)
where P is the total momentum P = L~l Pi.

Translation invariance in space implies conservation of the total momentum
of a system of particles.
3.2.5 Angular Momentum and Rotations
Consider rotations. An infinitesimal rotation of an angle 8 around an axis

along the unitary vector u transforms positions and velocities as
In this transformation, the variation of the Lagrangian is
(3.29)
If there is rotation invariance, 8 = 0 for all 8 u. Coming back to the

definition of conjugate momenta and their derivatives, we obtain
2)ri x Pi + ri x Pi) = O.
i
In other words,
~dt ("(r. xp.)) -= ~ ("L.)

L..J
i
t t dt L..J
i
t
= ~L 0
dt = ' (3.30)
where the angular momentum Li of each particle and the total angular mo-
mentum L are defined by
(3.31)
To rotation invariance there corresponds the conservation of the total angular

momentum.
3.2.6 Dynamical Symmetries
A problem may have symmetries of dynamical origin, which can be more or

less hidden. We will examine in Chapter 4 some of the many symmetries of
the harmonic oscillator.
The Kepler problem V(r) = -g2/r and = mv 2/r + g2/r has a well-
known symmetry that comes from the conservation of the Lenz vector. This
vector is
A = P x L _ g2~, (3.32)
m r
where P is the momentum and L = r x P the angular momentum of the
particle.
In Kepler's problem, we must determine six quantities as a function of
time r(t), r(t). Conservation of angular momentum and energy fixes four of
them. The conservation of the Lenz vector, which is perpendicular to the
angular momentum and therefore lies in the plane of the trajectory, fixes the
two others. Therefore, the solution of the problem does not necessitate any
quadrature. One consequence is that in the case of bound states, the trajectory
is closed, which is exceptional: Only the harmonic potential (ex r2) and the
Newtonian potential (ex 1/r) lead to this property.
The invariance law of the Lagrangian that corresponds to this conserva-
tion law is, in a sophisticated mathematical language, an 0(4) symmetry. We
shall not deal with that here. This necessitates some Lie-group theoretical
considerations that are beyond the scope of this book.
3.3 Velocity-Dependent Forces

3.3.1 Dissipative Systems
One can convince oneself that the formalisms of Lagrange and Newton coin-
cide in the case of conservative forces, which derive from potentials. However,
the Lagrangian formalism does not easily accommodate dissipative forces that
depend on the velocity, such as friction. Dissipative forces belong to the me-
chanics of continuous media, and we are not much concerned with that here. 3
We can nevertheless give, as a concrete example, a Lagrangian method
that can deal with simple dissipative systems by a trick. Consider a system
that loses energy by friction, Joule heating, or any other process. The trick
consists in coupling the system appropriately with a fictitious mirror system
that formally absorbs the energy in such a way that the total energy of the two
systems remains constant. Naturally, one only attributes a physical meaning
to quantities or results that possess one.
Consider, for definiteness, a damped harmonic oscillator in one dimension,
of coordinate x, whose equation of motion is
mx + Rx + kx = O. (3.33)
In order to obtain this result in the Lagrangian formalism, we introduce a
"mirror" oscillator of coordinate x* and the formal Lagrangian for the set of
the two coupled systems
1
C = m(xx*) - '2R(x*x - xx*) - kxx*. (3.34)
The conjugate momenta are
p = mx* - Rx* /2 and p* = mx + Rx/2. (3.35)
They have nothing to do with the linear momentum of the damped oscillator
(3.33).
3 For a general treatment of dissipative forces, we refer to Chapter 3, Section 2, of
Morse and Feshbach [10].
3.3 Velocity-Dependent Forces 59
Applying the Lagrange-Euler equations to the two variables {x,x*}, one

obtains the two equations
mx + Ri; + kx = 0 and mx* - Ri;* + kx* = O. (3.36)
The second equation represents an oscillator that "absorbs" the energy lost
by the damping of the first one.
The energy of the set of two systems
E = pi; + p*i;* - [, = i;i;* + kxx* (3.37)
is a constant of the motion. The amplitude of the variable x* increases as fast
as that of x decreases.
This example will be useful in Chapter 5.
3.3.2 Lorentz Force

On the other hand, fundamental interactions provide us with a velocity-
dependent conservative force, the Lorentz force, whose magnetic part does
not work.
Linear Term in the Velocity

In order to prepare the argument, consider, in three-dimensional space, the
case of a Lagrangian whose potential part is linear in the velocity
[, = ~mr2 + r . A(r, t), (3.38)
where A(r, t) is a given vector field.

The Lagrange equations give, for the x component for instance,
.. aA(r, t) _ .. ~A ( )
r ax - mx + dt x r, t . (3.39)
Taking into account
~A ( ) _ . aAx(r, t) . aAx(r, t) . aAx(r, t) aAx(r, t)
dt x r, t - x ax +Y ay +Z aZ + at ' (3.40)
we obtain
.. _ . (aAy(r, t) _ aAx(r,
mx - y ax ay
t)) _. (aAx(r,
Z az
t) _ aAz(r, t)) _aAx(r, t)
ax at
(3.41)
Therefore, if we introduce the vector field
B(r, t) = V' x A(r, t), (3.42)
we obtain the vector expression
.. _. B( ) _ aA(r, t)
mr - r x r, t at' (3.43)
whose form is of obvious interest.
Maxwell's Equations and the Lorentz Force
Classically, a particle of charge q placed in an electromagnetic field undergoes

the Lorentz force
f = q (E + v x B) .
This force is velocity dependent and does not derive from a potential. The
magnetic part qv x B does not work. Let if> be the electric potential. The
potential energy boils down to its electric part V = qif>, and the total energy
is E = mv 2 /2 + qif>.
The Lagrangian cannot be C = mv 2 /2 - qif> since one would lose track of
the magnetic field.
The result (3.43) shows how a linear dependence of the Lagrangian on the
velocity can solve this problem, owing to the properties of the electromagnetic
field.
Maxwell's homogeneous equations
V'. B = 0,
aB
V' x E = - - (3.44)
at'
allow us to express the fields E and B in terms of the scalar and the vector
potentials if> and A,
B=V'xA, (3.45)
Consider a particle of mass m and charge q placed in this electromagnetic

field. We note as usual rand r = v, the position and velocity of the particle.
A possible Lagrangian for this particle is expressed in terms of the poten-
tials A and CP,
C = ~mr2 + qr A(r, t) - qCP(r, t). (3.46)
As in (3.43), one can verify by using the Lagrange equations and
that we obtain the required equation of motion mr = q(E + r x B).

3.3.3 Gauge Invariance
One thing may, however, seem surprising. We have expressed the Lagrangian
in terms of the potentials if> and A. However, these are not unique. The fields
E and B are invariant under gauge transformations,
A -+ A' = A + V'x(r, t), (3.47)
where x(r, t) is an arbitrary function.

3.4 Lagrangian of a Relativistic Particle 61
If we insert this transformation in (3.46), we obtain
C' = C + q (r. \7x(r, t) + ~~) . (3.48)
The difference is a total time derivative
C' = C + q :tx(r, t). (3.49)
Therefore, a gauge transformation does not affect the physics of the problem.
This is of course obvious in the equations of motion. It becomes less obvious
when one transposes the result in quantum mechanics. 4 Gauge invariance is
a dynamical symmetry that one can visualize as defining field theories. This
is the starting point of modern theories of fundamental interactions.
3.3.4 Momentum
Consider now the conjugate momentum p. From the definition (3.12), we

obtain
p = mr + qA(r, t). (3.50)
In other words, in a magnetic field, the momentum p does not coincide with
the linear momentum mr!
Similarly, the angular momentum L = r x p does not coincide with r x mv.
3.4 Lagrangian of a Relativistic Particle
We can extend the Lagrangian formalism to the case of a relativistic particle.

We only consider here a particle that is free or placed in an electromagnetic
field.
The argument is based on Lorentz invariance. The least action principle
can only make sense if it yields the same equation of motion whatever the
relative state of free motion of the observer. We proceed as in Section 3.l.
We want to determine the path followed to go from A(rl' td to B(r2' t2) by
minimizing the action
(3.51 )
3.4.1 Free Particle
Consider first a free particle of mass m. We know the result: The motion is
linear and uniform.
4 See, for instance [7], Chapter 15, Section 5.3.

Consider the time to go from A to B when the particle follows various

paths, each of which is characterized by some values of the acceleration. We
refer to books on special relativity, for instance [1], Section 4.3, and to the
twin paradox that contains all we need for our purpose. Among all possible
paths, the free motion corresponds to the largest proper time.
Let dt be the time interval measured by an observer with a relative
velocity v with respect to the particle. The proper time of the particle is
dT = dtJl- v2 /c 2 . Therefore, free motion maximizes the quantity
F t J(l- ~}l (3.52)
which is, by construction, Lorentz invariant.

In order to recover a minimization principle and to obtain the non-
relativistic limit, which we already know, we choose the Lagrangian
(3.53)
The action is
s = _mc 2 1t2
h
)1- v: dt.
c
(3.54)
This action is Lorentz invariant, whereas the Lagrangian (3.53) is not. This
comes from the fact that in the present approach, time, over which we inte-
grate, plays a special role. One can get rid of this problem, but we shall not
do it here.
We remark that in the limit of small velocities, we recover the non-
e
relativistic Lagrangian up to a constant: = -mc2 + mv 2 /2.
3.4.2 Energy and Momentum
We deduce the expression of the energy and momentum by following the

same method as in Section 3.2. These quantities are of interest since they are
conserved if space-time is homogeneous. This holds in any reference system.
The conjugate momentum is
p - -
ae - mv
(3.55)
- av - ---,==~==;:
Jl - v2 /c 2 '
The energy is
(3.56)
We see that the set of four quantities {E / c, p} is a four-vector of space-

time. Energy and momentum satisfy the relation
3.4 Lagrangian of a Relativistic Particle 63
Two points must be emphasized. First, the Lagrangian formalism allows us

to prove that the set {E /c, p} is a four-vector of space-time, whereas neither
energy nor momentum are defined a priori, and we have only worked with
positions and velocities. Second, this property follows, of course, from our
starting assumption (3.54) based on the relativistic invariance of physical laws.
The observed velocity of the particle is related to its momentum and to
its energy by
(3.58)
3.4.3 Interaction with an Electromagnetic Field
Consider now a particle of electric charge q and mass m, in an electromagnetic

field.
We want to determine the form of the interaction Lagrangian C1 of the
particle and the field. If we insert the sum (C + C1 ) in (3.51), the Lagrange-
Euler equations must give us the equation of motion i> = f, where f is the
Lorentz force.
Actually, we already know the answer because the interaction part in (3.46)
is a relativistic formula!
One can recover this form using relativistic invariance considerations.
We want C1dt to be Lorentz invariant, as is Cdt above.
Consider the potential four-vector AIL = (ic, A) of the electromagnetic
field and the velocity four-vector of the particle uP, = bc, ')'v). The scalar
product
uP,Ap, =,),(-vA) (3.59)
is a Lorentz invariant quantity.
Therefore, the quantity
(3.60)
has the properties wanted. This expression is identical to the interaction

term of (3.46). It is called the "minimal interaction" of a charged particle
with an electromagnetic field.
We therefore obtain the expression of the relativistic Lagrangian of a par-
ticle of mass m and charge q in an electromagnetic field that derives from the
potentials and A,
~ + q(v A - ).
C = -mc2 y 1- ~ (3.61)
The equation of motion follows from the standard procedure.

1. Conjugate Momentum
Let p be the momentum in the absence of the field, as defined by (3.55):
mv
p= (3.62)
)1- v2 /c 2
---c=~=;;;:
The conjugate momentum P = alav is related to this momentum p by

a
P = av = p + qA. (3.63)
2. Lagrange-Euler Equations
The equation of motion follows from the Lagrange-Euler equations.
(3.64)
We have
a
ar = q(\7(v. A) - \7), (3.65)
which yields
dP = d(p + qA) = q(\7(v. A) _ \7). (3.66)
dt dt
3. Equations of Motion
We use the relations
dA = aA
dt at
+ (x aA
ax
+ iJ aA + i aA) = aA + (v. \7)A,
ay az at
(3.67)
and
\7(v . A) = (v \7)A + v x (\7 x A). (3.68)
This leads to the equation of motion
dp
dt = q(E + v x B), (3.69)
where the momentum p and the velocity v are related by (3.62).

4. We must take care of the relation (3.62). If we define the kinetic energy
kin by
(3.70)
by taking the derivative of this equation with respect to time, and taking
into account the definition (3.62), we obtain
dkin dp
--=V-. (3.71)
dt dt
3.5 Problems 65
Inserting this in equation (3.69) and taking into account that v (v x B) =

0, we obtain the anticipated result
dt:kin = qv . E (3.72)
dt '
where E is the electric field. Only the electric field works and modifies the
kinetic energy and the norm of the velocity.
3.5 Problems
3.1. Sliding Pendulum

Consider a pendulum of length 1 and mass m2 hanging on a point of mass ml
that moves horizontally without friction on a rail. We note x the abscissa of
ml and q; the angle with the vertical direction. Write the Lagrangian of this
system.
3.2. Properties of the Action

1. We consider a free particle of Lagrangian = mi;2/2. Calculate the ac-
tion along the physical trajectory in terms of the positions and times of
departure (Xl, h) and arrival (X2, t2)'
2. Consider a harmonic oscillator = (m/2)(i;2 - w2x2). Calculate the ac-
tion, setting T = t2 - tl'
3. Calculate the action for a constant force = mi;2/2 - Fx.
4. Show that the momentum at the point of arrival X2 is given by
5. Show that the energy E = pi; - at the point of arrival X2 is given by
3.3. Conjugate Momenta in Spherical Coordinates

We consider a non-relativistic particle of mass m in a central potential V(r),
vi
where r = x2 + y2 + z2. We denote the velocity v == rand v 2 its square.
We study the problem in spherical coordinates (r,fJ,q;) defined by
X = rsinfJcosq;, y = rsinfJsinq;, z = r cos fJ. (3.73)
The square of the velocity is
v2 = 1'2 + r2 iP + r2 sin2 fJ 2. (3.74)
1. Write the Lagrangian of the particle in spherical coordinates.

2. Calculate the conjugate momenta Pr, Pe, and P</J.

3. Show that the momentum P</J is equal to the z component of the angular
momentum Lz whose expression in Cartesian coordinates is L z = XPy -
YPX
4. To what invariance law does the conservation of Lz correspond?
5. If the particle is charged and placed in a magnetic field B parallel to the
z axis, is the component L z conserved?
4
Hamilton's Canonical Formalism
It is in the silence of laws that great actions are born.

Donatien-Alphonse-Franc;;ois, Marquis de Sade
The work of Lagrange was followed by the monumental five-volume Traite de

Mecanique Celeste (Treatise of Celestial Mechanics) of Pierre-Simon Laplace,
published between 1799 and 1825. This work had crucial importance both in
astronomy and in the evolution of philosophical ideas.
This leads us to the 1830s and to the so-called canonical formulation of
analytical mechanics of Hamilton. 1
Hamilton's canonical formalism was elaborated in 1834. It is more con-
venient for a series of problems such as the dynamics of point-like particles.
But it is impressive, above all, in the number of its developments, both in
physics and in mathematics. In the present book, we are mainly concerned
with applications to mechanics, but we shall describe several other spin-offs
of Hamilton's work. In Section 4.1, we give this canonical formalism, which
consists in describing the state of a system by the conjugate variables posi-
tions {x} and Lagrange conjugate momenta {p}, and not by positions and
velocities. In other words, a system is described by a point in phase space; it
is characterized by a Hamiltonian that is obtained from the Lagrangian by a
Legendre transformation.
After finding Hamilton's canonical equations, which are first-order coupled
differential equations for the evolution of state variables, we shall present, in
Section 4.2, some aspects of dynamical systems. In fact, this type of physical
problem was an amazing source of discoveries, both in mathematics and in
physics. Henri Poincare founded this field of research in 1885 when he studied
the three-body problem. This led to fascinating developments, such as the
behavior for t = 00, attractors and strange attractors, bifurcations, and chaos,
for example. The most famous strange attractor is the Lorenz attractor, named
after Edward N. Lorenz, who discovered it in 1963 in a mathematical model
J-L.
1 AsBasdevant,
in the previous chapter, one can refer to Landau and Lifshitz, [8J and [1], and
to Herbert Goldstein [9J, for any development missing her

68 4 Hamilton's Canonical Formalism
for the evolution of the atmosphere. Lorenz created a new and spectacular
source of interest in chaos with his "butterfly" effect in meteorology.
In Section 4.3, we introduce the Poisson brackets, which bear a mathe-
matical structure of great interest and whose applications are closer to what
we are concerned with here. Jacobi considered them as Poisson's greatest dis-
covery. In fact, Poisson brackets bear the starting point of the theory of Lie
groups. We shall use them to define canonical transformations, which have
many applications and show that there is a complete equivalence between
the two types of state variables: positions {x} and momenta {p}. From the
mathematical point of view, phase space is the space that is appropriate to
describe the evolution of a set of points, as opposed to the "empirical" space
of positions and velocities. We shall establish the Liouville theorem, which is
a remarkable geometric property of the evolution of a system in phase space.
We will then be able to see in a natural way the amazing discovery of Dirac
in 1925. There is a remarkable similarity between analytical mechanics and
quantum mechanics if one replaces the classical Poisson brackets by the com-
mutators (divided by in) of quantum observables. In Section 4.4, we shall
extend these considerations to the case of a charged particle in a magnetic
field, where precisely the conjugate momentum and the linear momentum
differ radically.
Section 4.5 is devoted to the Hamilton-Jacobi equation, where one chooses
to work directly with the action as a function of the variables (x,p) and no
longer with the Lagrangian or the Hamiltonian. After we have established the
major properties and the Hamilton-Jacobi equation, we will discover an im-
pressive series of results. We shall see how, for conservative systems, the flow
of trajectories is orthogonal to the surfaces of constant action. From that point
of view, we will see that the Maupertuis principle can be cast in a completely
geometric form. At that point, we will be able to understand how geometrical
optics appears as the limit of wave optics, as was discovered by Hamilton.
The proof involves what is called the eikonal, which is the optical analog of
the action (divided by the wavelength). In the approximation of small wave-
lengths, called the eikonal approximation, the wave propagates with a wave
vector that is locally perpendicular to the surfaces of constant eikonal. The
surfaces are the geometric wave fronts. We will see that the eikonal equation
corresponds exactly to the Fermat principle. The geometric interpretation is
nothing but the Huygens-Presnel principle. Finally, we will show that the
same methodology can be applied to the Schrodinger equation in wave me-
chanics. This constitutes the famous semiclassical approximation of Gregor
Wentzel, Hendrik Anthony Kramers and Leon Brillouin.
4.1 Hamilton's Canonical Formalism
Actually, the formulation (3.2) of the least action principle is not due to
Lagrange (who used a more complicated form). It was formulated by Hamilton
4.1 Hamilton's Canonical Formalism 69
in 1834. Hamilton, one of the greatest figures of science, was fascinated by

Lagrange and by his analytical mechanics which he called a "scientific poem
written by the Shakespeare of mathematics."
Hamilton's canonical formalism was formulated in 1834. It is more COnve-
nient for some categories of problems, but it goes far beyond that. It contains a
particularly fruitful mathematical structure that led to Lie groups, dynamical
systems, and many other developments.
Hamilton's starting point is to describe the state of a system by the vari-
ables Xi, positions, and Pi, the conjugate momenta, instead of Xi and :k
4.1.1 Canonical Equations
Suppose that we invert equations (3.12) and that we can calculate the {Xi}
in terms of the {xd and {pd, which are our new state variables. 2
The problem is to obtain the equations of motion of the {xd and {pd in
terms of these same variables by eliminating the {x;}.
The solution consists in performing what is called a Legendre transforma-
tion. Let us introduce the Hamilton function, or Hamiltonian,
( 4.1)
Consider, for simplicity, a one-dimensional problem, and let us write the total
differential of H,
dH = P d X + X. dP - -oJ: d X - -
oJ:. d'x - -oJ: dt.
ox ox ot
Taking into account (3.12) and (3.13), the first and fourth terms cancel, and
the third one is -p dx. Therefore, we have
dH = x dp - p' dx - oJ: dt (4.2)

ot '
which provides us with the equations of motion
. oH .
P=--
oH (4.3)
x = op' ox
For a system with any number of degrees of freedom, we have
oH .
Pi=--
oH (4.4)
Xi = OPi' OXi
which are called the canonical equations of Hamilton.
2 Conjugate momenta always exist since the Lagrangian contains a quadratic term
in Xi.
Legendre transformations are often used for performing changes of variables. One
chooses the most convenient set of variables according to the nature of the phys-
ical problem under consideration. A simple example is that of thermodynamic
potentials. Starting from the energy U = W +Q, which is convenient if one works
with the volume and the entropy, dU = -PdV + TdS, one goes to the enthalpy
H = U + PV if one works with the pressure and the entropy dH = V dP + TdS,
to the free energy F = U - T S if one works with the volume and the temperature
dF = -PdV - SdT, and the free enthalpy, the Gibbs function G = F + PV, if
one works with the temperature and the pressure dG = -SdT + VdP.
Hamilton's equations (4.4) form a set of first-order coupled differential

equations in the time variable, which is a major advantage. They are sym-
metric in x and p (up to the minus sign, to which we shall return). They
possess the major technical advantage of presenting directly the time evolu-
tion of the state variables in terms of these same variables.
The value of the Hamilton function is, of course, the energy (3.18). If the
Lagrangian does not depend explicitly on time, 8j8t = 0, then 8Hj8t = 0
and the energy is conserved,
(4.5)
4.2 Dynamical Systems
More generally, if we denote by X(t) = (ri(t), Pi(t)), the position of the system
at time t in phase space, Hamilton's equations are of the form X = F(X);
i.e., a first-order differential equation for the evolution of the 2N-component
vector X(t). This is called a dynamical system.
This type of problem has been an amazing source of discoveries both in
mathematics and in physics; one can refer to the book by 1. Percival and
D. Richards [11]. This field of investigations was founded by Henri Poincare
in 1885, in particular when he made his celebrated analysis of the three-
body problem (the really difficult problem of mechanics). A large number of
famous mathematicians have studied this type of problem, which is still at
the forefront ofresearch in mathematics. J.-C. Yoccoz was awarded the Fields
Medal in 1994 for his results on this subject, which he studied together with
Michael Herman.
One studies the whole set of possible motions, which is called the flow of
these vectors. This leads to fascinating problems such as limiting problems
at t = 00, attractors, and strange attractors; bifurcations, which are sudden
changes in the nature of these flows for certain values of the parameters enter-
ing the function F(X); and chaos and the "butterfly effect" in meteorology,
for example.
4.2 Dynamical Systems 71
4.2.1 Poincare and Chaos in the Solar System
Poincare considered a gravitating system involving more than two bodies, say
planets around the sun. Considering two sets of initial conditions as close to
one another as one wishes, Poincare proved that there is a time when two
of the planets can be as far away from each other (and from their starting
point) as one wishes. 3 This effect is called chaos. It is encountered in many
other physical problems. According to the system under consideration, the
characteristic time for chaos to show up varies considerably.
A very simple example of a chaotic system is playing dice. In principle, in
classical mechanics, if we were to determine extremely accurately the condi-
tions of the problem (i.e., the initial conditions, the way to throw the dice, the
geometry of the dice, etc.) one could in principle predict the result of a throw
of dice, and the phenomenon would lose its probabilistic character. However,
it is quite obvious and intuitive that the outcome of different experiments
would be highly sensitive to the initial conditions and that it would require
an enormous amount of information to make the prediction. It is therefore
much more efficient in practice to perform a probabilistic description of the
problem, where one imposes some ignorance on the initial conditions which
are said to be chosen "at random." This phenomenon is encountered in ce-
lestial mechanics, and many other problems, when initial conditions are close
but not "infinitesimally" close, provided the time of evolution is long enough.
The case of three planets of unequal masses orbiting around a "sun,"
taking into account their mutual interactions, is shown in Figure 4.1. At the
beginning, everything evolves rather smoothly. However, after some time, the
lightest planet is simply ejected from the system; this is of course compatible
with energy conservation, which would not be the case for a two-body system.
By letting the computer run for a longer time, the two other planets, which
have a smooth motion at first, also reach unexpected configurations.
4.2.2 The Butterfly Effect and the Lorenz Attractor
As noted earlier, the most famous strange attractor is probably the Lorenz
attractor, and it generated a spectacular amount of new interest in chaos.
Consider the evolution of a rectangular slice of the atmosphere that is
heated from below and cooled from above. There are three variables: x, the
convective flow of the atmosphere; y, the horizontal temperature distribution;
and Z, the vertical temperature distribution.
The details of the physics involved are of little interest here. In the Lorenz
model, the evolution of these variables is given by the (Hamiltonian) non-
linear differential system
3 In the 19th century, Laplace and others had extensively developed perturbation
theory, which provided extremely accurate predictions for celestial mechanics. Ac-
tually, in the course of his work, Poincare showed that the perturbation expansion
never converges; it is only an asymptotic expansion.
Fig. 4.1. Evolution of three planets around a star, taking into account their mutual
interactions. The time sequence of the pictures must be read from left to right
and from bottom to top. The time interval between two pictures is the same. One
sees that at the eleventh stage, the third planet, lighter and initially close to the
second one, is expelled from the system. Pictures are from Jean-Fran<;ois Colonna,
colonna@cmap.polytechnique.fr, http://www.lactamme.polytechnique.fr; all rights
reserved.
dx
dt=O"(Y-X)'
dy
dt = px - y - XZ,
dz
dt = xy - (3z, (4.6)
where 0" is the ratio of the viscosity to the thermal conductivity, p is the
temperature difference between the top and the bottom of the slice, and (3 is
the ratio of the width to the height of the slice.
Lorenz used to solve this problem numerically, using hours of computer
time at night,4 by standard successive iteration techniques (Xi, Yi, Zi) -+
(XHl' YHl, ZHl)' At that time, this generated kilograms of paper called com-
puter listings. One day, Lorenz had the idea of redoing a calculation whose
4 As an order of magnitude of the performance of computers in the early 1960s, his

first computer, called Royal McBee, could perform 60 multiplications per second.
4.3 Poisson Brackets and Phase Space 73
solution he had found the day before, using as a starting point not the last
point obtained the day before but some intermediate value (Xi, Yi, Zi) obtained
in the calculation. To his great surprise, after a relatively small number of iter-
ations, the following values appeared completely different from those obtained
previously. Lorenz had rediscovered chaos, due, in that case, to round-off er-
rors of the numbers he used.
The sensitivity of the results to the initial conditions induces the same
type of difference between two solutions initially close to one another. Lorenz
called that the "butterfly effect." Actually, the title of one of his talks was:
Can the beat of a butterfly's wing in Brasil cause a tornado in Texas? Whether
or not it is a coincidence, the "Lorenz attractor" has the shape of the wings
of a butterfly.
In Figures 4.2 and 4.3, one can see the result of an iteration of the equations
(4.6). We notice that the time evolution of the point (x, y, z) has a perfectly
quiet behavior-the point turns around on a wing of the attractor- but that
unexpectedly it "jumps" from one wing to the other at certain times. This
occurs unexpectedly in space as well as in time, in the sense that the trajec-
tories of two points that are initially very close (in position and velocity) can
become completely different at a later time. In particular, the two positions
can be on different wings of the attractor.
..
Fig. 4.2. Lorenz attractor viewed from two different sides. The points correspond
to a discrete numerical iteration of (4.6) . One can follow the points and observe the
sudden and unexpected transition from one wing of the attractor to the other, which
was not possible to predict half a semiperiod before. (Courtesy of Jean-Fran<;ois
Colonna.)
4.3 Poisson Brackets and Phase Space
Consider two physical quantities f and g, which are functions of the state
variables (Xi,Pi), i = 1, ... , N and possibly of time. One calls a Poisson
bracket of f and g the quantity
Fig. 4.3. Projection of the Lorenz attractor on the (x, z) plane. (Courtesy of Jean-
Fran<;ois Colonna.)
N
{f g} = '" ( a1 ag _ a1 ag ) (4.7)
, ~
i=l
ax2 ap2 ap2 ax2 .
Poisson brackets have the following properties, which are straightforward

to establish:
{f,g} = -{g,j}, {h + h,g} = {h,g} + {h,g}, (4.8)
{hh,g} = 1dh,g} + {h,g}h (4.9)

For the state variables (Xi, Pi), we have the important relations
(4.10)
and
a1
{pi, j} = --a
Xi
. (4.11)
One obtains with no difficulty the Jacobi identity
{1, {g, h}} + {g, {h, j}} + {h, {1, g}} = O. (4.12)
4.3.1 Time Evolution and Constants of the Motion
Consider a physical quantity 1(Xl,Pl, ... , XN,PN; t) == 1([Xi,Pi]; t), where we

denote by [Xi, Pi] the set of variables (Xl, Pl, ... , XN, PN) in order to avoid any
confusion with Poisson brackets. Its time evolution is given by
(4.13)
Using Hamilton's equations (4.4), we obtain

. af
f = {f, H} + at (4.14)
In particular, the canonical equations (4.4) are now written in a symmetric

way:
Xi = {Xi, H} Pi = {Pi, H}. (4.15)
In the canonical formalism, the Hamiltonian governs the time evolution of
the system. If a physical quantity f does not depend explicitly on time (i.e.,
a at
f / = 0, which amounts to saying that the system is isolated), then its time
evolution is obtained by taking the Poisson bracket of f and the Hamiltonian
j = {f,H}. (4.16)
Therefore, if the Poisson bracket of a quantity f and the Hamiltonian

vanishes, f is a constant of the motion.
The Poisson brackets, which were invented in 1809, are far from being sim-
ple technical tools. Jacobi considered them to be Poisson's greatest invention.
In fact, the Poisson brackets contain the germ of the theory of Lie groups.
Poisson Theorem
Theorem 1. If f and g are two constants of the motion, then their Poisson
bracket is also a constant of the motion.
This theorem of Poisson can be derived from the Jacobi identity (4.12)
{H, {f,g}} + {f, {g, H}} + {g, {H, f}} = o. (4.17)
We assume that f and g are constants of the motion; i.e., {g, H} = 0 and
{H, f} = O. Therefore,
{H, {f,g}} = 0
and {f, g} is a constant of the motion. In certain cases, this allows one to find
new constants of the motion.
4.3.2 Canonical Transformations
In the Lagrangian formalism, the Lagrange-Euler equations keep the same

form in any change of coordinates Xi --+ Xi(Xl, ... , xn) (for instance, chang-
ing from Cartesian coordinates (x, y, z) to polar coordinates (r, (), )). These
changes of coordinates in configuration space are called point transformations.
In the Hamiltonian formalism, there exists a much larger class of transfor-

mations under which the equations of motion are invariant. One can indeed
mix the state variables (i.e., the positions {xd and the conjugate momenta
{Pi}) and perform a transformation in phase space, which we will see below.
A canonical transformation is a coordinate transformation
(4.18)
such that Hamilton's equations keep the same form in the new variables. Let
H'(X I , ... , X N , PI"'" PN ; t) be Hamilton's function expressed in terms of
the new variables [Xi, Pi]. Then, in a canonical transformation, by definition
one has
. aH' . aH'
Xi = aPi' Pi = - aXi . (4.19)
The following theorem is of great practical importance.

Theorem 2. A transformation [Xi, Pi] --+ [Xi, Pi] that preserves the Poisson
brackets is a canonical transformation. It preserves the equations of motion.
This amounts to requiring that the Poisson brackets expressed in terms of
the new variables be the same as those expressed in the initial variables; i.e.,
(4.20)
We can give a direct proof. In order to simplify things, consider a time-
independent transformation and a couple of variables (x, p) --+ (X(x,p), P(x, p))
such that {X,P} = 1. We denote by H(x,p) and H'(X,P) the expression of
Hamilton's function in these two systems of variables.
In the variables (x,p), the time evolution of X and P is X = {X,H} and
F = {P, H}, which can be written, for instance, as
. aXaH aXaH
X=-----. (4.21)
ax ap ap ax
The Hamilton function in the new variables is expressed as
H'(X,P) = H(x(X,P),p(X,P)) (4.22)
and its inverse as
H(x,p) = H' (X(x,p), P(x,p)). (4.23)
Differentiating H with respect to x and p in the previous expression, we
obtain
aH aH' ax aH' ap aH aH' ax aH' ap
- -- -+- _. -=--+--. (4.24)
ax - ax ax ap ax' ap ax ap ap ap
Using this in (4.21), we obtain
X= (axap _ axap) aH'

ax ap ap ax ap
={X,p}aH',
ap
F= (apax _ apax) aH' =_{X,p}aH'.

ax ap ap ax ax ax
Since, by assumption, {X, P} = 1, we indeed recover the canonical equations
. aH' . aH'
X= ap' P=-- QED. (4.25)
ax
Comments
1. The extension to an arbitrary number N of variables
with, by assumption, {Xi, X j } = 0, {Pi, Pj } = 0, {Xi, Pj } = 6ij, causes

no difficulty.
2. We see that since canonical transformations mix coordinates and mo-
menta, there is no fundamental difference between these two types of state
variables. In the Hamiltonian formalism, the notions of space coordinates
and momenta (more or less associated with linear momenta) lose their
intuitive meaning.
For this reason, one usually calls these variables canonical conjugate
variables, which are denoted by (qi,Pi) with the relations {qi,Pj} = 6ij,
{qi, qj} = {pi, Pj} = 0. The very simple canonical transformation (X = p,
P = -x) shows that, in that sense, these two variables can be "ex-
changed." The canonical conjugate variables characterize the state of the
system by a point in phase space (see below).
3. The Hamiltonian motion therefore appears at each time t as performing
a canonical transformation of the state variables
4. In general, one calls canonical conjugate variables two physical quantities

q and P such that {q,p} = 1. One example, in spherical coordinates, is
the azimuthal angle 'P and the z component of the angular momentum L z
(see Problem 3.3 of Chapter 3).
Example: The one-Dimensional Harmonic Oscillator and

Angle-Action Variables.
Consider a one-dimensional harmonic oscillator of Hamiltonian H = p2 / (2m)+

mw 2 x 2 /2, where x and P are canonical conjugate variables. The transforma-
tion x = X/Vmw and P = PVmw is a canonical transformation, {X, P} = 1,
and, with these variables, the Hamiltonian is written as H = w(P 2 + X 2 )/2.
The rotation in phase space
~ = X cos () + P sin () , II = P cos () - X sin () , (4.26)
where () is an arbitrary fixed angle, is a canonical transformation. The ex-

pression of Hamilton's function is of the same form: H = w(II2 + ~2)/2 and
{~, II} = {X, P} = 1.
This is a simple but important example of a dynamical symmetry of the
system. In the present case, it is one example of the numerous symmetry
properties of the harmonic oscillator. This argument can be extended to N
degrees of freedom.
Dirac's method of creation and annihilation operators in quantum me-

chanics 5 relies directly on this symmetry.
Cyclic Variable
The above dynamical symmetry can be exploited further. In phase space,

which is two-dimensional, (X, P), in this example, we can use polar coordi-
nates by introducing the variables (A, 'P) defined by
X = v'2A cos 'P, P = v'2A sin 'P, (4.27)
which amounts to
A =
X2 +p 2
2 ,'P = arctan X
(P) . (4.28)
The variables (A, 'P) are canonical conjugate variables, as one can check with
no difficulty. In these variables, the Hamiltonian reduces to the simple expres-
sion H = wA. Hence we have the equations of motion
H=wA, {A,'P}=l ::::}A=o, cp=w, (4.29)
whose solution is obvious:
A = E / w = constant, 'P = wt + 'Po (4.30)
Here, E is the energy of the oscillator, a constant of the motion. The interest-
ing point about this operation is that we have reduced the problem to a single
time-dependent variable, the angle 'P. Since the energy, which is proportional
to the action A, is conserved, only the angular variable 'P evolves. The variable
'P is a cyclic variable. It does not appear explicitly in the Hamiltonian, and
this results in the properties (4.29) and (4.30).
The geometric interpretation in the (X, P) space, which here is equivalent
to phase space, is simple. The motion occurs on a circle of radius A = E/w,
which depends on the time-independent value of the energy E. On this circle,
the motion of the point (X, P) is uniform and of angular velocity w : 'P =
wt + 'Po.
We already mentioned cyclic variables in Section 3.2.2. This is a simple
example of the role played by such variables, in particular in the investigation
of integrable systems.
4.3.3 Phase Space; Liouville's Theorem
The evolution in phase space [Xi, Pi] is a geometrical representation of par-

ticular interest in mechanics. A point in phase space corresponds to a state
5 See, for instance [7], Chapter 7, Section 5.
of the system. When the system evolves, this point moves in phase space. A
volume element of phase space is defined by
(4.31)
J
Consider an arbitrary volume Jl of phase space, Jl = dJl. We claim that
this volume is invariant under canonical transformations,
J dXI .. dXN dPI ... dPN = J dXI ... dXN dPI ... dPN. (4.32)
Indeed, in this change of variables, we have
where J is the Jacobian of the transformation. Now, the Jacobian of a canon-

ical transformation is equal to one. This is obvious in the simple example
(4.26) above.
If we consider the simple case of only one couple of conjugate variables
(x,p) -+ (X, P) as in (4.21), the proof is simple. Indeed, the Jacobian is
simply the Poisson bracket {X,P},
J = ax ap _ ax ap = {X P} = 1 (4.33)
ax ap ap ax ' ,
which is equal to one by definition. The extension to N conjugate variables
Xl ... X N PI ... PN is more lengthy but proceeds from the same observation.
Consider now a volume Jl of phase space. Each point of this volume evolves
according to Hamilton's equations. Consequently, at any time t, the variables
(Xi(t),Pi(t)) are canonical variables. Therefore, the motion can be seen as
performing at each time interval a canonical transformation of the state vari-
ables in phase space (Xi(t),pi(t)) -+ (Xi(t'),Pi(t')). We therefore obtain the
Liouville theorem, which is of great importance in statistical physics:
Theorem 3. A volume of phase space remains unchanged during the Hamil-
tonian evolution of the system.
This remarkable geometric property derives from the structure of Hamil-
ton's equations. It is independent of the specific form of the Hamiltonian
itself.
Another interesting geometric property in phase space is the following.
Hamilton's function H(x,p) is defined in phase space. In this space, consider
a vector field whose components are (x,p); i.e.,
. aH . aH
X= ap' p = - ax
One calls the flow of this vector field the set of curves whose tangents at each
point are collinear with the vector at this point. We notice that the flow of
( i; , p), also called the Hamiltonian flow, is orthogonal in each point to the
gradient of the Hamiltonian at this point,
n
v
H= (aH aH)
ax 'ap .
In the example (4.26) above, the result is very simple. The trajectories
in the (X, P) plane are circles centered at the origin, and the gradient of
H = (P 2 + X 2 )/2 lies along straight lines going through the origin. This can
be stated in the converse way: The gradient of H = (p2 + X 2)/2 lies along
straight lines going through the origin, and therefore the trajectories are cir-
cles centered at the origin. This result can be generalized to any number of
variables. One can express the conservation laws of energy, momentum, and
angular momentum with geometrical considerations of that kind (using the
corresponding invariance properties).
4.3.4 Analytical Mechanics and Quantum Mechanics
The formulas above reveal an amazing fact. There is a strong analogy, if not
more, between the structures of analytical mechanics and quantum mechanics.
In quantum mechanics, one proves quite easily what is called the Ehrenfest
theorem: 6 The time derivative of the expectation value (a) of a physical quan-
tity A is related to the commutator of the observable A and the Hamiltonian
fI by the relation
d
dt (a) =
1"
in ([A, H]) + at .
(aA) (4.34)
If, by definition, we introduce an operator A such that whatever the state

vector 17,b) we have,
'. d
(7,bIAI7,b) := dt (a),
so that we have the equality of observables
'. 1"
A = in [A,H] + at
aA (4.35)
which has the same structure as (4.14) if one replaces the Poisson brackets by
the commutators of the quantum observables, divided by in.
The same remark applies to the canonical commutation relations of the
conjugate variables of position x and momentum p,
[Xj,Pk] = iMjk' (4.36)
which can be compared with (4.10).

4.4 Charged Particle in an Electromagnetic Field 81
This similarity, if not identity, between the structures of the two mechanics
was one of the first major discoveries of Paul Adrien Maurice Dirac during the
summer of 1925 (he was 23). Dirac, after finding that the noncommutativity
of quantum observables was actually the foundation of Werner Heisenberg's
matrix mechanics, had decided to construct a new formulation of mechanics
that would incorporate this noncommutativity in a well-defined way. One day,
he remembered the structure of Poisson brackets and saw that they played,
formally, a role similar to that of quantum commutators, divided by in. In
september 1925, he was able to construct what is called "quantum mechanics"
in its present form. Of course, the mathematical nature and the physical in-
terpretation of the quantities are different in the two cases, but the equations
that relate them are the same if one postulates the correspondence between
Poisson brackets in analytical mechanics and the quantum commutators, di-
vided by in, in quantum mechanics.
More generally, in complex problems (large numbers of degrees of freedom,
constraints between variables, etc.), the systematic method of obtaining the
commutation relations of quantum observables consists in referring to the
classical Poisson brackets and replacing them by the quantum commutators
(divided by in).
4.4 Charged Particle in an Electromagnetic Field

Consider the form (3.46) of the Lagrangian of a charged particle placed in an
electromagnetic field,
c = ~mr2 + qr A(r, t) - qcp(r, t). (4.37)
The conjugate momentum is
p= mr + qA(r, t). (4.38)
4.4.1 Hamiltonian
Equation (4.38) is easily inverted, r= (p - qA(r, t))/m, and hence we have

the Hamiltonian
1
H = 2m (p - qA(r, t))2 + qcp(r, t), (4.39)
which is expressed in terms of the potentials A and CP, and not the fields E
and B.
As an exercise, one can write the Hamiltonian in the relativistic case (3.61),
and the result is
H = Jm 2c4 + c2(p - qA)2 + qCP, (4.40)
where one discovers the "prescription" to introduce the electromagnetic field.

One must substitute p - qA for the momentum p and E + q for the energy
E in the expression of the energy-momentum relation for a free particle.
It is that prescription that Schrodinger applied to the free-wave equation
for de Broglie waves in order to calculate the energy levels of the hydrogen
atom. After some unexpected mismatches, he finally ended up with his cele-
brated equation.
4.4.2 Gauge Invariance
As in Section 3.3.3, we see that the Hamiltonian is expressed in terms of

the potentials and not in terms of the fields themselves. Contrary to what
happens in Section 3.3.3, it is not obvious a priori that the result will be
gauge invariant. One can check this directly on the equations of motion, of
course.
In the expression (4.39) of the hamiltonian, the quantity (p-qA)/m is the
velocity of the particle. It is a measurable quantity (some m s-l ), independent
of the gauge. Similarly, the energy is E = mv 2 /2+qP, as it should be. However,
the conjugate momentum p depends on the gauge.
One can see 7 how quantum mechanics deals with the subtle problems of
gauge invariance, which leads directly to the basic principles that dictate the
form of fundamental interactions. One can also devise experimental setups
that show directly that the Hamiltonian is expressed as a function of the
potentials and not the fields. An example can be found in the Aharonov and
Bohm experiment. s
4.5 The Action and the Hamilton-Jacobi Equation
The least action principle consists in finding the equations of motion by min-
imizing the action, which is itself defined in terms of the Lagrangian and the
endpoints of the trajectory by (3.2).
It is useful, however, to work directly with the action itself as a physical
quantity. In order to do this, we will first express the action in terms of the
coordinates and time; i.e., S(Xl, X2, ... , Xn; t). In the case of one degree of
freedom, this amounts to calculating the values of S along the set of physical
trajectories as a function of the time and position of arrival (x, t), the start-
ing time and position being fixed. Equivalently, we want to characterize the
various trajectories starting at (Xl, iI) and arriving at (x, t) with the value of
the action S(x, t; Xl, iI).
The action is defined by

8 A. Tonomura, N. Osakabe, T. Matsuda, T. Kawasaki, J. Endo, S. Yano and H.
Yamada, Phys. Rev. Lett. 56, 792 (1986).
4.5 The Action and the Hamilton~Jacobi Equation 83
S=it.c(X,x,t')dt' . (4.41)
tJ
In this expression, the variables (x(t),x(t)) are assumed to take their physical
values that satisfy the Lagrange-Euler equations.
4.5.1 The Action as a Function of the Coordinates and Time
We start from the variation of the action written in (3.5),
t (a.c a.c . ) (4.42)

oS = i tJ ax ox(t) + ax ox(t) dt.
We integrate the second term by parts, but we do not impose that we end
up at the same point x(t) but rather in its vicinity x(t) + ox(t) (we maintain
ox(td = 0). The integrated term does not cancel out and we obtain
oS = a.c
ax ox(t) + ith (a.c
ax d (a.c))
- dt ax ox(t) dt. (4.43)
By assumption, the trajectory is physical, so the right-hand side integral

vanishes. We obtain the variation of the action as
a.c
oS = ax ox(t) = pox(t), (4.44)
or, more generally,

N
oS = LPi OXi. (4.45)
i=l
Therefore, the partial derivatives of the action with respect to time and to
the coordinates are simply the conjugate momenta
as . 1 as = Pi,
-a
Xi
=Pi, or m genera -a
qi
(4.46)
if one works with an arbitrary set of canonical conjugate variables [qi,Pij.

Similarly, we can calculate the variation of the action if we vary the time of
arrival t. We obtain obviously
dS =.c. (4.47)
dt
Now, if we consider the action as a function of coordinates and time, we have
(4.48)
Putting together the two equalities, we see that the (partial) derivative of the
action with respect to time is, up to a sign, the Hamiltonian
as N
at = [- I>iXi = -H, (4.49)
i=1
and the total differential of the action can therefore be written in terms of the
coordinates and time as
N
dS = LPi dXi - H dt. (4.50)
i=1
We notice that no reference is made to the initial position and time. The
formal expression of the action is therefore
(4.51 )
Hamilton's least action principle amounts to writing 8S = O. Indeed, (4.51)

gives directly
(4.52)
which is the form (3.2) we used as a starting point in the previous chapter.
However, we work here with the conjugate variables (x,p) and not the
variables (x, x) of Chapter 3. The canonical equations of Hamilton follow
directly from the expression of the action (4.51). Indeed, consider the variables
x and p as independent variables, and consider for simplicity a single degree
of freedom. The action is then
S = 1(2)
(1)
(p dx - H dt) . (4.53)
We vary x by 8x and p by 8p, imposing that 8x(2) = 8x(1) = O. The variation
1
of Sis
(2) ( aH aH) (4.54)
8S= (1) 8pdx+pd(8x)- ax 8xdt - ap8pdt .
The second term in the integral can be integrated by parts. The integrated
term (p 8x) disappears since we assume 8x(2) = 8x(1) = O. We therefore
obtain
(4.55)
which vanishes for any variation (8x,8p) if and only if the integrands vanish
identically, i.e.
aH aH
dx- -dt=O
ap , dp+ ax dt = 0,
where we recognize Hamilton's canonical equations.
4.5 The Action and the Hamilton-Jacobi Equation 85
4.5.2 The Hamilton-Jacobi Equation and Jacobi Theorem
The Hamilton-Jacobi equation can be read off from (4.49) and (4.46). In the
Hamilton function, we can replace the momenta Pi by the partial derivatives
of the action. This leads to
~~ +H (Xl, ... ,XN' g~, ... ,:~;t) = o. (4.56)
The Hamilton-Jacobi equation is a nonlinear first-order partial differential

equation. In the same way as for the Lagrange-Euler equations or the canoni-
cal equations, it can be used to calculate the motion. Deciding which of these
formalisms should be used is a matter of convenience, in particular according
to the mathematical structure of the problem.
One can refer to Section 48 of the book by Landau and Lifshitz [8] for
a discussion of this point, which involves a more complete formulation of
canonical transformations than what we have done in Section 4.3.2.
It is perhaps more instructive to see the method in a specific example. We
will in particular see that the Jacobi theorem has great practical importance.
We consider a problem, of which Kepler's problem is a particular case, in
spherical coordinates (r, e, cjJ). Consider the Hamiltonian
12 Po
H=-2 ( Pr +2"+
2 PcP
2
2) +V(r,e,cjJ).
2e (4.57)
m r r sm
One can separate the variables if the potential is of the form
v = Vo(r) + f(e)
r2 (4.58)
(in full generality, one can add a term of the form g(cjJ)/T 2 sin 2 e). The
Hamilton-Jacobi equation is
1
2m
(OSO)2
or
1 [(OSO)2
+Vo(r)+2mr oe 2
1 1 (OSO)2 =E,
+2mf(e) +2mr2sin2e ocjJ
(4.59)
where E is the constant value of the energy.
The cjJ variable is cyclic. We denote by = L z the constant value of PcP. In
other words,
( OSO)2
ocjJ
= 2
. (4.60)
Inserting this into (4.59), we reduce the problem to

Multiplying by 2mr2, we notice that this equation can be separated into

two terms. One involves the variable 0, and the other the variable r.
We therefore seek a solution of the form
(4.62)
We obtain
dS
( dO 1)2 2
+ 2mf(O) + sin2 0 = a (4.63)
2
1 ( dS2 ) a
2m dr + Vo(r) + 2mr2 = E, (4.64)
where a is, like E and , a constant of the motion, determined by the initial
conditions.
Integrating these equations yields
( a - 2mf(O) - -!-)dO. (4.65)

sm2 0
Here, (E, , a) are arbitrary integration constants.

In order to obtain the equations of motion, we use the Jacobi theorem,
which we now explain for the simplest one-dimensional case q:
Theorem 4. Let a be an integration constant, and suppose we know the action
S(q, a, t). Then (3 = as/aa is a constant of the motion.
Proof: We have by definition
(3 = as therefore (4.66)
aa'
Since q is, by definition, the derivative of q along the physical trajectory,
we have
and .!!:.-(3 = i as + aH a 2s .
dt at aa ap aq aa
On the other hand, we have
~H (q as(q,a,t)) = aH a 2s (4.67)
aa 'aq ap aaaq'
Injecting this into (4.66), and taking into account the Hamilton-Jacobi equa-
tion (4.56), we obtain
.!!:.-(3 = ~ (as H ( as(q,a,t))) =0 QED (4.68)

dt aa at + q, aq .
4.5 The Action and the Hamilton-Jacobi Equation 87
Going back to the result (4.65) we consider the three constants of the
motion (E,f,a). From the expression (4.65) of the action, we define the three
constants (3E ,(3c,(3a by
as as
(3E = aE' (3c = af'
The values of these constants are fixed by the initial conditions of the problem.
We therefore obtain the trajectory and the equation of motion from the three
equations by taking the derivatives of (4.65) with respect to E, f, and a.
4.5.3 Conservative Systems, the Reduced Action, and the

Maupertuis Principle
Reduced action
Suppose the Hamiltonian H does not depend explicitly on time. Then, the
energy is conserved. Let E be its value in the problem under consideration.
Then equation (4.49) yields
as =-E- (4.69)
at '
i.e.,
S = -Et + SO(XI, ... ,XN). (4.70)
The quantity So is called the reduced action. It satisfies the equation
H (Xl, ... , N, ~~~ ,... , g:: )= E.

X (4.71)
More generally, if we refer to (4.51), one defines the reduced action So by
(4.72)
For a conservative system, we see that the variational principle concerns this
quantity: 680 = o.
Geometric Interpretation
The relation (4.46) can also be written in terms of the reduced action
aso
~=Pi' (4.73)
UXi
This formulation reveals a simple geometric property that will be useful

in order to see the analogy with optics. Let us use Cartesian coordinates for
clarity and consider the simple case where the momenta coincide with the
linear momenta Pi = mi:k In coordinate space, (Xl, X2, ... , X N ), consider the
surfaces on which the reduced action is constant, So = constant. The relation
(4.73) implies that the vector P == (PI,P2,'" ,PN) is everywhere orthogonal
to these surfaces.
Considering the simple case of a particle in three-dimensional space, we see
that the trajectory is, at each point, orthogonal to the surface So = constant.
At a given time, this property is also true for the action S. If we denote by
di: an elementary vector tangent to the surface So = constant at point r, we
have by definition \7 So . di: = 0 or
\7 So . di: = p . di: = O. (4.74)
In other words, the flow of the trajectories is orthogonal to the surfaces So =

constant.
Maupertuis Principle
For a particle of mass m in a potential V (r), equation (4.71) is
1
_(\7S0)2 + V(r) = E, or also (\7S0)2 = 2m(E - V(r)). (4.75)
2m
In this problem, the momentum is simply p = mr. The reduced action (4.72)
J J r.
is therefore
So = p. dr = m dr. (4.76)
In this expression, we want to express everything in terms of the position

variable. Calling e the curvilinear abscissa along the trajectory r(t), we have
obviously
(4.77)
Since the kinetic energy is T = mr 2 /2 = mP2/2, and since T = mr 2 /2 =

.V
E - V, we obtain
2(E-V)
e= m
, (4.78)
and, inserting this into (4.77) and (4.76),
So = J J2m(E - V) de. (4.79)
Hence we have the simple form of the Maupertuis principle given in Section
J
2.2.1
c5 J2m(E - V) de = o. (4.80)
4.6 Analytical Mechanics and Optics 89
4.6 Analytical Mechanics and Optics
4.6.1 Geometric Limit of Wave Optics
The previous consideration will allow us to show how geometrical optics ap-
pears as the limit of wave optics in the small-wavelength limit.
Scalar wave
Consider the equation of propagation of a scalar wave ([> in a medium of vari-

able index of refraction n(r). We assume that the medium is inhomogeneous
but isotropic: the index of refraction n depends on the point under consider-
ation but not on the direction of propagation.
The general case of the propagation of electromagnetic waves in a non-
conducting medium of electric and magnetic susceptibilities E and jL, and
taking into account possible discontinuities between two media and polariza-
tion is treated at length in Chapter III and Appendix I of the book Principles
of Optics by Max Born and Emil Wolf [12]. For our purpose, it suffices to
consider a nonmagnetic isotropic medium.
The propagation equation of a scalar wave ([>(r, t) is
(4.81)
We study the behavior of a periodic wave of frequency w, of the form ([>(r, t) =

tp(r)e- iwt . Inserting this form into the previous equation, we end up with
(4.82)
We seek a solution of this equation of the form
(4.83)
where
k 0_- W 27r
_
- -- (4.84)
c A
is the modulus of the wave vector at point r. The quantity Sin (4.83) is called
the eikonal (from the Greek word ELK,WlJ, image or picture). Inserting (4.83)
into (4.82), we obtain, after simplifying by eikoS(r) and dividing by k6,
i ( 1 2
tpo ( \7 S ) 2 + ko 2) 2
2\7 tpo . \7 S + tpo \7 S - k6 \7 tpo = n . (4.85)
In this equation, the imaginary term proportional to l/ko, multiplying by tpo,

can be written as
(4.86)
This is a conservation equation; it expresses energy conservation. The wave

propagates in the direction of V'S, and the energy density is proportional to
ip6. The complete interpretation in terms of the Poynting vector can be found
in Born and Wolf [12].
Consider now the real part. Suppose that the wavelength is very small,
namely that the index n does not vary appreciably over a distance of one
wavelength and that the size of optical instruments (for instance, diaphragms)
is much larger than>. defined in equation (4.84). This assumption can also be
expressed as taking the limit>. -+ 0 and therefore ko -+ 00. It is called the
eikonal approximation. We therefore neglect the term in 1jk6, and this leads
to the eikonal equation
(4.87)
which is the fundamental equation of geometrical optics.
In this approximation, the wave
<I>(r, t) = ipo(r)ei(koS(r)-wt) (4.88)
propagates with a wave vector that is locally perpendicular to the surfaces

S(r) = constant. These surfaces are the geometric wave fronts.
Geometrical Optics and Classical Mechanics
Of course, one notices the great similarity between the eikonal equation (4.87)
and the Hamilton-Jacobi equation (4.75) for a massive point-like particle. The
reduced action So of the particle and the eikonal S for a light wave obey the
same law if one makes the correspondence
n(r) {=? y'2m(E - V(r)). (4.89)
We only need to follow backwards the path leading to (4.75), in particular

(4.52), to see that the eikonal approximation corresponds exactly to Fermat's
principle
(4.90)
The principles of Fermat and Maupertuis (4.80) have an obvious similarity if

one makes the correspondence (4.89). Hamilton discovered this in 1834. He
had understood in 1830 how and in which limit geometrical optics was an
approximation of wave optics. Hamilton was fascinated by variational prin-
ciples and, in particular, by the similarity between Maupertuis's principle in
mechanics and Fermat's principle in optics. In 1834, he made the very remark-
able comment that the formalisms of optics and mechanics could be unified
and that Newtonian mechanics appeared to correspond to the same limit or
approximation as geometrical optics as compared with wave optics! It is true,
however, that in the 1830s no experimental evidence whatsoever revealed the
existence of the Planck constant.
4.6 Analytical Mechanics and Optics 91
Finally, we remark that the geometric interpretation (4.74), which in this

case boils down to (4.88), is nothing but the Huygens principle. This prin-
ciple, which was the first wave theory of light, consists of saying that light
propagates as a wave front. At each time t, each point of the wave front can
be considered as a point-like source. At time t + 6t, the new wave front is the
envelope of the spheres of radii 6r = (c/n)6i centered on each point of the
previous wave front. This principle is equivalent to the Fermat principle in the
limit of the eikonal approximation. The ideas of Huygens were strongly op-
posed by Newton, for whom the corpuscular conception of light was the only
acceptable one. However, Huygens was the first to obtain, with this principle,
an explanation of the double-refraction phenomenon of anisotropic crystals.
4.6.2 Semiclassical Approximation in Quantum Mechanics
The same type of argument can be used in wave mechanics and the Schrodinger
equation. This is called the semiclassical approximation and is due to Wentzel,
Kramers, and Brillouin (WKB). One can for instance refer to Volume 1, Chap-
ter VI of Albert Messiah's book Quantum Mechanics [13] for any details,
particularly practical applications of the method.
Consider the Schrodinger equation
a h2
+ V(r) 1j;(r, t).
at 1j;(r, t) = -2m
ih- - ,11j;(r, t) (4.91 )
We separate the modulus and the phase of the wave function as
1j;(r, t) = A(r, t) exp (*s(r, t)). (4.92)
Substituting in (4.91) and considering separately the real and imaginary parts,
we obtain
as _1 (\1S)2 V = ~ \1 2 A (4.93)
8t + 2m + 2m A '
8A 1
m-
8t
+ \1A \1S + -A\1
2
2
S = O. (4.94)
The second equation expresses the conservation of probability. If we introduce

the probability density p and the probability current J as
p(r, t) = 1/;*(r, t)1/;(r, t), J(r, t) = 2~

~m
(1/;*\11/; -1/;\11/;*), (4.95)
the conservation of probability has the local form

8
8l(r, t) + \1. J(r, t) = o. (4.96)
Using (4.92) and multiplying (4.94) by 2A, this amounts to

(4.97)
This is similar to equation (4.86).

The classical approximation consists in taking the limit
(4.93),
n ----+
in equation
85 1
at + 2m (\75)2 + V = 0, (4.98)
which is nothing but the classical equation.

Therefore, in the classical approximation, the wave function can be con-
sidered as describing a fluid made up of classical particles with no mutual
interactions and moving in the potential V. The density and current density
of these particles are at each time equal to the quantum probability density
p and current density J.
4.7 Problems
4.1. Coupled Oscillators

Consider two coupled harmonic oscillators and the Hamiltonian
P2 p2
H=_1+_2+ mw 2x2I+ ___
___ mw 2x22+ mJP(x I _ x 2 )2
2m 2m 2 2 4
1. Show that the transformation
x = Xl +X2 p = PI +P2
y'2' y'2'
y = Xl - X2
y'2'
is a canonical transformation and express the Hamiltonian with these new
variables.
2. Find the eigenfrequencies of the system.
3. Write the general form of the motion (Xl (t), X2 (t)).
4.2. Three Coupled Oscillators

Consider three coupled oscillators
(4.99)
1. Show that the transformation

x 1_- Xl y'2
- X2
,
4.7 Problems 93
p _ PI - P2 p _ PI + P2 + P3
1- y2 , 3 - J3 '
is canonical.
2. Write the Hamiltonian with these new variables. Deduce the eigenfrequen-
cies and the general form of the motion.
4.3. Forced Oscillations

Consider a one dimensional harmonic oscillator of Hamiltonian
(4.100)
where x and P are Lagrange conjugate variables.
1. We set x = X/,jmw and P = P,jmw.

Write the expression of the Hamiltonian (4.100) in terms of X and P, and
calculate the Poisson bracket {X, Pl.
2. We introduce the functions a and a*, the complex conjugate of a, defined
by
X+iP * X -iP
a= a = y2 .
y2
Write the Hamiltonian in terms of a, a*, and w.
3. Calculate the Poisson bracket {a, a*}.
4. (a) Write the time evolution equation of a and give its general solution.
(b) Write the energy E of the oscillator in terms of the parameters of this
solution and w.
5. We assume the energy of the oscillator is zero for t ::; 0, E(t ::; 0) = O.
Between t = 0 and t = T one applies to the oscillator a force which derives
from the potential energy Hpot = by2X sin (Dt) (Hpot = 0 if t ::; 0 or
t > T), where b is a parameter. Calculate the energy E' of the oscillator
for t > T.
6. Discuss the variation of E' as a function of the exciting frequency D.
4.4. Closed Chain of Coupled Oscillators

We recall that for 1 ::; n ::; Nand 1 ::; n' ::; N
1 ~ (2ik(n - n')7r) _ 6 (Kronecker 6).

N L..Jexp N - nn'
k=1
We consider a closed chain of N particles of equal mass m placed on a

plane circle (see Figure 4.4). Each of these particles has a one-dimensional
motion along the direction (x) perpendicular to the plane. We denote by x n ,
n = 1, ... ,N the abscissa of particle n along this axis.
These particles form a set of harmonic oscillators coupled to their nearest
neighbors. The Hamiltonian is
Fig. 4.4. Chain of coupled oscillators.
(4.101)
where Pn is the conjugate momentum to Xn and where we use the cyclic

convention XN+I == Xl
1. Define the complex variables
1 N 1 N .
Y k -- - - ~ e2ikmr/N X = __ e-22kmr/Np
ffi ~ n,
q
k ffi
~
~ n, (4.102)
n=l n=l
whose inverse relations are

1 N . 1
N
X = __ e-22kmr/Ny
n ffi ~
~ k,
Pn = L e2iknrr / N qk
I7\T (4.103)
k=l vN k=l
a) Show that
b) Show that
N N
LYkYZ = LX~ and (4.104)
k=l n=l k=l n=l
c) Show that
(4.105)
2. Equations of motion and their solutions

a) Write the Hamiltonian (4.101) in terms ofthe variables {Yk, yZ , qk, qU
4.7 Problems 95
b) Calculate the Poisson brackets
(4.106)
c) Write the differential equations satisfied by {Yk, Yk' qk, qU

d) Write the general expression of {Yk(t)}; deduce from it the expression
of {xn(t)}.
3. We assume that at time t = 0 we have YN(O) = 1, YN(O) = 0, and
{Yn(O) = 0, Yn(O) = 0, Vn =f- N}. Calculate {xn(t)} and interpret the
result.
4. Propagation of waves
We now assume, for simplicity, that w = O. We also assume that N 1,
so that sin(k7r/N) ~ (k7r/N) for k : N. We assume that for t = 0 we
have YN-l = 1, Yl = 1, Yn = 0 if n =f- 1 or n =f- N - 1), and Yn = 0 Vn.
a) Calculate xn(t) and XN-n(t).
b) Interpret the result physically.
c) We assume that the distance between two neighboring oscillators is a.
Setting xn(t) as the value of a function f(t, y) for Y = na, write the
wave equation (second-order partial differential equation) satisfied by
the function f.
4.5. Virial Theorem.

We consider, in three dimensions, a particle of mass m placed in a potential
V(r), whose Hamiltonian is H = p2/2m + V(r). We assume that the particle
is in a bound state with a given energy E.
1. Consider the physical quantity A = r p == xpx + YPy + Zpz. Calculate

the Poisson bracket {A, H}. Deduce from that the time evolution of A in
terms of the variables rand p.
2. We assume that the motion of the particle is periodic and of period T.
Let f(r,p) be a physical quantity. We define its mean value (1) by
1 (T
(1) = T io f(t) dt. (4.107)
Considering the mean value of .A == dA/ dt, show that we have
(4.108)
3. What does this equality become if the potential V is a central power law
function V = g rn with r = Irl?
4. In the case above, what is the relation between the total energy E, the
mean kinetic energy (E k ), and the mean potential energy (V) for
a) a harmonic oscillator n = 2, and
b) for a Newtonian (or Coulomb) potential n = -I?
5. In general, for an arbitrary potential, the orbits of bound states are not
closed curves, but they nevertheless remain confined in space. At all times,
Irl :::; ro and Ipi :::; Po, where ro and Po are fixed. Give a generalization of
the definition (4.107) such that the result (4.108) remains true.
4.6. Calculate the Poisson brackets of the three components of the angular
momentum L = r x p.
4.7. We consider Kepler's problem H = p2/2m - e 2/r. Calculate the Poisson

brackets of the components of the Lenz vector
A PxL r
=- - -e 2 -
m r
between each other, with the components of the angular momentum, and with
the Hamiltonian. What can one conclude?
4.8. Verify with the Hamiltonian (4.39) that Hamilton's equations give the
expected equation of motion.
5
Lagrangian Field Theory
Dreams pass into the reality of action.

From the reality of action stems the dream again.
Anals Nin
The Lagrangian formalism acquires its real power when one deals with systems
that possess a large, possibly infinite, number of degrees of freedom. That is
the case in mechanics of continuous media. We will now examine how this
formalism deals with field theory.
In itself, field theory is a vast domain that acquires its completeness when
one considers the quantization of fields and the theory of fundamental inter-
actions. In the present chapter, which is deliberately rather short, we want
to explain the principles of Lagrangian field theory and its application to the
electromagnetic field. The classical theory of gravitation is beyond the scope of
this book. It is thoroughly treated in the literature, and we refer the interested
reader to Landau and Lifshitz [1], for instance.
In Section 5.1, we will study the principle of the Lagrangian formulation
of field theory, starting with the case of a vibrating string. Actually, the pro-
cedure is rather simple. One starts by considering a discrete problem with
finite elements of the string. One then takes the continuum limit such that
a Lagrangian space density appears. It is in this limiting procedure that one
appreciates how well the Lagrangian formalism is adapted to this type of
problem.
The extension to three space dimensions, as well as several degrees of
freedom, is dealt with in Section 5.2. One can easily guess the extension of
the method to four dimensional space-time and relativistic fields. In Section
5.3, we will consider a scalar field, and in Section 5.4 the electromagnetic
field and the Maxwell equations. In Section 5.5, we shall say a few words
about field equations that are of first order in time. The first example is the
Fourier diffusion equation, which corresponds to a nonreversible problem; i.e.,
a dissipative
J-L. Basdevant, problem. This example is interesting because of the similarity
between the Fourier equation and the Schrodinger equation. We shall see that

98 5 Lagrangian Field Theory
a Lagrangian approach can be constructed for the latter but that essentially
it leads nowhere in nonrelativistic quantum mechanics.
5.1 Vibrating String

The vibrating string is the prototype of a system with an infinite number of
degrees of freedom.
Consider an elastic string of length l, fixed horizontally between the end-
points x = 0 and x = l (we do not take gravity into account). Its linear mass
density p is assumed to be uniform.
We only consider deformations of the string in the transverse plane (trans-
verse waves). We denote by 'lj;(x, t) the transverse (vertical) displacement at
point x with respect to its position at rest. For simplicity, we assume that this
displacement occurs in a single direction (the vertical axis).
One can consider the string to be the set of a large number of elements
of length dx, each of which obeys the usual laws of dynamics. In the limiting
procedure, this will result in an infinite number of degrees of freedom.
Consider an element of length dx. Its kinetic energy is
dEk = 2(P
1
dx) (o'lj;)
at 2
(5.1)
Let T be the elasticity constant of the string. If the displacement of two

successive elements located at x and x+ dx varies compared with its value at
rest, the corresponding potential energy V varies by
where obviously (o'lj; /ox? 1. The variation V of the potential energy of

the string when it is deformed is therefore
V = ~T
2
r (O'lj;)2 dx.
l
Jo ox (5.2)
The Lagrangian of the string is the sum of the elementary Lagrangians:
(5.3)
If we consider the string as an assembly of material elements of length dx,

each of these has an elementary Lagrangian
ox dx = ~2 [P(o'lj;)
d'c = L (Of,'f/, o'otlj; ' o'lj;) ot _ (O'lj;)2] dx.
2
Tox (5.4)
5.2 Field Equations 99
The quantity L that appears in this expression is called the Lagrangian

density of the string. In fact, the action of the string is
s~ ! L dx di ~ U H~)' -7 (:)'] dx di. (5.5)
(The integral over x runs over the path [0, I].)

We see that this is now a two-dimensional problem (x, t) for the dynamical
quantity 'Ij;(x, t). We must minimize the integral (5.5). The corresponding
Lagrange-Euler equation is
8L 8 ( 8L ) 8 ( 8L ) (5.6)
8'1j; = 8t 8(8'1j;/8t) + 8x 8(8'1j;/8x) .
In the case under consideration, 8L/8'1j; = 0 so that if we define the prop-
agation velocity c by
2 7
C =-, (5.7)
p
we obtain the propagation equation of vibrations along the string
(5.8)
We therefore see how the problem of a wave propagation can be deduced

from a variational principle. Here, the difference between the total kinetic
energy of the string and its potential energy must be as small as possible.
5.2 Field Equations

5.2.1 Generalized Lagrange-Euler Equations
The previous case is slightly more complex than the equations we saw in (2.8)
and (2.9). Indeed, for a field, the dynamical variable 'Ij; depends on several
variables. In the example (5.8), the field 'Ij; depends on two variables, t and x.
More generally, consider n dynamical variables 'lj;k, k = 1, ... , n, that
depend on m variables x s , s = 1, ... , m (including time); i.e., 'lj;k(X s ) , s =
1, ... ,m.
We define
(5.9)
and we denote by ['Ij;k] the set of partial derivatives of 'lj;k(Xl, ... , xm). The
Lagrangian density is of the form
and the action is

It is a bit tedious but not difficult to convince oneself that the determination of
the extremum of the action S under the set of all infinitesimal transformations
'ljJk -+ 'ljJk + 8'IjJk, k = 1, ... ,n, which vanish on the edge of the integration vol-
ume once one has performed all integrations by parts, lead to the generalized
Lagrange-Euler equations
(5.10)
In relativistic field theory, it is natural to incorporate time t among the

variables (x, y, z, t) on which the fields 'ljJk depend. In many problems, for
instance in statistical field theory, it is useful to maintain the special role of
the time variable. If we define
we obtain
a (aL) aL m-l a (aL)
at a?j;k = a'IjJk - ~ axs a'IjJ'k '
(5.11)
of which (5.6) is a particular case.
5.2.2 Hamiltonian Formalism
Consider again the vibrating string, adding for more generality a linear term
in 'IjJ (which can come from an external force F(x) that we apply at each
point). For simplicity, we define
and .,,1 = a'IjJ (5.12)

tV - ax'
and we consider the Lagrangian density
(5.13)
which leads to the equation of motion
(5.14)
where G = F/ p.
Since we are interested in the time evolution of the system, we define the
density of conjugate momentum p by
5.3 Scalar Field 101
i.e., here (5.15)
For a vibrating string, this is the linear density of momentum.

The Hamiltonian density is
(5.16)
This density depends on 'IjJ and p, but also on 'IjJ', and the form of the canonical
equations must be modified. Inserting (5.16) (i.e., L = p - H), in the least
action principle, and integrating by parts in the two variables x and t, we
obtain
0= 8 JJ dt dx(p - H(p, 'IjJ, 'IjJ'))
= JJ +
dt dx[8p p8 - (8H/8p)8p - (8H/8'IjJ)8'IjJ - (8H/8'IjJ')8'IjJ']
= J J [( .
dt dx
8H) (. 8H 8 8H) ]
'IjJ - 8p 8p - p + 8'IjJ - 8x 8'IjJ' 8'IjJ . (5.17)
Therefore, Hamilton's equations are
8'IjJ 8H 8p 8 (8H) 8H (5.18)

8t 8p; 8t = 8x 8'IjJ' - 8'IjJ
One can check that they yield the propagation equation (5.14).
5.3 Scalar Field

The previous results allow us to understand the form of the Lagrangian of
a scalar field in three-dimensional space, for instance sound waves in a com-
pressible nonviscous fluid. Calling 'IjJ(r, t) the compression of the fluid, and c
the sound velocity in the fluid, the Lagrangian density has the form
(5.19)
Notice that, compared with the vibrating string, space and time derivatives are
interchanged. The kinetic term (local velocity) comes from a vector quantity,
whereas the potential (the pressure) is a scalar.
With the Lagrangian density (5.19), one obtains the propagation equation
(5.20)
5.4 Electromagnetic Field
The case of the electromagnetic field is more complex and deeper. In fact, one
must take into account the fact that it involves two vector fields, and above
all, we must take care of relativistic invariance, which is the fundamental
property of Maxwell's equations. This problem is treated thoroughly in the
book by Landau and Lifshitz [1], for instance. Here we want to point out the
major features.
Physically, the electromagnetic field cannot be separated from its sources,
the charges, on which it furthermore acts. For a system of charged particles
in an electromagnetic field, the action is written in full generality as
(5.21)
where S field is the action of free fields, Spart is the action of the free particles in
the absence of fields, and Sint corresponds to the interaction of these particles
and the field, which we know already from Section 3.3.2. We recall that in an
electromagnetic field derived from the potentials A and cJ), the Lagrangian of
a particle of charge q and mass m is expressed in terms of the potentials A
and cJ),
Lint = qr A(r, t) - qtfJ(r, t). (5.22)
This form transforms as wanted in a Lorentz transformation. If we intro-
duce the current four-vector
{Pi} = (cp, j), (5.23)
where p and j are respectively the charge density and current, and the poten-
tial four-vector
{All} = (cJ)lc, A), (5.24)
the Lagrangian density that corresponds to (5.22) is
(5.25)
which is manifestly invariant. (We keep the same symbol L for the Lagrangian
density; the integration runs along space and time.) The action is invariant
since d 3 r dt is a relativistic invariant.
The fields are expressed in terms of the potentials cJ) and A by
B=V'xA, (5.26)
Using the notation all = aI ax Il' one expresses the electromagnetic tensor field
as
FIlV = all A V _ av All; (5.27)
i.e., the antisymmetric tensor
5.4 Electromagnetic Field 103
The couple of homogeneous Maxwell equations follows from the structure

of the tensor FI-'V, and the four equations (or identities)
(5.28)
which lead to
\7 x E = - -
aB \7. B = o. (5.29)
at '
The inhomogeneous Maxwell equations relate the fields to the charge
densities and currents. Suppose there is a given charge density and current
{jl-'} = (cp, j). Then the Lagrangian density for the electromagnetic field in
the presence of these sources is
(5.30)
If we return to an expression that is not manifestly covariant,
(5.31 )
the action S is defined as the integral over all space and time,
(5.32)
We have
(5.33)
One can check that the equations of motion of the electromagnetic field are,
in a covariant form,
(5.34)
where we have restored the coefficient EO, which we previously took equal to
one for convenience. This boils down to
P
\7E=-,
2
c \7 B = -
j aE
+-. (5.35)
EO
X
EO at
We see from (5.31) that the physical electromagnetic field in the vacuum,
away from charges, minimizes the difference (E2 -c2B2) given the constraints
imposed by the presence of the sources. This was implicit in the example of
the simple electrostatic field in (2.2.4).
5.5 Equations of First Order in Time
In order to deal with equations of first order in time, such as the Fourier
diffusion equation or the Schrodinger equation, we use the technique described
in Section 3.3.1 for dissipative systems.
5.5.1 Diffusion Equation
Diffusion, be it of heat or of a substance in a medium, is nonreversible. In

that sense, it can be thought of as a dissipative system. A quantity of heat
placed at some point in a material diffuses in the material and tends to make
its distribution as uniform as possible. It never "reconcentrates" at its initial
position.
The technique developed in Section 3.3.1 allows us to formulate this in a
Lagrangian form.
Let 7/J(r, t) be the density of heat (or of a diffusing substance) and a 2 the
diffusion constant. We introduce a fictitious mirror system whose density 7/J*
"concentrates" instead of diffusing.
Consider the Lagrangian density
L=_'\l7/J.'\l7/J*_a 2 (7/J*o7/J _7/J o7/J*). (5.36)

2 ot ot'
1:
i.e., the action is
S(to,t,) =
1
dt J
L d3 r.
The Lagrange~Euler equations give
(5.37)
The equation satisfied by 7/J is the usual diffusion equation. That written for
7/J*
would represent a diffusion reversed in time, or a "concentration".
It is necessary to use similar techniques in order to write in a Lagrangian
form the flow of a viscous fluid (see [10], Chapter 3, Section 3).
5.5.2 Schrodinger Equation
The Schrodinger equation is not a dissipative system since there is conserva-

tion of the norm and wave propagation. Nevertheless, the formal similarity
between its structure and that of the Fourier equation 1 allows us to write a
Lagrangian formulation similar to what we developed above.
lOne says that the Schrodinger equation is a Fourier equation with an imaginary
time.
5.6 Problems 105
We consider the simple case of a particle of mass m placed in a potential

V. Here, the wave function 'ljJ is complex. Therefore, one can simply use its
complex conjugate 'ljJ* as the "mirror" dynamical variable. This amounts to
considering the real and imaginary parts of the wave function as independent
dynamical variables. In direct analogy with (5.36), the Lagrangian density is
L=-~"V'ljJ."V'ljJ*- n. ('ljJ*O'ljJ _'ljJO'ljJ*) -'ljJ*V'ljJ. (5.38)

2m 2z at at
The Lagrange~Euler equations give
_ ~ ,1.1.* _ ~ o'ljJ* = _ V.I.*

2m 'f/ i at 'f/ ,
(5.39)
as one can easily check.

The densities of conjugate momenta are
n'ljJ
p* (5.40)
2i '
and the Hamiltonian density is
H = ~~ "V'ljJ. "V'ljJ* + 'ljJ*V'ljJ. (5.41)
This form is appealing since its integral over space is simply the expectation
value of the quantum energy
(5.42)
It is therefore tempting to look for a variational principle analogous to what

we saw in Section 2.2.4 for the electrostatic potential.
Unfortunately, this cannot be done. The Hamiltonian formulation of this
problem is even more involved than in Section 5.2.2, and it does not bring
anything new, compared with the Lagrangian formulation (5.38).
In fact, the problem lies in the form of the conjugate momenta (5.40).
As one notices immediately, p and p* are not independent variables. These
momenta are proportional to the dynamical variables 'ljJ and 'ljJ*. This property,
however, is useful in quantum field theory, in particular in what one calls
second quantization. This falls outside the scope of this book.
We will see the variational formulation of quantum mechanics via path
integrals in Chapter 7.
5.6 Problems
5.1. The Telegraph Equation
The equation for neutron transport in matter has the form
2 ap 3 a2 p
a at + v 2 at 2 - LJ.p = 0, (5.43)
called the telegraph equation (see, for instance, Appendix D of [14]) which
shows a propagation term of the neutron density, of individual velocities v
which we assume to be the same and constant here. In the diffusive regime,
in reactor cores, this term is negligible. There exist situations (for instance,
neutrino transport in supernovae) where all terms must be kept owing to the
discontinuities of the diffusive medium.
Proceeding as in (3.34), write the form of a Lagrangian from which this
equation is derived.
6
Motion in a Curved Space
You will never make a crab walk straight.

Aristophanes
Einstein's masterpiece, general relativity, stems from the amazing observation

that two physical quantities that a priori have nothing in common are equal
or strictly proportional. These quantities are the two concepts of mass. One is
the inertial mass, or the coefficient of inertia, and the other is the gravitational
mass, or the coupling coefficient to the gravitational field. There is no a priori
argument that can explain why this equality occurs. In a gravitational field,
this equality eliminates the mass from the equations of motion. Two bodies
placed with the same initial conditions in the same field have the same motion
whatever their masses.
It took some time to realize how deep this observation is. The historical
experiment of Eotvos in 1890 1 has been systematically redone and improved
since then. It is still performed with more and more sophisticated techniques.
The underlying idea of General relativity is that the equality becomes
natural if what we call the "gravitational" motion is actually a free motion in
a curved space-time.
Einstein used to say2 that in 1907, when he was working on how to in-
corporate Newtonian gravitation in relativity (the incorporation of electro-
magnetism was by construction automatic), he had the "happiest thought of
his life" (the original version is "glucklichster Gedanke meines Lebens"). He
was thinking of what a carpenter falling from the roof would feel. For such
an "observer" (and of course as long as he does not encounter any obstacle)
there is no gravitational field (the italics are from Einstein). If this observer
1 Roland Eotvos, "Uber die Anziehung der Erde auf Verchiedene Substanzen,"
Math. nat. Ber. Ungarn, 8, 65 (1890)
2 See, for instance, A. Pais, Subtle Is the Lord, Chapter 9, Oxford University Press,
New York, 1982. The original letter of Einstein to R. W. Lawson in January 1920
t, has been found. The published article, A. Einstein, Nature, 106, 782 (1921), is
not as light in spirit.

108 6 Motion in a Curved Space
lets any object "fall" from his pocket, this object stands still or has a uniform
linear motion with respect to him, whatever its nature, and its physical and
chemical composition; (the resistance of the atmosphere is neglected).
The "equivalence principle" and its consequences can be found in many
books, for instance one by Hans Stefani [15]. The ambition of this chapter is
to show how the notion of motion in a curved space can lead to a theory such
that the equality of the "two masses" emerges naturally.
The equivalence principle can be stated in the following way. For a short
time, the laws of physics in a small laboratory in free fall are the same as
in the same laboratory in an inertial reference frame in the absence of grav-
itation. One usually makes a distinction between this principle, which only
concerns the motion, and the theory of general relativity itself; (i.e.; the Ein-
stein equations that relate the curvature tensor of space-time to the energy
momentum tensor of matter). In this book, we shall not describe Einstein's
equations and their consequence.
We will start by studying the free motion of a particle in a curved space.
In Section 6.1, we define what one calls a curved space and introduce the
fundamental notion of the metric of the space. In Section 6.2, we will write
the motion of a free particle in such a space. This will lead us, in Section
6.3, to a fundamental result: The physical trajectories are the geodesics of the
space; i.e., the curves of minimal (or extremal) length. As we shall see, this is
how the motion of a particle of constant energy E in a Euclidean space-time,
can be transformed into the free motion of the same particle in a curved space,
which is equivalent to the Maupertuis principle.
This will allow us to understand the reasoning of Einstein when he con-
structed general relativity, and some consequences of this theory. We will
display three historical examples: the variation of the beat of a clock due to
the gravitational field, the corrections to Newton's celestial mechanics, and
the deviation of light rays by a gravitational field.
These examples are historical. They are also very important in present-
day astrophysics and cosmology. The deviation of light by a gravitational field
plays an important role via the gravitational lensing effect that it induces. One
application is the search for a baryonic component in the "missing mass" of
the universe. Another is that the mass distribution in the universe, be it the
visible mass or the missing mass, acts as a natural telescope that can enable
us to see faraway objects, and therefore much younger objects. Through this
natural cosmic telescope (or microscope), the universe appears as an endless
gallery of gravitational mirages.
6.1 Curved Spaces

6.1.1 Generalities
It is the work of mathematicians on Euclid's fifth axiom (the postulate of
parallel lines ) that led to the developments on the existence and properties of
6.1 Curved Spaces 109
non-Euclidean spaces. Legendre (1752-1833) had shown that Euclid's axiom

is equivalent to the assumption that the sum of the angles of a triangle is
equal to 1r. As early as 1816, Carl Friedrich Gauss (1777-1855) had convinced
himself that this statement could not be proven (roughly 15 years before the
celebrated work of Nikolay Ivanovitch Lobatchevsky).
Gauss then addressed the question of measuring whether the three-
dimensional space in which we live is "flat" (i.e. Euclidean) or not. Gauss
tried to verify that the sum of the angles of a triangle is equal to 180 degrees.
He performed this measurement between three peaks of the Harz, in cen-
tral Germany: the Inselberg, the Brocken and the Hoher Hagen. As "straight
lines," he used light rays reflected between these three points, and he had to
conclude that (unfortunately) the sum of the angles is equal to 180 degrees.
Some remarks are in order on the idea and on the result.
1. If the accuracy of the measurement had been good enough (owing to
atmospheric density fluctuations, accurate measurements could not be
performed with present laser technology), Gauss would have detected a
slight deviation (rv 10- 1), and therefore a curvature of space, since light
rays are deviated by the Earth's gravitational field.
2. A measurement, the spirit of which is close to Gauss's, was performed in
1964 by Shapiro 3 , who measured the delay of a radar echo between the
Earth and space probes placed on the planets Mercury, Venus, and Mars
(the Viking program). When the planet crosses the direction of the sun
opposite to the earth, the delay is longer than what one would expect
from Euclidean geometry. (Celestial mechanics allows us to calculate the
orbits, and therefore the Euclidean distances, with great accuracy.)
3. Gauss's idea was similar to that of Eratosthenes when he measured the
radius of the Earth by comparing the shadows of two vertical sticks on
the same meridian, in Syene and in Alexandria, on the summer solstice.
Eratosthenes had read in a document the observation that on the day of
the summer solstice, and only on that day, the wells in Syene (Aswan),
on the tropic of Cancer, had no shadow inside at noon. At any other
moment, a shadow appeared somewhere on the sides and on the bottom.
Eratosthenes concluded that the sun was, at that moment, at the vertical
of Syene. He measured the shadow of a vertical bar in Alexandria at noon
on the same day on the same meridian.
The measurement gave him an angle of 7 degrees and 12 minutes. He
figured out the distance between Syene and Alexandria (probably the most
difficult task in the experiment) and found a value of the circumference
of the order of 40,350 km (compared with the actual value of 40,074 km).
There is part luck in the accuracy of the value (Eratosthenes did not know
how far each city was from the same meridian). However, intellectually, the
experiment is fascinating since it provides a means to probe the structure
of the space in which we live and to measure its radius.
3 1.1. Shapiro, Phys. Rev. Lett., 13, 789 (1964).

4. The necessary tool for this type of measurement is to have straight lines
(i.e., geodesics) of the space. It appears that always, whether it was Thales
measuring the height of the Great Pyramid, Eratosthenes, or Gauss, it was
implicit in the minds of people that light rays are physical entities that
possess the "perfect" mathematical property of propagating along straight
lines.
In his celebrated memoir on the theory of surfaces, Gauss understood that
the geometry of a surface is an intrinsic property of the surface, independent
of whether this surface is embedded in a Euclidean space or not. Gauss's ideas
were the starting points for the developments performed by Riemann.
In order to see whether or not a space is Euclidean, one can check whether
the Pythagorean theorem, the triangle inequality, and the angle formula above
are satisfied or not. Analyzing this further shows that everything boils down
to measuring distances and comparing sets of them. Hence the importance of
what is called the metric tensor or simply the metric of the space, which we
shall introduce below.
A famous example, due to Einstein, illustrates this fact. Consider four
points in a space, which we denote 1, 2, 3, 4, and let us denote dij the distance
between points i and j. In a fiat, Euclidean space, the following relation is
always satisfied
~~+~4+~4+4~+~4+4~
+ +
+di2d~3d~1 di2d~4d~1 di3d~4d~1 d~3d~4d~2 +
-di2d~3d~4 - di3d~2d~4 - di2d~4d~3 - di4d~2d~3
-di3d~4d~2 - di4d~3d~2 - d~3d~1 di4 - d~l di3d~4
-d~4d~1 di3 - d~l di4d~3 - d~l di2d~4 - d~2d~1 di4 = O.
One can use an airline schedule (and some courage) to verify that this equality
is not satisfied by Paris, New York, Johannesburg, and Shanghai (or any
other set of four airports), provided one uses the actual distances covered by
airplanes going as "straight" as possible from one place to the other.
6.1.2 Metric Tensor
We characterize a point of the space 4 by a set of coordinates {XO}. The dis-

tance ds between two infinitesimally separated points {XO} and {XO + dxO}
is given, by the definition of the metric tensor gof3 of the space, as
ds 2 = L gof3 dx o dx f3 == gof3 dxo dx f3. (6.1)

o,f3
In the second form, we make use of Einstein's convention of summation over
repeated indices.
4 Or the manifold; we use both terms.
6.1 Curved Spaces 111
The inverse gQ{3 is defined by
(6.2)
In Euclidean space n 3 , in Cartesian coordinates, we have

ds 2 = dx 2 + dy2 + dz 2, i.e., gxx = gyy = gzz = 1, and gij = 0 otherwise.
In spherical coordinates (r, 8, ), the metric tensor is also diagonal, but its
elements are no longer constants:
grr = 1 ,gee = r 2 ,gq)(P = r 2 sm

.28
.
6.1.3 Examples
Sphere S2 in R 3
Let (x, y, z) be the coordinates of a point of three-dimensional Euclidean space

n3. A sphere of radius R centered at the origin corresponds to points such
that X 2+y2+Z2 = R2. The square of the distance between two (infinitesimally
distant) points in n 3 is ds 2 = dx 2+dy2+dz 2. On the sphere, we have of course
zdz = -(xdx + ydy) so that, putting it all together, we have in Cartesian
coordinates
d s2-d2 d2 (xdx+ydy)2.
- x + Y + R2 -x 2 2'
-y
i.e., the metric tensor
~ ~ ~
gxx = 1 + R2 -x 2 -y 2 gyy = 1 + R2 -x 2 -y 2' gxy = gyx = R2 -x 2 -y 2 .
Of course, the expression is much simpler in spherical coordinates (8, ):
Three-Dimensional Spaces Embedded in Four-Dimensional

Euclidean Space R4
Similarly, we can construct isotropic three-dimensional curved spaces embed-

ded in n4. In addition to the usual metric of a flat space, there is a curvature
term, which is particularly simple to express in spherical coordinates (p, 8, ).
Let (x,y,z,w) be the Cartesian coordinates in n4. We then obtain the fol-
lowing:
1. "Spherical" space, S3 sphere:
i.e., (6.3)
2. Hyperbolic spaces:
Two-sheet hyperboloid:
i.e., (6.4)
One-sheet hyperboloid:
p2dp2
i.e., dw 2 = p2 _ R2 (6.5)
3. Parabolic space:
2 p2dp2
i.e., dw = --2-. (6.6)
a
In all these cases, the metric tensor is expressed as
(6.7)
General Case
One could, of course, continue playing the same type of game as in these
examples by imposing any constraint of the type p(x, y, z, w) = 0 in the space
n4. Actually, one would be far from discovering all three-dimensional curved
spaces.
The definition of a curved space consists of choosing the metric {g",,i3}; the
simple examples above are only illustrations.
Historically, the most famous example was given by Felix Klein in 1890.
It was a concrete example of the geometries of Gauss, Janos B6lyai, and
Lobatchevsky. Klein's model consists of an analytical geometry where each
point is represented by two real numbers, Xl and X2, such that xi + x < 1
and where the distance d(x, y) between two points is defined to be
(6.8)
where a is a dimensional scale parameter.

Note that this two-dimensional space, whose curvature is negative (as op-
posed to a sphere, which has a positive curvature) cannot be embedded in
three-dimensional Euclidean space n3.
It can only be embedded in spaces of
dimension greater than three.
6.2 Free Motion in a Curved Space
We now study the free motion of a particle of mass m in a curved space.

6.2 Free Motion in a Curved Space 113
6.2.1 Lagrangian
Since the particle is free, the Lagrangian boils down to its kinetic part, E kin =
mv 2 /2; i.e.,
1 ( ds ) 2 1 dx a dx(3
(6.9)
=2 m dt =2 mgar dtdt'
Note that if the space variables do not seem to appear explicitly in this La-
grangian, they are present in the metric ga(3.
The conjugate momenta are obtained with no difficulty. Assuming the
metric is symmetric, ga(3 = g(3a, which does not restrain the generality, one
obtains
(6.10)
The Hamiltonian is
(6.11)
The value of the Hamiltonian is the same as the value of the Lagrangian
as it should be since we consider a free particle. (Of course, the Lagrangian
and Hamiltonian functions are not expressed with the same variables.) We
deduce a consequence which is both obvious and important. Because of energy
conservation, the square of the velocity is a constant of the motion along the
trajectory.
(6.12)
This property is the curved-space extension of the principle of inertia. In

the particular case of a Euclidean space, the velocity vector of a particle is
constant.
6.2.2 Equations of Motion
The equations of motion are obtained in the usual way
(6.13)
We shall use this in the next section. The expanded form of this equation is
(6.14)
We notice, and it is not surprising, that the mass cancels off identically:
The free motion of a particle in a curved space is independent of the mass

of the particle. The trajectory only depends on the initial conditions.
One obtains the equation for the i/' by multiplying (6.14) by gl"V (6.2),
OI.,(3X0I.f3-0
x"J-I+rJ-l x - , (6.15)
where we follow the usual conventions of general relativity by introducing the

Christoffel symbols r~,(3 defined by
rJ-l = ~ J-III (8 gOl. 11 + 8g(311 _ 8 gOl. (3 ) (6.16)

0I.,f3 2g 8 x f3 8xOI. 8x ll
We shall make no further use of these symbols in this book, but it is a good
example to show that the formal complexity of general relativity is a matter
of writing; what is subtle is the physics.
6.2.3 Simple Examples
1. Motion on S2
One can, as an exercise, recover that the motion on a usual sphere S2 is
a uniform motion on a great circle.
2. Motion on S3
Consider now the case of the three-dimensional "spherical" space of (6.3);
i.e., the free motion on the sphere S3.
Obviously, the volume of this space is finite since p2 = x 2 + y2 + z2 ~ R2.
In spherical coordinates, the Lagrangian of the problem is
(6.17)
The conservation laws of the problem bring the following simplifications.

a) There is rotational invariance. The angular momentum is conserved,
and the motion occurs on a plane.
b) We can choose the direction of the angular momentum as polar axis;
i.e., () = 7f /2 and iJ = o.
c) The Lagrangian of the planar motion therefore reduces to
(6.18)
d) The conservation of angular momentum results in
(6.19)
where A is a constant, fixed by the initial conditions.

e) The energy, which is a constant of the motion, is therefore
(6.20)
6.2 Free Motion in a Curved Space 115
We notice that the two constants of the motion E and A satisfy the
inequality
A 2 :S -
2R2E
-, (6.21)
m
which is a direct consequence of the fact that the energy is greater
than the rotational energy mA 2/2p2. This is a consequence of (6.20);
i.e., E ~ mA2 /2p2 ~ mA2/2R2.
The equations (6.19) and (6.20) are first-order differential equations that
determine the motion in terms of the constants of the motion E and A.
The solution is simple. We define parameters wand "( by
2E 2_ mA
2 2
w = mR2 and "( - 2ER2' (6.22)
From (6.21), we have the inequality
"(2 :s 1. (6.23)
We set
p = Rcos(w'ljJ); i.e., p = -w"j;Rsin(w'ljJ). (6.24)
If we insert this in equation (6.20), we obtain
2 2
2 2 '2 W "(
W = W 'ljJ + cos 2( 0'.);
w'f/
(6.25)
i.e.,
W 2"j;2 cos 2(w'ljJ) = w2 (cos2(w'ljJ) - "(2). (6.26)
We now make the change of functions
sin(w'ljJ(t)) = \",1- "(2 u(t); thereforecos 2 (w'ljJ) = 1- (1_"(2)u 2. (6.27)
The choice
u(t) = sin(w((t)) (6.28)
leads with no difficulty to
(2 = 1, namely u = sin(w(t - to)), (6.29)
and to the result
which is periodic and of frequency w. The calculation of the time evolution

of the azimuthal angle (t) is obtained by this expression and equation
(6.19),
. A
= . (6.31)
R2(cos2 w(t - to) + "(2 sin2 w(t - to))'
Le.,
tan((t) - o) = "(tanw(t - to), (6.32)
which is also periodic and of frequency w.
We conclude the following.
a) Consider the Euclidean plane of the motion (Le., x = p cos , y =
p sin ). For simplicity, we choose the initial parameters as to = 0, o =
0, and we have
x = R cos wt y = "(Rsinwt.
The trajectory is an ellipse of equation x 2 + y2 h 2 = R2.

b) The point p = R (i.e., the boundary of the space) is always reached,
whatever the initial conditions on the energy and the angular mo-
mentum. If A = 0, the angular momentum vanishes and the motion
is linear and sinusoidal. If "( = 1, the motion is uniform on a circle of
radius R.
c) In this Euclidean plane, the energy of the particle is
where the "effective potential" V is energy dependent:
Therefore, the motion also appears as a two-dimensional harmonic

motion whose frequency n depends on the total energy E,
n2 = 2E
J& mR2.
d) Of course, if the square of the velocity is a constant in the curved

four-dimensional space, this is not the case if one visualizes the phe-
nomenon in a Euclidean plane, as above.
e) The simplicity of the result is intuitive. Quite obviously, as one can see
in the definition (6.3), the symmetry of the problem is much larger
than the sole rotation in n3. There is a rotation invariance in n4.
The solutions of maximal symmetry correspond to a uniform motion
on a circle of radius R in a plane whose orientation is arbitrary in n4.
The whole set of solutions is obtained by projecting these particular
solutions on planes of n 3 , which leads to the elliptic trajectories we
have found.
6.3 Geodesic Lines 117
6.2.4 Conjugate Momenta and the Hamiltonian
The calculation of the conjugate momenta and the Hamiltonian is straight-

forward in the example above. We obtain
R2 2
Pp = mp R2 -p2' Po = mp (), (6.33)
and the Hamiltonian
H = _1 (p2R2 2+2
- P Po + 2) .
Pc/> (6.34)
2m p R2 p2 p2 sin2 ()
6.3 Geodesic Lines

The following property is fundamental.
Theorem 5. The trajectories of a free particle in a curved space are the

geodesic lines of this space.
6.3.1 Definition
In the case of a positive metric, a possible definition of a geodesic line, hereafter

called a geodesic for simplicity, going through two points A and B is that
it is the curve of minimal (extremal) length between these two points. In
differential geometry, there are other equivalent definitions. 5
Therefore, a geodesic is the path that minimizes the length
s AB = LB ds = LB JgCl.(3dxCl. dx(3; (6.35)
i.e., the path {XCI.} such that 8s AB = 0 for any infinitesimal variation
{8x Cl.,8:i;CI.}. Considering an arbitrary parameterization {XCl.(A)} of the path,
we must find the path that minimizes the integral
dxCl. dx(3
gCl.(3 dA dA dA. (6.36)
The assertion above is that these paths are the same as those along which
the action
S =
A
lB 1
.edt = -m
2 A
lB dxCl. dx f3
gCl.(3--dt
dt dt
(6.37)
is stationary.
5 In Minkowski space, light rays follow trajectories of vanishing "length." The no-
tion of parallel transport allows us to overcome this apparent difficulty; see [5],
[16], or [18].
6.3.2 Equation of the Geodesics
The variational problem posed in equation (6.36) is similar in every way with
those considered in Chapter 2. Consider a variation
XV --+ XV + EV and XV --+ XV + EV , (6.38)
where
dE v
EV(A) EV(B)
.V
E = d)" . and = = O. (6.39)
To first order, the variation of SAB is
(6.40)
where we have set X'" == (djd)")x"'. We now integrate the second term by parts.
Consider the quantity
We have
s: _
USAB -
JB [12Faxv
A
(a g",{3. '" .X X
{3
-
d (2gv{3x. (3)) - (2gv{3x. (3) d)"
d)" d ( 2F 1)] E
vd ).. -_ O.
(6.41 )
This variation must vanish for any {EV}, and we obtain the equations
(6.42)
These equations are simplified if one makes an appropriate choice of the pa-
rameter )... Consider the choice).. = s; i.e., ).. is the length along the geodesic
and d)" = ds. 6 Then, by definition, inserting this in equation (6.36), we have
along the geodesic
dF
F= 1, and d)" = o.
Consequently, the equation of the geodesic becomes
(6.43)
Not only does this equation have the same form as the equation of motion
(6.13), but it is equivalent to it. Indeed, we can choose).. = t. In the case of a
free motion, we have seen that v = dsjdt is a constant along the trajectory.
Therefore, ds = vdt and the factor Ijv 2 cancels off identically in (6.43).
We have proven our assertion. The trajectories of a free particle in a curved
space are the geodesics of the space. In other words, the trajectory followed
6 Actually, it suffices that A be an affine function of s.
by a free particle to go from A to B in a curved space is the path of short-

est length. 7 Galileo's principle of inertia appears as a particular case of this
property in flat space.
6.3.3 Examples
If we keep in mind how we have treated example 2 of the free motion on S3,
we can use the constants of the motion in order to determine the geodesics in
simple but non trivial cases that are not totally academic.
1. Isotropic spaces
Metrics of the form (6.7), or more generally
(6.44)
are isotropic and time independent. Therefore, as we have seen in exam-

ple 2, the conservation laws of energy and angular momentum simplify
considerably the determination of the geodesics.
Starting from (6.44) and the conservation of angular momentum, which
if we choose the polar axis perpendicular to the trajectory is expressed as
(p = Aj p2, one obtains
2E A2
-=g(p)ii+- (6.45)
m p2
and, by a simple quadrature, the relation between t and p,
i
r da
po
g(a)
( (2Ejm-A2ja2) )
=t-to (6.46)
The only difficulty lies in the inversion of this formula in order to obtain
the dependence p( t) .
2. Hyperbolic geodesics
Consider the metric
(6.47)
where R is a characteristic length.

Notice that one considers this metric as deriving from a "Lorentzian"
metric
ds 2 = dx 2 + dy2 + dz 2 - dw 2, (6.48)
by the three-dimensional reduction
7 As already mentioned, in full generality this statement is wrong. In Minkowski

space there exist nontrivial zero-length paths; i.e., photon trajectories. The as-
sertion becomes correct and general if one uses the notion of parallel transport.
(6.49)
The calculation is very similar to that of example 2.

In spherical coordinates, the Lagrangian of the problem is
(6.50)
The conservation laws are the same as before.

a) Rotation invariance yields the conservation of angular momentum.
The motion is planar.
b) Choosing the direction of the angular momentum as the polar axis,
we have () = 7r /2 and iJ = o.
c) The Lagrangian of the planar motion reduces to
(6.51)
d) The conservation of angular momentum leads to
(6.52)
where A is a constant of the motion.

e) The energy, which is a constant of the motion, is
(6.53)
The solution of the problem is obtained rather easily. One defines the
parameters wand , as before,
2 2E
w = mR2 and ,2 = 2ER2.
mA 2
(6.54)
We obtain in a way similar to example 2, except that hyperbolic functions

replace trigonometric functions,
(6.55)
and
tan((t) - o) = ,cothw(t - to). (6.56)
We notice that the distance to the origin increases exponentially when
It I -+ 00. The geodesics of the metric (6.47) are hyperbolas
(6.57)
6.3.4 Maupertuis Principle and Geodesics
Consider again a conservative Lagrangian problem (i.e., a problem where the

Lagrangian does not depend on time and the energy E is conserved, as in
Chapter 4 Section 4.5.3). We want to show that the motion of a particle in
a field of forces in Euclidean space can be reduced to the free motion of this
particle in a curved space.
The Maupertuis principle can be extended to an arbitrary number N of
degrees of freedom with N = 3k for k particles in three-dimensional space.
We denote the coordinates by qi, i = 1, ... , N and their time derivatives by
qi' For simplicity, we call q the set qi, i = 1, ... , N.
Consider in three-dimensional Euclidean space a Lagrangian of the form
1 N
.c = 2 L mi,j(q)qiqj - V(q). (6.58)
i,j=l
Here, we denote by mi,j (q) the coefficients of the quadratic form that con-
stitutes the kinetic energy. In Cartesian coordinates, mi,j (q) is diagonal and
does not depend on the coordinates. This is no longer true in general.
The conjugate momenta are
(6.59)
The energy, which is a constant of the motion, is
1 N
E =2 L mi,j(q)qiqj + V(q). (6.60)
i,j=l
Consider the time interval dt to go from point qi, i 1 , ... , N to point

qi + dqi, i = 1, ... , N. This interval dt is obtained easily with the previous
expression (6.60) as
L:i,j mi,j (q)d qi d qj
dt = (6.61)
2(E - V(q))
If we insert this into the expression (4.72) of the reduced action
(6.62)
the reduced action takes the form
So = J (6.63)
In this form, we see that the trajectory that minimizes the reduced action So
is a geodesic of a curved space whose metric, which depends on E, is given by
N
ds 2 = 2(E - V(q)) L mi,j(q)dqidqj. (6.64)
i,j=l
In this form, the Maupertuis principle appears as a purely geometric state-

ment.
Once this geodesic is determined, the motion is determined by integrating
(6.61); i.e.,
t - to = lqo
q Li,j mi,j(q')dq~dqj
2(E - V(q'))
(6.65)
6.4 Gravitation and the Curvature of Space-Time
The scheme we have studied up to now is appealing since the equality between
the inertial and the gravitational masses follows automatically. However, the
theory has an embarrassing by-product in that the norm of the velocity is a
constant in time.
In order to get rid of this defect, we must introduce the time variable in
the problem and extend it to space-time and not only space.
6.4.1 Newtonian Gravitation and Relativity
The "flat" space of special relativity corresponds to the (non-Euclidean) met-

ric 8
l.e. 900 = I, 911 = 922 = 933 = -1. (6.66)
Our purpose here is not to enter the domain of relativistic gravitation and
general relativity as a whole (see, for instance, [1], [15], [16], or [17]). We only
want to introduce a curvature of space-time that, at least to lowest order in
v 2 / c2 , v being a characteristic velocity of the problem, allows us to recover
Newton's usual equations while maintaining the nice properties encountered
above, in particular the fact that the mass drops out from the equations of
motion.
What metric of space-time can we choose in order to achieve this program?
We have seen in Chapter 3 that the Lagrangian of a free relativistic particle
1S
L = -mc2 V~
1 - ~. (6.67)
8 Unfortunately, we follow the particle-physics tradition of having a metric with

negative space components instead of positive ones as we have used above.
6.4 Gravitation and the Curvature of Space-Time 123
I t2g2
The Lorentz-invariant action is
S = -me2 1 - 2 dt. (6.68)

t, e
In the nonrelativistic limit of small velocities, the Lagrangian (6.67) is ex-

panded up to first order as
(6.69)
In nonrelativistic physics, the Lagrangian of a particle in a gravitational po-

tential has the form
.e = ~mv2
2 - m-f..
'1-', (6.70)
and the most "natural" choice to extend equation (6.69) would be
1
.e = -me2 + -mv
2
2
- m-f..
'1-', (6.71)
J J(e - ~:
and an action
s= .edt = -me + ~ ) dt. (6.72)
Comparing this expression with
S = -me J
ds, (6.73)
we end up quite naturally, to lowest order in v 2 / e2 and in / e2 , with the

expression of the invariant element
(6.74)
This is the simplest, or most naive, extension of the metric (6.66) which ac-
counts for the phenomena that interest us
(6.75)
The equation of the geodesics obviously gives us Newton's law
r = -\l, (6.76)
which involves neither the velocity of light, since we are in non-relativistic

mechanics, nor the mass, of course.
6.4.2 The Schwarzschild Metric
In the theory of general relativity, the metric is related to the mass distribution
(actually to the energy-momentum tensor) by Einstein's equations.
An exact solution of these equations was given in 1916 by Karl Schwarzschild.
It is the metric generated by the static gravitational field of an isotropic mass
distribution of total mass M. This metric leads to
ds 2 = (1 - ro) c2dt 2 _ dp2

7-) _ p2(d(j2 + sin e dqi)
,
2 (6.77)
P (1 -
where the Schwarzschild radius ro is given by

2GM
ro = --2-' (6.78)
C
One can write this formula in the more general way
ds 2 = (1 + 2tf>(P)) c2dt 2 _
c2 (
dp2
1 + 2~~P))
_ p2(de 2 + sin 2 e dqi), (6.79)
tf>(p) being the central Newtonian potential of the mass distribution.

In this expression, it is important to define the coordinates in a precise way.
The variables (e, ) are the usual angular coordinates in the reference system
centered at the origin of the mass distribution. A problem remains, however,
in the definition of the radial variable p in the presence of a gravitational field.
In the expression (6.79), the physical meaning of the coordinate p is that the
circumference of a circle centered at the origin is equal to 27rp. The distance
between two points PI and P2 in the same direction (e, ) is
d l2 = l po
p2
PI
dp
1--
P
> P2 - Pl (6.80)
Some arbitrariness remains as far as the couple of variables (p, t) are con-
cerned. Here, these variables are chosen so that there is no off-diagonal term
dp dt in the metric.
The proof of this formula is, of course, beyond the scope of this book.
One can refer to Landau and Lifshitz [1], Section 97, and to Misner, Thorne,
and Wheeler [18], Chapter 25. One can find the complete description of black
holes (i.e. physics inside the Schwarzschild radius) in [18].
The "naive" metric (6.74) is the approximation of (6.77) to lowest order
in v2jc2 and /c2.
We remark on the form (6.77) that its spatial part is not locally Euclidean.
There is no local rotation invariance, which is intuitive since the radial vari-
able plays a special role. When fields are weak (i.e. roj p 1), or at large
distances, one can use locally Euclidean space variables (x, y, z), and, to a
good approximation, the Schwarzschild metric (6.77) is of the form
ds 2 = (1 - r:) c2de _ (1 + r:) (dx 2 + dy2 + dz 2)

= (1 - ~) c2dt 2 - (1 + r:) (dr2 + r2(de 2 + sin 2 e dr/i)), (6.81)
where (r, e, ) are the usual spherical coordinates. (The proof of this result
can be found in [1], Section 97).
6.4.3 Gravitation and Time Flow
We notice that if the metric (6.75) gives us the classical Newton equation, it
"curves" time at each point in space. In that respect, it is in full agreement
with the general solution of Schwarzschild, which predicts a dilation of the
proper time in an algebraically increasing gravitational potential
(6.82)
This effect, as well as the "twin effect" of special relativity, has been mea-
sured with great accuracy by R.F.C. Vessot and collaborators. 9 A hydrogen
maser was sent to an altitude of 10,000 km by a Scout rocket, and the varia-
tion in time of its frequency was made as the gravitational potential increased
(algebraically). There are many corrections, in particular due to the Doppler
effect of the spacecraft and to the Earth's rotation. It was possible to test the
predictions of general relativity on the variation of the pace of a clock as a
function of the gravitational field with a relative accuracy of 7 x 10- 5 . This
was done by comparison with atomic clocks, or masers, on Earth. Up to now,
it has been one of the best verifications of general relativity. The recording of
the beats between the embarked maser and a test maser on Earth is shown in
Figure 6.1. (These are actually beats between signals, which are first recorded
and then treated in order to take into account all physical corrections.)
6.4.4 Precession of Mercury's Perihelion
To next order, the Schwarzschild metric curves space. This causes a variety
of observable phenomena in celestial mechanics. Among these is the famous
precession of the perihelion of planets and comets .
Here we choose to work with the form (6.81). In fact, the value of
Schwarzschild's radius is r s = 2G M / c2 , r s = 3 km for the sun and r s =
0.44 cm for the Earth. It is very small compared to the orders of magnitude
of celestial mechanics in the solar system (1 A.U.= 150 x 106 km). The effects
are small corrections to the Newtonian terms.
9 R. F. C. Vessot, M. W. Levine, E. M. Mattison, E. L. Blomberg, T. E. Hoffman,
G. U. Nystrom, B. F. Farrel, R. Decher, P. B. Eby, C. R. Baugher, J. W. Watts, D.
L. Teuber, and F. D. Wills, "Test of Relativistic Gravitation with a Space-Borne
Hydrogen Maser", Phys. Rev. Lett. 45, 2081, (1980).
(a) ~
1
11460 .. T
1
11490MT
(e)
I
12400.. T
...
1
1331 GMT
(e)
I
1~360MT
Fig. 6.1. Beats between a maser onboard the spacecraft launched by a Scout rocket
and a maser on Earth at various instants in GMT. (a) Signal of the dipole antenna;
the pointer shows the delicate moment when the spacecraft separated from the rocket
(it was important that the maser onboard had not been damaged by vibrations
during takeoff). During this first phase, the special relativity effect due to the velocity
is dominant. (b) Time interval of "zero beat" during ascent when the velocity effect
and the gravitational effect, of opposite signs, cancel each other. (c) Beat at the
apogee, entirely due to the gravitational effect of general relativity. Its frequency is
0.9 Hz. (d) Zero beat at descent. (e) End of the experiment. The spacecraft enters
the atmosphere and the maser onboard ceases to work. (Courtesy of R.F.C. Vessot.)
The length element is given by
ds 2 = (1 - 2a)e 2 dt 2 - (1 + 2a)(dr2 + r2(dfP + sin2 f) d2)), (6.83)
GM
with a
re
l.
= -2-
For non-relativistic velocities, one has to good approximation
(6.84)
and the Lagrangian
.c = -me -ds
dt
G M-
= -me2 + -
r
3G
m + -m ( 1 + - M) [r2 + r 2(f)' 2 + sm
2-
2 re
. 2 .2 ]
f) ) .
(6.85)
The first and most famous application is the calculation of the precession
of Mercury's perihelion.
Classical Calculation
It is convenient to recall the treatment of Kepler's problem that corresponds

to a = O. Let M be the mass of the sun (which we assume to be fixed
for simplicity) and m the mass of the orbiting planet. We choose spherical
coordinates (r, f), ), and we assume that the trajectory is in the plane f) = 7r /2
(conservation of angular momentum). The Lagrangian of the problem is
Let [. be the energy of the planet and A the norm of its angular momentum.
For convenience, we define
[. A
E=.- and L=- (6.86)
m m
Conservation of angular momentum yields
r2 = L constant of the motion, (6.87)
and the (constant) energy of the system is
(6.88)
We study the trajectory r( ), from which the time dependence follows by

using (6.87). We define r' =. (dr/d), therefore r = r' d/dt = r'L/r2. By
introducing the variable u( ) = 1/ r( ), we obtain the first-order equation
2E ,2 2 2GMu
L2 = U +U - -L-2- (6.89)
The trajectory is obtained by a simple quadrature (one can take the deriva-
tive of (6.89) with respect to , which leads to a linear equation whose general
solution is inserted into (6.89) in order to fix the constants):
1 + ecos L2
U= P with p= GM and e = (6.90)
This also amounts to

p
r=---- (6.91)
1 + ecos'
where one recognizes, for negative energies (E ::::: 0), an ellipse of parameter p
and eccentricity e.
Relativistic Correction
With the curvature of space-time, the motion remains planar. One chooses as
above e = 7r 12, and the Lagrangian is given by (6.85); i.e.,
I' _
/..-- 1 (1 +--
-m
2
3GM) (.2
rc2
r +r 2J,2)
'f'
GMm
+--.
r
(6.92)
There is conservation of the angular momentum A = mr2(1+3GM l(rc2))

and of the energy E. As above, we set
E A 1
E=::-, L=::-, U= -. (6.93)
m m r
We make use of the parameter p, the eccentricity e of the Newtonian ellipse,
and the parameter A defined by
3GM 3G2M2
e= and A -- - - -
pc2 - L 2 c2 (6 . 94)
The energy is calculated as in Section 3.3. Its value is
E --
- 1(1 +--
2
3G M) (.2
c2 r
2J,2) - -
r +r'f' G-M.
r
(6.95)
We still define r' =:: (dr 1d) and r = r' so that the energy is expressed, in
terms of the variables and parameters defined in (6.93) and (6.94), as
2E (U,2+ U2) 2
-- --u (6.96)
2 - (1+3~~) p.
Under the change of function

V() = p u(), (6.97)
and multiplying by p2, we obtain
(6.98)
Of course, we notice that in the absence of the relativistic correction (,X = 0),
the solution is
Va = 1 + e cos . (6.99)
In order to calculate the relativistic correction, we start by taking the deriva-
tive of (6.98) with respect to . We obtain
2v' v" + 2v' V - 2v' - ,Xv/ (3v 2 + v/ 2 + 2vv") = 0;
i.e., 2v" + 2v - 2 - 'x(3v 2 + v/ 2 + 2vv") = O. (6.100)

This is a necessary condition (the complete solution is obtained by inserting
this into (6.98)).
First-Order Perturbation
The solutions of equation (6.98) can be expressed in terms of elliptic functions.

However, a first-order perturbative calculation suffices since the effect is very
weak. Since we have ,X = 3G M / pc2 1, we expand the solution as
(6.101)
where Va is the Kepler solution Va = 1 + e cos and VI is the correction that

interests us. Inserting this into (6.100) and retaining only the first-order terms
in ,X, we obtain the equation
II 3 + e2
vI + VI = - 2 - + 2e cos , (6.102)
whose solution is
VI
3+
=-
2 .e
2 - +esm+ (
00+"2 sm, e). (6.103)
a being an arbitrary constant that we choose to be equal to zero. We notice

that to first order in ,X the initial equation (6.98) is satisfied.
The complete solution of the problem in first-order perturbation theory,
taking into account that cos(l - c) ~ cos + c sin , is therefore
1 GM [
-:;. = 1:2 1 + e cos 1 -
( 3 G 2 M2 )
c2 L2
3G 2 M2
+ c2 L2
(3 +e2 e. ) ]
- 2 - + "2 sm .
(6.104)
This is the equation of a deformed ellipse that precesses. The precession of

the major axis in one period (8 = 21f) corresponds to an angle
61fC 2M2 61fC M

Llw ~ c2 L2 - c2a(1 _ e2) , (6.105)
where a is half of the major axis of the ellipse and e its eccentricity.
The parameters of the planet Mercury are a = 55,3 x 106 km, 11 = liT =
415 revolutions per century, and its eccentricity is e = 0.2056 (the mass of
the sun is Me:; = 2 X 1030 kg). The calculated value is
Llw = 43.03 seconds of arc per century
compared with the observed 43.11 0.45" per century. Einstein said that this
result was the strongest emotional experience of his scientific life.lO
6.4.5 Gravitational Deflection of Light Rays
Another effect of the metric (6.77) and the corresponding geodesics is the
deviation of light rays by a gravitational field. This effect, which was one of
the first verifications of general relativity, in 1919, has regained considerable
interest in recent years because of its astrophysical and cosmological conse-
quences through the gravitational lensing effect that it induces.ll We use the
weak-field approximation
(6.106)
The most important astrophysical use of this effect is the gravitational

lensing effect it produces on remote objects. This effect comes from the gravi-
tational curvature of photon trajectories that it produces, as shown in Figure
6.2.
In order to calculate the trajectory of a photon, we can use the fact that
the proper time dT of a photon is zero; i.e., inserting this in (6.106), we have
(6.107)
where (r, t) are the space-time coordinates of the photon as seen by an ob-
server. From this equation, we can calculate the velocity v of a photon in a
gravitational potential,
1+ 2~~r) [ 2<P(r)]
v =C 2~(r) ~ C 1 + --2- . (6.108)
1- ----cr- C
10 See A. Pais, Subtle is the Lord, Chapter 14, page 253.

11 See, for instance, Rich [19].
Fig. 6.2. Deviation of a photon trajectory in a gravitational potential <P(r). This

potential is assumed to be spherically symmetric. The photon position at a given
time is parameterized by r and e. The straight line represents the trajectory of a
photon in the absence of a gravitational field. In the presence of this potential, the
photon "falls" in the gravitational field (full curve).
With this expression, we can calculate the photon trajectory by the Fermat
principle exactly as for curved rays in equation (2.5); i.e., by minimizing the
integral
T = rB
dR, (6.109)
JA v
where A and B are the endpoints of the photon trajectory and dR is the length
element along the trajectory.
We assume that the potential <1>( r) is spherically symmetric and centered
at the origin. We consider the motion in the plane (AOB) and we use polar
coordinates (r, e) as shown in Figure 6.2. We consider a situation where A
e
and B are symmetric to each other, so that = 0 corresponds to the point
of shortest distance to the origin. It is convenient to determine the function
r(B) that minimizes the time T. Under these conditions, equation (6.109) can
be written as
T= -
liB VI + r 2 (P dr
c A [1 + 2~~r) 1 '
(6.110)
where iJ = dB / dr.
We consider the potential created by a total mass M, and we assume the
photon path is outside the mass distribution so that we can set
2<1>(r) 2GM A
, (6.111)
c2 r
which defines the constant A.

We notice in (6.110) that the variable e is not present in the Lagrangian
(6.112)
Therefore, the momentum Be/Be is conserved; Le.,
(6.113)
where R is a length that is a constant of the motion. Solving this first-order

differential equation causes no special difficulty. We change to dimensionless
variables
r = xR, ()J = d()/dx == Re, and J.L = >../R, (6.114)
and, by squaring and taking into account that the gravitational term is small,
J.L/x 1, we obtain
(6.115)
In the absence of a gravitational field, J.L = 0, we obtain the equation
()J = 1 . Le., () = arccos R or r cos () = R,

xvx2 -1' r
which corresponds to a straight line at a distance R from the origin.

If we now switch on the gravitational potential, we obtain, to first order
in J.L/x, the equation
() J 1 J.L
-- xvx 2 - 1
+-===--
( Vx 2 - 1)3 .
(6.116)
whose solution is
R >..r
() = arccos - - -=--;:::::;<=~ (6.117)
r Rvr2 - R2
(we have come back to the variable r).
The value of the constant R is obtained from the closest distance ro of the
photon to the origin, which corresponds to () = o. We obtain
GM
R = ro(1 - c:) with c:- - -2
- roc
One can check that, to the same order, R is nothing but the impact parameter
of the photon (Le. the distance between its trajectory, which is linear at long
distances, and the parallel line going through the center r = 0).
What is more interesting is the angular deflection compared with a straight
line. In the absence of the gravitational field, the photon follows a straight
line, so that the difference between the direction of arrival and the direction
of departure is L1()~ = 7r. This direction of departure is also the (Euclidean)
direction of observation of the source that emits the photon.
In solution (6.117), in the presence ofthe gravitational potential, this same
difference is twice the difference ()(r = (0) - ()(r = ro). By definition, ()(r =
6.5 Gravitational Optics and Mirages 133
ro) = O. For r --+ 00, equation (6.117) gives e(r = (0) = 7r /2-Aj R, according
to whether it is the initial or final direction of the photon. The difference
between the direction of reception of the photon and the geometrical direction
of its source is i1e:foM = 7r - 2),,/ R. In other words, one observes a deflection of
the light rays compared with a straight line, due to the gravitational potential,
of
i1e = i1e GM _ i1e o = 4GM (6.118)
00 00 00 Rc2 '
where R is the impact parameter or, to good approximation, the closest dis-
tance between the photon and the center of the potential. 12
For a light ray coming from a star and grazing the edge of the sun, the
calculated deflection is 1.75/1. In the case of Jupiter, it is 0.02/1.
The first measurement of this effect was performed by teams led by Sir
Arthur Stanley Eddington. 13 It was done on the Sobral Islands in Brasil and
the Principe Islands in the Gulf of Guinea on May 29, 1919. The experiment
consisted in observing the apparent motion of stars (seven at Sobral and five at
Principe) during a total eclipse of the sun. The results, 1.98 0.16/1 at Sobral
and 1.61 0.31/1 at Principe, were in agreement with Einstein's prediction.
It is most probably this experiment that generated the public's interest in
relativity and Einstein himself.
The most precise measurement at present comes from interferometric ra-
dioastronomical observation of radio waves coming from the source 3C 279. 14
It gives the result 1,77 0.20/1.
6.5 Gravitational Optics and Mirages

The effect we have just described, the action of a gravitational field on the
propagation of light, shows that a vast field has now been opened in what one
could call gravitational optics.
6.5.1 Gravitational Lensing
As we shall see, the most important cosmological use of this effect is through
gravitational lensing of light on remote objects in the universe. This effect is
due to the fact that mass (not only mass of galaxies but also of "dark matter")
in the universe acts as an optical instrument that can enable one to observe
faraway objects and therefore very "young" objects.
Two potentials are of particular interest. The first is that of a point-like
mass M:
12 It is amusing that this is exactly twice as much as the deflection that a Newtonian
argument would give using the Rutherford scattering formula.
13 F.W. Dyson, A.S. Eddington, and C. Davidson, Philos. Trans. R Soc. London,
Ser. 220 A, 291 (1920); Mem. R Astron. Soc., 62, 291 (1920).
14 G.A. Seielstad, RA. Sramek, and K.W. Weller, Phys. Rev. Lett., 24, 1373 (1970).
[Xp GM 4GM
8r r2 =? a = Rc2 . (6.119)
The angle of deflection a is of the order of the gravitational potential iP(R) at

the shortest distance of approach or, equivalently, the square of the velocity
v~jc2 of an object orbiting on a circle at the shortest distance of approach.
The second potential of interest gives a constant rotation velocity and is
a good approximation for extended objects such as galactic halos of clusters
of galaxies:
8iP v2
'"uc = or" 2C (6.120)
8r r c '
Here, Vc is the (constant) circular velocity of objects orbiting in the galaxy or
in the cluster of galaxies.
6.5.2 Gravitational Mirages
As shown in Figure 6.3, the gravitational deflection can yield two images of
a source. The two images have impact parameters b1 and b2 . The potential
created by a point-like mass always gives two images because the angle of
deflection diverges for small values of the impact parameter. We will see later
that one of the images is in general much more luminous than the other.
In the case of an extended mass distribution, such as a cluster of galaxies
(6.120), one can only observe two images separated by an angle a if the
undeflected impact parameter satisfies bo < b max = Laj2 where L is the
distance between the source and the lens (which we take here to be equal to
the distance between the observer and the lens for simplicity). The reason is
that if bo were larger than Laj2, the two images would be on the same side,
which is impossible.
The large clusters correspond to a = v~ j c2 rv 10- 5 , and the two images
can be separated by terrestrial telescopes of resolution CJ() rv 3 X 10- 6 .
The "cross-section" necessary for a double image to occur is CJ rv 1fb;;'ax =
1f L2V~ j c2 . This cross-section increases with L because the necessary angle of
deflection decreases with L.
The probability for a given object to have two images because of the lensing
effect due to a cluster of galaxies is simply equal to the probability P that this
object hides behind a cluster. This probability is proportional to the cross-
section, to the number density n of clusters, and to the total length of the
path rv L:
(6.121)
The fact that this probability increases rapidly with L makes the number of
double-image quasars sensitive to the value of the undeflected impact param-
eter boo
A second pratical application of deflection by clusters is that the time of
flight is not the same for both images. Quasars have an intrinsic variability
y
L L
Fig. 6.3. Creation of two images of a source S by a gravitational potential symmetric

around the origin. The undeflected impact parameter is bo, whereas the physical
photon trajectories have impact parameters b1 and b2 and deflection angles a1 = 28 1
and a2 = 28 2 . For simplicity, the source and the observer 0 are assumed to be at
equal distances from the origin. For clarity, the angles have been grossly exaggerated.
(Courtesy of James Rich.)
and by comparing the light curves (light flux as a function of time) one can
determine the difference Llt of the two times of flight.
The time it takes light to go from one point to another can be deduced
with no difficulty from the calculations performed in Section 6.4.5. We have
dr 2GM 1 - 2GM/re 2 R 2]1/2

edt = (1 - ~) [
1 - 1 - 2G M / Re2 (--;: )
(6.122)
To first order in 2G M / re 2 and 2G M / Re 2, the time interval to get from r to

R is
et(r, R) -_ vir 2 - R 2 + -2GM

2 - ln
(r + vr2
R - R2) GM
+- 2
~- R
--. (6.123)
e e r+R
The first term is the obvious term in the absence of a gravitational effect.
If we consider a mirage, the time delay is the difference between the inte~
grals calculated along each path. The first-order term vanishes obviously. This
leaves a "gravitational" term, which is proportional to the potential difference,
and a "geometric term."
For an angle of deflection independent of the point of impact, which is
approximately the case for clusters of galaxies (6.120), the geometric term
vanishes, leaving
(6.124)
where Y1(Z) and Y2(Z) are the photon trajectories in the two images. Consider
the nearly symmetric case IY11 rv IY21. Going back to Figure 6.3, we see that
in the case where (h rv (h = 8, we have b1 - bo = b2 + boo The integral

will be dominated by the nearby region of the cluster, and we can make the
approximation
IYl(Z)I-IY2(Z)1 rv b1 - b2 rv 2bo . (6.125)
Substituting this in (6.124), we obtain
Llt rv 1
4b o
00
-00
dz
a<P
a Y
. (6.126)
The integral is simply the deflection angle 28 given by (6.118), and the time
delay is
bo
Llt = 4L L 28. (6.127)
The factor bo/ L is the angular separation between the center of the cluster
and the average position of the two images.
e
In order to estimate the length of the delay, we can take bo/ L rv rv 10- 5
and L rv d hub (where d hub is the Hubble distance, d hub = c/ Ho ':::0' 4300 Mpc,
Ho being the Hubble constant), which gives Llt rv 1 year.
The first historical observation of this effect was the observation in 1979
of the "double quasar" caused by the gravitational lens Q0957+561. 15 The
original image is shown in Figure 6.4.
Fig. 6.4. Top left: first picture of the double quasar Q0957+561. Top right: com-
parison of the spectra of the two objects with a time delay of 417 days. Bottom
picture: the galaxy that acts as a gravitational lens after subtracting the pictures of
the quasars. (Picture from P. Magain , Liege University.)
15 D. Walsh, R.F. Carswell, and R.J. Weymann, Nature, 279, 381 (1979).
The two quasars have exactly the same spectrum. However, the time vari-
ation of the signals emitted is the same except for a delay of 417 days. Once
the two images are subtracted from each other, taking this delay into account,
the galaxy that acts as a gravitational lens appears clearly.
One can observe pictures with a multiplicity greater than two. The most
spectacular example is perhaps the Einstein cross shown in Figure 6.5. Four
images of the pulsar Q2237 +0305 appear, together with the spiral galaxy that
causes the mirage.
Fig. 6.5. The Einstein cross. The four different images of the same quasar at a
redshift of 1. 7 are due to the central galaxy, which is at a redshift of 0.04 and
therefore much closer. One can wonder about the probability of finding such an
alignment, but the vastness of the universe and the perseverance of astronomers are
such that events of small probability are observed in appreciable amounts. (Credit
NASA and ESA.)
In the case of a straight alignment, one can observe an "Einstein ring," as

shown in Figure 6.6.
This is of course a very exceptional situation. Einstein did not believe
that such a phenomenon could ever be observed. Nevertheless, he did the
calculation to please his friend Mandl. In Figure 6.6 the galaxy B1938+666 is
"hidden" behind a nearby galaxy. This latter galaxy does not act as a screen
but, on the contrary, as a gigantic gravitational lens. It amplifies considerably
the luminosity of the first galaxy, in the shape of a nearly circular ring caused
by the fact that the sun and the two galaxies are nearly exactly aligned.
In (6.127), the reflection angle 2e can be determined from the angular
separation of the two images. The angle bo/ L is more difficult to determine
because it depends on the mass distribution in the cluster.
e
Once the two angles and bo/ L are determined, the distance L of the
cluster can be determined by measuring the value of Llt. This determination
of the distance is a very useful tool for the evaluation of Hubble's constant
(see [19]).
Fig. 6.6. Einstein ring caused by the lensing by a close galaxy of the light emitted
by the galaxy B1938+666 located behind it. The actual size of the visible object is
several tens of thousands of light-years. The picture comes from the Hubble Space
Telescope. (Credit L. J . King (U. Manchester), NIeMOS, HST, NASA.)
Fig. 6.7. Gravitational lensing effect by a spherically symmetric potential centered

at the origin on the light emitted by an extended object S in the background of
the sky. In this example, two images are seen by the observer O. The right-hand
panel shows a projection on the x, y plane (at the z value of the lens) of the two
images. The image one would observe in the absence of the lens is also shown. Owing
to the cylindrical symmetry, the motion of photons is planar and the two images
are therefore extended in the tangent direction. (If the object were exactly behind
the lens, the image would be a circle around the origin.) Owing to this distortion,
galaxies of the background of the sky can appear as arcs, as one can see in Figure
6.8. (Figure courtesy of James Rich.)
The last effect of gravitational lensing comes from the distortion of the
image of an extended object. The distortion along the radial and tangent
directions is illustrated in Figure 6.7. This distortion causes the arcs that can
be seen in Figure 6.8.
This effect can be used to determine the mass of the cluster. The masses
determined by this effect can be compared with the visible masses and with the
masses one estimates with the virial theorem and the dispersion in velocities.
It is a method to evaluate the amount of dark matter in the cluster.
In this respect, the universe appears as an endless gallery of mirages.
Fig. 6.8. In this picture, obtained by the Hubble Space Telescope, practically all
luminous objects are galaxies of the cluster Abell 2218. This cluster is massive
enough that its gravitational field focuses the light of galaxies located behind it.
This results in images that are stretched along arcs similar to what one can see at
night through a white wine glass. The Abell 2218 cluster is 3 billion light-years from
us in the constellation of Draco. The spectrum of the arcs is strongly blueshifted
compared with that of the stars in the cluster (this cannot be seen on the black
and white picture) because the light that is focused comes from very young and
hot stars at the beginning of their evolution. (Credit: W. Couch (University of New
South Wales), R. Ellis (Cambridge University), NASA.)
6.5.3 Baryonic Dark Matter
The theory of nucleosynthesis indicates that the baryon density in the universe
is 0.04 times the critical density. This leads to the idea that baryons cannot
account for all the dark matter. It is nevertheless possible that baryons are
a component of the galactic dark matter if they are in a form that does not
emit light in appreciable amounts. The simplest way this can happen is if the
baryons are hidden in objects that either do not burn (for instance, brown
dwarfs) or have ceased to burn (for instance, white dwarfs, neutron stars, or
black holes). Brown dwarfs have a mass < 0.07M(,), which is not sufficient to
create high enough temperatures for the burning of hydrogen. Initially, they
were the preferred candidates because they completely avoid the problems
associated with background light emission or the pollution of the interstellar
medium by heavy elements caused by mass loss or supernova explosions.
The dark objects located in galactic halos are called "machos", for "mas-
sive compact halo objects."
Paczynski 16 has suggested that machos could be detected through the
gravitational lensing effect they induce on individual stars of the Large Mag-
16 B. Paczynski, Astrophys. J. 304, 1 (1986).

Milky Way
o SMC
Fig. 6.9. Sketch of a gravitational lensing effect in the Large Magellanic Cloud
(LMC) by an invisible object in the galactic halo. The two images cannot be resolved,
but the combined luminosity of the two images gives rise to a time-dependent am-
plification of the light of the star when the invisible object crosses the line of sight.
The corresponding light curve is shown in Figure 6.10.
ellanic Cloud (LMC) (Figure 6.9). This small galaxy is 50 kpc away from the
Earth.
The theory of gravitational lensing was done above. For a point-like lens, it
is simple to show that the two amplifications 17 depend on the reduced impact
parameter u = bo/ R E ,
A = u2 + 2 uVU2+4 (6.128)
2uvu 2 +4 '
where the "Einstein radius" is given by
R2 _ 4GMLx(1- x)
E - c2 '
(6.129)
where Lx is the distance between the observer and the lens, L being the
distance between the observer and the source. We see that for bo R E ,
A+ = 1 and A_ = 0, as expected. For bo ---+ 0, the amplifications become
infinite formally, and this actually results from the fact that a point-like source
is deformed into a ring. The divergence ceases if one takes into account the
finite extension of the source, which gives an effective extension at boo
In the case of "lensing" by stellar objects of the galactic halo, the angle
between the two images is very small 1 milliarcsec). This type of effect is
therefore called "microlensing" . Terrestrial telescopes cannot resolve the two
17 Professionals prefer the term "magnification."
images because atmospheric turbulence blurs the images and stellar objects
have an angular dimension of the order of 1 arcsec. Therefore, the only observ-
able effect is a temporary amplification of the total intensity when the macho
comes close to the line of sight, and then recedes from it. The amplification is
u2 + 2
A= (6.130)
uvu 2 +4 '
where u is the closest distance to the (undeflected) line of sight that the
deflector reaches in units of "Einstein's radius," RE = J4GMLx(1 - x)/c2 ,
where L is the distance between the source and the observer, Lx is the distance
between the observer and the deflector, and M is the mass of the macho.
The amplification is larger than 1.34 when the distance to the line of sight
is less than R E . This amplification corresponds to an acceptable observational
threshold since photometry can be performed quite easily to better than 10%
accuracy. At a given moment, the probability P for a star to be amplified by
a factor 1.34 is equal to the probability that its undeflected light path passes
within one Einstein radius of the macho,
P rv nmacho L 7r R~ , (6.131)
where nmacho is the average number density of machos lying between the LMC
and us, and L is the distance of the LMC. The macho density is nmacho rv
Mhalo/ M L 3 , where Mhalo is the total mass of the halo up to the position of
the LMC. Using the expression of the Einstein radius, one finds that P does
not depend on the mass M but is determined only by the velocity of the LMC:
P rv GMhalo rv V[MC
(6.132)
Lc2 c2
The LMC is believed to orbit around the Milky Way with a velocity of VLMC rv
200 km/s. (This corresponds to a flat rotation curve up to the LMC.) In that
case, P is of the order of 10- 6 . More refined calculations give P = 0.5 X 10- 6
[19].
Since the observer, the star, and the deflector are in relative motion with
respect to one another, a noticeable amplification lasts only as long as the non-
deflected ray remains within one Einstein radius. The resulting light curve,
which is achromatic and symmetric, is represented in Figure 6.10 for a variety
of values of the impact parameter. The timescale of the amplification is the
time ilt that it takes the object to cross one Einstein radius between the
observer and the source. For the lensing of stars of the LMC and deflectors
of our halo, the relative velocities are of the order of 200kms- 1 and the
position of the deflector is roughly halfway between the observer and the
source (x rv 0.5). The average ilt is then
ilt rv RE
200km/s
rv 75 days ~
-.
Mev
(6.133)
The duration distribution can therefore be used to estimate the mass of ma-
chos, assuming they are in the galactic halo (and not in the LMC).
c
.2 2.4
0 2.2
.... - 0.5
~
a. 2
E
loB
1.6
1.4
1.2
0.6
-3 -2 -1 o 2 J
(t-t.)/tot
Fig. 6.10. Microlensing curves for a point-like source. The curves correspond to
four values of the closest approach distance (0.5, 0.7, 1.0, and 1.5 times the Ein-
stein radius). The duration timescale L1t, which depends on the mass, is normalized
according to (6.133). (Courtesy of James Rich.)
Two experimental groups, the MACHO collaboration and the EROS col-
laboration, have published results of searches for events in the directions of
the LMC and the SMC (the nearby Small Magellanic Cloud). The absence
of events lasting less than 15 days allowed both groups to exclude objects
of masses in the interval 10- 7 M('J < M < 10- 1M('J as the main component
of the halo. 18 these limits exclude brown dwarfs of masses rv 0.07 M('J as the
major components of the halo.
One important aspect of the phenomenon is that the amplification should
be symmetric and achromatic. Figure 6.11 shows an event recorded by the
EROS experiment that possesses both properties.
The MACHO collaboration, however, observed events of a duration of rv 50
days.19 If these are interpreted as originating from dark lenses of the galactic
halo, this rate corresponds to a fraction f = 0.2 of machos contributing to the
total mass of the halo. The timescale corresponds to objects of mass rv O.4M('J '
EROS has only published upper bounds on the fraction of the halo com-
ponents made of machos. 2o
The results of the two experiments show that it is unlikely that the halo is
made up predominantly of objects of the order of stellar masses. The present
challenge is to prove that the events observed by the MACHO collaboration
are caused by lensing by objects of the halo and not, for instance, lensing
18 C.Alcock, RA.Allsman, D.Alves , et al., Astrophys. J. Lett. 499 , L9 (1998)
19 C.Alcock, RAllsman, D.RAlves, et al., Astrophys. J. 542,281 (2000).
20 T . Lasserre, et al. , EROS Collaboration, Astron. Astrophy.355, L39 (2000).
3300 3325 3350 3375 3400 3-42$ ~ 3475 3500 3525 3S5O
dote
3300 3325 3350 3375 3400 3425 34SO 3475 ~ = 3550

dolo
Fig. 6.11. Gravitational microlensing event in the EROS experiment. The upper
picture is in the blue part of the optical spectrum and the lower picture is in the
red part. The phenomenon is symmetric and achromatic, as expected. (Courtesy of
James Rich).
objects in the clouds themselves. If that is the case, the mass estimation
implies that they correspond to very old white dwarfs, perhaps the oldest
stars.
The information on the localization of the lenses (in the galactic halo or
in the Magellanic Clouds) is difficult to obtain. The simplest case is that of
events with a very large amplification, in particular the events due to binary
lenses. In such events, the light curve is modified in a way that depends on
the relative distance of the lens and the source stars.21 It is also possible to
obtain information on the distance of lenses in events of very long duration
when the motion of the Earth around the sun modifies the light curve. 22 In
the future, it will also be possible to resolve the two microlensing images
with interferometric space telescopes. Such measurements will give enough
information to determine the distances of the lensing objects and to draw a
definite conclusion.
The search for dark objects by microlensing is under way in the Andromeda
Nebula, the spiral galaxy M31, which is close to our galaxy.23
21 C.Afonso, C.Alard, J.N.Albert, et al., Astrophys. J. 532, 340 (2000).

22 C. Alcock, R. A. AUsman, D. Alves, et al., Astrophys. J. Lett. 454, L125 (1995).
23 A.P.S. Crotts and A. B. Tomaney, Astrophys. J. Lett., 473, L87 (1995)
Experimental results on these problems and on general relativity can be

found in the book by James Rich [19].
6.6 Problems
6.1. Hyperspherical Coordinates

Consider the problem of the motion on S3 Section 6.2.3 in hyperspherical
coordinates:
w = Rcos'lj;, Z = Rsin'lj;cose, x = Rsin'lj; sine cos cp, y = Rsin'lj; sine sin cp.
6.2. Geodesics
Consider the metric
(6.134)
which can be considered to be derived from a "Lorentzian" metric
(6.135)
by the three-dimensional restriction
(6.136)
Find a parametric expression for the geodesics in terms of the time t.

7
Feynman's Principle in Quantum Mechanics
You can never solve a problem

on the level on which is was created.
Albert Einstein
Richard P. Feynman was probably the greatest theoretical physicist of the

second half of the 20th century. In his thesis work in 1942 at Princeton,
Feynman attempted to solve the problem of the self mass of the electron, which
is infinite in second-order perturbation theory in quantum electrodynamics. 1
Feynman discovered a "least action principle" that enabled him to solve the
problem by using both retarded and advanced potentials. In order to do this,
he introduced the mathematical concept of path integrals, which has been
an extensive field of interest since then. The first triumph of this method
came when it led to the correct calculation of the Lamb shift in the hydrogen
atom without introducing any arbitrary cutoff parameters. The infinities were
dealt with in a systematic and well-defined manner in terms of basic physical
parameters. Since then, the renormalization group has acquired a depth that
places it at the forefront of present theoretical physics.
It was only a few years later that Feynman understood that he could apply
his ideas to a variational formulation of nonrelativistic quantum mechanics. In
an article published in 1948,2 followed a few years later by the book Quantum
Mechanics and Path Integrals by Feynman and Hibbs [20], which corresponds
to the course Feynman gave on quantum mechanics at Caltech for a few
1 In fact, this amounts to calculating the electrostatic energy of an electron. In

traditional quantum field theory, the electron is considered a point-like parti-
cle whose extension in space TO vanishes. The corresponding electrostatic energy
e 2 / (471'20TO) is therefore infinite. Renormalization theory consists in canceling this
infinite quantity by redefining the "bare" mass of the electron (in the absence of
a field) as an unphysical infinite parameter, the two infinite quantities being such
that their sum is the finite physical mass.
J-L. Basdevant,
2 R.P. Feynman, "Space-Time approach to Non-Relativistic Quantum Mechanics",
Rev. Mod. Phys., 20, 367 (1948).

146 7 Feynman's Principle in Quantum Mechanics
years, one can find the essence and the beauty of his ideas and results. Feyn-
man showed that quantum transition amplitudes can be calculated with path
integrals and that this method is more efficient than working with the notion
of a wave function, in particular when one considers systems made of several
interacting particles. Statistical physics also profited from path integral tech-
niques. Many results have been obtained, and this is a central tool in modern
quantum field theory.
As far as teaching quantum mechanics is concerned, Feynman acknowl-
edged that the traditional approach is more efficient. However, it is useful and
instructive, as we shall see, to provide in this last chapter Feynman's ideas
and the path integral technique. We will, in particular, discover its unifying
aspects after having worked through the previous five chapters.
The book by Feynman and Hibbs is remarkably well written. It proceeds
by considering specific examples at each step. This allows the reader to get
acquainted gradually with the basic ideas of the theory. Here, we wish to
remain at an elementary level and to elicit the structure of the theory and
its relation with analytical mechanics. This is why we will treat very few
examples; in particular, we will not enter into the application to perturbation
theory, which is extremely powerful and elegant.
One reason for the success of the method in field theory stems from the fact
that Feynman directly casts the problem of quantum mechanics in space-time.
7.1 Feynman's Principle
7.1.1 Recollections of Analytical Mechanics
It is useful to recall a few results obtained in analytical mechanics in the

previous chapters. In general, we consider one space dimension for simplicity.
1. A mechanical system is characterized by a Lagrangian (x, x, t),
which
depends on the state variables (i.e., the position x and its time derivative
x = dx / dt), and possibly on time.
2. The Lagrangian of a particle in a potential V(x, t) is = mx 2 /2- V(x, t).
3. For any trajectory x(t) one can imagine, the action S is defined by the
l
integral
t2
S= (x, x, t) dt. (7.1)
h
4. The least action principle states that the actual physical trajectory X(t)
renders S minimal (extremal).
5. The equation of motion that determines the actual trajectory is the
Lagrange-Euler equation
(7.2)
7.1 Feynman's Principle 147
6. For a free particle, = mx 2 /2, the classical action between (Xl, tl) and
(X2' t 2 ) is
(7.3)
7. If we express the action in terms of the coordinates, the conjugate mo-

menta Pi and the Hamiltonian Hare
as H = _ as (7.4)
Pi = aXi' at
7.1.2 Quantum Amplitudes
The basic concept on which Feynman relies is that of the amplitude of a

process. The concept of the quantum state of a system (i.e., the description
in itself of the state of a system) only comes afterward. This point of view is
more realistic in the sense that any experiment consists of a series of processes.
Feynman wants to obtain the laws for quantum processes. Therefore, Feyn-
man's principle concerns the dynamics and, to a lesser extent, the physical
quantities.
Feynman's approach relies on the superposition principle. To each physical
process, there corresponds complex amplitudes, which we denote by rPk, that
add up. The probability of observing an event coming from several interfering
alternatives for a process is given by the modulus squared of the sum of
amplitudes that lead to this event.
In the Young slit interference experiment, the interfering alternatives cor-
respond to the passage through each slit. To each of them there corresponds
an amplitude (i.e., rPI and rP2), and the probability P of observing the out-
going particle at a given point of the detector is the modulus squared of the
sum P = IrPI + rP212.
We can generalize the experiment by placing a series of screens one after
the other, each of which bears a set of slits. To each possible path of the
particle, between the source and the detector there corresponds an amplitude.
The sum of these amplitudes gives the total amplitude on the detector, and
its modulus squared is the probability.
7.1.3 Superposition Principle and Feynman's Principle
Consider a simple process where a particle moves from a point (xa, t a) to

another point (Xb, tb) (we work in space-time, and we include the time variable
in the definition of the position of the particle). As in classical physics, one
can imagine a variety of paths by which this process can happen. Of course,
the classical trajectory is well defined, and it corresponds to an extremum of
the action S(b, a). In quantum mechanics, all paths x(t) coming from a and
ending at b contribute to the amplitude of the process, as one can visualize
from the successive interferences represented in Figure 7.1.
a b
Fig. 7.1. Successive Young interferences across a series of screens (only a subset of
possi ble paths are represented).
The modulus of all these amplitudes is roughly the same,3 but the phase
differs appreciably from one path to another. The amplitude K(b, a) is the
sum of individual amplitudes
K(b, a) = L k(b, a), (7.5)

k
where k(b, a) is the amplitude corresponding to path k.

Feynman writes this sum in the equivalent form
K(b,a) = (x(t)) , (7.6)

all paths a-t b
where x(t) defines a path between a and b. Of course, the specific structure
of the setup in Figure 7.1 does not matter.
The Feynman principle consists in stating that , in full generality, in any
experimental setup, the phase of the amplitude (x(t)) corresponding to a
given path is the classical action along this path, calculated according to
equation (7.1), divided by n:
1 .
(x(t)) = C ekS(x(t)). (7.7)
We shall see later on how one fixes the normalization constant C (which is
essential). We stress the fact that the quantity S(x(t)) in this expression is the
value of the action (7.1) along the path x(t). It does not necessarily correspond
to an extremum of the classical action.
7.1.4 Path Integrals
This leads us to a central point, which is the evaluation of the sum (7.6) with
the definition (7.7). In fact, the family of possible trajectories x(t) between a
3 Of course, it is only after we have understood the physical and mathematical
structure of the problem that this claim appears justified in good approximation.
and b is a complicated set. The result does not correspond to a simple limit
of the discrete set, which we could calculate in the case of Figure 7.1, to a
continuous set.
In order to define the sum on all paths, we proceed by first taking discrete
time intervals tb-ta in the form of N successive equal intervals ti, i = 0, ... , N
as :
tb - ta = Ne, e = ti+1 - ti to = t a , tN = tb. (7.8)
For each value ti, we choose a value Xi of the variable x. This gives a set
of N - 1 values since the endpoints are fixed,
By joining the successive xi's by straight lines (we shall come back to this
point), we define a trajectory in the form of a broken line that joins the
points a and b. Each set {Xi} defines a different possible trajectory.
If we integrate on the values of each Xi from -00 to +00, we sum over
all "trajectories" corresponding to this particular discretization of the time
variable. This procedure is illustrated in Figure 7.2 .
.< -
Xh
/
Xl
/ '-:-. / \ :/
Xu
./
\ /'
'2
x,
Fig. 7.2. Example of a trajectory x(t) in a discretization of the time variable.

Finding the path integral consists of integrating on all values Xi(ti) and then taking
the limit E -+ O.
For a given value of e, let C (e) be the normalization constant of (7.7). The
amplitude K,(b, a) is given by
-'
K,(b, a) - !~ C(e)
_1_ JJ... J e
-kS(b,a) dXl dX2 dX(N-l)
C(e) C(e) ... C(e) , (7.9)
where, for each value of the set {Xi(ti)}, S(b,a) is the action calculated on
the trajectory defined by this set, as represented in Figure 7.2.
The end of the calculation consists in taking the limit E -+ O. This is where
the normalization factor C (E) enters, as well as the number of such factors.
Indeed, the limit must exist and only involve physical quantities. Assuming
this is achieved, the amplitude K(b, a) is given by
K(b, a) = lim K,(b, a). (7.10)

,-+0
At this point, the following remarks are in order.

1. Instead of choosing straight lines to join two successive points Xi(t i ) and
xH 1(tH 1)' we can perfectly well make the more elegant choice of portions
of classical trajectories, which correspond to stationary values of the action
S(i + 1, i). For a free particle, there is no difference. In the presence of
forces, the limit E -+ 0, which is taken at the end of the calculation, is
such that this makes no difference in the final result.
2. In this discretization, the value of Xi is well defined, as is its derivative Xi =
(XHI -Xi)/E. We see, however, that this latter expression is discontinuous
and that the second derivative is not defined at the instants t i . In the
case that concerns us, this has no importance since the Lagrangian does
not involve x(t). In other cases that one can imagine and that are not
too pathological, the prescription x = (XHI - 2Xi + Xi_l)/E 2 leads to
acceptable results.
3. From the mathematical viewpoint, a satisfactory definition of the path
integral requires a formalism and some concepts more subtle than this
discretization of time. However, for what concerns us here, the only im-
portant points are that the summation exists and that the prescriptions
(7.9) and (7.10) lead to correct results.
Most of the time in what follows, we will avoid writing the sum over paths
as the limits (7.9) and (7.10). We will write this sum as
K(b, a) = lb etS(b,a)Vx(t), (7.11)
where the symbol Vx( t) characterizes the mathematical nature of this expres-
sion.
The form (7.11) is called a path integral. In this expression, S(b, a) is a
number whose value depends on the function x(t). The "integration" over this
function x(t), which is represented by Vx(t), is called a functional integral.
7.1.5 Amplitude of Successive Events
One justification of the form (7.9) is obtained if we consider the combination

law for amplitudes of successive events.
Consider a process (a -+ b) and some intermediate time tc such that ta :::;
tc :::; tb. The action S(b, a) is therefore the sum
S(b, a) = S(b, c) + S(c, a). (7.12)

Indeed, the action is a time integral, and we work with Lagrangians ,c(x, X, t)
that do not depend on higher-order derivatives such as x. The integral (7.11)
is written as
K(b,a) = lb [* exp (S(b, c) + S(c,a))]vx(t),

and the previous expression can be written as
where, of course, the integral over Xc is a usual integral. We have factorized

the integrand.
This expression is a usual integral,
K(b, a) = J dXcK(b, c)K(c, a). (7.14)
In other words, the amplitudes for two successive events going through the
same given intermediate point c, (a -+ c) and (c -+ b), are multiplied. The
amplitude K(b, a) is the sum of these products on all possible values of the
intermediate point. This is simply the superposition principle.
This argument can be extended to any number N of intervals, with inter-
mediate points Xi, i = 1"" (N - 1), which leads to
K(b,a) = J K(b,N-1)K(N-1,N-2)K(i+1,i)
.. K(l,a)dxl dX2 .dXN-l (7.15)
Assuming these intervals are infinitesimal and of equal length E, the cor-
responding expression resembles equation (7.9). It is not identical, however,
since the latter form is a limit, whereas (7.15) is an equality. This, however,
enables us to obtain an infinitesimal form of the amplitude K between two
points separated by an infinitesimal time interval E. In fact, when t2 - tl = E
is infinitesimal, the action (7.1) is, to first order in E,
S(2 1) = ,c (X2
, E
+ Xl X2 - Xl t2 +
2' E ' 2
h) '
(7.16)
or
K(2,1) = 1 exp(~,C(X2+Xl,X2-Xl,t2+tl)).
C(t2- t l=E) n 2 E 2 (7.17)
Inserting this result into the formula (7.15), and assuming we can exchange
the order of the integration and the limit E -+ 0, we indeed obtain an equality
between the two expressions. This justifies the method (7.9) and (7.10) in all
cases where the expressions converge sufficiently well.
7.2 Free Particle

We first apply what we have done to the case of a particle propagating freely
in space. One calls a propagator the amplitude K(b, a) of the propagation of
a particle (free or not) from point a to point b.
A free classical particle propagates according to a uniform linear motion.
The corresponding action is
(7.18)
This result is obvious. In this problem, the velocity j; is a constant and the
Lagrangian is C = mj;2/2 = m[(x2 - Xd/(t2 - td1 2 /2, which leads directly to
(7.18) since S = J Cdt.
7.2.1 Propagator of a Free Particle
In order to calculate the propagator of a free particle, we could use the lim-
iting form (7.9). However, in this case, it is completely equivalent to use the
expression (7.15) because the result is independent of the value of f = tH1 -ti.
We will also obtain the value of the normalization coefficient C(E) in (7.7).
The final result is that the propagator of a free particle between points a
and b is
K(b ,a) = (27rih(tb - ta)) -1/2 exp (im(Xb

h(
- xa?)
). (7.19)
m 2 tb - ta
This gives the value of the normalization factor. For a free particle and a time
interval (tb - t a), we read in the formula above
(7.20)
The normalization constant C depends only on the variable (tb - ta).

For an infinitesimal time interval (tH1 - ti) = f, which is the case in the
definition (7.9) of the path integral, we have
(7.21 )
This prescription ensures that the path integral (7.10) exists.

The proof of this is the following.
Gaussian Integrals
In what follows, we shall frequently make use of the integrals
7.2 Free Particle 153
(7.22)
(7.23)
(7.24)
The second expression is obtained using the Cauchy theorem in the complex
plane.
In expression (7.15) we set Xo == xa. We first calculate the integral over
rJ
Xl. Using (7.17), we obtain
K(2,0) = (C~ f) dXI exp (~;: [(X2 - xd 2 + (Xl - xO)2] ). (7.25)
With the identity
(X2 - xd
2 2 1
(
+ (Xl - xo) ="2 (X2 + xo) + 4 Xl -
2
( X2 - Xo
2 )2)
this reduces to a simple Gaussian integral on Xl, which gives
1 )
K(2, a) = ( C(f)
2 J i(2f)7rn ( im
m exp 4nf (X2 - Xo)
2) . (7.26)
Of course, we have made use of the fact that t2 - tl = tl - to = f, but we

have not taken any limit.
The expression (7.26) can be written in terms of (t2 - to) = 2f as
(7.27)
To first order in f, this expression must be the same as that of K(l, a) if

tr - ta = 'T/ = 2f and Xl = X2, and therefore
K(1,a)tl-t a =2E = C(~f) exp (2n(t:~ta)(X2 _XO)2). (7.28)
Consequently, the equality of (7.28) and (7.27) for infinitesimal time in-
tervals tb - ta = f imposes the choice
C( f) = (2i:nf r/ 2
== (2i7rn(~ - ta ) r/ 2
(7.29)
The proof of (7.19) can be obtained by recursion. Assuming this result

is exact, we insert it in (7.14) by considering an intermediate point (x, t) we
obtain
K(b, a) = (21l"in,~ - t) 21l"in,~ - ta)) -1/2
J (exp -im ((Xb - x)2

2n, (tb - t)
+ (x(t -- xa)2))
ta)
dx. (7.30)
It is straightforward to obtain
(Xb - X)2 (x - xa)2
(tb - t) + (t - ta)
(Xb- Xa)2 (tb-ta) ( _ Xb(t-ta)+xa(tb-t))2 (731)
(tb - ta) + (tb - t)(t - ta) x (tb - ta) . .
The first term, which is independent of x, factorizes in the integral, which
boils down to a Gaussian integral. The value I of this integral (without the
prefactors of (7.30)) is therefore
I = ( m(tb - ta) ) -1/2 (im (Xb - xa)2)

exp - -'-;-------':- (7.32)
2i1l"n,(tb - t)(t - ta) 2n, (tb - ta) .
Inserting this in (7.30), we obtain the expected result
K(b , a ) = (21l"in,(tb - ta)) -1/2 exp (im (Xb - xa)2)

2n, (tb - ta ) .
m
The proof is completed by noticing that the previous result does not require
any condition on a, b and the intermediate point (x, t). Therefore, the method
can be extended to any partition of the interval [a, b]. The formula therefore
coincides with the "definition" (7.26) in the infinitesimal limit (tb - ta) = N E
if it is legitimate to interchange the order of the integration and the limit
E -+ O.
7.2.2 Evolution Equation of the Free Propagator

For simplicity, let us fix the origin of time and position at a. We call b ==
(x, t) the point of arrival, and we examine the properties of the propagator
K(x, t; 0, 0) as a function of the endpoint (x, t). If we set K(x, t) == K(x, t; 0, 0),
we have
K,(x, t) = V(Tn
2riht exp (imx2)
2n, t . (7.33)
One can check with no difficulty that the free propagator obeys the partial
differential equation
aK n,2 a2K
in,-=--- (7.34)
at 2m ax2
for t > 0 (or tb > ta).
Equation (7.34) has the same form as the Schr6dinger equation for a free
particle. We must, however, be careful since we do not yet know the physical
nature of the amplitude K and how it is related to a physical probability
amplitude.
7.2 Free Particle 155
7.2.3 Normalization and Interpretation of the Propagator
Indeed, if we calculate the integral over x of JC(x, t), we obtain
1 +00 JC(x,t)dx
-00
= Vrr:~ 1+
27r2nt -00
00
exp (irr:,x2) dx
2n t
= 1 'Vt > 0. (7.35)
In particular, in the limit t = E ---+ 0, we have

lim JC(x, t) = o(x), (7.36)
t--+O
where 0 is the Dirac distribution.

Therefore, K is not strictly speaking a quantum mechanical probability
amplitude. However, this is a minor problem. In fact, the propagator K(b, a)
is the amplitude for a particle going from point a to point b. It is unphysical
in the sense that it is not possible physically to prepare or to measure a par-
ticle whose position is strictly defined to be the point x = Xa' The position
"eigenstates" Ix) have "wave functions" 7/J xa(x) ex: o(x - x a) in the same way
that the momentum eigenstates have wave functions 7/Jpo (x) ex: exp (ipo x).
These are not physical since they are not normalizable. They are "eigendistri-
but ions" that do not belong to the Hilbert space. Nevertheless, we know that
any physical state can be written as a linear superposition of such nonphysi-
cal states. It is useful to work with the nonphysical states Ix) and Ip) in all
intermediate calculations.
The same thing happens here. It is useful to work with the propagator
K(b, a) and to call it a probability amplitude, even though we are aware that
a true probability amplitude is obtained after a summation of K(b, a) over
vicinities of band a.
One can check that if we forget this precaution, the "probability" of ob-
serving the particle in a vicinity dx of point Xb, knowing that it originates
from a, would be
m
P(b) dx = 27r n( tb - ta ) dx,
whose integral over all space is infinite. This is exactly the same problem as
encountered in quantum mechanics to shift from de Broglie plane waves to
wave packets.
7.2.4 Fourier and Schrodinger Equations
Some authors stress the fact that the free Schrodinger equation can be con-
sidered as a Fourier diffusion equation,
ap
at = D'V2 p,
for a purely imaginary time t = iT. This remark is interesting in that the same
mathematical techniques apply to both and that the solutions have obvious
formal similarities.
Two points are in order. First, the function K that we use here becomes a
density p (of heat or matter), which is positive. The solution is then real and
positive, or zero. The result (7.35) expresses the conservation of energy, and
the limit (7.36) represents an initial condition where some quantity of heat has
been deposited on a given point, which avoids any problem of interpretation.
Second, and this is perhaps more interesting, this is an example of the fact
that path integral techniques are useful in a large category of problems.
One can refer to the remarkable book Techniques and Applications of Path
Integration by Lawrence S. Schulman [21]. In the present case, the solution of
a partial differential equation of first order in time can be cast quite directly
into the form of a path integral. This is the case for the Fourier equation as
well as the Schrodinger equation.
The propagator K(x, t) oscillates in x and in t with a wavelength), and a

frequency w that vary with x. Locally, for large values of x and t, in a region
(8X x, 8t t), one can approximate this wave by a monochromatic plane
J
wave
K(x, t) = 27r7nt exp i(x, t) ex: ei(kx-Wf) , (7.37)
where k is the wave vector and w the frequency. These are locally related by
k= a (7.38)
ax
Here, the value of the phase is
Therefore, we obtain
k= m~ i.e., ). = 27r = h (7.39)

n t' k m(x/t) '
and
(7.40)
If we place ourselves far from the origin x >., the propagator oscillates in x
and t with a wavelength), and a frequency w, which are both nearly constant.
If the particle, emitted at the origin at t = 0 is detected at point x at time t, its
velocity is v = x/t, its momentum is p = mv = m(x/t), and its kinetic energy
is Ek = m(x/t)2/2. 4 We therefore obtain the de Broglie relation between the
wavelength of the propagator and the momentum of the free particle
4 See, for instance, [7], Chapter 2, Section 6, for a discussion of this point.
7.3 Wave Function and the Schrodinger Equation 157
A= ~. (7.41)
p
Similarly, the kinetic energy Ek = mv 2 /2 of the free particle is related to the

frequency w of the wave by the relation w = (m/2n)(x 2 /t 2 ); i.e.,
(7.42)
These de Broglie-Einstein relations (7.41) and (7.42) will appear later

as being in agreement with the definition of energy and momentum in the
classical limit.
7.2.6 Interference and Diffraction
The calculation of interference and diffraction phenomena follows immediately

from equation (7.14),
K(b, a) = J dXcK(b, c)K(c, a). (7.43)
In three dimensions, this result is transposed as
K(b, a) = J d3rcK(b, c)K(c, a). (7.44)
We can define a function G(rc) as G = 1 in some domain D (collection of

slits, aperture of arbitrary shape) and G = 0 elsewhere. G represents a screen
that lets the particles go through freely in D and stops them elsewhere. The
amplitude emitted in a, diffracted by G, and measured on each point b of a
screen or detector is simply the usual sum
(7.45)
In the book by Feynman and Hibbs, there are several examples and cal-
culations of this type. We shall not elaborate further on this aspect.
7.3 Wave Function and the Schrodinger Equation
Up to now, we have considered the amplitude for a particle to travel from a

to b.
It is quite possible, and legitimate, to address the question of the total
amplitude of a particle to reach an arbitrary point b, independently of what
happened to it previously (up to now, we considered the problem when "the
particle was emitted at a"). Of course, this amplitude is the wave function
'ljJ(x, t) 5 (we have replaced the name b by variables (x, t)). This probability
5 See, for instance, [7], chapter 2, Section 1.
amplitude 'ljJ(x, t) obviously satisfies all the conditions we have found previ-
ously. By definition, it is square integrable, which avoids by construction the
limiting procedures seen in Section 7.2.3. Apart from these problems, the am-
plitude K of (7.33) is a particular wave function for which we know that the
particle started at a == (0,0).
The wave function is a probability amplitude. Therefore, it satisfies the
i:
law of composition of successive amplitudes (7.14); i.e., the integral equation
'ljJ(x,t) = K(x,t;x',t')'ljJ(x',t')dx'. (7.46)
The physical content of this formula is important. The amplitude 'ljJ(x, t) for
the particle to arrive at (x, t) is the sum over all possible values of an in-
termediate point x' of the product of the total amplitude 'ljJ( x', t') and the
amplitude K(x,t;x',t') to go from (x',t') to (x,t).
In other words (we intentionally keep the enthusiastic presentation of Feyn-
man), the effect of all the past history of a particle is contained in a single
function 'ljJ(x, t). One can forget everything one knows about the past history
of a particle. If one knows its wave function at a given time t, one can calculate
and "read" in it all that can happen to the particle in the future. 6
In fact, equation (7.46) is nothing else than the modern expression of the
Huygens-Fresnel principle in optics (see, for instance, Born and Wolf [12],
Chapter VIII), which founded wave optics. The Huygens principle, given in
1690, was that "Each infinitesimal element of a wave front can be considered as
a secondary perturbation which radiates spherical wavelets. The wave front
at a later time is the envelope of these wavelets". Fresnel completed this
principle later, in 1818, by postulating that the secondary wavelets "are in
mutual interference". The fundamental principles of wave optics were stated.
7.3.1 Free Particle
We have abundantly treated the case of a free particle above. The propagator
can be calculated with no difficulty:
m
---;-------,-- exp (im- (X2 - XI)2) . (7.4 7)
27rih(t2 - tI) 2h (t2 - tI)
The wave function 'ljJ(x, t) satisfies the free Schrodinger equation
o'ljJ(x, t)
at
.'to,
Zit
(7.48)
6 Feynman added, with his legendary sense of humor, "The effect of the entire
History on the future of the universe could be obtained from a single gigantic
wave function."
7.3 Wave Function and the Schrodinger Equation 159
As in the case of a particle in a potential, which we examine in the next

paragraph, we will not further pursue the analysis, which is completely anal-
ogous to the usual analysis of Schrodinger theory. One can refer to the book
by Feynman and Hibbs [20] for all details.
7.3.2 Particle in a Potential
Consider now the case of a particle of mass m in a potential V (x).
i:
The integral equation satisfied by the wave function is still
7jJ(x, t) = K(x, t; x', t')7jJ(x', t') dx', (7.49)
but there is no analytic expression for the propagator
K(b, a) = lb eitS(b,a)Vx(t). (7.50)
The action is
S= t (x, X, t) dt,
it'
(7.51 )
and the Lagrangian

= ~mx2
2 - V(x) . (7.52)
Consider an infinitesimal time interval t - t' = Eo To order E, the action is
(7.53)
We insert this into (7.49) and we recall that if the phase of the propagator K
is well defined, exp(iS/n), on the contrary its normalization C is not, and we
obtain
7jJ(x, t + E) = 1 00
-00
1 (
C exp
[im(x- y
Ii 2E )2 iE V (x+y)])
+ h" -2- 7jJ(y, t) dy.
(7.54)
In the argument of the exponential, the first term, (i/n)(m(x - y)2/2E), be-
comes large as soon as y becomes appreciably different from x. Within this
assumption, the phase of the integrand in (7.54) varies rapidly and this inte-
grand oscillates very quickly. On average, these various contributions to the
integral cancel each other. In other words, it is only the sufficiently small val-
ues of Ix - yl that will give appreciable contributions to the integral. We can
rewrite (7.54), setting y - x = 7] and keeping in mind that only small values
of 17]1 will contribute, as
7jJ(x, t + E) = 1 00
-00
1 exp [im7]2]
C 2nE exp [iE
-h" V(x + 7]/2) ]7jJ(x + 7], t)d7]. (7.55)
We now perform the power series expansion of 'ljJ in E. In order to ensure

the consistency of the calculation, we must keep the terms in T)2. The term
EV(X + T)/2) can be replaced by EV(X), since the difference is of order than E2
or higher. Expanding to first order in E and to second order in T), we obtain
(7.56)
The identification in each order of E gives the following result, using the gaus-
sian integrals (7.22) and (7.23).
1. Order 0 in E
The coefficients of 'ljJ on both sides are concerned. We obtain
C= (21r~nE r/ 2
, (7.57)
which is identical to what we obtained in (7.21). Notice that it is the value

of C for an infinitesimal time interval, so that it is not surprising to find
the same value. Of course, the integrated value is different from the case
of a free particle.
2. Order 1 in T)
The first-order term in T) on the right-hand side vanishes identically.
3. Order 1 in E
To first order in E, we obtain
. a'ljJ n a 'ljJ
2 2
2n- - - a2 + V(x)'ljJ(x, t),
at = - 2m x
(7.58)
which is nothing but the Schrodinger equation.

As we mentioned above, we shall not pursue any further the analysis of
Schrodinger theory or quantum mechanics in general. One can refer to the
book by Feynman and Hibbs [20]. We note the following points:
The theory of observables and their algebraic properties, in particular un-
certainty relations, is obtained with no great difficulty.
There are important technical and conceptual simplifications in the way
to treat perturbation theory.
The resulting formalism can be extended much more easily to many par-
ticle systems and, in particular, to quantum field theory.
To end this section, we remark that it is a good exercise to repeat the
calculation above in three dimensions, in particular to recover the Schrodinger
equation
(7.59)
7.4 Concluding Remarks 161
7.4 Concluding Remarks
Both from the conceptual and the technical points of view, the method of
Feynman path integrals has an undeniable elegance and richness. We have
mentioned that it extends to many other physical problems such as quantum
field theory, Brownian motion, polarons, spin physics, statistical mechanics,
and critical phenomena, as one can see in the book by Schulmann [21]. This
book contains, in particular, a very pleasant discussion of quantum mechanics
in curved spaces. We end this chapter with a series of remarks that the present
results have induced after going through the previous five chapters of this
book.
There is no hierarchical relationship between the depth of the various
approaches and different chapters of physics, neither do we wish to discuss
any axiomatics of physics. It is a personal matter of taste to prefer such and
such a line of thought. What is interesting here is to see the unifying character
of what we have discussed, from the Fermat principle up to the Feynman path
integrals.
7.4.1 Classical Limit
Consider again the path integral (7.11)
K(b, a) = lb eKS(b,a)Vx(t), (7.60)
and suppose the classical action S(b, a) is macroscopic (i.e., it is much larger
than the Planck constant n). Consider the contribution of several paths that
can perfectly well be close to each other in the classical sense but whose
difference is much larger than n. The contributions of these paths to the phase
will be completely different (and very difficult to determine with an accuracy
better than, say, 7r). With great probability, they will interfere destructively. If
one considers the set of all those paths, their total contribution to the integral
will vanish.
However, in the vicinity of the classical trajectory Xel(t), the action
Sel (b, a) is stationary. Therefore, paths that are sufficiently close to the classi-
cal trajectory will give contributions that will interfere in a constructive way.
Only those paths along which the action S(b, a) is sufficiently close to the clas-
sical action Sel (b, a) will contribute, the difference being noticeably smaller
than the unit of action n. Notice that for all processes involving macroscopic
values of the action, this quantity will be larger than, say, 10 25 to 30 n.
In other words, under these conditions, the only appreciable contribution
will come from an infinitesimal vicinity of the classical trajectory that cannot
be resolved experimentally. Consequently, the "probability" of the classical
trajectory is equal to one. The probability for any trajectory that can be
distinguished from the classical one vanishes.
Therefore, classical mechanics appears here as the limit of quantum me-

chanics for macroscopic actions. Of course, one may wonder about the fact
that Feynman's starting point involves the classical action in (7.60), which
means that some care should be taken with the previous assertion. However,
from the very beginning, Feynman operates in space-time (x, t). Therefore, all
quantities defined in Section 7.1.1 (i.e., (x, X, t), the Lagrangian L, and the
action S) are perfectly well-defined quantities, even though they do not have
to possess any intuitive meaning.
Consider the variation of the phase factor eisclin in an infinitesimal variation

on the endpoint b, Sel -+ Sel(b + db, a). The word "infinitesimal" can have
a meaning in a macroscopic sense for the action S; it may happen that the
variation of the phase = Selin is large. The variation of the propagator may
not be slow, but in any case the arguments above show that its form is close
to
(7.61 )
where the function F is slowly varying.
One can prove that the approximation (7.61) becomes an equality if the
Lagrangian is a quadratic form, in particular for a harmonic oscillator (see
Problem 7.1).
Going back to the consideration of Section 7.2.5, if we perform in (7.61) a
variation of the endpoint Xb -+ Xb + dXb, the phase variation = Selin is
1 aSel
d = h aXb dXb = kdxb, (7.62)
where k is the local wave vector of (7.61). But we know (see (7.4)) that the
classical momentum of a particle at Xb is related to the action by p = (as / aXb).
We therefore obtain the de Broglie relation
p= hk,
as announced in (7.41).
In the same way, by varying the time of arrival bt2, we obtain (referring
to the definition (7.37))
(7.63)
Since the energy of a particle is related to the action by (7.4), E = -(as/at),

we therefore obtain the Einstein-de Broglie relation seen in (7.42):
E=nw.
7.4 Concluding Remarks 163
7.4.3 Optics and Analytical Mechanics

The previous considerations enlighten the relationship between analytical me-
chanics and optics. We have shown, in Section 4.6.1, Hamilton's remarkable
statement in 1830 that the formalisms of optics and mechanics could be unified
and that Newtonian mechanics corresponds to the same limit or approxima-
tion as geometrical optics as compared with wave optics.
In the same way as geometrical optics appears as the short wavelength
limit of wave optics, classical mechanics appears as the limit of quantum
mechanics for "small" values of 'Ii, or rather in situations such that the value
of the action is large compared with Planck's constant.
We have stressed in Section 7.3 the remarkable similarity between the
Feynman principle in quantum mechanics and the Huygens-Fresnel principle
that was the foundation of wave optics.
In Section 4.6.1, we studied the eikonal and how the transition between
wave optics and geometrical optics is achieved, as well as the "semiclassical"
approximation method of Wentzel, Kramers, and Brillouin. We shall not come
back to this except by saying that several chapters of physics such as all optics
and all mechanics therefore emerge as originating from the same common
mold. (One thing remains to be done, and that is to extend this to field
theory and its quantization, which is beyond the scope of this book.)
7.4.4 The Essence of the Phase

We are aware that the present considerations may seem far away, if not dis-
jointed, from variational principles as we expounded them in previous chap-
ters. However, we recover everywhere results obtained with variational prin-
ciples in Sections 7.1 to 7.3.
Actually, in order to "see" how Feynman's principle emerges as a genuine
variational principle, it would be necessary to treat the question of observables
which can be found in the book by Feynman and Hibbs [20], Chapter 5, and
more importantly, to explain the variational principle of Schwinger, 7 which
can be found in [22].
Everything lies in the phase. One can make the naive and heuristic follow-
ing remark. Consider a variation of the coordinates in (7.9) Xi(t) ---+ Xi(t + r5t)
which leads to the variation 5S of the action. Formally (i.e., without discussing
precisely the mathematical nature of various expressions), we have a variation
of the propagator
5K = lb (5 !~b Ldt) eitS(b,a)Vx(t),

(one can also vary the Lagrangian itself). The fact that the amplitude corre-
sponding to the propagator is stationary to first order in the variation of the
Xi (t) implies the classical equation
7 Julian Schwinger, Phys. Rev. 82, 914 (1951).
i.e., the classical definition of the trajectory.

This can, of course, be made much more rigorous. The variational principle
of Schwinger is perhaps less elegant than Feynman's, but it had undoubtedly
better mathematical foundations when it was first formulated. One can refer
to [20], [4], [21], and [22].
Feynman's principle can also be compared with what is called the station-
ary phase method in analysis. Consider the limit for /-l -+ 00 of the integral
(7.64)
For large values of /-l, the phase of eil'f(t) varies very rapidly unless f'(t) = O.
Therefore, the dominant contributions to the integral will come from values of
t for which f'(t) vanishes. If f'(t) vanishes at a single point to, we can expand
f as a power series in the vicinity of to; i.e.,
(7.65)
If one neglects higher-order terms in the expansion, one obtains the result
(7.66)
to be compared with (7.19).

Feynman's principle consists of taking into account, in the calculation of
an amplitude, the largest "number" of possible paths, with the constraint that
paths too far from each other will lead to destructive interference. One can
also visualize this as the fact that an amplitude increases when the "volume"
of the space of the alternative paths that contribute in a coherent manner is
larger. From that point of view, the phase of an amplitude acquires a physical
role and an essence that perhaps is not fully appreciated.
7.5 Problems
7.1. Propagator of a Harmonic Oscillator

The Lagrangian of a one-dimensional harmonic oscillator is
Show that the corresponding propagator is

7.5 Problems 165
K = F(T) exp ( nirr:W T

2 smw
[(x~ + x~) coswT - 2XaXbl) , (7.67)
where T = tb - ta and where
F(T) = (
mw )1/2 .
27rinsinwT
Check that one obtains the formula (7.61).
Solutions
Problems for Chapter 2
2.1 We obtain directly dr / dz = f(1 + f2 - rr) / (1 + f2)3/2. The equation of

the curve is 1 + f2 - rr = 0, from which the result follows.
Therefore
r(z) = avl + f(z)2.
Setting f(z) = sinh((z)), we obtain
r(z) = acosh((z)); i.e., f = a(z) sinh((z)),

and therefore a(z) = 1 and the solution r(z) = acosh((z - zo)/a). This is a
particular case of the use of conserved quantities discussed in Chapter 3.
2.2 Lagrange Multipliers
We must minimize
(7.68)
with the constraints
z(O) = zo, z(a) = Zl, and i. VI

B
+ z(x)2dx = L.
One can transform the problem into
min V= loa (p,gz + >')Vl + z(x)2dx, (7.69)
with z(O) = zo, z(a) = Zl.

The conserved quantity
(p,gz + >.) = C (7.70)

VI + z(x)2
168 Solutions
yields z = sinh cf>(x) , i.e., j.gZ + A = Ccoshcf> with C = j.g. The solution is
Z A + -C cosh (J-Lg
= -- -(x - xo) ) . (7.71)
J-Lg J-Lg C
The constants xo, C, and A are fixed by the conditions z(O) = zo, z(a) = Zl,
and Joa JI + z(x)2dx = L.
2.3 Brachistochrone
Energy conservation gives
-I (dS)2
- + g(z - a) = O. (7.72)
2 dt
We want to minimize
T-
- la
b
( I + Z2
2g(a-z)
) dx (7.73)
with the constraints z(a) = a, z(b) = {3.

The Lagrange function I:- = ylr'cI-+---'-,z2"-j""-2-g-;-(a---z--:-) does not depend on x, and
therefore there is conservation of
(7.74)
where we introduce a positive constant R. Setting z= tan(cf>j2), we obtain

the parametric form
Rcoscf> R(cf> + sincf
Z-Zo = ---, x - Xo = 2 ' (7.75)
2
which is the equation of a cycloid.
2.4 Win a Slalom
1. With this definition of the variable x, we have (z - zo) = (x - xo) sina
and the potential energy is V = mg(z - zo) = -mgxsina.
2. The total energy is E = ~m(j;2+1?)-mgxsina. Since energy is conserved,
and since it is taken to be zero initially, we have j;2 + iP = 2gx sin a.
3. Therefore dt 2 = (dx 2 + dy2)j(2gx sin a).
4. The total time to get from 0 to A is therefore
T = fAo dt = v'2gsina
I fA JI +x(y')2 dx 0
5. Using the Lagrange-Euler equation, we obtain

d y'
o= - "'t=::;=====;=~
dx ylx(1 + (y')2)
Solutions 169
6. We deduce
where C is a constant. However,

y' dy if =C
y'x(l + (y')2) y'x(dx 2 + dy2) xyf2g sin 0: '
(7.76)
and therefore if = Kx with K = Cyf2g sin 0:.
7. The parametric form x(B) = (1- cos2B)/2C 2 = sin 2 B/C 2, y(B) = (2B-
sin2B)/2C 2 satisfies the equation (y')2 = C 2x/(1- C 2x); i.e., (dy/dB)2 =
(dx/dB)2tan 2 B. From if/x = K, we obtain (dy/dB)(dB/dt)/x = K; i.e.,
dB/dt = K/2 and B = Kt/2 since, for t = 0, B = O.
8. The curve is a portion of a cycloid. We have dy / dx = tan B and therefore
y' 1 for B rv 7r /2. The trajectory starts vertically (dy / dx = 0 for B = 0)
and becomes horizontal if y(A) x(A), as shown in Figure 7.1.
o y
A
x
Fig. 7.1. Optimal trajectory from 0 to A.
9. Since point A is fixed, the velocity VA at A is fixed by energy conserva-

tion. It is the maximum velocity of the skier. Therefore, the time to get
horizontally from y(A) to y(O) is larger than the time (y(A) - y(O))/VA it
would take to cover this distance at the maximum velocity. On the other
hand, one must start vertically in order to acquire the maximum velocity
as quickly as possible. The ideal trajectory comes from an optimization
between these two effects.
2.5 Strategy of a Regatta
1. We have by definition x = Vx = V cos B, i = Vz = v sin B, and therefore
z' = dz/dx = tanB.
2. We have Vx = vcosB = w/h. This velocity is maximum when h(z') is
minimum; i.e., for z' = 1, namely B = 7r/4. We then have Vx = w/2. In
fact, it is sufficient to multiply h by a constant to be in the appropriate
situation for a given sailboat for which vx,max = )..w.
3. We have dt = dx/v x = h'(z') dx/w(z), and therefore
-l
T-
o
L
dx
h'(z')
().
w z
(7.77)
170 Solutions
4. Setting <fJ = h'(z')/w(z), the Lagrange-Euler equation that optimizes the

total time T is
~: = :X (~:,) .
5. The function <fJ does not depend explicitly on x. Therefore, we have
~<fJ _ , 8<fJ ,,8<fJ

dx - z 8z +z 8z'
Consequently,
~
dx
(<fJ - z' 8<fJ) = 0
8z' ,
which gives (h'(z')z' - h(z'))/w(z) = constant.
6. We have z'h' - h = -2/z'. We therefore obtain the first-order differential
equation for the function x(z), (-2/A)dx/dz = w(z), and hence the result
x = L WoZ - wlzoln(l + (z/zo))

(7.78)
WOZI - WIZO In(l + (zI/zo)) '
where we have incorporated the conditions (x = 0, z = 0) and (x = L, z =

zd
7. We obtain
z , - --
dz - WOZI - wlzoln(l
~~~--~~~~~~~
+ (zI/zo))
- dx - woL - wILzo/(z + zo)
If Zl Land Zl Zo, the velocity of the wind does not vary appreciably
over the whole path, and one has z' '" zI/ L l.
In the second question, we have seen that the optimal velocity for a
constant wind velocity is attained for z' = 1. The present configuration
certainly does not correspond to the best strategy. One must tack at some
point (Xl, Z) with 0 < Xl < Land Z Zl, as represented in Figure 7.2
in order to benefit fully from the power of the wind (this possibility was
excluded in the text).
x:::L
Z :::z,
shore x
Fig. 7.2. Path of the boat with a tacking at x = L/2.

Solutions 171
The trajectory drawn with an angle of fJ = 45 degrees (lz'l = 1) and a

tacking fJ -+ -fJ at x = L/2 has a total length LV2 and a velocity greater
than (wO - wl)/2. The time along this path, Tv = 2LV2/(wO - wI),
is obviously shorter than the time along the path with no tacking, T rv
2L(zl/L)/(wO - wI) = 2zl/(wO - wI) .
In realistic cases, for instance the America's Cup, one can see how
subtle the regatta problem is. Skippers must make quick decisive choices
between very different options.
3.1 Moving Pendulum
3.2 Properties of the Action

1. Free particle
s= m (X2 - xd 2
2 t2 - h
2. Harmonic oscillator
3. Constant force
with Va = (X2 - Xd/(t2 - td - (1/2)(F/m)(t2 - h).

4. One varies the endpoint of arrival in the integration by parts of
5. One varies t2, taking into account that the variation of the time of arrival
yields a variation of the trajectory.
172 Solutions
3.3 Conjugate Momenta in Spherical Coordinates
1. The Lagrangian is C = ~m(f2 + r2 iJ2 + r2 sin2 0 2) - V(r).

2. The conjugate momenta are
ac . ac 2 ac 2 2 .
Pr = af = mr, P9 = aiJ = mr 0, P</> = a = mr sin O.
3. Taking the derivative of (3.73) with respect to time, and taking into ac-
count that in Cartesian coordinates p = mv, one obtains directly the
result L z = mr2 sin2 O = P</>o
4. The conservation of P</>' or L z , corresponds to the invariance under trans-
lation in ; i.e., rotation invariance around the z axis.
5. If a charged particle is in a magnetic field B parallel to Oz, there is
rotational invariance around the z axis and the component L z is conserved.

4.1 Coupled Oscillators
1. One obtains directly
{X,P} = 1 {X,Q} = 0 {Y,P} = 0 {Y,Q} = 1
p2 mw 2X 2 Q2 m(w2 + ,n2)y2
H = 2m + 2 + 2m + 2 .
2. The eigenfrequencies of the system are therefore WI = wand W2
Jw 2 +,n2 .
3. The general form of the motion follows from
4.2 Three Coupled Oscillators

We obtain with no difficulty
m mw 2 3m,n2 (2
2
H = 2(PI +P2 +P3)
2 2
+ -2-(XI
2
+X2 +X3 + - 2 - Xl +X2 .
2 2) 2)
4.3 Forced Oscillations
1. We obtain with no difficulty
{X,P} = 1.
Solutions 173
2. In these variables, which are the same as those used by Dirac in the
quantum harmonic oscillator,
H = w(a*a).
3. We obtain {a, a*} = -i.

4. The evolution equation in time of a is
a = {a,H} = -iwa,
which is a first-order differential equation. The general solution is
a(t) = ao exp (-iwt),
where ao is a complex constant. The energy of the oscillator is E = wlaol 2 .

5. For t ::::; 0, we have ao = O. In the presence of Hpoh the Hamiltonian
becomes
H = w(a*a) + b(a + a*) sin fit.
Therefore, we have
a = {a, H} = -iwa - ib sin fit.
This is solved by standard techniques. With the condition E(t < 0) = 0,

one obtains
e-i(D-w)T _ 1 e-i(D+w)T - 1
E(t > T) = wb 21 2i(D _ w) + 2i(D + w) 12.
6. This is a resonance phenomenon at D = w (or at D = -w, which is

equivalent). In the vicinity of D = w, the energy acquired by the oscillator
is of the form
E( T) = b2sin2(D - w)T/2
t> w (D-W)2 '
which has a peak of height wb 2T2/4 at D = w.
4.4 Closed Chain of Coupled Oscillators.
1. a) In the definition, we see that
Yk = y'N-k,
b) We have
The summation over k gives onn' and the result

174 Solutions
N N
L,qkqk = L,p;. (7.80)
k=l n=l
Similarly
t t (~ t
k=l
qkqk =
k=l VN n=l
e-2ikmr/N pn) ( ~
VN n'=l
t e2ikn'7r/N p~) .
(7.81)
The summation over k gives bnn" and hence the result.
c) On the other hand, we have
~(xn - X n +,)' ~ ~~ (t, e-2ikn'IN (1- e- 2ihIN )Yk)
x (t/ ik' nIN (1 - e 2ik' . IN) Yk) . (7.82)

The summation over n gives bkk' and the result.
2. Equations of motion and their solution.
a) We have
with
b) We have
{Yj, qd = bjk' {Yj, qk} = bjk, {Yj, qF,r -d = bjk, {yj, qN -d = bjk.
(7.83)
c) We obtain
Yk = {Yk, H} = ; (qk + qN-k) = mqk'
Yk = {Yk' H} = ; (qk + qF,r-k) = mqk,
. _{ H} - mfl'%(Yk + YN-k) _ fl,2 *

qk - qk, - - 2 - m kYk,
fl,2 ( + * )
.* _ { * H} - m k Yk YN-k _ fl,2
qk - qkl - - 2 - m kYk
d) We therefore have {Yk(t)} = ak cos(fl\t + k), and hence {xn(t)}.

Solutions 175
3. If, at time t = 0, we have YN(O) = 1, YN(O) = 0 and {Yn(O) = 0, Yn(O) =

O}, 'Vn =1= N, then YN(t) = cos(wt) and Yn(t) = 0, 'Vn =1= N. Therefore
xn(t) = (l/VN) cos(wt). Oscillators of the same amplitude at a given
time are always in phase, and only the global motion with respect to the
plane x = 0 with frequency w appears.
4. Wave propagation.
If w = 0, the eigenfrequencies are !?~ = 2!?sin(k1r/N) rv 2!?(k1r/N)
for k N. The boundary conditions give Y1 = cos 2!?7rt / N, YN -1 =
cos2!?7rt/N, and Yn = 0 otherwise.
a) Therefore, we obtain
lXn = XN-n = 2 cos (2!?7rt)

VN ~ cos2n7rN (7.84)
-_ VN
1 [
cos (2!?7rtN+ 2n7r) + cos (2!?7rtN- 2n7r)] . (7.85 )
b) We observe a propagation phenomenon in both directions since
in the notation above. The point x n +m has the same amplitude at

time t + m/!? as the point Xn at time t.
c) If we write xn(t) = f(t, Y = na), the function f is
(2!?7rt + 2Y7r/a)
f( t,y ) -__1_ [
VN cos N + cos (2!?7rt -N 2Y7r/a)]
and satisfies the wave equation
1 82 f 82 f
------=0.
!?2a 2 8t 2 8x2
In this chain of coupled oscillators, a progressive wave of velocity !?a
propagates.
4.5 Virial Theorem
1. One obtains
p2
{A, H} = - - r . V'V.
m
The time evolution of A is simply
dA p2
dt = {A, H} = m - r . V'v.
2. We have (.,4) (A(T) - A(O))/T = O. Therefore, inserting this in the

result above, we obtain
2 ( : : ) = (r V'V).
176 Solutions
3. If V = gr n , we have
8V
r V'V = ra;: = nV.
We therefore obtain 2(Ec) = n(V).

4. The total energy is E = Ec + V. We therefore obtain
a) For a harmonic oscillator, E = 2(Ec) = 2(V).
b) For a Newtonian potential, E = -(Ec) = (1/2)(V), which is obvious
on a circular trajectory, but holds for any elliptic trajectory.
5. In general, for an arbitrary potential, the orbits of bound states are not
closed. However, they remain confined in a given region of space at any
time. The generalization of the averaging (4.107) is
(I) = lim (T---+oo)

r T
T1 Jo f(t) dt.
With this definition, we have
(A) = lim (T---+oo) (A(T) - A(O))/T = 0

since A(t) is bounded for any t. With this definition, the result remains
true.
4.6
{Lx,Ly} = L z
4.7 We obtain
and cyclic permutations.

5.1 Telegraph Equation
The Lagrangian density is
(7.86)
where 'lj;* is the "mirror" density which concentrates instead of diffusing. This
leads to the propagation equation
3 8 2'lj; 28'lj;
2~ -i1'lj;+a ~ =0. (7.87)
v ut ut
This equation can be solved by Fourier transformation if the coefficients v
and a are constants. (This is not the case if the medium is inhomogeneous or
discontinuous. )
Solutions 177
Problems of Chapter 6
6.2 Geodesics
Solutions exist only for p 2: R (which is explained by equation (6.136)).
The energy is
(7.88)
The calculation is similar to previous cases such as (2). We define the

parameters wand , as before:
2 2E
w = mR2' (7.89)
We obtain
(7.90)
and
tanh((t) - o) = ,tanhw(t - to). (7.91)
7.1 Propagator of a Harmonic Oscillator

The classical action for a harmonic one-dimensional oscillator is
The calculation of the propagator involves only Gaussian integrals, and the
result follows directly. One recovers (7.61).
References
1. L. Landau and E. Lifshitz, The Classical Theory of Fields, Pergamon

Press, Oxford (1965).
2. Arthur Koestler, The Act of Creation, Hutchinson & Co., London
(1964).
3. R.P. Feynman, R.B. Leighton, and M. Sands, The Feynman Lectures
on Physics, Addison-Wesley, Reading MA (1964).
4. Wolfgang Yourgenau and Stanley Mandelstam, Variational Principles
in Dynamics and Quantum Theory, Dover Publications, New York
(1979).
5. Izrail Moiseevich Gelfand and Sergei Vasilevich Fomin, Calculus of Vari-
ations, Rev. English ed. Prentice-Hall, Englewood Cliffs, NJ, (1963).
Andrew Russell Forsyth, Calculus of Variations, Dover, New York
(1960). Jean-Pierre Bourguignon, Calcul Variationnel, Ecole Polytech-
nique, Palaiseau (1990).
6. Erwin Schr6dinger, Statistical Thermodynamics, Dover Publications,
New York (1989).
7. J.-L. Basdevant and Jean Dalibard, Quantum Mechanics, Springer Ver-
lag, Heidelberg (2005).
8. L. Landau and E. Lifshitz, Mechanics, Pergamon Press, Oxford (1965).
9. Herbert Goldstein, Charles Poole and John Safko, Classical Mechanics,
Addison Wesley, Boston (2002).
10. Philip M. Morse and Herman Feshbach, Methods of Theoretical Physics,
Mc Graw-Hill, New York (1953).
11. Ian Percival and Derek Richards, Introduction to Dynamics, Cambridge
University Press, Cambridge (1982).
12. Max Born and Emil Wolf, Principles of Optics, Pergamon Press, Oxford
(1964).
13. Albert Messiah, Quantum Mechanics, North-Holland, Amsterdam
(1962).
14. J.L. Basdevant, J. Rich, and M. Spiro, Fundamentals in Nuclear
Physics, Springer, New York (2005).
15. Hans Stefani, General Relativity, Cambridge University Press, Cam-
bridge (1982).
180 References
16. Steven Weinberg, Gravitation and Cosmology, John Wiley & Sons, New
York (1972).
17. P. A. M. Dirac, General Theory of Relativity, John Wiley & Sons, New
York (1975).
18. Charles W. Misner, Kip S. Thorne, and John Archibald Wheeler, Grav-
itation, W.H. Freemann and Company, New York (1973).
19. James Rich, Fundamentals of Cosmology, Springer-Verlag, Heidelberg
(2001).
20. R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path Integrals,
McGraw-Hill, New York (1965).
21. Lawrence S. Schulman, Techniques and Applications of Path Integra-
tion, John Wiley & Sons, New York (1981).
22. Julian Schwinger, Selected Papers on Quantum Electrodynamics, Dover,
New York, (1958).
Index
action, 9, 50, 82, 146 curved rays, 27

amplitude, 147 curved space, 53, 108, 112
angle-action variables, 77 cyclic variable, 54, 77, 78
angular momentum, 57
and rotations, 57 d'Alembert, 15
attractor, 71 Descartes, R., 22
diffusion equation, 104
baryonic dark matter, 139 Dirac, P.A.M., 81
black holes, 139 disorder, 41
Boltzmann entropy, 41 dissipative systems, 58
Boltzmann factor, 38 distribution, 36
brachistochrone, 43 dynamical symmetries, 57, 77
Buridan, Jean de, 11 dynamical systems, 70
B6lyai, J., 112
economic models, 15, 41
canonical Ehrenfest theorem, 80
commutation relations, 80 eikonal, 89
conjugate variables, 77, 83 eikonal approximation, 90
equations, 69 eikonal equation, 90
formalism, 68 Einstein ring, 138
formulation, 16, 67 Einstein, A., 17, 53, 107
transformation, 75-77, 79 electromagnetic field, 102
catenoid, 35 energy, 54
chaos, 71 energy-momentum, 62
Christoffel symbols, 114 entropy, 41
classical limit, 161 Boltzmann, 41
commutator, 80 equation of the geodesics, 118
configuration, 36 equivalence principle, 108
conjugate momentum, 53, 61 Eratosthenes, 109
conservation laws, 53 Euclid, 108
conservative systems, 87 Euler, L., 12, 26
conserved quantities, 43 Eotvos, R., 18, 107
constant of the motion, 55, 74, 113
curvature of space-time, 122 Fermat principle, 8, 21, 90
182 Index
Fermat, P. de, 8, 21, 50 Legendre transformation, 69

Feynman principle, 145 Leibniz, G. W., 9
Feynman, R.P., 145 Liouville theorem, 78, 79
field equations, 99 Lobatchevsky, N.I., 109, 112
field theory, 17, 97 Lorentz force, 48, 59, 60, 63
flow, 17, 68, 70, 88 Lorentz invariance, 49, 61
flow of a vector field, 79 Lorentz invariant, 62, 63
Fourier equation, 104 Lorenz attractor, 71
Galileo, G., 47 machos, 139

gauge invariance, 60 Magellanic clouds, 140
gauge transformations, 60 Maupertuis principle, 9, 22, 30, 87, 88,
Gauss, C.F., 109, 112 121
general relativity, 1, 17, 107 Maupertuis, P.L. de, 9, 15, 22, 24, 30,
generalized momentum, 53 47,50
geodesics, 117 Maxwell distribution, 41
geometrical optics Mercury's perihelion, 125
and wave optics, 89 metric, 110
gravitation and the curvature of metric tensor, 110
space-time, 122 minimal interaction, 63
gravitational mirage, 22, 28
deflection, 130 inferior, 28
lens, 138, 139 superior, 28
microlensing, 140 mirages in the Abell cluster, 139
gravitational lensing, 130, 133, 135 mirror system, 58
by a cluster of galaxies, 134, 137 momentum, 56
time delay, 134
neutron stars, 139
Hamilton, W.R., 12, 50, 69 Newton, I., 47
Hamilton-Jacobi equation, 82, 85 Newtonian gravitation, 122
Hamiltonian, 69, 81
heat, 42 optimisation under constraints, 10
Hero of Alexandria, 10
Huygens principle, 91 partition function, 39
path integrals, 105, 148
interfering alternatives, 147
phase, 163
Jacobi identity, 74 phase space, 73, 75, 77, 78
Philoponus, John, 11
Jacobi theorem, 86
photon, 130
Klein, Felix, 112 Poincare, 71
point transformation, 75
Lagrange function, 26 Poisson brackets, 73, 75, 76, 80
Lagrange multipliers, 37, 43 Poisson law, 33
Lagrange, J.-L., 12, 15, 26, 48, 49 Poisson theorem, 75
Lagrange-Euler equations, 27, 50 precession of the perihelion, 125
Lagrangian, 50 principle
Laplace, P.S. de, 67 of maximal disorder, 35
least action principle, 48, 49 of equal probability of states, 35
least time principle, 21 of least action, 48, 49
Index 183
of least time, 9, 24 state, 36

of natural economy, 9, 21, 30 superposition principle, 147
of the Best, 9
propagator, 152 telegraph equation, 106
proper time, 125 temperature, 39, 41
Pythagorean music scale, 2 Thales, 110
thermodynamic equilibrium, 36
reduced action, 87 thermostat, 41
refraction, 23 Titius Bode law, 7
relativistic particle, 61 translation in time, 54
rescuing, 25 translations in space, 56
Riemann, B., 110 twin paradox, 62
scalar field, 101 variational calculus, 21, 26

Schri:idinger equation, 104, 154, 160 variational principle, 52
Schwarzschild metric, 124 verifications of general relativity, 125
Schwarzschild, K., 124 vibrating string, 98
Schwinger variational principle, 163
semiclassical approximation, 91 white dwarfs, 139
Shapiro, 1.1., 109 WKB approximation, 91
soap bubble, 34 work,42

Professor Jean-Louis Basdevant Auth. Variational Principles in Physics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Professor Jean-Louis Basdevant Auth. Variational Principles in Physics

Uploaded by

Copyright:

Available Formats

Variational Principles in Physics

Variational Principles in Physics

Library of Congress Control Number: 2006931784

ISBN 0-387-37747-6 ISBN 0-387-37748-4 (eBook)

Printed on acid-free paper.

2007 Springer Science+ Business Media, LLC

Optimization under constraints is part of our daily lives. To live as comfortably

development of gravitational optics which allows us to probe the universe

Paris lean-Louis Basdevant

2 Variational Principles . ................................... " 21

3 The Analytical Mechanics of Lagrange. . . . . . . . . . . . . . . . . . . .. 47

4 Hamilton's Canonical Formalism. . . . . . . . . . . . . . . . . . . . . . . . . .. 67

4.6.1 Geometric Limit of Wave Optics .................... 89

5 Lagrangian Field Theory .................................. 97

6 Motion in a Curved Space ................................. 107

7 Feynman's Principle in Quantum Mechanics ............... 145

Solutions ...................................................... 167

References ..................................................... 179

Index .......................................................... 181

Since mysteries are beyond us,

Art cannot be dissociated from metaphysics and philosophy. In his Aesthetics:

1.1 Esthetics and Physics

J.-L. Basdevant, Variational Principles in Physics

of an enormous amount of effort toward simplicity and synthesis that was

1.2 Metaphysics and Science

Philosophical thinking is frequently related to scientific progress, which is

1.3 Numbers, Music, and Quantum Physics

Music is a concert of several discordant sounds. One must not

Table 1.1. Frequency ratios in the musical scale of Pythagoras.

8 Actually, it is unimportant whether the anecdote is true or not, or whether it

1.4 The Age of Enlightenment and the Principle of

He had on general physics a particular idea, namely that God built

In his New Essays on Human Understanding, Leibniz wrote "My system

1.5 The Fermat Principle and Its Consequences

The scientific thunderbolt, the mathematical formalization of the ideas above,

9 He was engaged in a correspondence with Etienne Pascal, the father of Blaise,

(Notice the presence of the Supreme Being.)

1.6 Variational Principles

remarkable discussion of this fact is given in the Feynman Lectures, Chapters

media quies (re ting state)

The intermediate phase corresponds to the weakening of the violent im-

1.7 The Modern Era, from Lagrange to Einstein

ational calculus is an amazing chapter of mathematics, both in its unifying

incredibly simple principle:

Thermal equilibrium corresponds to a situation that maximizes the en-

The range of application of such a principle goes far beyond thermody-

replaces the classical Poisson brackets by the commutators (divided by in)

15 R.P. Feynman, "Space-Time approach to Non-Relativistic Quantum Mechanics,"

Nature always acts by the shortest paths.

J.-L. Basdevant, Variational Principles in Physics

2.1 The Fermat Principle and Variational Calculus

As we have already mentioned, everything started with a quarrel between

The time T it takes light to follow this path is

T = (nl AO + n2 OB)jc. (2.1)

Here, we give an analytic proof, at present simpler to understand than

Furthermore, this extremum is indeed a minimum (d 2 T / dx 2 > 0).

Consider a two-dimensional problem (x, z) such as the propagation of light in

dT = n(z) dl = n(z) vdz 2 + dx 2

3 Andre Martin, private communication.

By definition, along the path Z (x) that we wish to determine, we have dz =

given the constraints Z(x = xo) = Zo and Z(x = xt) = Zl.

2.1.2 Variational Calculus of Euler and Lagrange

The problem under consideration consists in finding a function, or a family of

1= lB (z(x), z(x), x)dx. (2.6)

OJ = lB [ 8z(x) + 8Z(x)] dx.