Professional Documents
Culture Documents
1. Introduction
Example 1.1 (Shortest Path Problem). Let A and B be two fixed points in a space. Then
we want to find the shortest distance between these two points. We can construct the
problem diagrammatically as below.
Y = Y (x)
B
ds
dY
dx
A
a b x
ds2 = dx2 + dY 2
(1.1) = 1 + (Y )2 dx2 .
The second line of this is achieved by noting Y = dY . Now to find the path between the
dx R
B
points A and B we integrate ds between A and B, i.e. A ds. We however replace ds using
equation (1.1) above and hence get the expression of the length of our curve
Z bp
J(Y ) = 1 + (Y )2 dx.
a
To find the shortest path, i.e. to minimise J, we need to find the extremal function.
Example 1.2 (Minimal Surface of Revolution - Euler). This problem is very similar to the
above but instead of trying to find the shortest distance, we are trying to find the smallest
surface to cover an area. We again display the problem diagrammatically.
1
2 PROF. ARNOLD ARTHURS
Y = Y (x)
B
ds
A
Y
a b x
To find
R B the surface of our shape we need to integrate 2Y ds between the points A and B,
i.e. A 2Y ds. Substituting ds as above with equation (1.1) we obtain our expression of
the size of the surface area
Z b p
J(Y ) = 2Y 1 + (Y )2 dx.
a
To find the minimal surface we need to find the extremal function.
Example 1.3 (Brachistochrone). This problem is derived from Physics. If I release a bead
from O and let it slip down a frictionless curve to point B, accelerated only by gravity,
what shape of curve will allow the bead to complete the journey in the shortest possible
time. We can construct this diagrammatically below.
b
O
x
v
Y = Y (x)
g
ds
B
In the above problem, we want to minimise the variable of time. So, we construct our
integral accordingly and consider the total time taken T as a function of the curve Y .
CALCULUS OF VARIATIONS 3
Z b
T (Y ) = dt
x=0
ds
now using v = dt
and rearranging we achieve
Z b
ds
= .
x=0 v
y A
(1.2) A = xy
(1.3) p = 2x + 2y.
By rearranging equation (1.3) in terms of y and substituting into (1.2) we obtain that
x
A= p 2x
2
dA p
= 2x
dx 2
dA
we require that dx
= 0 and thus
p p
2x = 0 x =
2 4
4 PROF. ARNOLD ARTHURS
Example 1.5 (Chord and Arc Problem). Here we are seeking the curve of a given length
that encloses the largest area above the x-axis. So, we seek the curve Y = y (y is reserved
for the solution and Y is used for the general case). We describe this diagrammatically
below.
Y = Y (x)
AREA
x
b
Z b
J(Y ) = Y dx
0
Z bp
K(Y ) = 1 + (Y )2 dx = c,
0
B
yb
Y
A
ya
a b x
Suppose A(a, ya ) and B(b, yb ) are two fixed points and consider a set of curves
(2.1) Y = Y (x)
joining A and B. Then we seek a member Y = y(x) of this set which minimises the integral
Z b
(2.2) J(Y ) = F (x, Y, Y ) dx
a
where Y (a) = ya , Y (b) = yb . We note that examples 1.1 to 1.3 correspond to a specification
of the above general case with the integrand
p
F = 1 + (Y )2 ,
p
F = 2Y 1 + (Y )2 ,
s
1 + (Y )2
F = .
2gY
An extra example is
Z b
1 1
J(Y ) = p(x)(Y )2 + q(x)Y 2 + f (x)Y dx.
a 2 2
Now, the curves Y in (2.2) may be continuous, differentiable or neither and this affects the
problem for J(Y ). We shall suppose that the functions Y = Y (x) are continuous and have
continuous derivatives a suitable number of times. Thus the functions (2.1) belong to a
set of admissible functions. We define precisely as
6 PROF. ARNOLD ARTHURS
k
Y continuous and d Y continuous, k = 1, 2, . . . , n .
(2.3) = Y dxk
Example 2.1. We can extend the problem by considering more derivatives of Y . So the
integrand becomes
F = F (x, Y, Y , Y ),
i.e. F depends on Y as well as x, Y, Y .
Example 2.2. We can consider more than one function of x. Then the integrand becomes
F = F (x, Y1 , Y2, Y1 , Y2 ),
so F depends on two (or more) functions Yk of x.
Example 2.3. Finally we can consider functions of more than one independent variable.
So, the integrand would become
F = F (x, y, , x , y ),
where subscripts denote partial derivatives. So, F depends on functions (x, y) of two
independent variables x, y. This would mean that to calculate J(Y ) we would have to
integrate more than once, for example
ZZ
J(Y ) = F dxdy.
Definition: Let R be the real numbers and a set of functions. Then the function
J : R is called a functional.
Then we can say that the calculus of variations is concerned with maxima and minima
(extremum) of functionals.
CALCULUS OF VARIATIONS 7
f (u)
a u
If f (u) > f (a) for all u near a on both sides of u = a this means that there is a
minimum at u = a. The consequences of this are often seen in an expansion. Let
us assume that there is a minimum at f (a) and a Taylor expansion exists about
u = a such that
h2
(3.1) f (a + h) = f (a) + hf (a) + f (a) + O3 (h 6= 0).
2
Note that we define df (a, h) := hf (a) to be the first differential. As there exists a
minimum at u = a we must have
gives a minimum / maximum at (a, b) 6 f (a, b). The corresponding Taylor expan-
sion is
We note that in this case the first derivative is df (a, b, h, k) := hfu (a, b) + kfv (a, b).
For a minimum (or a maximum) at (a, b) it follows, as in the previous case that a
necessary condition is
f f
(3.6) df = 0 = = 0 at (a, b).
u v
(iii) Now considering functions of multiple (say n) variables, i.e. f = f (u1 , u2 , . . . , un ) =
f (u) we have the Taylor expansion to be
(3.8) df = 0 f (a) = 0.
Z b
(3.9) J(Y ) = F (x, Y, Y ) dx.
a
(3.11) Y = y +
Y = y(x) + (x)
yb B
Y = y(x)
ya A
a x b
Z b
(3.14) J(y + ) = F (x, y + , y + ) dx.
a
Now to deal with this expansion we take a fixed x in (a, b) and treat y and y as independent
variables. Recall Y and Y are independent and the Taylor expansion of two variables
(equation (3.5)) from above. Then we have
f f
(3.15) f (u + h, v + k) = f (u, v) + h +k + O2 .
u v
Now we take u = y, h = , v = y , k = , f = F . Then (3.14) implies
Z b
F F 2
J(Y ) = J(y + ) = F (x, y, y ) + + + O( ) dx
a y y
= J(y) + J + O2 .
We note that J is calculus of variations notation for dJ where we have
10 PROF. ARNOLD ARTHURS
Z b
F F
(3.16) J = + dx
a y y
= linear terms in = first variation of J
and we also have that
F F (x, Y, Y )
(3.17) = .
y Y Y =y,Y =y
Now J is analogous to the linear terms in (3.1), (3.5), (3.7). We now prove that if J(Y )
has a minimum at Y = y, then
(3.18) J = 0
the first necessary condition for a minimum.
Proof. Suppose J 6= 0 then J(Y ) J(y) = J + O2 . For small enough then
Z b
F d F
(3.21) J = dx.
a y dx y
Note. For notational purposes we write
Z b
(3.22) hf, gi = f (x)g(x) dx
a
which is an inner (or scalar) product. Also, for our J, write
CALCULUS OF VARIATIONS 11
F d F
(3.23) J (y) =
y dx y
as the derivative of J. Then we can express J as
(3.24) J = , J (y) .
This gives us the Taylor expansion
(3.25) J(y + ) = J(y) + , J (y) +O2 .
| {z }
J
We compare this with the previous cases of Taylor expansion (3.1), (3.5) and (3.7). Now
collecting our results together we get
Theorem 3.1. A necessary condition for J(Y ) to have an extremum (maximum or mini-
mum) at Y = y is
(3.26) J = , J (y) = 0
for all admissible . i.e.
4. Stationary Condition (J = 0)
Our next step is to see what can be deduced from the condition
(4.1) J = 0.
In detail this is
Z b
(4.2) , J (y) = J (y) dx = 0.
a
Rb
For our case J (Y ) = a
F (x, Y, Y ) dx, we have
12 PROF. ARNOLD ARTHURS
F d F
(4.3) J (y) = .
y dx y
To deal with equations (4.2) and (4.3) we use the Euler-Legrange Lemma. Using this
Lemma we can state
Theorem 4.1. A necessary condition for
Z b
(4.4) J(Y ) = F (x, Y, Y ) dx,
a
F d F
(4.5) J (y) = =0
y dx y
with a < x < b and F = F (x, y, y ). This is known as the Euler-Lagrange equation.
The above theorem is the Euler-Lagrange variational principle and it is a stationary prin-
ciple.
Example 4.1 (Shortest Path Problem). We now revisit p Example 1.1 with the mechanisms
that we have just set up.
p So we had that F (x, Y, Y ) = 1 + (Y )2 previously but instead
we write F (x, y, y ) = 1 + (y )2 . Now
F F 2y y
= 0, = p = p .
y y 2 1 + (y )2 1 + (y )2
The Euler-Lagrange equation for this example is
!
d y
0 p =0 a<x<b
dx 1 + (y )2
y
p = const.
1 + (y )2
y =
y = x +
where , are arbitrary constants. So, y = x + defines the critical curves. We require
more information to establish a minimum.
Example
p 4.2 (Minimum Surface Problem). Now revisitingp Example 1.2, we had F (x, Y, Y ) =
2Y 1 + (Y )2 but again we write F (x, y, y ) = 2y 1 + (y )2 . We drop the 2 to give
us
CALCULUS OF VARIATIONS 13
F p F yy
= 1 + (y )2 , = p .
y y 1 + (y )2
So, we are left with the Euler-Lagrange equation
!
p d yy
1 + (y )2 p = 0.
dx 1 + (y )2
Now solving the differential equation leaves us with
23
1 + (y )2 1 + (y )2 yy = 0,
for finite y in (a, b). We therefore have
(4.6) 1 + (y )2 = yy .
dy
y =
dx
dy dy
=
dy dx
dy 1 d(y )2
(4.7) = y = .
dy 2 dy
So, substituting (4.7) into (4.6), the Euler-Lagrange equation implies that
1 d(y )2
1 + (y )2 = y
2 dy
dy 1 d(y )2 1 dz
= 2
=
y 2 1 + (y ) 21+z
1
ln y = ln 1 + (y )2 + C
2p
y = C 1 + (y )2 ,
which is a first integral. Now integrating again gives us
x + C1
y = cosh
C
where C1 and C are arbitrary constants. We note that this is a catenary curve.
14 PROF. ARNOLD ARTHURS
F d F
=0
y dx y
=0
d F
=0
dx y
F
=c
y
on orbits (extremals or critical curves), first integral. If this can be for y thus
y = f (x, c)
R
then y(x) = f (x, c) dx + c1 .
dF F F F
= + + y
dx x y y
F d F F
= +
y + y
x dx y y
F d F
= + y
x dx y
d F
F y = 0 on orbits
dx y
F y Fy = c
2yy
Fy = p .
1 + (y )2
then the first integral is
F = y Fy
p (y )2
= 2y{ 1 + (y )2 p }
1 + (y )2
2y
=p
1 + (y )2
= const. = 2c
p
y=c 1 + (y )2 - First Integral.
This arose in Example 4.2 as a result of integrating the Euler Lagrange equation
once.
F
(3) y = 0 and hence F = F (x, y), i.e. y is missing. The Euler-Lagrange equation is
F d F F
0=
= =0
y dx y y
but this isnt a differential equation in y.
F d F
= 0 ln y 1 + x = 0
y dx y
ln y = 1 + x y = e1+x
16 PROF. ARNOLD ARTHURS
6. Change of Variables
In the shortest path problem above, we note that J(Y ) does not depend on the co-ordinate
system chosen. Suppose we require the extremal for
Z 1 p
(6.1) J(r) = r 2 + (r )2 d
0
dr
where r = r() and r = d
. The Euler-Lagrange equation is
!
F d F r d r
(6.2)
=0 p p = 0.
r d r r 2 + (r )2 d r 2 + (r )2
To simplify this we can
(a) change variables in (6.2), or
(b) change variables in (6.1).
For example, in (6.1) take
dx2 + dy 2 = dr 2 + r 2 d2
( )
2
dr
1 + (y )2 dx2 = + r 2 d2
d
p p
1 + (y )2 dx = r 2 + (r )2 d.
This is just the smallest path problem, hidden in polar coordinates. We know the solution
to this is
Z p
=
J(r) J(y) 1 + (y )2 dx.
r sin = r cos +
r=
sin cos
CALCULUS OF VARIATIONS 17
Z b
(7.1) J(Y1 , Y2, . . . , Yn ) = F (x, Y1 , . . . , Yn , Y1 , . . . , Yn ) dx.
a
n
X
(7.4) J = k , J (y1 , . . . , yn ) .
k=1
with
F d F
(7.5) Jk (y1 , . . . , yn ) =
yk dx yk
for k = 1, . . . , n. The stationary condition J = 0
n
X
(7.6) hk , Jk i = 0
k=1
(7.7) hk , Jk i = 0 by linear independence.
for all k = 1, . . . , n. Thus, by the Euler-Lagrange Lemma, this implies
(7.8) Jk = 0
for k = 1, . . . , n. i.e.
F d F
(7.9) =0
yk dx yk
for k = 1, . . . , n. (7.9) is a system of n Euler-Lagrange equations. These are solved subject
to (7.3).
18 PROF. ARNOLD ARTHURS
n
dF F X F F
= + yk + yk
dx x k=1
yk yk
on orbits.
n
F X d F
= + yk
x k=1
dx yk
dG F
= .
dx x
F
If x
= 0, i.e. F does not depend implicitly on x, we have
dG d
= =0
dx dx
on orbits. i.e.
n
X F
F yk =c
yk
k=1
on orbits, where c is a constant. This is a first integral. i.e. G is constant on orbits. This
is the Jacobi integral.
CALCULUS OF VARIATIONS 19
8. Double Integrals
Here we look at functions of 2 variables, i.e. surfaces,
(8.5) J = 0.
To use this we rewrite (8.4)
(8.6) ZZ
F F F F F
J = + + dxdy.
R x x y y x x y y
Now by Greens Theorem we have
ZZ Z
P F
dxdy = P dy P =
x x
Z ZR R
Z
Q F
(8.7) dxdy = Q dx Q=
R y R y
So,
20 PROF. ARNOLD ARTHURS
ZZ Z
F F F F F
(8.8) J = dxdy + dy dx .
R x x y y R x y
If we choose all functions including to statisfy
= B on R
then + = B on R
=0 on R.
Hence (8.8) simplifies to
ZZ
F F F
J = (x, y) dxdy
R x x y y
, J () .
By a simple extension of the Euler-Lagrange lemma, since is arbitrary in R, we have
J = 0 (x, y) is a solution of
F F F
(8.9) J () = = 0.
x x y y
This is the Euler-Lagrange Equation - a partial differential equation. We seek the solution
which takes the given values B on R.
Example 8.1. F = F (x, y, , x, y ) = 12 2x + 12 2y + f , where f = f (x, y) is given. The
Euler-Lagrange equation is
F F F
=f = x = y .
x y
So,
F F F
=0f x y = 0.
x x y y x y
So, i.e. we are left with
2 2
+ 2 = f,
x2 y
which is Poissons Equation.
CALCULUS OF VARIATIONS 21
F F F
=0
x x x t t
1
0 x t = 0
x t c2
2 1 2
=
x2 c2 t2
This is the classical wave equation.
F d F
(9.1) =0
yk dx yk
which give the critical curves y1 , . . . , yn of
Z
(9.2) J(Y1 , . . . , Yn ) = F (x, Y1 , . . . , Yn , Y1 , . . . , Yn ) dx
Equations (9.1) form a system of n 2nd order differential equations. We shall now rewrite
this as a system of 2n first order differential equations. First we introduce a new variable
F
(9.3) pi = i = 1, . . . , n.
yi
pi is said to be the variable conjugate to yi . We suppose that equations (9.3) can be solved
to give y as a function i of x, yj , pj (j = 1, . . . , n). Then it is possible to define a new
function H by the equation
n
X
(9.4) H(x, y1, . . . , yn , p1 , . . . , pn ) = pi yi F (x, y1 , . . . , yn , y1 , . . . , yn )
i=1
n n
!
X F X F F
dH = ( dyi + yi dpi )
pi dx dyi + dyi
i=1
x i=1
yi yi
n
F X F
(9.5) = dx + yi dpi
x i=1
yi
using (9.3). For H = (x, y1 , . . . , yn , p1 , . . . , pn ) then we have
n
H X H H
dH = + dyi + dpi .
x i=1
y i p i
H F H
yi = =
pi yi yi
dyi H dpi H
(9.6) = =
dx pi dx yi
for i = 1, . . . , n. Equations (9.6) are the canonical Euler-Lagrange equations associated
with the integral (9.2).
Example 9.1. Take
Z b
(9.7) J(Y ) = (Y )2 + Y 2 dx
a
where , are given functions of x. For this F (x, y, y ) = (y )2 + y 2 if so
F 1
=
= 2y y = .
y 2
The Hamiltonian H is, by (9.4),
H = py F
1
= py (y )2 y 2 with y =
2
1 1
=pp 2 p2 y 2
2 4
1 2
= p y 2
4
in correct variables x, y, p. Canonical equations are then
dy H 1 dp H
(9.8) = = p = = 2y.
dx p 2 dx y
CALCULUS OF VARIATIONS 23
Y = y + P = p + .
Then we obtain
Z b
d
I(p + , y + ) = (p + ) (y + ) H(x, y + , p + ) dx
a dx
Z b
dy dy d d H H
= p + + p + 2 H(x, y, p) O2 dx
a dx dx dx dx y p
= I(p, y) + I + O2 .
where we have
Z b
dy d H H
I = +p dx
a dx dx y p
Z b
dy H dp H b
= + dx + p a
a dx p dx y | {z }
=0
(
ya at x = a
y = yB =
yb at x = b
Note. If Y = yB no boundary conditions are required on p.
9.3. An extension.
Modify I(P, Y ) to be
Z b b
dY
Imod (P, Y ) = P H(x, P, Y ) dx P (Y yB )
a dx a
Here P and Y are any admissible functions. Imod is stationary at P = p, Y = y. Y = y+,
P = p + y and then make Imod = 0. You should find that (y, p) solve
dy H dp H
= =
dx p dx y
with y = yB on [a, b].
dyi H dpi H
(10.1) = = (i = 1, . . . , n)
dx pi dx yi
and hence of the system
dyi F d F
(10.2) = yi =0 (i = 1, . . . , n)
dx yi dx yi
which is equivalent to (10.1). Take the case where
F
= 0.
x
i.e. F = F (y1 , . . . , yn , y1 , . . . , yn ). Then
n
X
H= pi yi F
i=1
H
is such that x
= 0 and hence
CALCULUS OF VARIATIONS 25
n
0
dH H7 X H dyi H dpi
= + + .
dx x i=1
yi dx pi dx
n n
dW W X W dyi W dpi W X W H W H
= + + = +
dx x i=1
yi dx pi dx x i=1
yi pi pi yi
on orbits.
n
X X Y X Y
[X, Y ] =
i=1
yi pi pi yi
Then we have
dW W
= + [W, H]
dx x
W
on orbits. In the case when x
= 0, we have
dW
= [W, H].
dx
So,
dW
= 0 [W, H] = 0.
dx
26 PROF. ARNOLD ARTHURS
Y Y Y
Y
y y y y
a b a b a b a b
(i) (ii) (iii) (iv)
Figure 9. The four possible cases of varying end points in the direction of y.
ii) In this case we have the complete opposite situation where 6= 0 for either x = a
or x = b. This gives us that
b
F
J = = 0.
y a
iii) In this case we are given that only one end point gives = 0, namely (a) = 0 but
6= 0 for x = b. This gives us the criteria that
F
=0 at x = b.
y
iv) Now our final case is where (b) = 0 but (a) 6= 0. This gives us the condition
F
=0 at x = a.
y
Hence J = 0 y satisfies the Euler-Lagrange equation
F d F
=0 (a < x < b)
y dx y
with
F
=0 at x = a
y
F
=0 at x = b.
y
F
If Y is not prescribed at an end point, e.g. x = b, then we require y
= 0 at x = b. Such
a condition is called a natural boundary condition.
Example 11.1 (The simplest problem, revisited). We go back to the Simplest Problem
from chapter 1 where we have F = (y )2 and thus
Z 1
J(Y ) = (y )2 dx
0
Z 1
J(Y ) = (Y )2 dx
0
but it is not prescribed at x = 0, 1. The Euler-Lagrange equation is y = 0 which implies
that y = x + . Now we have from the boundary conditions that at x = 0, 1 we have
F
=0 y (1) = = 0 =0
y
and hence y = .
yb + yb
y +
ya + ya
yb
y
ya
a a + a b b + b x
F
=0 at x = a
y
F
=0 at x = b.
y
Let our J to be minimised be
Z b
(12.1) J(y) = F (x, y, y ) dx.
a
CALCULUS OF VARIATIONS 29
J = J(y + ) J(y)
Z b+b Z b
= F (x, y + , y + ) dx F (x, y, y ) dx
a+a a
which we rewrite as
Z b
(12.5) J = {F (x, y + , y + ) F (x, y, y )} dx
a
Z b+b
+ F (x, y + , y + ) dx
b
Z a+a
F (x, y + , y + ) dx.
a
We have to assume here that y and are defined on the interval (a, b + b), i.e. extensions
of y and y + are needed and we suppose that this is done (by Taylor expansion - see
later). Then, expanding F (x, y + , y + ) about (x, y, y ).
Z b
F F 2
(12.6) J = + + O( ) dx
a y y
Z b+b
+ {F (x, y, y ) + O()} dx
b
Z a+a
{F (x, y, y ) + O()} dx
a
and the linear terms in this case are
30 PROF. ARNOLD ARTHURS
Z b
F F
J = + dx + F (x, y, y ) b F (x, y, y ) a
a y y x=b x=a
Z b x=b
F F
= + + F (x, y, y )x
a y y x=a
Z b b
F d F F
(12.7) = dx + + F (x, y, y )x .
a y dx y y a
The final step is to find in terms of y, a, b at x = a and x = b. Use (12.4) to obtain
ya + yb = y(a + a) + (a + a)
= y(a) + ay (a) + + (a) + a (a) +
First order terms are then
ya = ay (a) + (a)
(12.8) (a) = ya ay (a)
(12.9) (b) = yb by (b).
We can now write (12.7) as
Z b b
F d F F F
(12.10) J = dx + y y F x .
a y dx y y y a
This is the General First Variation of the integral J. If we introduce the canonical
variable p conjugate to y defined by
F
(12.11) p=
y
and the Hamiltonian H given by
(12.12) H = py F
we can rewrite (12.10) as
Z b b
F d F
(12.13) J = dx + py Hx .
a y dx y a
If J has an extremum for Y = y, then J = 0 y is a solution of
F d F
(12.14) =0 (a < x < b)
y dx y
CALCULUS OF VARIATIONS 31
with
b
(12.15) py Hx = 0.
a
Z b
(13.1) J(y) = F (x, y, y ) dx.
a
Z b b
d
(13.2) J = Fy Fy dx + py Hx
a dx a
F
where p = y
and H = py F .
yb + yb B2
C2
yb B1
ya C1
A
a b b + b x
We apply (13.2) to the case in the above Figure, where A is fixed at (a, ya ) and C1 , C2 are
both extremal curves to B1 (b, yb ) and B2 (b + b, yb + yb ) respectively. Then we have
32 PROF. ARNOLD ARTHURS
S = S(b + b, yb + yb ) S(b, yb ).
By (2), the first order terms of (3) are
S = Sy y + Sx x
which gives
(13.5) Sy = p Sx = H.
Hence,
(13.6) Sx + H(x, y, Sy ) = 0
F dy
which is the Hamilton-Jacobi equation, where p(= y ) = Sy , y denoting the derivative dx
calculated at B(x, y) for the extremal going from A to B.
Example 13.1 (Hamilton-Jacobi). Let our integral be
Z b
J(y) = (y )2 dx
a
2
and thus we have F (x, y, y ) = (y ) . Thus we have
F 1
p=
= 2y y = p
y 2
H = py F
2
1 1
=p p p
2 2
1
= p2
4
CALCULUS OF VARIATIONS 33
Sx + H(x, y, p = Sy ) = 0
1
Sx + (Sy )2 = 0
4
(1) How do we solve this for S = S(x, y)?
(2) Then, how do we find the extremals?
13.3. Structure of S
For S = S(x, y, ) we have
dS S S dy
= +
dx x y dx
by the chain rule. Now, in a critical curve C0 we have
S S
= H = p.
x y
So,
dS dy
= H + p on C0
dx dx
dS = Hdx + pdy on C0 .
Integrating we obtain
Z
S(x, y, ) = (pdy Hdx)
C0
Z
= F dx
C0
up to additive constant. So, Hamiltons principal function S is equal to J(y), where J(y)
corresponds to J(y) evaluated along a critical curve.
(1) Case H
x
= 0, i.e. H = H(y, p). In this case, by the result in 10, H = const. =
say in C0 . This p = f (, y) say. Then
34 PROF. ARNOLD ARTHURS
Z
S= (pdy Hdx)
C0
Z
= {f (, y)dy dx}
C0
= W (y) x.
Put this in the Hamilton-Jacobi equation to find W (y) and hence S. Note: it is
additive rather than S = X(x)Y (y).
(2) Case H
y
= 0, hence H = H(x, p). On C0
dp H
= = 0 p = const.
dx y
on C0 , say c. Then
Z
S= (pdy Hdx)
C0
Z
= (cdy H(x, p = c)dx)
C0
= cy V (x).
Again, this separates the x and y variables for the Hamilton-Jacobi equation. In
this case, find V (x) and hence S, by evaluating
Z
V (x) = H(x, p = c) dx.
13.4. Examples
Example 13.2. Let our integral be
Z b
J(y) = (y )2 dx.
a
2
Then we have that F = (y ) . Thus we have
1
p = Fy = 2y y = p
2
1
H = py F H = p2
4
The Hamilton-Jacobi equation is
CALCULUS OF VARIATIONS 35
S S
+ H x, y, p = =0
x y
2
S 1 S
() + =0
x 4 y
H
(1) Now, x
= 0 here. So,
H = const. =
say n C0 . Hence, the the result of 13.3 we can write
S(x, y) = W (y) x
Put this in () then
2
1 dW
+ =0
4 dy
dW
= 2 W (y) = 2 y
dy
So, we have
S = 2 y x .
S
=pp=2
y
S 1
= yx=
H
(2) This is also an example of y
= 0 on C0
dp H dp
= =0
dx y dx
= p = const. = c
say. Then
36 PROF. ARNOLD ARTHURS
Z
S(x, y) = (pdy Hdx)
C0
Z
1 2
= cdy c dx
C0 2
1
= cy c2 x.
2
Extremals are
S S
p= =c = = const.
y c
S
= y cx y = cx + , which are straight lines (c, constants).
So, = c
p
Example 13.3. We have that F (x, y, y ) = y 1 + (y )2 and we note F
x
= 0. Now,
F yy p
p= = p y = p .
y 1 + (y )2 y 2 p2
Also,
p
H = py F (x, y, y ) with y = p
y p2
2
p
= y 2 p2 .
S
The Hamilton-Jacobi equation H + x
= 0 is in this case
( 2 ) 12
S S
y2 + =0
y x
i.e. we have
2 2
S S
+ = y2
x y
H
Hence x
= 0 and so H = const. = say on C0 . So, we can write S(x, y) = W (y) + x
2
dW
= y 2 2
dy
Z y
1
W (y) = (v 2 2 ) 2 dv.
So we have that
CALCULUS OF VARIATIONS 37
Z y
1
S(x, y) = (v 2 2 ) 2 dv + x.
To find the extremals we use the Hamilton-Jacobi theorem
S S
=p = const. =
y
say. Then
Z y
S
== dv + x
v 2 2
x
y = cosh
J = J(Y ) J(y).
38 PROF. ARNOLD ARTHURS
For quadratic problems e.g. F = ay 2 + byy + c(y )2 the O3 terms are zero and
J = J(Y ) J(y) = 2 J.
Z b
1 1
(15.1) J(Y ) = v(Y )2 + wY 2 f Y dx
a 2 2
with
Z b
1
(15.5) J = J(y + ) J(y) = v( )2 + w()2) dx
2 a
Z b
1 2 1 2
J(y + ) = v(y + ) + w(y + ) f (y + ) dx
a 2 2
Z b Z b
1 2 1 2
= v(y ) + wy f y dx + {vy + wy f } dx
a 2 2 a
Z b
1 2 1 2
+ 2 v( ) + w dx
a 2 2
= J(y) + J + 2 J
and we have
CALCULUS OF VARIATIONS 39
Z b
J = {vy + wy f } dx
a
Z b
d
= (vy ) + wy f dx + [vy ]ba
a dx | {z }
=0
Z b
d
= (vy ) + wy f dx
a dx
We consider
J = J(y + ) J(y)
= 2J
Z
1 b
(15.6) = v( )2 + w()2 dx.
2 a
Has this a definite sign?
Special Case
Take v and w to be constants and let v > 0 (if v < 0 change the sign of J). Then
Z bn
1 w 2o
(15.7) J = v2 ( )2 + dx
2 a v
The sign of J depends only on sign of
Z b
w 2
K() = dx ( ) +
a v
where (a) = 0 and (b) = 0. Clearly K > 0 if wv > 0. To go further, integrate therm
by parts. Then
Z b
d2 w
(15.8) K() = 2+ dx + [ ]ba .
a dx v | {z }
=0
Z bX
d2 w X
(15.9) K() = am m (x) 2 + an n (x) dx
a m=1
dx v n=1
Rb
where a
n m dx = 0 for n 6= m. Then
X
Z b
X d2 w
(15.10) = am an m 2+ n dx
m=1 n=1 a dx v
| {z }
n2 2
=m +w n
(ba)2 v
X
Z b
X n2 2 w
(15.11) = am an 2
+ n m dx
m=1 n=1
(b a) v a
| {z }
=0 for m6=n
Z b
X n2 2 w
(15.12) = a2n 2
+ 2n dx.
n=1
(b a) v a
Hence, in (12), if
2 w
(15.13) 2
+ >0
(b a) v
(n = 1 term) then
v 2
(15.16) w> .
(b a)2
So, for some negative w we shall get a MIN principle of J(Y ). (15.13) gives us that if
w 2
>
v (b a)2
then we have (7)
Z bn
w 2o
( )2 + > 0.
a v
i.e.
CALCULUS OF VARIATIONS 41
Z b
2 2
( ) 2 dx > 0
a (b a)2
with (a) = 0 = (b)
Theorem 15.1 (Wirtingers Inequality).
Z b Z b
2
(15.17) (u )2 dx > u2 dx
a (b a)2 a
for all u such that u(a) = 0 = u(b).
g
f f
(16.1) + (1) x
x
=0
x y y
dy
y+x =0
dx
dy
1x+x = 0.
dx
We also obtain
42 PROF. ARNOLD ARTHURS
dy
1+1
= 0.
dx
Thus we have y + x(1) = 0, 1 x x = 0, 1 2x = 0. This implies
1
x=
2
1
y = 1x =
2
Thus z = 2 MAX at ( 12 , 21 ).
V f g
(a) = + =0
x x x
V f g
(b) = + =0
y y y
V
(c) = g(x, y) = 0
Combining (a) and (b) we obtain
f
f g y
+ (1) g = 0.
x x y
As before in (16.1) we have that (c) is the constraint. We solve (a), (b) and (c) for x, y
and .
Example 16.2. Returning to our example from before we want to establish a maximum
at ( 21 , 12 ) for f . Take the points
1 1
x= +h y= + k.
2 2
Then we obtain that
1 1 1 1
f + h, + k = +h +k
2 2 2 2
11 1
= + (h + k) + hk
22 2
CALCULUS OF VARIATIONS 43
but this has linear terms in our stationary points. Dont forget the constraint! Now, the
point ( 21 + h, 21 + k) must satisfy
g =x+y1=0
1 1
+ h + + k 1 = 0 h + k = 0.
2 2
Hence,
1 1 11
f + h, + k = h2
2 2 22
1 1 1 1
6 MAX at , .
2 2 2 2
Problem with 2 Constraints
Find extremum of f (x, y, z) subject to g1 (x, y, z) = 0 and g2 (x, y, z) = 0. This could be
quite complicated. But Lagrange multipliers gives a simple extension. Form
V (x, y, z, 1 , 2 ) = f1 + 1 g1 + 2 g2 .
We make V stationary, so
Vx = 0 Vy = 0 Vz = 0 V1 = 0 V2 = 0.
K(Y ) k = 0 g(1, 2 ) = 0 1 , 2
not independent. Now use the method of Lagrange multipliers to form
44 PROF. ARNOLD ARTHURS
E(1 , 2 , ) = f + g.
Make E stationary at 1 = 0, 2 = 0 (i.e. Y = y). Then,
f g
+ = 0
1 1
at 1 = 2 = 0.
f g
+ =0
2 2
Now
f (1 , 2 ) = J(y + 1 1 + 2 2 )
= J(y) + h1 1 + 2 2 , J (y)i + O2
and we get
g(1, 2 ) = K(y + 1 1 + 2 2 ) k
= K(y) k
= h1 1 + 2 2 , K (y)i + O2 .
Taking partial derivatives we obtain
f g
= h1 , J (y)i at 1 = 2 = 0 = h1 , K (y)i at 1 = 2 = 0
1 1
f g
= h2 , J (y)i at 1 = 2 = 0 = h2 , K (y)i at 1 = 2 = 0.
2 2
h1 , J (y) + K (y)i = 0
h2 , J (y) + K (y)i = 0.
Since 1 and 2 are arbitrary and independent functions in (a), (b) and y is a solution of
J (y) + K (y) = 0
F d F g d g
+ = 0.
y dx y y dx y
Let L = F + G then we have
L d L
= 0.
y dx y
This is the first necessary condition.
CALCULUS OF VARIATIONS 45
Eulers Rule
Make J + K stationary and satisfy K(Y ) = k. Check for
(
min J(y) 6 J(Y )
max J(y) > J(Y )
Here we assume K (y) 6= 0, i.e. y is not an extremal of K(y).
Example 16.3. We want to maximize
Z a
J(y) = y dx
a
subject to the constraints y(a) = y(a) = 0 and
Z ap
K(y) = 1 + (y )2 dx = L.
a
p
Now, use Eulers Rule. Let F = y, G = 1 + (y )2 and form
p
F + G = y + 1 + (y )2 .
This gives us
Z a
I(y) = L dx.
a
Make I(y) stationary and hence solve
L d L
=0
y dx y
d y
1 p =0
dx 1 + (y )2
!
d y
p =1
dx 1 + (y )2
y
p =x+c
1 + (y )2
2 (y )2
= (x + c)2
1 + (y )2
x+c
y = p
2 (x + c)2
21
y = 2 (x + c)2 + c1
(y c1 )2 = 2 (x + c)2
46 PROF. ARNOLD ARTHURS
p
which is (part of) a cricle, centred at (c, c1 ) and radius r = ||. Or L = y + 1 + (y )2 .
However, note that Lx
= 0 and hence there is a first integral
L y Ly = const.
The equation above can be written
(x c)2 + (y c1 )2 = 2 .
This must satisfy the conditions that y(a) = 0 = y(a) and K(y) = L. This can be
realised diagramatically as:
Example 16.4. Find the extremum of
Z 1
J(y) = y 2 dx
0
subject to the constraints
Z 1
K(y) = (y )2 dx = const. = k 2 y(0) = 0 = y(1)
0
We use Eulers Rule and form
Z 1
I(y) = J(y) + K(y) = L dx
0
where L = y 2 + (y )2 . Make I(y) stationary solve
L d L
=0
y dx y
d
2y (2y ) = 0
dx
1
y y = 0
There are three cases for this
(1) 1 > 0 where sin x1 = 2 ( 6= 0) then
y = A cosh x + B sinh x.
1
(2)
= 0 then
y = ax + b.
(3) 1
< 0 where sinh1 1 2
= ( 6= 0) then
y = C cos x + D sin x.
CALCULUS OF VARIATIONS 47
y(0) = 0 A = 0, b = 0, C = 0
y(1) = 0 B = 0, a = 0, D sin = 0
sin = 0
= n
with n = 1, 2, . . . . So, y = D sin nx with n = 1, 2, 3, . . . . Take yn = sin nx. To include
all these functions we take
X
y= cn y n yn = sin nx.
n=1
Now calculate J(y) and K(y) for this y function.
Z 1
J(y) = y 2 dx
0
Z
!
!
1 X X
= cn sin nx cm sin nx dx
0
n=1 m=1
XX Z 1
= cn cm sin(nx) sin(mx) dx
n=1 m=1 0
| {z }
= 12 nm
X 1
= c2n .
n=1
2
Also,
Z 1
2
k = K(y) = (y )2 dx
0
Z
!
!
1 X X
= cn n cos(nx) cm cos(mx) dx
0 n=1 m=1
XX Z 1
2
= cn cm nm cos(nx) cos(mx) dx
n=1 m=1 0
| {z }
1
2 nm
X 1
= c2n n2 2 .
n=1
2
Now,
48 PROF. ARNOLD ARTHURS
X 1
J(y) = c2n
n=1
2
X 1
K(y) = c2n n2 2 = k2
n=1
2
1 k2 X 2 2 1
c21 = 2 cn n .
2 n=2
2
Write
1 X 1
J(y) = c21 + c2n
2 n=2
2
k2 1 X 21
X
= 2 c2n n2 + c
n=2
2 n=2 n 2
k2 X 2 1 2
= 2 cn (n 1)
n=2
2
k2
6
2
2k 2
with equality when cn = 0 (n = 2, 3, 4, . . . ) leaving c21 = 2
. Then
y = c1 sin x,
with c1 = k 2 .
Note. We had the Wertinger Inequality (Theorem 15.1)
Z b Z b
2 2
(y ) dx > y 2 dx
a (b a)2 a
where y(a) = 0 and y(b) = 0. Relating this to our example above we have a = 0 and b = 1.
Thus,
Z 1 Z 1
2 2
(y ) dx > y 2 dx
0 0
k 2 > 2 J(y)
k2
or J(y) 6 2
as found above.
CALCULUS OF VARIATIONS 49
J (y) = 0.
Now, suppose we are unable to solve this equation for y. The idea is to approach the
problem of finding J(y) directly. In this case we probably have to make do with an
approximate estimate of J(y). To illustrate, consider the case
Z 1
1 2 1 2
J(Y ) = (Y ) + Y qY dx
0 2 2
| {z }
=F
with Y (0) = 0 and Y (1) = 0. The Euler-Lagrange equation for y is
F d F
= 0 y q y = 0
y dx y
y + y = q 0<x<1
with y(0) = 0 = y(1). For Y = y + ,
Z 1
2
J(Y ) = J(y) + ( )2 + 2 dx
2 0
> J(y)
a MIN principle. Suppose we cannot solve the Euler-Lagrange equation. Then we try to
minimize J(Y ) by any route. To take a very simple approach we choose a trial function
Y1 = x(1 x)
such that Y1 (0) = 0 = Y1 (1). We then calculate
J(Y1 ) = J()
Z Z 1
1 2 1 2 2
= ( + ) dx q dx
2 0 0
1
= 2 A B
2
with, say
50 PROF. ARNOLD ARTHURS
Z 1 Z 1
2 2
A= ( + ) dx B= q dx.
0 0
We have that
dJ B
= A B = 0 opt =
d A
Then
1 B2 B 1 B2
J(opt ) = A B = .
2 A2 A 2 A
11
Now for q = 1 we have A = 30 , B = 16 , opt = B
A
5
= 11 5
and thus J(opt ) = 132 . We note
for later that this has decimal expansion
5
J(opt ) = = 0.037 8.
132
We can continue by introducing more than one variable. For example take
Y2 = 1 1 + 2 2
with, say 1 = x(1 x) and 2 = x2 (1 x). We find
y1 = 1 y2 = aex + bex .
This gives our general solution to the differential equation to be
y = y1 + y2 = 1 + aex + bex .
Now we use the boundary conditions to solve for a and b. These boundary conditions give
us
y(0) = 0 1 + a + b = 0 b = 1 a
y(1) = 0 1 + ae + be1 = 0
1 + ae (1 + a)e1 = 0
1
a= .
1+e
e
Then b = 1 a = 1+e . Hence our general solution becomes
CALCULUS OF VARIATIONS 51
ex e x
y =1 e
1+e 1+e
1
=1 {ex + e1x }.
1+e
Now calculate J(y).
Z 1
1 2 1 2
J(y) = (y ) + y y dx since q = 1
0 2 2
Z 1
1 1 2 1 1
= yy + y y dx + [yy ]0
0 2 2 2 | {z }
=0
Z 1
1
= y(y + y) y dx
0 2
but y + y = 1 by the Euler Lagrange equation and hence
Z
1 1
= y dx.
2 0
So putting in our solution for y we obtain
Z
1 1 1 x 1x
J(y) = 1 (e + e ) dx
2 0 1+e
1 1 1
= + {e 1 (1 e)} dx
2 21+e
1 e1
= +
2 e+1
= 0.03788246 . . .
which is very close to our approximation. In fact the difference between the two values is
0.00000368.
Hence we have
1 1
(18.2) F = (Y )2 + wY 2 qY
2 2
F
(18.3) P =
= Y
Y
H(x, P, Y ) = P Y F
1 1
(18.4) = P 2 wY 2 + qY
2 2
Now replace J(Y ) by
Z a
dY
(18.5.i) I(P, Y ) = p H(x, P, Y ) dx [P Y ]ba
b dx
Z b
(18.5.ii) = {P Y H} dx.
a
Then we know from section 9 that I(P, Y ) is stationary at (p, y) where
dy H
(18.6) = (a < x < b) and y = 0 on [a, b]
dx p
dp H
(18.7) = (a < x < b).
dx y
Now we define dual integrals J(y1 ) and G(p2 ). Let
dy H
(18.8) 1 = (p, y) dx = p (a < x < b) and y = 0 on [a, b]
dp H
(18.9) 2 = (p, y)
dx = y (a < x < b) .
Then we define
1 1
(18.13) H(x, P, Y ) = P 2 wY 2 + qY
2 2
CALCULUS OF VARIATIONS 53
Z b
1 2 1 2
J(y1 ) = y12
(y ) wy1 + qy1 dx
a 2 2
Z b
1 2 1 2
(18.16) = y + wy1 qy1 dx
a 2 1 2
1
the original Euler-Lagrange integral J. Next G(p2 ) = I(p2 , y2 ) with y2 = (p
w 2
+ q)
G(p2 ) = I(p2 , y2 )
Z b
1 2 1 2
= p2 y2 p wy + qy2 dx
a 2 2 2 2
this uses (18.5.ii)
Z b
1 2 1 2
= p2 + wy2 y2 (p2 + q) dx
a 2 2 | {z }
=wy2
Z b
1 2 1 2
= p2 wy2 dx
a 2 2
1
with y2 = (p
w 2
+ q)
Z b
1 1
(18.17) = p22 + (p2 + q)2 dx
2 a w
If we write
(18.18) y1 = y + p2 = p +
in (18.16) and (18.17), expand about y and p and use the stationary property
J(y, ) = 0 G(p, ) = 0.
We have
54 PROF. ARNOLD ARTHURS
Z b
1 2
(18.19) J = J(y1 ) J(y) = ( ) + w()2 dx
2 a
Z
1 b 2 1 2
(18.20) G = G(p2 ) G(p) = () + ( ) dx
2 a w
Hence, if
Z 1
1 2
G(p2 ) = p2 + (p2 + 1)2 dx.
2 0
Here p2 is any admissible function. To make a good choice for p2 we note that the exact
p and y are linked through
dy
p= y(0) = 0 = y(1).
dx
So, we follow this and choose
d
p2 = () = x(1 x)
dx
Thus we obtain
G(p2 ) = G()
Z
1 1 2 2
= + ( + 1)2 dx
2 0
1
= A 2 B C
2
CALCULUS OF VARIATIONS 55
where we define
Z 1 13
A= ( )2 + ( )2 dx =
0 3
Z 1
B= dx = 2
0
1
C= .
2
dG 6
Now compute d
= A B = 0 opt = B
A
= 13
. So,
B2 1
MAX G() = G(opt ) = C = = 0.038461 . . .
2A 26
From the previous calculation of J(y1 ) we had in section 17 that
MIN = 0.03787878 . . .
Hence we have bounded I(p, y) such that
19. Eigenvalues
(19.1) Ly = y.
For example
d2 y
(19.2) 2 = y 0<x<1
dx
(19.3) y(0) = 0 y(1) = 0.
We have that (19.2) has general solution
(19.4) y = A sin x + B cos x
(19.5) y(0) = 0 B = 0
y(1) = 0 A sin = 0
Take sin = 0 then we have
(19.6) = n, n = 1, 2, 3, . . . or = n2 2 , n = 1, 2, 3 . . .
Then we have that
56 PROF. ARNOLD ARTHURS
(
yn = sin nx n = 1, 2, 3, . . .
n = n2 2
In general, Eignevalue problems (19.1) cannot be solved exactly, so we need methods for
estimating Eigenvalues. One method is due to Rayleigh: it provides an upper bound for
1 , the lowest Eigenvalue of the problem. Consider
Z
1
d2 y
(19.7) A(Y ) = Y 2 1 Y dx
0 dx
for all Y = {C2 functions such that Y (0) = 0 = Y (1)}. We write
X
Y = an yn
n=1
d2 y
{yn } = complete set of eigenfunctions of
dx2
= {sin nx}.
Then (19.7) becomes
Z 1X 2 X
d y
A(Y ) = an yn 2 1 am ym dx
0 n=1 dx m=1
X X Z 1
= an am (m 1 ) yn ym
n=1 m=1 0
| {z }
=0(n6=m)
X Z 1
= a2n (n 1 ) yn2 dx
n=1 0
>0
where n = n2 2 and hence 0 < 1 < 2 < 3 < . . . . So we have A(Y ) > 0. Rearranging
(19.7) we obtain
Z 1 2 Z 1
d y
Y 2 Y dx > 1 Y 2 dx
0 dx 0
R 1 d2 y
0
Y dx2 dx
(19.8) 1 6 R1 = (Y ),
Y 2 dx
0
1
3
(Y ) = 1 = 10.
30
The exact 1 = 2 9.87.
Example 19.1. From example 2 of section 16 we have
() (u , u) > 0 (u, u)
where u(0) = 0, u(1) = 0 and
Z 1
(u, u) = u2 dx/
0
2
We know that 0 = 4
. We can use () to obtain an upper bound for 0 :
(u , u)
0 6 .
(u, u)
Take u = 2x x2 such that u(0) = 0, u (1) = 0. So
4
3 20
() 0 6 7 = 2.85.
15
7
2 9.87
Note 0 = 4
4
2.47.
F d F
=0
y dx y
dy H
=
dx p
dp H
= .
dx y
d d
Simplest pairs of derivatives dx and dx appear. We now attempt to generalise. We take
an inner produce space S of functions u(x) and suppose hu, vi denotes inner product. A
very useful case is
Z
hv, ui = uvw(x) dx
hu, vi = hv, ui
hu, ui > 0.
d
Write T as a linear differetial operator, e.g. T = dx
and define its adjoint T by
hu, T vi = hT u, vi,
where we suppose boundary terms vanish.
d
R
Example 20.1. T = dx and hu, vi = uvw(x) dx. Then
Z
dv
hu, T vi = u
w(x) dx
dx
Z
d
= (1) (w(x)u)v dx + [ ]
dx |{z}
=0
Z
1 d
= (1) (wu)vw(x) dx
w dx
= hT u, vi.
So,
1 d 1 d
T u = w(x)u T = (w)
w dx w dx
From T and T we can construct L = T T and hence
1 d
d
L=T T = w .
w dx dx
Now look at
Z
hu, Lui = u(T T u)w(x) dx
= hu, T T ui
= hT u, ui
Z
= (T u)(T u)w(x) dx
> 0.
d d d 2
So, L is a positive operator (e.g. w = 1, T = dx
, T = dx L = dx 2 ). Also
CALCULUS OF VARIATIONS 59
Lu = u
(i.e. T T u = u). is an eigenvalue. Suppose there is a complete set of eigenfunctions
{un }, corresponding to eigenvalues {n }. Now
0 = (n m )hum , un i
d d d2
w=1 t= T = L= un = sin(nx) n = n2 2 .
dx dx dx2
Another example is
d 1 d 1 d d
w=x T = T = (x) L = x un = J0 (an x) n = a2n .
dx x dx x dx dx
X
u(x) = an un (x) Lun = n un .
n=1
Then
X
huk , ui = huk , un i = ck huk , uk i.
n=1
R
Consider A(u) = hT u, T ui for example hu , ui = u2 w(x) dx. We have that
*
+
X X
A(u) = cn u n , L cm u m
n=1 m=1
X
X
= cn cm hun , Lum i
n m
CALCULUS OF VARIATIONS 61