You are on page 1of 61

CALCULUS OF VARIATIONS

PROF. ARNOLD ARTHURS

1. Introduction
Example 1.1 (Shortest Path Problem). Let A and B be two fixed points in a space. Then
we want to find the shortest distance between these two points. We can construct the
problem diagrammatically as below.

Y = Y (x)
B
ds
dY
dx
A

a b x

Figure 1. A simple curve.

From basic geometry (i.e. Pythagoras Theorem) we know that

ds2 = dx2 + dY 2

(1.1) = 1 + (Y )2 dx2 .
The second line of this is achieved by noting Y = dY . Now to find the path between the
dx R
B
points A and B we integrate ds between A and B, i.e. A ds. We however replace ds using
equation (1.1) above and hence get the expression of the length of our curve
Z bp
J(Y ) = 1 + (Y )2 dx.
a
To find the shortest path, i.e. to minimise J, we need to find the extremal function.
Example 1.2 (Minimal Surface of Revolution - Euler). This problem is very similar to the
above but instead of trying to find the shortest distance, we are trying to find the smallest
surface to cover an area. We again display the problem diagrammatically.
1
2 PROF. ARNOLD ARTHURS

Y = Y (x)
B

ds
A
Y

a b x

Figure 2. A volume of revolution, of the curve Y (x), around the line y = 0.

To find
R B the surface of our shape we need to integrate 2Y ds between the points A and B,
i.e. A 2Y ds. Substituting ds as above with equation (1.1) we obtain our expression of
the size of the surface area
Z b p
J(Y ) = 2Y 1 + (Y )2 dx.
a
To find the minimal surface we need to find the extremal function.
Example 1.3 (Brachistochrone). This problem is derived from Physics. If I release a bead
from O and let it slip down a frictionless curve to point B, accelerated only by gravity,
what shape of curve will allow the bead to complete the journey in the shortest possible
time. We can construct this diagrammatically below.

b
O
x
v
Y = Y (x)
g
ds
B

Figure 3. A bead slipping down a frictionless curve from point O to B.

In the above problem, we want to minimise the variable of time. So, we construct our
integral accordingly and consider the total time taken T as a function of the curve Y .
CALCULUS OF VARIATIONS 3

Z b
T (Y ) = dt
x=0
ds
now using v = dt
and rearranging we achieve
Z b
ds
= .
x=0 v

Finally using the formula v 2 = 2gY we obtain


Z bs
1 + (Y )2
= dx.
0 2gY
Thus to find the smallest possible time taken we need to find the extremal function.
Example 1.4 (Isoperimetric problems). These are problems with constraints. A simple
example of this is trying to find the shape that maximises the area enclosed by a rectangle
of fixed perimeter p.

y A

Figure 4. A rectangle with sides of length x and y.

We can see clearly that the constraint equations are

(1.2) A = xy
(1.3) p = 2x + 2y.
By rearranging equation (1.3) in terms of y and substituting into (1.2) we obtain that

x
A= p 2x
2
dA p
= 2x
dx 2
dA
we require that dx
= 0 and thus
p p
2x = 0 x =
2 4
4 PROF. ARNOLD ARTHURS

and finally substituting back into equation (1.3) gives us


 
1 1 p
y= p p = .
2 2 4

Thus a square is the shape that maximises the area.

Example 1.5 (Chord and Arc Problem). Here we are seeking the curve of a given length
that encloses the largest area above the x-axis. So, we seek the curve Y = y (y is reserved
for the solution and Y is used for the general case). We describe this diagrammatically
below.

Y = Y (x)

AREA

x
b

Figure 5. A curve Y (x) above the x-axis.

We have the area of the curve J(Y ) to be

Z b
J(Y ) = Y dx
0

where J(Y ) is maximised subject to the length of the curve

Z bp
K(Y ) = 1 + (Y )2 dx = c,
0

where c is a given constant.

2. The Simplest / Fundamental Problem


Examples 1.1, 1.2 and 1.3 are all special cases of the simplest/fundamental problem.
CALCULUS OF VARIATIONS 5

B
yb

Y
A
ya

a b x

Figure 6. The Simplest/Fundamental Problem.

Suppose A(a, ya ) and B(b, yb ) are two fixed points and consider a set of curves

(2.1) Y = Y (x)
joining A and B. Then we seek a member Y = y(x) of this set which minimises the integral

Z b
(2.2) J(Y ) = F (x, Y, Y ) dx
a

where Y (a) = ya , Y (b) = yb . We note that examples 1.1 to 1.3 correspond to a specification
of the above general case with the integrand

p
F = 1 + (Y )2 ,
p
F = 2Y 1 + (Y )2 ,
s
1 + (Y )2
F = .
2gY
An extra example is
Z b 
1 1
J(Y ) = p(x)(Y )2 + q(x)Y 2 + f (x)Y dx.
a 2 2
Now, the curves Y in (2.2) may be continuous, differentiable or neither and this affects the
problem for J(Y ). We shall suppose that the functions Y = Y (x) are continuous and have
continuous derivatives a suitable number of times. Thus the functions (2.1) belong to a
set of admissible functions. We define precisely as
6 PROF. ARNOLD ARTHURS

 k

Y continuous and d Y continuous, k = 1, 2, . . . , n .

(2.3) = Y dxk

So the problem is to minimise J(Y ) in (2.2) over the functions Y in where

Y (a) = ya and Y (b) = yb .


This basic problem can be extended to much more complicated problems.

Example 2.1. We can extend the problem by considering more derivatives of Y . So the
integrand becomes

F = F (x, Y, Y , Y ),
i.e. F depends on Y as well as x, Y, Y .

Example 2.2. We can consider more than one function of x. Then the integrand becomes

F = F (x, Y1 , Y2, Y1 , Y2 ),
so F depends on two (or more) functions Yk of x.

Example 2.3. Finally we can consider functions of more than one independent variable.
So, the integrand would become

F = F (x, y, , x , y ),
where subscripts denote partial derivatives. So, F depends on functions (x, y) of two
independent variables x, y. This would mean that to calculate J(Y ) we would have to
integrate more than once, for example
ZZ
J(Y ) = F dxdy.

Note. The integral J(Y ) is a numerical-valued function of Y , which is an example of a


functional.

Definition: Let R be the real numbers and a set of functions. Then the function
J : R is called a functional.

Then we can say that the calculus of variations is concerned with maxima and minima
(extremum) of functionals.
CALCULUS OF VARIATIONS 7

3. Maxima and Minima


3.1. The First Necessary Condition
(i) We use ideas from elementary calculus of functions f (u).

f (u)

a u

Figure 7. Plot of a function f (u) with a minimum at u = a.

If f (u) > f (a) for all u near a on both sides of u = a this means that there is a
minimum at u = a. The consequences of this are often seen in an expansion. Let
us assume that there is a minimum at f (a) and a Taylor expansion exists about
u = a such that

h2
(3.1) f (a + h) = f (a) + hf (a) + f (a) + O3 (h 6= 0).
2
Note that we define df (a, h) := hf (a) to be the first differential. As there exists a
minimum at u = a we must have

(3.2) f (a + h) > f (a) for h (, )


by the above comment. Now, if f (a) 6= 0, say it is positive and h is sufficiently
small, then

(3.3) sign{f = f (a + h) f (a)} = sign{df = hf (a)} (6= 0).


In equation (3.3) the L.H.S. > 0 because f has a minimum at a and hence equation
(3.2) holds and also the R.H.S. > 0 if h > 0. However this is a contradiction, hence
df = 0 which f (a) = 0.
(ii) For functions f (u, v) of two variables; similar ideas hold. Thus if (a, b) is a minimum
then f (u, v) > f (a, b) for all u near a and v near b. Then for some intervals (1 , 1 )
and (2 , 2 ) we have that

a 1 6u 6 a + 1
(3.4)
b 2 6v 6 b + 2
8 PROF. ARNOLD ARTHURS

gives a minimum / maximum at (a, b) 6 f (a, b). The corresponding Taylor expan-
sion is

(3.5) f (a + h, b + k) = f (a, b) + hfu (a, b) + kfv (a, b) + O2 .

We note that in this case the first derivative is df (a, b, h, k) := hfu (a, b) + kfv (a, b).
For a minimum (or a maximum) at (a, b) it follows, as in the previous case that a
necessary condition is

f f
(3.6) df = 0 = = 0 at (a, b).
u v
(iii) Now considering functions of multiple (say n) variables, i.e. f = f (u1 , u2 , . . . , un ) =
f (u) we have the Taylor expansion to be

(3.7) f (a + h) = f (a) + h f (a) + O2 .

Thus the statement from the previous case (3.6) becomes

(3.8) df = 0 f (a) = 0.

3.2. Calculus of Variations


Now we consider the integral

Z b
(3.9) J(Y ) = F (x, Y, Y ) dx.
a

Suppose J(Y ) has a minimum for the curve Y = y. Then

(3.10) J(Y ) > J(y)

for Y = {Y | Y C2 , Y (a) = ya , Y (b) = yb }. To obtain information from (3.10) we


expand J(Y ) about the curve Y = y by taking the so-called varied curves

(3.11) Y = y +

like u = a + h in (3.2). We can represent the consequences of this expansion diagrammat-


ically.
CALCULUS OF VARIATIONS 9

Y = y(x) + (x)

yb B

Y = y(x)
ya A

a x b

Figure 8. Plot of y(x) and the expansion y(x) + (x).

Since all curves Y , including y, go through A and B, it follows that

(3.12) (a) = 0 and (b) = 0.


Now substituting equation (3.11) into (3.10) gives us

(3.13) J(Y ) = J(y + ) > J(y)


for all y + and substituting (3.11) into (3.9) gives us

Z b
(3.14) J(y + ) = F (x, y + , y + ) dx.
a

Now to deal with this expansion we take a fixed x in (a, b) and treat y and y as independent
variables. Recall Y and Y are independent and the Taylor expansion of two variables
(equation (3.5)) from above. Then we have

f f
(3.15) f (u + h, v + k) = f (u, v) + h +k + O2 .
u v
Now we take u = y, h = , v = y , k = , f = F . Then (3.14) implies

Z b 
F F 2
J(Y ) = J(y + ) = F (x, y, y ) + + + O( ) dx
a y y
= J(y) + J + O2 .
We note that J is calculus of variations notation for dJ where we have
10 PROF. ARNOLD ARTHURS

Z b 
F F
(3.16) J = + dx
a y y
= linear terms in = first variation of J
and we also have that
 
F F (x, Y, Y )
(3.17) = .
y Y Y =y,Y =y

Now J is analogous to the linear terms in (3.1), (3.5), (3.7). We now prove that if J(Y )
has a minimum at Y = y, then

(3.18) J = 0
the first necessary condition for a minimum.
Proof. Suppose J 6= 0 then J(Y ) J(y) = J + O2 . For small enough then

sign{J(Y ) J(y)} = sign{J}.


We have that J > 0 or J < 0 for some varied curves corresponding to . However there
is a minimum of J at Y = y L.H.S. > 0. This is a contradiction. 
For our J, we have by (3.17)
Z b 
F F
(3.19) J = + dx.
a y y
If we integrate 2nd term by parts we obtain
Z b    b
F d F F
(3.20) J = dx +
a y dx y y
| {z }a
=0
and so we have

Z b  
F d F
(3.21) J = dx.
a y dx y
Note. For notational purposes we write
Z b
(3.22) hf, gi = f (x)g(x) dx
a
which is an inner (or scalar) product. Also, for our J, write
CALCULUS OF VARIATIONS 11

F d F
(3.23) J (y) =

y dx y
as the derivative of J. Then we can express J as


(3.24) J = , J (y) .
This gives us the Taylor expansion


(3.25) J(y + ) = J(y) + , J (y) +O2 .
| {z }
J

We compare this with the previous cases of Taylor expansion (3.1), (3.5) and (3.7). Now
collecting our results together we get
Theorem 3.1. A necessary condition for J(Y ) to have an extremum (maximum or mini-
mum) at Y = y is



(3.26) J = , J (y) = 0
for all admissible . i.e.

J(Y ) has an extremum at Y = y J(y, ) = 0.


Definition: y is a critical curve (or an extremal ), i.e. y is a solution of J = 0. J(y) is a
stationary value of J and (3.26) is a stationary condition.

To establish an extremum, we need to examine the sign of

J = J(y + ) J(y) = total variation of J.


We consider this later in part two of this course.

4. Stationary Condition (J = 0)
Our next step is to see what can be deduced from the condition

(4.1) J = 0.
In detail this is
Z b


(4.2) , J (y) = J (y) dx = 0.
a
Rb
For our case J (Y ) = a
F (x, Y, Y ) dx, we have
12 PROF. ARNOLD ARTHURS

F d F
(4.3) J (y) = .
y dx y
To deal with equations (4.2) and (4.3) we use the Euler-Legrange Lemma. Using this
Lemma we can state
Theorem 4.1. A necessary condition for

Z b
(4.4) J(Y ) = F (x, Y, Y ) dx,
a

with Y (a) = ya and Y (b) = yb , to have an extremum at Y = y is that y is a solution of

F d F
(4.5) J (y) = =0
y dx y
with a < x < b and F = F (x, y, y ). This is known as the Euler-Lagrange equation.
The above theorem is the Euler-Lagrange variational principle and it is a stationary prin-
ciple.
Example 4.1 (Shortest Path Problem). We now revisit p Example 1.1 with the mechanisms

that we have just set up.
p So we had that F (x, Y, Y ) = 1 + (Y )2 previously but instead
we write F (x, y, y ) = 1 + (y )2 . Now

F F 2y y
= 0, = p = p .
y y 2 1 + (y )2 1 + (y )2
The Euler-Lagrange equation for this example is
!
d y
0 p =0 a<x<b
dx 1 + (y )2
y
p = const.
1 + (y )2
y =
y = x +
where , are arbitrary constants. So, y = x + defines the critical curves. We require
more information to establish a minimum.

Example
p 4.2 (Minimum Surface Problem). Now revisitingp Example 1.2, we had F (x, Y, Y ) =

2Y 1 + (Y )2 but again we write F (x, y, y ) = 2y 1 + (y )2 . We drop the 2 to give
us
CALCULUS OF VARIATIONS 13

F p F yy
= 1 + (y )2 , = p .
y y 1 + (y )2
So, we are left with the Euler-Lagrange equation
!
p d yy
1 + (y )2 p = 0.
dx 1 + (y )2
Now solving the differential equation leaves us with
 23 
1 + (y )2 1 + (y )2 yy = 0,
for finite y in (a, b). We therefore have

(4.6) 1 + (y )2 = yy .

Start by rewriting y in terms of y and y . Then substituting gives us

dy
y =
dx
dy dy
=
dy dx
dy 1 d(y )2
(4.7) = y = .
dy 2 dy
So, substituting (4.7) into (4.6), the Euler-Lagrange equation implies that

1 d(y )2
1 + (y )2 = y
2 dy
dy 1 d(y )2 1 dz
= 2
=
y 2 1 + (y ) 21+z
1 
ln y = ln 1 + (y )2 + C
2p
y = C 1 + (y )2 ,
which is a first integral. Now integrating again gives us
 
x + C1
y = cosh
C
where C1 and C are arbitrary constants. We note that this is a catenary curve.
14 PROF. ARNOLD ARTHURS

5. Special Cases of the Euler-Lagrange Equation


for the Simplest Problem
For F (x, y, y ) the Euler-Lagrange equation is a 2nd order differential equation in general.
There are special cases worth noting.
F
(1) y
= 0 F = F (x, y ) and hence y is missing. The Euler-Lagrange equation is

F d F
=0
y dx y
=0
d F
=0
dx y
F
=c
y
on orbits (extremals or critical curves), first integral. If this can be for y thus

y = f (x, c)
R
then y(x) = f (x, c) dx + c1 .

Example 5.1. F = x2 + x2 (y )2 . Hence F


y
= 0. So Euler-Lagrange equation has
a first integral y = c 2x y = c y = 2xc 2 y = 2x
F 2 c
+ c1 , which is the
solution.
F
(2) x
= 0 and so F = F (y, y ) and hence x is missing. For any differentiable F (x, y, y )
we have by the chain rule

dF F F F
= + + y
dx x y y
 
F d F F
= +
y + y
x dx y y
 
F d F
= + y
x dx y

on orbits. Now, we have


 
d F F
F y =
dx y x
F
on orbits. When x
= 0, this gives
CALCULUS OF VARIATIONS 15

 
d F
F y = 0 on orbits
dx y
F y Fy = c

on orbits. First integral.

Note. G := F y Fy is the Jacobi function and G = G(x, y, y ).


p
Example 5.2. F = 2y 1 + (y )2 . An example of case 2. So we have

2yy
Fy = p .
1 + (y )2
then the first integral is

F = y Fy
p (y )2
= 2y{ 1 + (y )2 p }
1 + (y )2
2y
=p
1 + (y )2
= const. = 2c
p
y=c 1 + (y )2 - First Integral.

This arose in Example 4.2 as a result of integrating the Euler Lagrange equation
once.
F
(3) y = 0 and hence F = F (x, y), i.e. y is missing. The Euler-Lagrange equation is

F d F F
0=
= =0
y dx y y
but this isnt a differential equation in y.

Example 5.3. F (x, y, y ) = y ln y + xy and then the Euler-Lagrange equation


gives

F d F
= 0 ln y 1 + x = 0
y dx y
ln y = 1 + x y = e1+x
16 PROF. ARNOLD ARTHURS

6. Change of Variables
In the shortest path problem above, we note that J(Y ) does not depend on the co-ordinate
system chosen. Suppose we require the extremal for

Z 1 p
(6.1) J(r) = r 2 + (r )2 d
0

dr
where r = r() and r = d
. The Euler-Lagrange equation is
!
F d F r d r
(6.2)
=0 p p = 0.
r d r r 2 + (r )2 d r 2 + (r )2
To simplify this we can
(a) change variables in (6.2), or
(b) change variables in (6.1).
For example, in (6.1) take

x = r cos dx = dr cos r sin d


y = r sin dy = dr sin + r cos d.
So, we obtain

dx2 + dy 2 = dr 2 + r 2 d2
(  )
2
 dr
1 + (y )2 dx2 = + r 2 d2
d
p p
1 + (y )2 dx = r 2 + (r )2 d.
This is just the smallest path problem, hidden in polar coordinates. We know the solution
to this is
Z p
=
J(r) J(y) 1 + (y )2 dx.

Now the Euler-Lagrange equation is y = 0 y = x + . Now,

r sin = r cos +

r=
sin cos
CALCULUS OF VARIATIONS 17

7. Several Independent functions of One Variable


Consider a general solution of the simplest problem.

Z b
(7.1) J(Y1 , Y2, . . . , Yn ) = F (x, Y1 , . . . , Yn , Y1 , . . . , Yn ) dx.
a

Here each curve Yk goes through given end points

(7.2) Yk (a) = yka , Yk (b) = ykb for k = 1, . . . , n.


Find the critical curves yk , k = 1, . . . , n. Take

(7.3) Yk = yk (x) + k (x)


where k (a) = 0 = k (b). By taking a Taylor expansion around the point (x, y1 , . . . , yn , y1 , . . . , yn )
we find that the first variation of J is

n
X

(7.4) J = k , J (y1 , . . . , yn ) .
k=1

with

F d F
(7.5) Jk (y1 , . . . , yn ) =
yk dx yk
for k = 1, . . . , n. The stationary condition J = 0

n
X
(7.6) hk , Jk i = 0
k=1
(7.7) hk , Jk i = 0 by linear independence.
for all k = 1, . . . , n. Thus, by the Euler-Lagrange Lemma, this implies

(7.8) Jk = 0
for k = 1, . . . , n. i.e.

F d F
(7.9) =0
yk dx yk
for k = 1, . . . , n. (7.9) is a system of n Euler-Lagrange equations. These are solved subject
to (7.3).
18 PROF. ARNOLD ARTHURS

Example 7.1. F = y1 y22 + y12 y2 + y1 y2 and so we obtain equations



d
J1 y22
= 0 + 2y1y2 (y ) = 0
dx 2
d
J2 = 0 2y1 y2 + y12 (y1 ) = 0
dx
Consider

n  
dF F X F F
= + yk + yk
dx x k=1
yk yk

the general chain rule.


n  
F X d F F
= + yk
+ yk
x k=1
dx y k yk

on orbits.
n  
F X d F
= + yk
x k=1
dx yk

and this is again on orbits. Now we obtain


( n
)
d X F F
F yk =
dx yk x
k=1

on orbits. We define G = { }. Then

dG F
= .
dx x
F
If x
= 0, i.e. F does not depend implicitly on x, we have

dG d
= =0
dx dx
on orbits. i.e.

n
X F
F yk =c
yk
k=1

on orbits, where c is a constant. This is a first integral. i.e. G is constant on orbits. This
is the Jacobi integral.
CALCULUS OF VARIATIONS 19

8. Double Integrals
Here we look at functions of 2 variables, i.e. surfaces,

(8.1) = (x, y).


We then take the integral
ZZ
(8.2) J() = F (x, y, , x , y ) dxdy
R
with B = given on the boundary R of R. Suppose J() has a minimum for = .
Hence R is some closed regioon in the xy-plane and x = x
, y = y
. Assume F has
continuous first and second derivatives with respect to x, y, , x , y . Consider
ZZ
J( + ) = F (x, y, + , x + x , y + y ) dxdy
R

and expand in Taylor series


ZZ  
F F F
= F (x, y, , x, y ) + + x + y + O2 dxdy
R x y
(8.3) = J() + J + O2
where we have
ZZ  
F F F
(8.4) J = + x + y dxdy
R x y
is the first variation. For an extremum at , it is necessary that

(8.5) J = 0.
To use this we rewrite (8.4)

(8.6) ZZ         
F F F F F
J = + + dxdy.
R x x y y x x y y
Now by Greens Theorem we have
ZZ Z
P F
dxdy = P dy P =
x x
Z ZR R
Z
Q F
(8.7) dxdy = Q dx Q=
R y R y
So,
20 PROF. ARNOLD ARTHURS

ZZ   Z  
F F F F F
(8.8) J = dxdy + dy dx .
R x x y y R x y
If we choose all functions including to statisfy

= B on R
then + = B on R
=0 on R.
Hence (8.8) simplifies to

ZZ  
F F F
J = (x, y) dxdy
R x x y y


, J () .
By a simple extension of the Euler-Lagrange lemma, since is arbitrary in R, we have
J = 0 (x, y) is a solution of

F F F
(8.9) J () = = 0.
x x y y

This is the Euler-Lagrange Equation - a partial differential equation. We seek the solution
which takes the given values B on R.
Example 8.1. F = F (x, y, , x, y ) = 12 2x + 12 2y + f , where f = f (x, y) is given. The
Euler-Lagrange equation is

F F F
=f = x = y .
x y
So,

F F F
=0f x y = 0.
x x y y x y
So, i.e. we are left with

2 2
+ 2 = f,
x2 y
which is Poissons Equation.
CALCULUS OF VARIATIONS 21

Example 8.2. F = F (x, t, , x , t ) = 21 2x 1


2 .
2c2 t
The Euler-Lagrange equation is

F F F
=0
x x x t t
 
1
0 x t = 0
x t c2
2 1 2
=
x2 c2 t2
This is the classical wave equation.

9. Canonical Euler-Equations (Euler-Hamilton)


9.1. The Hamiltonian
We have the Euler-Lagrange equations

F d F
(9.1) =0
yk dx yk
which give the critical curves y1 , . . . , yn of

Z
(9.2) J(Y1 , . . . , Yn ) = F (x, Y1 , . . . , Yn , Y1 , . . . , Yn ) dx

Equations (9.1) form a system of n 2nd order differential equations. We shall now rewrite
this as a system of 2n first order differential equations. First we introduce a new variable

F
(9.3) pi = i = 1, . . . , n.
yi
pi is said to be the variable conjugate to yi . We suppose that equations (9.3) can be solved
to give y as a function i of x, yj , pj (j = 1, . . . , n). Then it is possible to define a new
function H by the equation

n
X
(9.4) H(x, y1, . . . , yn , p1 , . . . , pn ) = pi yi F (x, y1 , . . . , yn , y1 , . . . , yn )
i=1

where yi = i (x, yj , pj ). The function H is called the Hamiltonian corresponding to (9.2).


Now look at the differential of H which by (9.4) is
22 PROF. ARNOLD ARTHURS

n n
!
X F X F F
dH = ( dyi + yi dpi )
pi  dx dyi + dyi
i=1
x i=1
yi yi
n  
F X F
(9.5) = dx + yi dpi
x i=1
yi
using (9.3). For H = (x, y1 , . . . , yn , p1 , . . . , pn ) then we have
n  
H X H H
dH = + dyi + dpi .
x i=1
y i p i

Comparison with (9.5) gives

H F H
yi = =
pi yi yi
dyi H dpi H
(9.6) = =
dx pi dx yi
for i = 1, . . . , n. Equations (9.6) are the canonical Euler-Lagrange equations associated
with the integral (9.2).
Example 9.1. Take
Z b 
(9.7) J(Y ) = (Y )2 + Y 2 dx
a
where , are given functions of x. For this F (x, y, y ) = (y )2 + y 2 if so
F 1
=
= 2y y = .
y 2
The Hamiltonian H is, by (9.4),

H = py F
1
= py (y )2 y 2 with y =
2
1 1
=pp 2 p2 y 2
2 4
1 2
= p y 2
4
in correct variables x, y, p. Canonical equations are then

dy H 1 dp H
(9.8) = = p = = 2y.
dx p 2 dx y
CALCULUS OF VARIATIONS 23

The ordinary Euler-Lagrange equation for J(Y ) is


F d F d

= 0 2y (2y ) = 0
y dx y dx
which is equivalent to (9.8)

9.2. The Euler-Hamilton (Canonical) variational principle.


Let
Z b 
dY
I(P, Y ) = P H(x, Y, P ) dx
a dx
be defined for any admissible independent functions P and Y with Y (a) = ya , Y (b) = yb ,
i.e.
(
ya at x = a
Y = yB =
yb at x = b.
Suppose I(P, Y ) is stationary at Y = y, P = p. Take varied curves

Y = y + P = p + .
Then we obtain
Z b 
d
I(p + , y + ) = (p + ) (y + ) H(x, y + , p + ) dx
a dx
Z b 
dy dy d d H H
= p + + p + 2 H(x, y, p) O2 dx
a dx dx dx dx y p
= I(p, y) + I + O2 .
where we have
Z b 
dy d H H
I = +p dx
a dx dx y p
Z b    
dy H dp H  b
= + dx + p a
a dx p dx y | {z }
=0

If all curves Y , including y, go through (a, ya ) and (b, yb ) then = 0 at x = a and x = b.


Then I = 0 (p, y) are solutions of
dy H dp H
= = (a < x < b)
dx p dx y
with
24 PROF. ARNOLD ARTHURS

(
ya at x = a
y = yB =
yb at x = b
Note. If Y = yB no boundary conditions are required on p.

9.3. An extension.
Modify I(P, Y ) to be
Z b   b
dY
Imod (P, Y ) = P H(x, P, Y ) dx P (Y yB )
a dx a
Here P and Y are any admissible functions. Imod is stationary at P = p, Y = y. Y = y+,
P = p + y and then make Imod = 0. You should find that (y, p) solve

dy H dp H
= =
dx p dx y
with y = yB on [a, b].

10. First Integrals of the Canonical Equations


A first integral of a system of differential equations is a function, which has constant value
along each solution of the differential equations. We now look for first integrals of the
canonical system

dyi H dpi H
(10.1) = = (i = 1, . . . , n)
dx pi dx yi
and hence of the system
dyi F d F
(10.2) = yi =0 (i = 1, . . . , n)
dx yi dx yi
which is equivalent to (10.1). Take the case where

F
= 0.
x
i.e. F = F (y1 , . . . , yn , y1 , . . . , yn ). Then
n
X
H= pi yi F
i=1
H
is such that x
= 0 and hence
CALCULUS OF VARIATIONS 25

n
0  
dH H7 X H dyi H dpi
=  + + .
dx x i=1
yi dx pi dx

On critical curves (orbits) this gives


n  
X H H H H
= + (1)
i=1
yi pi pi yi
=0

which implies H is constant on orbits. Consider now an arbitrary differentiable function


W = W ()x, y1 , . . . , yn , p1 , . . . , pn ). Then

n   n  
dW W X W dyi W dpi W X W H W H
= + + = +
dx x i=1
yi dx pi dx x i=1
yi pi pi yi

on orbits.

Definition: We define the Poisson bracket of X and Y to be

n  
X X Y X Y
[X, Y ] =
i=1
yi pi pi yi

with X = X(yi , pi ) and Y = Y (yi , pi ).

Then we have

dW W
= + [W, H]
dx x
W
on orbits. In the case when x
= 0, we have

dW
= [W, H].
dx
So,

dW
= 0 [W, H] = 0.
dx
26 PROF. ARNOLD ARTHURS

11. Variable End Points (In the y Direction)


We let the end points be variable in the y direction
Z b
J(Y ) = F (x, Y, Y ) dx.
a
So y(x) is not prescribed at the end points. Suppose J is stationary for Y = y. Take
Y = y + . Then
Z b
J(y + ) = F (x, y + , y + ) dx
a
Z b 
F F
= F (x, y, y ) + + + O2 dx.
a y y
Thus
Z b 
F F
J = + dx
a y y
Z b    b
F d F F
= dx + .
a y dx y y a
If J(Y ) has a minimum for Y = y, we need J = 0. There are four possible cases in which
we require this to happen. We realise these diagrammatically as

Y Y Y
Y

y y y y

a b a b a b a b
(i) (ii) (iii) (iv)

Figure 9. The four possible cases of varying end points in the direction of y.

i) In this case we have that = 0 at x = a and x = b. This gives us that


Z b  
F d F
J = dx = 0
a y dx y
F d F
=0 (a < x < b).
y dx y
which is just our standard Euler-Lagrange equation.
CALCULUS OF VARIATIONS 27

ii) In this case we have the complete opposite situation where 6= 0 for either x = a
or x = b. This gives us that
 b
F
J = = 0.
y a
iii) In this case we are given that only one end point gives = 0, namely (a) = 0 but
6= 0 for x = b. This gives us the criteria that

F
=0 at x = b.
y
iv) Now our final case is where (b) = 0 but (a) 6= 0. This gives us the condition

F
=0 at x = a.
y
Hence J = 0 y satisfies the Euler-Lagrange equation

F d F
=0 (a < x < b)
y dx y
with
F
=0 at x = a
y
F
=0 at x = b.
y
F
If Y is not prescribed at an end point, e.g. x = b, then we require y
= 0 at x = b. Such
a condition is called a natural boundary condition.
Example 11.1 (The simplest problem, revisited). We go back to the Simplest Problem
from chapter 1 where we have F = (y )2 and thus
Z 1
J(Y ) = (y )2 dx
0

with Y (0) = 0 and Y (1) = 1. The Euler-Lagrange equation is y = 0 which implies


y = ax + b. Using the boundary conditions we get

y(0) = 0 b = 0
y=x
y(1) = 1 a = 1
which is our critical curve.
Example 11.2. Let us re-examine this problem in the light of our new theory. We have
again that
28 PROF. ARNOLD ARTHURS

Z 1
J(Y ) = (Y )2 dx
0
but it is not prescribed at x = 0, 1. The Euler-Lagrange equation is y = 0 which implies
that y = x + . Now we have from the boundary conditions that at x = 0, 1 we have
F
=0 y (1) = = 0 =0
y
and hence y = .

12. Variable End Points: Variable in x and y


Directions
Y

yb + yb

y +
ya + ya

yb

y
ya

a a + a b b + b x

Figure 10. Variable end points in two directions.

The Euler-Lagrange equations (a < x < b) are

F
=0 at x = a
y
F
=0 at x = b.
y
Let our J to be minimised be
Z b
(12.1) J(y) = F (x, y, y ) dx.
a
CALCULUS OF VARIATIONS 29

Here y is an extremal and write

(12.2) y(a) = ya y(b) = yb .


Suppose that the varied curve Y = y + is defined over (a + a, b + b) so that
Z b+b
(12.3) J(y + ) = F (x, y + , y + ) dx
a+a
and at the end points of this curve we write

y + = ya + ya at x = a + a
(12.4)
= yb + yb at x = b + b.
To find the first variation J we must obtain the terms in J(y + ) which are linear in the
first order quantities , a, b, ya , yb . The total variation (difference of)

J = J(y + ) J(y)
Z b+b Z b

= F (x, y + , y + ) dx F (x, y, y ) dx
a+a a
which we rewrite as

Z b
(12.5) J = {F (x, y + , y + ) F (x, y, y )} dx
a
Z b+b
+ F (x, y + , y + ) dx
b
Z a+a
F (x, y + , y + ) dx.
a
We have to assume here that y and are defined on the interval (a, b + b), i.e. extensions
of y and y + are needed and we suppose that this is done (by Taylor expansion - see
later). Then, expanding F (x, y + , y + ) about (x, y, y ).

Z b 
F F 2
(12.6) J = + + O( ) dx
a y y
Z b+b
+ {F (x, y, y ) + O()} dx
b
Z a+a
{F (x, y, y ) + O()} dx
a
and the linear terms in this case are
30 PROF. ARNOLD ARTHURS

Z b 
F F
J = + dx + F (x, y, y ) b F (x, y, y ) a
a y y x=b x=a
Z b   x=b
F F
= + + F (x, y, y )x
a y y x=a
Z b    b
F d F F
(12.7) = dx + + F (x, y, y )x .
a y dx y y a
The final step is to find in terms of y, a, b at x = a and x = b. Use (12.4) to obtain

ya + yb = y(a + a) + (a + a)
= y(a) + ay (a) + + (a) + a (a) +
First order terms are then
ya = ay (a) + (a)
(12.8) (a) = ya ay (a)
(12.9) (b) = yb by (b).
We can now write (12.7) as
Z b      b
F d F F F
(12.10) J = dx + y y F x .
a y dx y y y a
This is the General First Variation of the integral J. If we introduce the canonical
variable p conjugate to y defined by

F
(12.11) p=
y
and the Hamiltonian H given by

(12.12) H = py F
we can rewrite (12.10) as
Z b    b
F d F
(12.13) J = dx + py Hx .
a y dx y a
If J has an extremum for Y = y, then J = 0 y is a solution of

F d F
(12.14) =0 (a < x < b)
y dx y
CALCULUS OF VARIATIONS 31

with

 b
(12.15) py Hx = 0.
a

13. Hamilton-Jacobi Theory


13.1. Hamilton-Jacobi Equations
For our standard integral

Z b
(13.1) J(y) = F (x, y, y ) dx.
a

The general First Variation is

Z b    b
d
(13.2) J = Fy Fy dx + py Hx
a dx a
F
where p = y
and H = py F .

yb + yb B2
C2

yb B1

ya C1
A

a b b + b x

Figure 11. Variable end points in the y direction.

We apply (13.2) to the case in the above Figure, where A is fixed at (a, ya ) and C1 , C2 are
both extremal curves to B1 (b, yb ) and B2 (b + b, yb + yb ) respectively. Then we have
32 PROF. ARNOLD ARTHURS

J(C1 ) = function of B1 = S(b, yb )


J(C2 ) = function of B2 = S(b + b, yb + yb)
then S S = pyb Hb. Consider the total variation

S = S(b + b, yb + yb ) S(b, yb ).
By (2), the first order terms of (3) are

(13.3) S = pyb Hb.


Now, B1 (b, yb ) may be any end point B(x, y) say and by (4)

(13.4) S(x, y) = py Hx.


Compare with, given S(x, y),

S = Sy y + Sx x
which gives

(13.5) Sy = p Sx = H.
Hence,

(13.6) Sx + H(x, y, Sy ) = 0
F dy
which is the Hamilton-Jacobi equation, where p(= y ) = Sy , y denoting the derivative dx
calculated at B(x, y) for the extremal going from A to B.
Example 13.1 (Hamilton-Jacobi). Let our integral be
Z b
J(y) = (y )2 dx
a
2
and thus we have F (x, y, y ) = (y ) . Thus we have

F 1
p=
= 2y y = p
y 2

H = py F
 2
1 1
=p p p
2 2
1
= p2
4
CALCULUS OF VARIATIONS 33

Then the Hamilton-Jacobi equation is

Sx + H(x, y, p = Sy ) = 0
1
Sx + (Sy )2 = 0
4
(1) How do we solve this for S = S(x, y)?
(2) Then, how do we find the extremals?

13.2. Hamilton-Jacobi Theorem


See hand out.

13.3. Structure of S
For S = S(x, y, ) we have

dS S S dy
= +
dx x y dx
by the chain rule. Now, in a critical curve C0 we have

S S
= H = p.
x y
So,

dS dy
= H + p on C0
dx dx
dS = Hdx + pdy on C0 .
Integrating we obtain

Z
S(x, y, ) = (pdy Hdx)
C0
Z
= F dx
C0

up to additive constant. So, Hamiltons principal function S is equal to J(y), where J(y)
corresponds to J(y) evaluated along a critical curve.
(1) Case H
x
= 0, i.e. H = H(y, p). In this case, by the result in 10, H = const. =
say in C0 . This p = f (, y) say. Then
34 PROF. ARNOLD ARTHURS

Z
S= (pdy Hdx)
C0
Z
= {f (, y)dy dx}
C0
= W (y) x.
Put this in the Hamilton-Jacobi equation to find W (y) and hence S. Note: it is
additive rather than S = X(x)Y (y).
(2) Case H
y
= 0, hence H = H(x, p). On C0

dp H
= = 0 p = const.
dx y
on C0 , say c. Then
Z
S= (pdy Hdx)
C0
Z
= (cdy H(x, p = c)dx)
C0
= cy V (x).
Again, this separates the x and y variables for the Hamilton-Jacobi equation. In
this case, find V (x) and hence S, by evaluating
Z
V (x) = H(x, p = c) dx.

13.4. Examples
Example 13.2. Let our integral be
Z b
J(y) = (y )2 dx.
a
2
Then we have that F = (y ) . Thus we have

1
p = Fy = 2y y = p
2
1
H = py F H = p2
4
The Hamilton-Jacobi equation is
CALCULUS OF VARIATIONS 35

 
S S
+ H x, y, p = =0
x y
 2
S 1 S
() + =0
x 4 y
H
(1) Now, x
= 0 here. So,

H = const. =
say n C0 . Hence, the the result of 13.3 we can write

S(x, y) = W (y) x
Put this in () then

 2
1 dW
+ =0
4 dy
dW
= 2 W (y) = 2 y
dy
So, we have


S = 2 y x .

The extemals are

S
=pp=2
y
S 1
= yx=

H
(2) This is also an example of y
= 0 on C0

dp H dp
= =0
dx y dx
= p = const. = c

say. Then
36 PROF. ARNOLD ARTHURS

Z
S(x, y) = (pdy Hdx)
C0
Z  
1 2
= cdy c dx
C0 2
1
= cy c2 x.
2
Extremals are

S S
p= =c = = const.
y c
S
= y cx y = cx + , which are straight lines (c, constants).
So, = c
p
Example 13.3. We have that F (x, y, y ) = y 1 + (y )2 and we note F
x
= 0. Now,

F yy p
p= = p y = p .
y 1 + (y )2 y 2 p2
Also,

p
H = py F (x, y, y ) with y = p
y p2
2
p
= y 2 p2 .
S
The Hamilton-Jacobi equation H + x
= 0 is in this case
(  2 ) 12
S S
y2 + =0
y x
i.e. we have
 2  2
S S
+ = y2
x y
H
Hence x
= 0 and so H = const. = say on C0 . So, we can write S(x, y) = W (y) + x
 2
dW
= y 2 2
dy
Z y
1
W (y) = (v 2 2 ) 2 dv.

So we have that
CALCULUS OF VARIATIONS 37

Z y
1
S(x, y) = (v 2 2 ) 2 dv + x.
To find the extremals we use the Hamilton-Jacobi theorem

S S
=p = const. =
y
say. Then
Z y
S
== dv + x
v 2 2
 
x
y = cosh

Part II - Extremum Princples


14. The Second Variation - Introduction to
Minimum Problems
So far we have considered the stationary aspect of variational problems (i.e. J = 0). Now
we shall look at the max, or minimum aspect. That is, we look for the function y (if there
is one) which makes J(Y ) take a maximum or a minimum value in a certain class of
functions. So

J(y) 6 J(Y ), Y a MINIMUM


J(Y ) 6 J(y), Y a MAXIMUM.
In what follows we shall discuss the MIN case, since the MAX case is equivalent to a MIN
case for J(Y ). For the integral
Z b
J(Y ) = F (x, Y, Y ) dx
a
which is stationary for Y = y (J = 0), we have
Z
1 2 b 2
J(y + ) = J(y) + Fyy + 2 Fyy + ( )2 Fyy dx + O3
2 a
2
= J(y) + J + O3 .
So, providing and are small enough, the second variation 2 J determines the sign of

J = J(Y ) J(y).
38 PROF. ARNOLD ARTHURS

For quadratic problems e.g. F = ay 2 + byy + c(y )2 the O3 terms are zero and

J = J(Y ) J(y) = 2 J.

15. Quadratic Problems


Consider

Z b 
1 1
(15.1) J(Y ) = v(Y )2 + wY 2 f Y dx
a 2 2
with

(15.2) Y (a) = ya Y (b) = yb


where v, w and f are given functions of x in general. The associated Euler-Lagrange
equation is
 
d dy
(15.3) v + wy = f a<x<b
dx dx
with

(15.4) y(a) = ya y(b) = yb .


If y is a critical curve (solution of (15.3) and (15.4)) then (15.1)

Z b
1 
(15.5) J = J(y + ) J(y) = v( )2 + w()2) dx
2 a

with = 0 at x = a, b. Has this a definite sign?

Z b 
1 2 1 2
J(y + ) = v(y + ) + w(y + ) f (y + ) dx
a 2 2
Z b  Z b
1 2 1 2
= v(y ) + wy f y dx + {vy + wy f } dx
a 2 2 a
Z b 
1 2 1 2
+ 2 v( ) + w dx
a 2 2
= J(y) + J + 2 J
and we have
CALCULUS OF VARIATIONS 39

Z b
J = {vy + wy f } dx
a
Z b 
d
= (vy ) + wy f dx + [vy ]ba

a dx | {z }
=0
Z b  
d
= (vy ) + wy f dx
a dx
We consider

J = J(y + ) J(y)
= 2J
Z
1 b
(15.6) = v( )2 + w()2 dx.
2 a
Has this a definite sign?

Special Case
Take v and w to be constants and let v > 0 (if v < 0 change the sign of J). Then
Z bn
1 w 2o
(15.7) J = v2 ( )2 + dx
2 a v
The sign of J depends only on sign of
Z b
w 2
K() = dx ( ) +
a v
where (a) = 0 and (b) = 0. Clearly K > 0 if wv > 0. To go further, integrate therm
by parts. Then
Z b  
d2 w
(15.8) K() = 2+ dx + [ ]ba .
a dx v | {z }
=0

Try to imagine is expressed as eigenfunctions of operator { }. Then,


 
d2 n(x a) n2 2
2 n n n n = sin n = .
dx ba (b a)2
Expand

X
= an n (x)
n=1
40 PROF. ARNOLD ARTHURS
n o
n(xa)
where n = sin ba
. So,


Z bX  
d2 w X
(15.9) K() = am m (x) 2 + an n (x) dx
a m=1
dx v n=1
Rb
where a
n m dx = 0 for n 6= m. Then
X
Z b  
X d2 w
(15.10) = am an m 2+ n dx
m=1 n=1 a dx v
| {z }
n2 2

=m +w n
(ba)2 v

X
 Z b
X n2 2 w
(15.11) = am an 2
+ n m dx
m=1 n=1
(b a) v a
| {z }
=0 for m6=n
 Z b
X n2 2 w
(15.12) = a2n 2
+ 2n dx.
n=1
(b a) v a

Hence, in (12), if

2 w
(15.13) 2
+ >0
(b a) v
(n = 1 term) then

(15.14) K() > 0


(15.15) J > 0,
which is the MINIMUM PRINCIPLE. Note (15.13) is equivalent to

v 2
(15.16) w> .
(b a)2
So, for some negative w we shall get a MIN principle of J(Y ). (15.13) gives us that if

w 2
>
v (b a)2
then we have (7)
Z bn
w 2o
( )2 + > 0.
a v
i.e.
CALCULUS OF VARIATIONS 41

Z b 
2 2
( ) 2 dx > 0
a (b a)2
with (a) = 0 = (b)
Theorem 15.1 (Wirtingers Inequality).
Z b Z b
2
(15.17) (u )2 dx > u2 dx
a (b a)2 a
for all u such that u(a) = 0 = u(b).

16. Isoperimetric (constraint) problems


16.1. Lagrange Multipliers in Elementary Calculus
Find the extremum of z = f (x, y) subject to g(x, y) = 0. Suppose g(x, y) = 0 y = y(x).
dz
Then z = f (x, y(x)) = z(x). Make dx = 0, i.e.
f f dy
+ = 0.
x y dx
Now, from g(x, y) = 0
g g dy
+ = 0.
x y dx
dy
We now eliminate dx
in the above equations to obtain

g
f f
(16.1) + (1) x
x
=0
x y y

Example 16.1. Let z = f (x, y) = xy with constraint g(x, y) = x + y 1 = 0 y = 1 x.


We have discussed this problem in Example 1.4. In this case we have

z = f (x, y(x)) = x(x 1)


which gives us that
dz
= 1 2x = 0.
dx
Going back to our theory we obtain

dy
y+x =0
dx
dy
1x+x = 0.
dx
We also obtain
42 PROF. ARNOLD ARTHURS

dy
1+1
= 0.
dx
Thus we have y + x(1) = 0, 1 x x = 0, 1 2x = 0. This implies

1
x=
2
1
y = 1x =
2
Thus z = 2 MAX at ( 12 , 21 ).

16.2. Lagrange Multipliers


We form

V (x, y, ) = f (x, y) + g(x, y)


with no constraints on V . Make V stationary:

V f g
(a) = + =0
x x x
V f g
(b) = + =0
y y y
V
(c) = g(x, y) = 0

Combining (a) and (b) we obtain
f
f g y
+ (1) g = 0.
x x y
As before in (16.1) we have that (c) is the constraint. We solve (a), (b) and (c) for x, y
and .
Example 16.2. Returning to our example from before we want to establish a maximum
at ( 21 , 12 ) for f . Take the points
1 1
x= +h y= + k.
2 2
Then we obtain that
    
1 1 1 1
f + h, + k = +h +k
2 2 2 2
11 1
= + (h + k) + hk
22 2
CALCULUS OF VARIATIONS 43

but this has linear terms in our stationary points. Dont forget the constraint! Now, the
point ( 21 + h, 21 + k) must satisfy

g =x+y1=0
1 1
+ h + + k 1 = 0 h + k = 0.
2 2
Hence,
 
1 1 11
f + h, + k = h2
2 2 22
1 1 1 1
6 MAX at , .
2 2 2 2
Problem with 2 Constraints
Find extremum of f (x, y, z) subject to g1 (x, y, z) = 0 and g2 (x, y, z) = 0. This could be
quite complicated. But Lagrange multipliers gives a simple extension. Form

V (x, y, z, 1 , 2 ) = f1 + 1 g1 + 2 g2 .
We make V stationary, so

Vx = 0 Vy = 0 Vz = 0 V1 = 0 V2 = 0.

16.3. Isoperimetric Problems: Constraint Problems in Calculus of


Variations
Find an extremum of
Z b
J(Y ) = F (x, Y, Y ) dx
a
with boundary conditions Y (a) = ya , Y (b) = yb . This is subject to the constraint
Z b
K(Y ) = G(x, Y, Y ) dx = k
a
where k is a constant. Note that g(Y ) = K(y) k = 0. If we take varied curves Y = y +
then J = J() and K = K() but we note that J and K are functions of one variable. We
need at least two variables! So, take the infinite set of varied curves Y = y + 1 1 + 2 2
(where y is a solution curve). Using this formula for Y we have that J(Y ) becomes some
function f (1 , 2 ). Now,

K(Y ) k = 0 g(1, 2 ) = 0 1 , 2
not independent. Now use the method of Lagrange multipliers to form
44 PROF. ARNOLD ARTHURS

E(1 , 2 , ) = f + g.
Make E stationary at 1 = 0, 2 = 0 (i.e. Y = y). Then,

f g
+ = 0

1 1
at 1 = 2 = 0.
f g
+ =0

2 2
Now

f (1 , 2 ) = J(y + 1 1 + 2 2 )
= J(y) + h1 1 + 2 2 , J (y)i + O2
and we get
g(1, 2 ) = K(y + 1 1 + 2 2 ) k
= K(y) k
= h1 1 + 2 2 , K (y)i + O2 .
Taking partial derivatives we obtain

f g
= h1 , J (y)i at 1 = 2 = 0 = h1 , K (y)i at 1 = 2 = 0
1 1
f g
= h2 , J (y)i at 1 = 2 = 0 = h2 , K (y)i at 1 = 2 = 0.
2 2

Hence the stationary conditions imply that

h1 , J (y) + K (y)i = 0
h2 , J (y) + K (y)i = 0.
Since 1 and 2 are arbitrary and independent functions in (a), (b) and y is a solution of

J (y) + K (y) = 0
   
F d F g d g
+ = 0.
y dx y y dx y
Let L = F + G then we have
L d L
= 0.
y dx y
This is the first necessary condition.
CALCULUS OF VARIATIONS 45

Eulers Rule
Make J + K stationary and satisfy K(Y ) = k. Check for
(
min J(y) 6 J(Y )
max J(y) > J(Y )
Here we assume K (y) 6= 0, i.e. y is not an extremal of K(y).
Example 16.3. We want to maximize
Z a
J(y) = y dx
a
subject to the constraints y(a) = y(a) = 0 and
Z ap
K(y) = 1 + (y )2 dx = L.
a
p
Now, use Eulers Rule. Let F = y, G = 1 + (y )2 and form
p
F + G = y + 1 + (y )2 .
This gives us
Z a
I(y) = L dx.
a
Make I(y) stationary and hence solve

L d L
=0
y dx y
d y
1 p =0
dx 1 + (y )2
!
d y
p =1
dx 1 + (y )2
y
p =x+c
1 + (y )2
2 (y )2
= (x + c)2
1 + (y )2
x+c
y = p
2 (x + c)2
 21
y = 2 (x + c)2 + c1
(y c1 )2 = 2 (x + c)2
46 PROF. ARNOLD ARTHURS
p
which is (part of) a cricle, centred at (c, c1 ) and radius r = ||. Or L = y + 1 + (y )2 .
However, note that Lx
= 0 and hence there is a first integral

L y Ly = const.
The equation above can be written

(x c)2 + (y c1 )2 = 2 .
This must satisfy the conditions that y(a) = 0 = y(a) and K(y) = L. This can be
realised diagramatically as:
Example 16.4. Find the extremum of
Z 1
J(y) = y 2 dx
0
subject to the constraints
Z 1
K(y) = (y )2 dx = const. = k 2 y(0) = 0 = y(1)
0
We use Eulers Rule and form
Z 1
I(y) = J(y) + K(y) = L dx
0
where L = y 2 + (y )2 . Make I(y) stationary solve

L d L
=0
y dx y
d
2y (2y ) = 0
dx
1
y y = 0

There are three cases for this
(1) 1 > 0 where sin x1 = 2 ( 6= 0) then

y = A cosh x + B sinh x.
1
(2)
= 0 then

y = ax + b.
(3) 1

< 0 where sinh1 1 2
= ( 6= 0) then

y = C cos x + D sin x.
CALCULUS OF VARIATIONS 47

Now, from the constraints we have

y(0) = 0 A = 0, b = 0, C = 0
y(1) = 0 B = 0, a = 0, D sin = 0
sin = 0
= n
with n = 1, 2, . . . . So, y = D sin nx with n = 1, 2, 3, . . . . Take yn = sin nx. To include
all these functions we take

X
y= cn y n yn = sin nx.
n=1
Now calculate J(y) and K(y) for this y function.
Z 1
J(y) = y 2 dx
0
Z
!
!
1 X X
= cn sin nx cm sin nx dx
0
n=1 m=1

XX Z 1
= cn cm sin(nx) sin(mx) dx
n=1 m=1 0
| {z }
= 12 nm

X 1
= c2n .
n=1
2
Also,
Z 1
2
k = K(y) = (y )2 dx
0
Z
!
!
1 X X
= cn n cos(nx) cm cos(mx) dx
0 n=1 m=1

XX Z 1
2
= cn cm nm cos(nx) cos(mx) dx
n=1 m=1 0
| {z }
1

2 nm

X 1
= c2n n2 2 .
n=1
2
Now,
48 PROF. ARNOLD ARTHURS


X 1
J(y) = c2n
n=1
2
X 1
K(y) = c2n n2 2 = k2
n=1
2

1 k2 X 2 2 1
c21 = 2 cn n .
2 n=2
2

Write


1 X 1
J(y) = c21 + c2n
2 n=2
2

k2 1 X 21
X
= 2 c2n n2 + c
n=2
2 n=2 n 2

k2 X 2 1 2
= 2 cn (n 1)
n=2
2
k2
6
2
2k 2
with equality when cn = 0 (n = 2, 3, 4, . . . ) leaving c21 = 2
. Then

y = c1 sin x,

with c1 = k 2 .
Note. We had the Wertinger Inequality (Theorem 15.1)
Z b Z b
2 2
(y ) dx > y 2 dx
a (b a)2 a

where y(a) = 0 and y(b) = 0. Relating this to our example above we have a = 0 and b = 1.
Thus,

Z 1 Z 1
2 2
(y ) dx > y 2 dx
0 0
k 2 > 2 J(y)
k2
or J(y) 6 2
as found above.
CALCULUS OF VARIATIONS 49

17. Direct Methods


Let J(Y ) have a minimum for Y = y thus J(y) 6 J(Y ) for all Y . Our approach to
finding J(y) so far has been via the Euler-Lagrange equation

J (y) = 0.
Now, suppose we are unable to solve this equation for y. The idea is to approach the
problem of finding J(y) directly. In this case we probably have to make do with an
approximate estimate of J(y). To illustrate, consider the case
Z 1 
1 2 1 2
J(Y ) = (Y ) + Y qY dx
0 2 2
| {z }
=F
with Y (0) = 0 and Y (1) = 0. The Euler-Lagrange equation for y is

F d F

= 0 y q y = 0
y dx y
y + y = q 0<x<1
with y(0) = 0 = y(1). For Y = y + ,
Z 1
2 
J(Y ) = J(y) + ( )2 + 2 dx
2 0
> J(y)
a MIN principle. Suppose we cannot solve the Euler-Lagrange equation. Then we try to
minimize J(Y ) by any route. To take a very simple approach we choose a trial function

Y1 = x(1 x)
such that Y1 (0) = 0 = Y1 (1). We then calculate

min J(Y1 ) where Y1 = (x), = x(1 x).



Now

J(Y1 ) = J()
Z Z 1
1 2 1 2 2
= ( + ) dx q dx
2 0 0
1
= 2 A B
2
with, say
50 PROF. ARNOLD ARTHURS

Z 1 Z 1
2 2
A= ( + ) dx B= q dx.
0 0
We have that
dJ B
= A B = 0 opt =
d A
Then

1 B2 B 1 B2
J(opt ) = A B = .
2 A2 A 2 A
11
Now for q = 1 we have A = 30 , B = 16 , opt = B
A
5
= 11 5
and thus J(opt ) = 132 . We note
for later that this has decimal expansion
5
J(opt ) = = 0.037 8.

132
We can continue by introducing more than one variable. For example take

Y2 = 1 1 + 2 2
with, say 1 = x(1 x) and 2 = x2 (1 x). We find

min J(Y2 ) 6 min J(Y1 ).


1 ,2 1
We now try and solve the Euler-Lagrange equation for q = 1, which is

y + y = 1 (0 < x < 1) y(0) = 0 = y(1).


We have the Particular Integral and Complementary Function to be

y1 = 1 y2 = aex + bex .
This gives our general solution to the differential equation to be

y = y1 + y2 = 1 + aex + bex .
Now we use the boundary conditions to solve for a and b. These boundary conditions give
us

y(0) = 0 1 + a + b = 0 b = 1 a
y(1) = 0 1 + ae + be1 = 0
1 + ae (1 + a)e1 = 0
1
a= .
1+e
e
Then b = 1 a = 1+e . Hence our general solution becomes
CALCULUS OF VARIATIONS 51

ex e x
y =1 e
1+e 1+e
1
=1 {ex + e1x }.
1+e
Now calculate J(y).
Z 1  
1 2 1 2
J(y) = (y ) + y y dx since q = 1
0 2 2
Z 1 
1 1 2 1 1
= yy + y y dx + [yy ]0
0 2 2 2 | {z }
=0
Z 1 
1
= y(y + y) y dx
0 2
but y + y = 1 by the Euler Lagrange equation and hence
Z
1 1
= y dx.
2 0
So putting in our solution for y we obtain
Z  
1 1 1 x 1x
J(y) = 1 (e + e ) dx
2 0 1+e
1 1 1
= + {e 1 (1 e)} dx
2 21+e
1 e1
= +
2 e+1
= 0.03788246 . . .
which is very close to our approximation. In fact the difference between the two values is
0.00000368.

18. Complementary Variational Principles (CVPs)


/ Dual Extremum Principles (DEPs)
Now seek upper and lower bounds for J(y) so

lower bound 6 J(y) 6 upper bound.


To do this we turn to the canonical formalism. Take the particular case
Z b 
1 2 1
(18.1) J(Y ) = (Y ) + wY 2 qY dx Y (a) = 0 = Y (b).
a 2 2
52 PROF. ARNOLD ARTHURS

Hence we have

1 1
(18.2) F = (Y )2 + wY 2 qY
2 2
F
(18.3) P =
= Y
Y
H(x, P, Y ) = P Y F
1 1
(18.4) = P 2 wY 2 + qY
2 2
Now replace J(Y ) by
Z a  
dY
(18.5.i) I(P, Y ) = p H(x, P, Y ) dx [P Y ]ba
b dx
Z b
(18.5.ii) = {P Y H} dx.
a
Then we know from section 9 that I(P, Y ) is stationary at (p, y) where

dy H
(18.6) = (a < x < b) and y = 0 on [a, b]
dx p
dp H
(18.7) = (a < x < b).
dx y
Now we define dual integrals J(y1 ) and G(p2 ). Let
 
dy H
(18.8) 1 = (p, y) dx = p (a < x < b) and y = 0 on [a, b]

 
dp H
(18.9) 2 = (p, y)
dx = y (a < x < b) .
Then we define

(18.10) J(y1 ) = I(p1 , y1 ) with (p1 , y1 ) 1


(18.11) G(p2 ) = I(p2 , y2 ) with (p2 , y2 ) 2
An extremal pair (p, y) 1 2 and

(18.12) G(p) = I(p, y) = J(y).


Case

1 1
(18.13) H(x, P, Y ) = P 2 wY 2 + qY
2 2
CALCULUS OF VARIATIONS 53

where w(x) and q(x) are prescribed. Then


 
dy H
(18.14) 1 = (p, y) = = p (a < x < b) and y = 0 on [a, b]
dx p


dp H p + q
(18.15) 2 = (p, y) = = wy + q, i.e. y = (a < x < b) .
dx y w
So J(y1 ) = I(p1 , y1 ) with p1 = y1 if y1 = 0 on [a, b]. Now

Z b  
1 2 1 2
J(y1 ) = y12
(y ) wy1 + qy1 dx
a 2 2
Z b 
1 2 1 2
(18.16) = y + wy1 qy1 dx
a 2 1 2
1
the original Euler-Lagrange integral J. Next G(p2 ) = I(p2 , y2 ) with y2 = (p
w 2
+ q)

G(p2 ) = I(p2 , y2 )
Z b  
1 2 1 2
= p2 y2 p wy + qy2 dx
a 2 2 2 2
this uses (18.5.ii)

Z b
1 2 1 2
= p2 + wy2 y2 (p2 + q) dx
a 2 2 | {z }
=wy2
Z b 
1 2 1 2
= p2 wy2 dx
a 2 2
1
with y2 = (p
w 2
+ q)
Z b 
1 1
(18.17) = p22 + (p2 + q)2 dx
2 a w
If we write

(18.18) y1 = y + p2 = p +
in (18.16) and (18.17), expand about y and p and use the stationary property

J(y, ) = 0 G(p, ) = 0.
We have
54 PROF. ARNOLD ARTHURS

Z b
1  2
(18.19) J = J(y1 ) J(y) = ( ) + w()2 dx
2 a
Z  
1 b 2 1 2
(18.20) G = G(p2 ) G(p) = () + ( ) dx
2 a w

Hence, if

(18.21) w(x) > 0

we have J > 0 and G 6 0, i.e.

(18.22) G(p2 ) 6 G(p) = I(p, y) = J(y) 6 J(y1 ),

the complementary (dual) principles.

Example 18.1. Calculation of G in the case w = 1, q = 1 on (0, 1). For this

Z 1
1  2
G(p2 ) = p2 + (p2 + 1)2 dx.
2 0

Here p2 is any admissible function. To make a good choice for p2 we note that the exact
p and y are linked through

dy
p= y(0) = 0 = y(1).
dx
So, we follow this and choose

d
p2 = () = x(1 x)
dx
Thus we obtain

G(p2 ) = G()
Z
1 1  2 2
= + ( + 1)2 dx
2 0
1
= A 2 B C
2
CALCULUS OF VARIATIONS 55

where we define
Z 1  13
A= ( )2 + ( )2 dx =
0 3
Z 1
B= dx = 2
0
1
C= .
2
dG 6
Now compute d
= A B = 0 opt = B
A
= 13
. So,

B2 1
MAX G() = G(opt ) = C = = 0.038461 . . .
2A 26
From the previous calculation of J(y1 ) we had in section 17 that

MIN = 0.03787878 . . .

Hence we have bounded I(p, y) such that

0.038461 6 I(p, y) 6 0.03787878


In this case we know that I(p, y) = 0.03788246

19. Eigenvalues

(19.1) Ly = y.
For example

d2 y
(19.2) 2 = y 0<x<1
dx
(19.3) y(0) = 0 y(1) = 0.
We have that (19.2) has general solution

(19.4) y = A sin x + B cos x
(19.5) y(0) = 0 B = 0

y(1) = 0 A sin = 0

Take sin = 0 then we have

(19.6) = n, n = 1, 2, 3, . . . or = n2 2 , n = 1, 2, 3 . . .
Then we have that
56 PROF. ARNOLD ARTHURS

(
yn = sin nx n = 1, 2, 3, . . .
n = n2 2
In general, Eignevalue problems (19.1) cannot be solved exactly, so we need methods for
estimating Eigenvalues. One method is due to Rayleigh: it provides an upper bound for
1 , the lowest Eigenvalue of the problem. Consider

Z  
1
d2 y
(19.7) A(Y ) = Y 2 1 Y dx
0 dx
for all Y = {C2 functions such that Y (0) = 0 = Y (1)}. We write


X
Y = an yn
n=1

d2 y
{yn } = complete set of eigenfunctions of
dx2
= {sin nx}.
Then (19.7) becomes


Z 1X  2 X
d y
A(Y ) = an yn 2 1 am ym dx
0 n=1 dx m=1
X X Z 1
= an am (m 1 ) yn ym
n=1 m=1 0
| {z }
=0(n6=m)

X Z 1
= a2n (n 1 ) yn2 dx
n=1 0

>0
where n = n2 2 and hence 0 < 1 < 2 < 3 < . . . . So we have A(Y ) > 0. Rearranging
(19.7) we obtain

Z 1  2  Z 1
d y
Y 2 Y dx > 1 Y 2 dx
0 dx 0
R 1  d2 y 
0
Y dx2 dx
(19.8) 1 6 R1 = (Y ),
Y 2 dx
0

which is the Rayleigh Bound. To illustrate: take Y = x(x 1). Then


CALCULUS OF VARIATIONS 57

1
3
(Y ) = 1 = 10.
30
The exact 1 = 2 9.87.
Example 19.1. From example 2 of section 16 we have

() (u , u) > 0 (u, u)
where u(0) = 0, u(1) = 0 and
Z 1
(u, u) = u2 dx/
0
2
We know that 0 = 4
. We can use () to obtain an upper bound for 0 :

(u , u)
0 6 .
(u, u)
Take u = 2x x2 such that u(0) = 0, u (1) = 0. So

4
3 20
() 0 6 7 = 2.85.
15
7
2 9.87
Note 0 = 4
4
2.47.

20. Appendix: Structure


Integration by parts has been used in the work on Euler-Lagrange and Canonical Equations.

F d F
=0
y dx y

dy H

=
dx p
dp H

= .
dx y
d d
Simplest pairs of derivatives dx and dx appear. We now attempt to generalise. We take
an inner produce space S of functions u(x) and suppose hu, vi denotes inner product. A
very useful case is
Z
hv, ui = uvw(x) dx

with w(x) > 0, where w is a weighting function. Properties


58 PROF. ARNOLD ARTHURS

hu, vi = hv, ui
hu, ui > 0.
d
Write T as a linear differetial operator, e.g. T = dx
and define its adjoint T by

hu, T vi = hT u, vi,
where we suppose boundary terms vanish.
d
R
Example 20.1. T = dx and hu, vi = uvw(x) dx. Then

Z
dv
hu, T vi = u
w(x) dx
dx
Z
d
= (1) (w(x)u)v dx + [ ]
dx |{z}
=0
Z
1 d
= (1) (wu)vw(x) dx
w dx
= hT u, vi.
So,

1 d  1 d
T u = w(x)u T = (w)
w dx w dx
From T and T we can construct L = T T and hence
 
1 d
d
L=T T = w .
w dx dx
Now look at
Z
hu, Lui = u(T T u)w(x) dx
= hu, T T ui
= hT u, ui
Z
= (T u)(T u)w(x) dx
> 0.
d d d 2
So, L is a positive operator (e.g. w = 1, T = dx
, T = dx L = dx 2 ). Also
CALCULUS OF VARIATIONS 59

hu, Lvi = hu, T T vi


= hT u, T V i
= hT T u, vi
= hLu, vi.
But hu, Lvi = hL u, vi by definition of adjoint. So, L = L. So, L = T T is a self-adjoint
operator.

The Canonical equations generalise to


H H
Ty = T p =
p y
with Euler-Lagrange equation
 
F F
+ T = 0.
y (T y)
Eigenvalue problems.

Lu = u

(i.e. T T u = u). is an eigenvalue. Suppose there is a complete set of eigenfunctions
{un }, corresponding to eigenvalues {n }. Now

hu, Lui = hu, T T ui = hT u, T ui > 0


Let u = un then

hun , Lun i = hun , n un i = n hun , un i > 0


which implies that n > 0. Suppose the n are discrete, so if we order them

0 < 1 < 2 < 3 < . . .


Orthogonality of un s. Let

Lun = n un and Lum = m um


Then, taking innner products we have

hum , Lun i = hum , n un i = n hum , un i


hun , Lum i = hun , m um i = m hun , um i = m hum , un i
But L self adjoint, so

hum , Lun i = hLum , un i = hun , Lum i


60 PROF. ARNOLD ARTHURS

using hu, vi = hv, ui. So,

0 = (n m )hum , un i

which implies hum , un i = 0 if n 6= m . Orthogonality. An example of this is

d d d2
w=1 t= T = L= un = sin(nx) n = n2 2 .
dx dx dx2

Another example is

 
d 1 d 1 d d
w=x T = T = (x) L = x un = J0 (an x) n = a2n .
dx x dx x dx dx

Orthogonality allows us to expand functions in terms of eigenfunctions un . Thus


X
u(x) = an un (x) Lun = n un .
n=1

Then


X
huk , ui = huk , un i = ck huk , uk i.
n=1

R
Consider A(u) = hT u, T ui for example hu , ui = u2 w(x) dx. We have that

A(u) = hu, T T ui = hu, Lui

Expand u in eigenfunctions un of L. Then

*
+
X X
A(u) = cn u n , L cm u m
n=1 m=1
X
X
= cn cm hun , Lum i
n m
CALCULUS OF VARIATIONS 61

using that Lum = m um


X
X
= cn cm hun , m um i
n m
X
X
= cn cm m hun , um i
n m
X
= c2n n hun , un i
n=1
X
> 1 c2n n hun , un i
n=1
> 1 hu, ui
hu,Lui
i.e. A(u) > 1 hu, ui and so hu,ui
> 1 .

You might also like