You are on page 1of 22

Physics 129a

Calculus of Variations
071113 Frank Porter
Revision 081120
1 Introduction
Many problems in physics have to do with extrema. When the problem
involves nding a function that satises some extremum criterion, we may
attack it with various methods under the rubric of calculus of variations.
The basic approach is analogous with that of nding the extremum of a
function in ordinary calculus.
2 The Brachistochrone Problem
Historically and pedagogically, the prototype problem introducing the cal-
culus of variations is the brachistochrone, from the Greek for shortest
time. We suppose that a particle of mass m moves along some curve under
the inuence of gravity. Well assume motion in two dimensions here, and
that the particle moves, starting at rest, from xed point a to xed point b.
We could imagine that the particle is a bead that moves along a rigid wire
without friction [Fig. 1(a)]. The question is: what is the shape of the wire
for which the time to get from a to b is minimized?
First, it seems that such a path must exist the two outer paths in
Fig. 2(b) presumably bracket the correct path, or at least can be made to
bracket the path. For example, the upper path can be adjusted to take an
arbitrarily long time by making the rst part more and more horizontal. The
lower path can also be adjusted to take an arbitrarily long time by making
the dip deeper and deeper. The straight-line path from a to b must take
a shorter time than both of these alternatives, though it may not be the
shortest.
It is also readily observed that the optimal path must be single-valued in
x, see Fig. 1(c). A path that wiggles back and forth in x can be shortened in
time simply by dropping a vertical path through the wiggles. Thus, we can
describe path C as a function y(x).
1
C
a
b
a
b
(a) (b)
y
x
(c)
a
b
.
.
.
.
. .
Figure 1: The Brachistochrone Problem: (a) Illustration of the problem; (b)
Schematic to argue that a shortest-time path must exist; (c) Schematic to
argue that we neednt worry about paths folding back on themselves.
Well choose a coordinate system with the origin at point a and the y axis
directed downward (Fig. 1). We choose the zero of potential energy so that
it is given by:
V (y) = mgy.
The kinetic energy is
T(y) = V (y) =
1
2
mv
2
,
for zero total energy. Thus, the speed of the particle is
v(y) =
_
2gy.
An element of distance traversed is:
ds =
_
(dx)
2
+ (dy)
2
=

_
1 +
_
dy
dx
_
2
dx.
Thus, the element of time to traverse ds is:
dt =
ds
v
=
_
1 +
_
dy
dx
_
2

2gy
dx,
and the total time of descent is:
T =
_
x
b
0
_
1 +
_
dy
dx
_
2

2gy
dx.
2
Dierent functions y(x) will typically yield dierent values for T; we call
T a functional of y. Our problem is to nd the minimum of this functional
with respect to possible functions y. Note that y must be continuous it
would require an innite speed to generate a discontinuity. Also, the accel-
eration must exist and hence the second derivative d
2
y/dx
2
. Well proceed
to formulate this problem as an example of a more general class of problems
in variational calculus.
Consider all functions, y(x), with xed values at two endpoints; y(x
0
) =
y
0
and y(x
1
) = y
1
. We wish to nd that y(x) which gives an extremum for
the integral:
I(y) =
_
x
1
x
0
F(y, y

, x) dx,
where F(y, y

, x) is some given function of its arguments. Well assume good


behavior as needed.
In ordinary calculus, when we want to nd the extrema of a function
f(x, y, . . .) we proceed as follows: Start with some candidate point (x
0
, y
0
, . . .),
Compute the total dierential, df, with respect to arbitrary innitesimal
changes in the variables, (dx, dy, . . .):
df =
_
f
x
_
x
0
,y
0
,...
dx +
_
f
y
_
x
0
,y
0
,...
dy + . . .
Now, df must vanish at an extremum, independent of which direction we
choose with our innitesimal (dx, dy, . . .). If (x
0
, y
0
, . . .) are the coordinates
of an extremal point, then
_
f
x
_
x
0
,y
0
,...
=
_
f
y
_
x
0
,y
0
,...
= . . . = 0.
Solving these equations thus gives the coordinates of an extremum point.
Finding the extremum of a functional in variational calculus follows the
same basic approach. Instead of a point (x
0
, y
0
, . . .), we consider a candidate
function y(x) = Y (x). This candidate must satisfy our specied behavior at
the endpoints:
Y (x
0
) = y
0
Y (x
1
) = y
1
. (1)
We consider a small change in this function by adding some multiple of
another function, h(x):
Y (x) Y (x) +h(x).
3
0.9
1.4
1.9
-1.1
-0.6
-0.1
0.4
0 0.2 0.4 0.6 0.8 1
h
Y
Y+ h
Figure 2: Variation on function Y by function h.
To maintain the endpoint condition, we must have h(x
0
) = h(x
1
) = 0. The
notation Y is often used for h(x).
A change in functional form of Y (x) yields a change in the integral I.
The integrand changes at each point x according to changes in y and y

:
y(x) = Y (x) +h(x),
y

(x) = Y

(x) +h

(x). (2)
To rst order in , the new value of F is:
F(Y +h, Y

+h

) F(Y, Y

, x)+
_
F
y
_
y=Y
y

=Y

h(x)+
_
F
y

_
y=Y
y

=Y

h

(x). (3)
Well use I to denote the change in I due to this change in functional
form:
I =
_
x
1
x
0
F(Y + h, Y

+ h

, x) dx
_
x
1
x
0
F(Y, Y

, x) dx,

_
x
1
x
0
_

_
_
F
y
_
y=Y
y

=Y

h +
_
F
y

_
y=Y
y

=Y

h

_ dx. (4)
We may apply integration by parts to the second term:
_
x
1
x
0
F
y

dx =
_
x
1
x
0
h
d
dx
_
F
y

_
dx, (5)
4
where we have used h(x
0
) = h(x
1
) = 0. Thus,
I =
_
x
1
x
0
_
F
y
+
d
dx
_
F
y

__
y=Y
y

=Y

hdx. (6)
When I is at a minimum, I must vanish, since, if I > 0 for some ,
then changing the sign of gives I < 0, corresponding to a smaller value of
I. A similar argument applies for I < 0, hence I = 0 at a minimum. This
must be true for arbitrary h and small but nite. It seems that a necessary
condition for I to be extremal is:
_
F
y
+
d
dx
_
F
y

__
y=Y
y

=Y

= 0. (7)
This follows from the fundamental theorem:
Theorem: If f(x) is continuous in [x
0
, x
1
] and
_
x
1
x
0
f(x)h(x) dx = 0 (8)
for every continuously dierentiable h(x) in [x
0
, x
1
], where h(x
0
) =
h(x
1
) = 0, then f(x) = 0 for x [x
0
, x
1
].
Proof: Imagine that f() > 0 for some x
0
< < x
1
. Since f is continuous,
there exists > 0 such that f(x) > 0 for all x ( , + ). Let
h(x) =
_
(x + )
2
(x )
2
, x +
0 otherwise.
(9)
Note that h(x) is continuously dierentiable in [x
0
, x
1
] and vanishes at x
0
and x
1
. We have that
_
x
1
x
0
f(x)h(x) dx =
_
+

f(x)(x + )
2
(x )
2
dx (10)
> 0, (11)
since f(x) is larger than zero everywhere in this interval. Thus, f(x) cannot
be larger than zero anywhere in the interval. The parallel argument follows
for f(x) < 0.
This theorem then permits the assertion that
_
F
y
+
d
dx
_
F
y

__
y=Y
y

=Y

= 0. (12)
5
whenever y = Y such that I is an extremum, at least if the expression on
the right is continuous. We call the expression on the right the Lagrangian
derivative of F(y, y

, x) with respect to y(x), and denote it by


F
y
.
The extremum condition, relabeling Y y, is then:
F
y

F
y

d
dx
_
F
y

_
= 0. (13)
This is called the Euler-Lagrange equation.
Note that I = 0 is a necessary condition for I to be an extremum, but
not sucient. By denition, the Euler-Lagrange equation determines points
for which I is stationary. Further consideration is required to establish
whether I is an extremum or not.
We may write the Euler-Lagrange equation in another form. Let
F
a
(y, y

, x)
F
y

. (14)
Then
d
dx
_
F
y

_
=
dF
a
dx
=
F
a
x
+
F
a
y
y

+
F
a
y

(15)
=

2
F
xy

+

2
F
yy

+

2
F
y
2
y

. (16)
Hence the Euler-Lagrange equation may be written:

2
F
y
2
y

+

2
F
yy

+

2
F
xy


F
y
= 0. (17)
Let us now apply this to the brachistochrone problem, nding the ex-
tremum of:
_
2gT =
_
x
b
0

1 +y
2
y
dx. (18)
That is:
F(y, y

, x) =

1 +y
2
y
. (19)
Notice that, in this case, F has no explicit dependence on x, and we can
take a short-cut. Starting with the Euler-Lagrange equation, if F has no
explicit x-dependence we nd:
0 =
_
F
y

d
dx
F
y

_
y

(20)
6
=
F
y
y

d
dx
F
y

(21)
=
dF
dx

F
y

d
dx
F
y

(22)
=
d
dx
_
F y

F
y

_
. (23)
Hence,
F y

F
y

= constant = C. (24)
In this case,
y

F
y

= (y

)
2
/
_
y (1 +y
2
). (25)
Thus,

1 +y
2
y
(y

)
2
/
_
y (1 +y
2
) = C, (26)
or
y
_
1 +y
2
_
=
1
C
2
A. (27)
Solving for x, we nd
x =
_

y
Ay
dy. (28)
We may perform this integration with the trigonometric substitution: y =
A
2
(1 cos ) = Asin
2
2
. Then,
x =
_

_
sin
2
2
1 sin
2
2
Asin

2
cos

2
d (29)
= A
_
sin
2

2
d (30)
=
A
2
( sin ) +B. (31)
We determine integration constant B by letting = 0 at y = 0. We
chose our coordinates so that x
a
= y
a
= 0, and thus B = 0. Constant A is
determined by requiring that the curve pass through (x
b
, y
b
):
x
b
=
A
2
(
b
sin
b
), (32)
y
b
=
A
2
(1 cos
b
). (33)
7
This pair of equations determines A and
b
. The brachistochrone is given
parametrically by:
x =
A
2
( sin ), (34)
y =
A
2
(1 cos ). (35)
In classical mechanics, Hamiltons principle for conservative systems that
the action is stationary gives the familiar Euler-Lagrange equations of clas-
sical mechanics. For a system with generalized coordinates q
1
, q
2
, . . . , q
n
, the
action is
S =
_
t
t
0
L({q
i
} , { q
i
} , t

) dt, (36)
where L is the Lagrangian. Requiring S to be stationary yields:
d
dt
_
L
q
i
_

L
q
i
= 0, i = 1, 2, . . . , n. (37)
3 Relation to the Sturm-Liouville Problem
Suppose we have the Sturm-Liouville operator:
L =
d
dx
p(x)
d
dx
q(x), (38)
with p(x) 0, q(x) 0, and x (0, U). We are interested in solving the
inhomogeneous equation Lf = g, where g is a given function.
Consider the functional
J =
_
U
0
_
pf
2
+ qf
2
+ 2gf
_
dx. (39)
The Euler-Lagrange equation for J to be an extremum is:
F
f

d
dx
_
F
f

_
= 0, (40)
where F = pf
2
+ qf
2
+ 2gf. We have
F
f
= 2qy + 2g (41)
d
dx
_
F
f

_
= 2p

+ 2pf

. (42)
8
Substituting into the Euler-Lagrange equation gives
d
dx
_
p(x)
d
dx
f(x)
_
q(x)f(x) = 0. (43)
This is the Sturm-Liouville equation! That is, the Sturm-Liouville dierential
equation is just the Euler-Lagrange equation for the functional J.
We have the following theorem:
Theorem: The solution to
d
dx
_
p(x)
d
dx
f(x)
_
q(x)f(x) = g(x), (44)
where p(x) > 0, q(x) 0, and boundary conditions f(0) = a and
f(U) = b, exists and is unique.
Proof: First, suppose there exist two solutions, f
1
and f
2
. Then d = f
1
f
2
must satisfy the homogeneous equation:
d
dx
_
p(x)
d
dx
d(x)
_
q(x)d(x) = 0, (45)
with homogeneous boundary conditions d(0) = d(U) = 0. Now multi-
ply Equation 45 by d(x) and integrate:
_
U
0
d(x)
d
dx
_
p(x)
d
dx
d(x)
_
dx
_
U
0
q(x)d(x)
2
dx = 0
= d(x)p(x)
dd(x)
dx

U
0

_
U
0
_
dd(x)
dx
_
2
p(x) dx
=
_
U
0
pd
2
dx. (46)
Thus,
_
U
0
_
pd
2
(x) +q(x)d(x)
2
_
dx = 0. (47)
Since pd
2
0 and qd
2
0, we must thus have pd
2
= 0 and qd
2
= 0
in order for the integral to vanish. Since p > 0 and pd
2
= 0 it must
be true that d

= 0, that is d is a constant. But d(0) = 0, therefore


d(x) = 0. The solution, if it exists, is unique.
The issue for existence is the boundary conditions. We presume that
a solution to the dierential equation exists for some boundary con-
ditions, and must show that a solution exists for the given boundary
9
condition. From elementary calculus we know that two linearly inde-
pendent solutions to the homogeneous dierential equation exist. Let
h
1
(x) be a non-trivial solution to the homogeneous dierential equation
with h
1
(0) = 0. This must be possible because we can take a suitable
linear combination of our two solutions. Because the solution to the
inhomogeneous equation is unique, it must be true that h
1
(U) = 0.
Likewise, let h
2
(x) be a solution to the homogeneous equation with
h
2
(U) = 0 (and therefore h
2
(0) = 0). Suppose f
0
(x) is a solution to
the inhomogeneous equation satisfying some boundary condition. Form
the function:
f(x) = f
0
(x) +k
1
h
1
(x) +k
2
h
2
(x). (48)
We adjust constants k
1
and k
2
in order to satisfy the desired boundary
condition
a = f
0
(0) +k
2
h
2
(0), (49)
b = f
0
(U) +k
1
h
1
(U). (50)
That is,
k
1
=
b f
0
(U)
h
1
(U)
, (51)
k
2
=
a f
0
(0)
h
2
(U)
. (52)
We have demonstrated existence of a solution.
This discussion leads us to the variational calculus theorem:
Theorem: For continuously dierentiable functions in (0, U) satisfying f(0) =
a and f(U) = b, the functional
J =
_
U
0
_
pf
2
+ qf
2
+ 2gf
_
dx, (53)
with p(x) > 0 and q(x) 0, attains its minimum if and only if f(x) is
the solution of the corresponding Sturm-Liouville equation.
Proof: Let s(x) be the unique solution to the Sturm-Liouville equation sat-
isfying the given boundary conditions. Let f(x) be any other continu-
ously dierentiable function satisfying the boundary conditions. Then
d(x) f(x) s(x) is continuously dierentiable and d(0) = d(U) = 0.
10
Solving for f, squaring, and doing the same for the dervative equation,
yields
f
2
= d
2
+ s
2
+ 2sd, (54)
f
2
= d
2
+ s
2
+ 2s

. (55)
Let
J J(f) J(s) (56)
=
_
U
0
_
pf
2
+ qf
2
+ 2gf ps
2
qs
2
2gs
_
dx (57)
=
_
U
0
_
p
_
d
2
+ 2s

_
+ q
_
d
2
+ 2ds
_
+ 2gf
_
dx (58)
= 2
_
U
0
(pd

+ qds + gd) dx +
_
U
0
_
pd
2
+ qd
2
_
dx. (59)
But
_
U
0
(pd

+ qds + gd) dx = dps

U
0
+
_
U
0
_
d(x)
d
dx
(ps

) +qds + gd
_
dx
=
_
U
0
d(x)
_

d
dx
(ps

) +qs + g
_
dx, since d(0) = d(U) = 0
= 0; integrand is zero by the dierential equation. (60)
Thus, we have that
J =
_
U
0
_
pd

+ qd
2
_
dx 0. (61)
In other words, f does no better than s, hence s corresponds to a
minimum. Furthermore, if J = 0, then d = 0, since p > 0 implies d

must be zero, and therefore d is constant, but we know d(0) = 0, hence


d = 0. Thus, f = s at the minimum.
4 The Rayleigh-Ritz Method
Consider the Sturm-Liouville problem:
d
dx
_
p(x)
d
dx
f(x)
_
q(x)f(x) = g(x), (62)
with p > 0, q 0, and specied boundary conditions. For simplicity here,
lets assume f(0) = f(U) = 0. Imagine expanding the solution in some set
11
of complete functions, {
n
(x)} (not necessarily eignefunctions):
f(x) =

n=1
A
n

n
(x).
We have just shown that our problem is equivalent to minimizing
J =
_
U
0
_
pf
2
+ qf
2
+ 2gf
_
dx. (63)
Substitute in our expansion, noting that
pf
2
=

n
A
m
A
n
p(x)

m
(x)

n
(x). (64)
Let
C
mn

_
U
0
p

n
dx, (65)
B
mn

_
U
0
p
m

n
dx, (66)
G
n

_
U
0
g
n
dx. (67)
Assume that we can interchange the sum and integral, obtaining, for example,
_
U
0
pf
2
dx =

n
C
mn
A
m
A
n
. (68)
Then
J =

n
(C
mn
+ B
mn
) A
m
A
n
+ 2

n
G
n
A
n
. (69)
Let D
mn
C
mn
+ B
mn
= D
nm
. The D
mn
and G
n
are known, at least in
principle. We wish to solve for the expansion coecients {A
n
}. To accom-
plish this, use the condition that J is a minimum, that is,
J
A
n
= 0, n. (70)
Thus,
0 =
J
A
n
=

m=1
D
nm
A
m
+ G
n
, n = 1, 2, . . . (71)
This is an innite system of coupled inhomogeneous equations. If D
nm
is
diagonal, the solution is simple:
A
n
= G
n
/D
nn
. (72)
12
The reader is encouraged to demonstrate that this occurs if the
n
are the
eigenfunctions of the Sturm-Liouville operator.
It may be too dicult to solve the eigenvalue problem. In this case, we can
look for an approximate solution via the Rayleigh-Ritz approach: Choose
some nite number of linearly independent functions {
1
(x),
2
(x), . . . ,
N
(x)}.
In order to nd a function

f(x) =
N

n=1

A
n
(n)(x) (73)
that approximates closely f(x), we nd the values for

A
n
that minimize
J(

f) =
N

n,m=1

D
nm

A
m

A
n
+ 2
N

n=1

G
n

A
n
, (74)
where now

D
nm

_
U
0
(p

m
+ q
n

m
) dx (75)

G
n

_
U
0
g
n
dx. (76)
The minimum of J(

f) is at:
N

m=1

D
nm

A
m
+

G
n
= 0, n = 1, 2, . . . (77)
In this method, it is important to make a good guess for the set of functions
{
n
}.
It may be remarked that the Rayleigh-Ritz method is similar in spirit
but dierent from the variational method we typically introduce in quantum
mechanics, for example when attempting to compute the ground state energy
of the helium atom. In that case, we adjust parameters in a non-linear
function, while in the Rayleigh-Ritz method we adjust the linear coecients
in an expansion.
5 Adding Constraints
As in ordinary extremum problems, constraints introduce correlations, now
in the possible variations of the function at dierent points. As with the
ordinary problem, we may employ the method of Lagrange multipliers to
impose the constraints.
13
We consider the case of the isoperimetric problem, to nd the stationary
points of the functional:
J =
_
b
a
F(f, f

, x) dx, (78)
in variations f vanishing at x = a, b, with the constraint that
C
_
b
a
G(f, f

, x) dx (79)
is constant under variations.
We have the following theorem:
Theorem: (Euler) The function f that solves this problem also makes the
functional I = J +C stationary for some , as long as
C
f
= 0 (i.e., f
does not satisfy the Euler-Lagrange equation for C).
Proof: (partial) We make stationary the integral:
I = J + C =
_
b
a
(F + G)dx. (80)
That is, f must satisfy
F
f

d
dx
F
f

+
_
G
f

d
dx
G
f

_
= 0. (81)
Multiply by the variation f(x) and integrate:
_
b
a
_
F
f

d
dx
F
f

_
f(x) dx +
_
b
a
_
G
f

d
dx
G
f

_
f(x) dx = 0.
(82)
Here, f(x) is arbitrary. However, only those variations that keep C
invariant are allowed (e.g., take partial derivative with respect to and
require it to be zero):
C =
_
b
a
_
G
f

d
dx
G
f

_
f(x) dx = 0. (83)
5.1 Example: Catenary
A heavy chain is suspended from endpoints at (x
1
, y
1
) and (x
2
, y
2
). What
curve describes its equilibrium position, under a uniform gravitational eld?
14
The solution must minimize the potential energy:
V = g
_
2
1
ydm (84)
= g
_
2
1
yds (85)
= g
_
x
2
x
1
y
_
1 +y
2
dy, (86)
where is the linear density of the chain, and the distance element along the
chain is ds = dx

1 +y
2
.
We wish to minimize V , under the constraint that the length of the chain
is L, a constant. We have,
L =
_
2
1
ds =
_
x
2
x
1
_
1 +y
2
dx. (87)
To solve, let (we multiply L by g and divide out of the problem)
F(y, y

, x) = y
_
1 +y
2
+
_
1 +y
2
, (88)
and solve the Euler-Lagrange equation for F.
Notice that F does not depend explicitly on x, so we again use our short
cut that
F y

F
y

= constant = C. (89)
Thus,
C = F y

F
y

(90)
= (y + )
_
_
1 +y
2

y
2

1 +y
2
_
(91)
=
(y + )
(

1 +y
2
.
(92)
Some manipulation yields
dy
_
(y + )
2
C
2
=
dx
C
. (93)
With the substitution y + = C cosh , we obtain =
x+k
C
, where k is an
integraton constant, and thus
y + = C cosh
_
x + k
C
_
. (94)
15
There are three unknown constants to determine in this expression, C, k,
and . We have three equations to use for this:
y
1
+ = C cosh
_
x
1
+ k
C
_
, (95)
y
2
+ = C cosh
_
x
2
+ k
C
_
, and (96)
L =
_
x
2
x
1
_
1 +y
2
dx. (97)
6 Eigenvalue Problems
We may treat the eigenvalue problem as a variational problem. As an exam-
ple, consider again the Sturm-Liouville eigenvalue equation:
d
dx
_
p(x)
df(x)
dx
_
q(x)f(x) = w(x)f(x), (98)
with boundary conditions f(0) = f(U) = 0. This is of the form
Lf = wf. (99)
Earlier, we found the desired functional to make stationary was, for Lf =
0,
I =
_
U
0
_
pf
2
+ qf
2
_
dx. (100)
We modify this to the eigenvalue problem with q q w, obtaining
I =
_
U
0
_
pf
2
+ qf
2
wf
2
_
dx, (101)
which possesses the Euler-Lagrange equation giving the desired Sturm-Liouville
equation. Note that is an unknown parameter - we want to determine it.
It is natural to regard the eigenvalue problem as a variational problem
with constraints. Thus, we wish to vary f(x) so that
J =
_
U
0
_
pf
2
+ qf
2
_
dx (102)
is stationary, with the constraint
C =
_
U
0
wf
2
dx = constant. (103)
16
Notice here that we may take C = 1, corresponding to normalized eigenfunc-
tions f, with respect to weight w.
Lets attempt to nd approximate solutions using the Rayleigh-Ritz method.
Expand
f(x) =

n=1
A
n
u
n
(x), (104)
where u(0) = u(U) = 0. The u
n
are some set of expansion functions, not the
eigenfunctions if they are the eigenfunctions, then the problem is already
solved! Substitute this into I, giving
I =

m=1

n=1
(C
mn
D
mn
) A
m
A
n
, (105)
where
C
mn

_
U
0
(pu

m
u

n
+ qu
m
u
n
) dx (106)
D
mn

_
U
0
wu
m
u
n
dx. (107)
Requiring I to be stationary,
I
A
m
= 0, m = 1, 2, . . . , (108)
yields the innite set of coupled homogeneous equations:

n=1
(C
mn
D
mn
) A
n
= 0, m = 1, 2, . . . (109)
This is perhaps no simpler to solve than the original dierential equation.
However, we may make approximate solutions for f(x) by selecting a nite
set of linearly independent functions
1
, . . . ,
N
and letting

f(x) =
N

n=1

A
n

n
(x). (110)
Solve for the best approximation of this form by nding those {

A
n
} that
satisfy
N

n=1
_

C
mn


D
mn
_

A
n
= 0, m = 1, 2, . . . , N, (111)
where

C
mn

_
U
0
(p

n
+ q
m

n
) dx (112)

D
mn

_
U
0
w
m

n
dx. (113)
17
This looks like N equations in the N+1 unknowns

, {

A
n
}, but the overall
normalization of the A
n
s is arbitrary. Hence there are enough equations in
principle, and we obtain

N
m,n=1

C
mn

A
m

A
n

N
m,n=1

D
mn

A
m

A
n
. (114)
Notice the similarity of Eqn. 114 with
=
_
U
0
(pf
2
+ qf
2
) dx
_
U
0
wf
2
dx
=
J(f)
C(f)
. (115)
This follows since I = 0 for f a solution to the Sturm-Liouville equation:
I =
_
U
0
_
pf
2
+ qf
2
wf
2
_
dx
= pff

U
0
+
_
U
0
_
f
d
dx
(pf

) +qf
2
wf
2
_
dx
= 0 +
_
U
0
_
qf
2
+ wf
2
+ qf
2
wf
2
_
dx
= 0, (116)
where we have used the both the boundary condition f(0) = f(U) = 0 and
Sturm-Liouville equation
d
dx
(pf

) = qf wf to obtain the third line. Also,

=
J(

f)
C(

f)
, (117)
since, for example,
J(

f) =
_
U
0
_
p

f
2
+ q

f
2
_
dx
=
_
U
0
_
p

m,n

A
n

A
m

n
+ q

m,n

A
n

A
m

n
_
dx
=

m,n

C
mn

A
n

A
m
. (118)
That is, if

f is close to an eigenfunction f, then

should be close to an
eigenvalue .
Lets try an example: Find the lowest eigenvalue of f

= f, with
boundary conditions f(1) = 0. We of course readily see that the rst
eigenfunction is cos(x/2) with
1
=
2
/4, but lets try our method to see
18
how we do. For simplicity, well try a Rayleigh-Ritx approximation with only
one term in the sum.
As we noted earlier, it is a good idea to pick the functions with some
care. In this case, we know that the lowest eigenfunction wont wiggle much,
and a good guess is that it will be symmetric with no zeros in the interval
(1, 1). Such a function, which satises the boundary conditions, is:

f(x) =

A
_
1 x
2
_
, (119)
and well try it. With N = 1, we have
1
= = 1 x
2
, and
C

C
11
=
_
1
1
_
p
2
+ q
2
_
dx. (120)
In the Sturm-Liouville form, we have p(x) = 1, q(x) = 0, w(x) = 1.
With N = 1, we have
1
= = 1 x
2
, and
C =
_
1
1
4x
2
dx =
8
3
. (121)
Also,
D

D
11
=
_
1
1
w
2
dx =
_
1
1
_
1 x
2
_
2
dx =
16
15
. (122)
The equation
N

n=1
_

C
mn


D
mn
_

A
n
= 0, m = 1, 2, . . . , N, (123)
becomes
(C

D)

A = 0. (124)
If

A = 0, then

=
C
D
=
5
2
. (125)
We are within 2% of the actual lowest eigenvalue of
1
=
2
/4 = 2.467. Of
course this rather good result is partly due to our good fortune at picking a
close approximation to the actual eigenfunction, as may be seen in Fig. 6.
19
0.6
0.8
1
1.2
0
0.2
0.4
-1.5 -1 -0.5 0 0.5 1 1.5
x
f(x)
cos x
1-x
2
Figure 3: Rayleigh-Ritz eigenvalue estimation example, comparing exact so-
lution with the guessed approximation.
7 Extending to Multiple Dimensions
It is possible to generalize our variational problem to multiple independent
variables, e.g.,
I(u) =
__
D
F
_
u,
u
x
,
u
y
, x, y
_
dxdy, (126)
where u = u(x, y), and bounded region D has u(x, y) specied on its bound-
ary S. We wish to nd u such that I is stationary with respect to variation
of u.
We proceed along the same lines as before, letting
u(x, y) = u(x, y) + h(x, y), (127)
where h(x, y)|
S
= 0. Look for stationary I:
dI
d

=0
= 0. Let
u
x

u
x
, u
y

u
y
, h
x

h
x
, etc. (128)
Then
dI
d
=
__
D
_
F
u
h +
F
u
x
h
x
+
F
u
y
h
y
_
dxdy. (129)
20
We want to integrate by parts the last two terms, in analogy with the
single-variable case. Recall Greens theorem:
_
S
(Pdx + Qdy) =
__
D
_
Q
x

P
y
_
dxdy, (130)
and let
P = h
F
u
x
, Q = h
F
u
y
. (131)
With some algrbra, we nd that
dI
d
=
_
S
h
_
F
u
x
dx
F
u
y
dy
_
+
__
D
h
_
F
u

D
Dx
_
F
u
x
_

D
Dy
_
F
u
y
__
dxdy,
(132)
where
Df
Dx

f
x
+
f
u
u
x
+
f
u
x

2
u
x
2
+
f
u
y

2
u
xy
(133)
is the total partial derivative with respect to x.
The boundary integral over S is zero, since h(x {S}) = 0. The re-
maining double integral over D must be zero for arbitrary functions h, and
hence,
F
u

D
Dx
_
F
u
x
_

D
Dy
_
F
u
y
_
= 0. (134)
This result is once again called the Euler-Lagrange equation.
8 Exercises
1. Suppose you have a string of length L. Pin one end at (x, y) = (0, 0)
and the other end at (x, y) = (b, 0). Form the string into a curve such
that the area between the string and the x axis is maximal. Assume
that b and L are xed, with L > b. What is the curve formed by the
string?
2. We considered the application of the Rayleigh-Ritz method to nding
approximate eigenvalues satisfying
y

= y, (135)
with boundary conditions y(1) = y(1) = 0. Repeat the method, now
with two functions:

1
(x) = 1 x
2
, (136)

2
(x) = x
2
(1 x
2
). (137)
21
You should get estimates for two eigenvalues. Compare with the exact
eigenvalues, including a discussion of which eigenvalues you have man-
aged to approximate and why. If the eigenvalues you obtain are not
the two lowest, suggest another function you might have used to get
the lowest two.
3. The Bessel dierential equation is
d
2
y
dx
2
+
1
x
dy
dx
+
_
k
2

m
2
x
2
_
y = 0. (138)
A solution is y(x) = J
m
(kx), the mth order Bessel function. Assume
a boundary condition y(1) = 0. That is, k is a root of J
m
(x). Use the
Rayleigh-Ritz method to estimate the rst non-zero root of J
3
(x). I
suggest you try to do this with one test function, rather than a sum of
multiple functions. But you must choose the function with some care.
In particular, note that J
3
has a third-order root at x = 0. You should
compare your result with the actual value of 6.379. If you get within,
say, 15% of this, declare victory.
22

You might also like