You are on page 1of 22

Physics 129a

Calculus of Variations
071113 Frank Porter
Revision 171116

1 Introduction
Many problems in physics have to do with extrema. When the problem
involves finding a function that satisfies some extremum criterion, we may
attack it with various methods under the rubric of “calculus of variations”.
The basic approach is analogous with that of finding the extremum of a
function in ordinary calculus.

2 The Brachistochrone Problem


Historically and pedagogically, the prototype problem introducing the cal-
culus of variations is the “brachistochrone”, from the Greek for “shortest
time”. We suppose that a particle of mass m moves along some curve under
the influence of gravity. We’ll assume motion in two dimensions here, and
that the particle moves, starting at rest, from fixed point a to fixed point b.
We could imagine that the particle is a bead that moves along a rigid wire
without friction [Fig. 1(a)]. The question is: what is the shape of the wire
for which the time to get from a to b is minimized?

First, it seems that such a path must exist – the two outer paths in
Fig. 2(b) presumably bracket the correct path, or at least can be made to
bracket the path. For example, the upper path can be adjusted to take an
arbitrarily long time by making the first part more and more horizontal. The
lower path can also be adjusted to take an arbitrarily long time by making
the dip deeper and deeper. The straight-line path from a to b must take
a shorter time than both of these alternatives, though it may not be the
shortest.
It is also readily observed that the optimal path must be single-valued in
x, see Fig. 1(c). A path that wiggles back and forth in x can be shortened in
time simply by dropping a vertical path through the wiggles. Thus, we can
describe path C as a function y(x).

1
(a) (b) (c)
a. a . a.
C

y
x
b . b . b .
Figure 1: The Brachistochrone Problem: (a) Illustration of the problem; (b)
Schematic to argue that a shortest-time path must exist; (c) Schematic to
argue that we needn’t worry about paths folding back on themselves.

We’ll choose a coordinate system with the origin at point a and the y axis
directed downward (Fig. 1). We choose the zero of potential energy so that
it is given by:
V (y) = −mgy.
The kinetic energy is
1
T (y) = −V (y) = mv 2 ,
2
for zero total energy. Thus, the speed of the particle is
q
v(y) = 2gy.

An element of distance traversed is:


v
u !2
q u dy
ds = (dx)2 + (dy)2 = t
1+ dx.
dx

Thus, the element of time to traverse ds is:


r  2
dy
ds 1 + dx
dt = = √ dx,
v 2gy
and the total time of descent is:
r  2
dy
Z xb 1 + dx
T = √ dx.
0 2gy

2
Different functions y(x) will typically yield different values for T ; we call
T a “functional” of y. Our problem is to find the minimum of this functional
with respect to possible functions y. Note that y must be continuous – it
would require an infinite speed to generate a discontinuity. Also, the accel-
eration must exist and hence the second derivative d2 y/dx2 . We’ll proceed
to formulate this problem as an example of a more general class of problems
in “variational calculus”.
Consider all functions, y(x), with fixed values at two endpoints; y(x0 ) =
y0 and y(x1 ) = y1 . We wish to find that y(x) which gives an extremum for
the integral: Z x1
I(y) = F (y, y 0 , x) dx,
x0
0
where F (y, y , x) is some given function of its arguments. We’ll assume “good
behavior” as needed.
In ordinary calculus, when we want to find the extrema of a function
f (x, y, . . .) we proceed as follows: Start with some candidate point (x0 , y0 , . . .),
Compute the total differential, df , with respect to arbitrary infinitesimal
changes in the variables, (dx, dy, . . .):
! !
∂f ∂f
df = dx + dy + ...
∂x x0 ,y0 ,...
∂y x0 ,y0 ,...

Now, df must vanish at an extremum, independent of which direction we


choose with our infinitesimal (dx, dy, . . .). If (x0 , y0 , . . .) are the coordinates
of an extremal point, then
! !
∂f ∂f
= = ... = 0.
∂x x0 ,y0 ,...
∂y x0 ,y0 ,...

Solving these equations thus gives the coordinates of an extremum point.


Finding the extremum of a functional in variational calculus follows the
same basic approach. Instead of a point (x0 , y0 , . . .), we consider a candidate
function y(x) = Y (x). This candidate must satisfy our specified behavior at
the endpoints:

Y (x0 ) = y0
Y (x1 ) = y1 . (1)

We consider a small change in this function by adding some multiple of


another function, h(x):

Y (x) → Y (x) + h(x).

3
1.9 Y+ ε h
Y
1.4

0.9 h

0.4

-0.1
0 0.2 0.4 0.6 0.8 1
-0.6

-1.1

Figure 2: Variation on function Y by function h.

To maintain the endpoint condition, we must have h(x0 ) = h(x1 ) = 0. The


notation δY is often used for h(x).
A change in functional form of Y (x) yields a change in the integral I.
The integrand changes at each point x according to changes in y and y 0 :
y(x) = Y (x) + h(x),
y 0 (x) = Y 0 (x) + h0 (x). (2)
To first order in , the new value of F is:
! !
0 0 ∂F 0 ∂F
F (Y +h, Y +h ) ≈ F (Y, Y , x)+ h(x)+ h0 (x). (3)
∂y y=Y ∂y 0 y=Y
y 0 =Y 0 y 0 =Y 0

We’ll use “δI” to denote the change in I due to this change in functional
form:
Z x1 Z x1
0 0
δI = F (Y + h, Y + h , x) dx − F (Y, Y 0 , x) dx,
x0  x0 
Z x1 ! !
∂F ∂F
≈  h+ h0  dx. (4)
 
∂y ∂y 0

x0 y=Y y=Y
y 0 =Y 0 y 0 =Y 0

We may apply integration by parts to the second term:


Z x1 Z x1 !
∂F 0 d ∂F
0
h dx = − h dx, (5)
x0 ∂y x0 dx ∂y 0

4
where we have used h(x0 ) = h(x1 ) = 0. Thus,
Z x1 " !#
∂F d ∂F
δI =  + h dx. (6)
x0 ∂y dx ∂y 0 y=Y
y 0 =Y 0

When I is at a minimum, δI must vanish, since, if δI > 0 for some ,


then changing the sign of  gives δI < 0, corresponding to a smaller value of
I. A similar argument applies for δI < 0, hence δI = 0 at a minimum. This
must be true for arbitrary h and  small but finite. It seems that a necessary
condition for I to be extremal is:
" !#
∂F d ∂F
+ = 0. (7)
∂y dx ∂y 0 y=Y
y 0 =Y 0

This follows from the fundamental theorem:

Theorem: If f (x) is continuous in [x0 , x1 ] and


Z x1
f (x)h(x) dx = 0 (8)
x0

for every continuously differentiable h(x) in [x0 , x1 ], where h(x0 ) =


h(x1 ) = 0, then f (x) = 0 for x ∈ [x0 , x1 ].

Proof: Imagine that f (χ) > 0 for some x0 < χ < x1 . Since f is continuous,
there exists  > 0 such that f (x) > 0 for all x ∈ (χ − , χ + ). Let

(x − χ + )2 (x − χ − )2 , χ −  ≤ x ≤ χ + 

h(x) = (9)
0 otherwise.
Note that h(x) is continuously differentiable in [x0 , x1 ] and vanishes at x0
and x1 . We have that
Z x1 Z χ+
f (x)h(x) dx = f (x)(x − χ + )2 (x − χ − )2 dx (10)
x0 χ−
> 0, (11)

since f (x) is larger than zero everywhere in this interval. Thus, f (x) cannot
be larger than zero anywhere in the interval. The parallel argument follows
for f (x) < 0.
This theorem then permits the assertion that
" !#
∂F d ∂F
+ = 0. (12)
∂y dx ∂y 0 y=Y
y 0 =Y 0

5
whenever y = Y such that I is an extremum, at least if the expression on
the right is continuous. We call the expression on the right the “Lagrangian
derivative” of F (y, y 0 , x) with respect to y(x), and denote it by δF
δy
.
The extremum condition, relabeling Y → y, is then:
!
δF ∂F d ∂F
≡ − = 0. (13)
δy ∂y dx ∂y 0

This is called the Euler-Lagrange equation.


Note that δI = 0 is a necessary condition for I to be an extremum, but
not sufficient. By definition, the Euler-Lagrange equation determines points
for which I is “stationary”. Further consideration is required to establish
whether I is an extremum or not.
We may write the Euler-Lagrange equation in another form. Let
∂F
Fa (y, y 0 , x) ≡ . (14)
∂y 0
Then
!
d ∂F dFa ∂Fa ∂Fa 0 ∂Fa 00
= = + y + y (15)
dx ∂y 0 dx ∂x ∂y ∂y 0
∂ 2F ∂ 2 F 0 ∂ 2 F 00
= + y + 02 y . (16)
∂x∂y 0 ∂y∂y 0 ∂y
Hence the Euler-Lagrange equation may be written:

∂ 2 F 00 ∂ 2F 0 ∂ 2F ∂F
y + y + − = 0. (17)
∂y 02 ∂y∂y 0 ∂x∂y 0 ∂y
Let us now apply this to the brachistochrone problem, finding the ex-
tremum of:
Z xb s
q 1 + y 02
2gT = dx. (18)
0 y
That is: s
0 1 + y 02
F (y, y , x) = . (19)
y
Notice that, in this case, F has no explicit dependence on x, and we can
take a short-cut. Starting with the Euler-Lagrange equation, if F has no
explicit x-dependence we find:
" #
∂F d ∂F 0
0 = − y (20)
∂y dx ∂y 0

6
∂F 0 d ∂F
= y − y0 (21)
∂y dx ∂y 0
dF ∂F d ∂F
= − 0 y 00 − y 0 (22)
dx ∂y dx ∂y 0
!
d ∂F
= F − y0 0 . (23)
dx ∂y

Hence,
∂F
F − y0 = constant = C. (24)
∂y 0
In this case,
∂F 0 2
q
y0 = (y ) / y (1 + y 02 ). (25)
∂y 0
Thus, s
1 + y 02 2
q
− (y 0 ) / y (1 + y 02 ) = C, (26)
y
or   1
y 1 + y 02 = ≡ A. (27)
C2
Solving for x, we find
Z s
y
x= dy. (28)
A−y
We may perform this integration with the trigonometric substitution: y =
A
2
(1 − cos θ) = A sin2 2θ . Then,
v
sin2 2θ
Z u
u θ θ
x = t
2 θ A sin cos dθ (29)
1 − sin 2 2 2
Z
θ
= A sin2 dθ (30)
2
A
= (θ − sin θ) + B. (31)
2
We determine integration constant B by letting θ = 0 at y = 0. We
chose our coordinates so that xa = ya = 0, and thus B = 0. Constant A is
determined by requiring that the curve pass through (xb , yb ):

A
xb = (θb − sin θb ), (32)
2
A
yb = (1 − cos θb ). (33)
2

7
This pair of equations determines A and θb . The brachistochrone is given
parametrically by:
A
x = (θ − sin θ), (34)
2
A
y = (1 − cos θ). (35)
2
In classical mechanics, Hamilton’s principle for conservative systems that
the action is stationary gives the familiar Euler-Lagrange equations of clas-
sical mechanics. For a system with generalized coordinates q1 , q2 , . . . , qn , the
action is Z t
S= L ({qi } , {q̇i } , t0 ) dt, (36)
t0

where L is the Lagrangian. Requiring S to be stationary yields:


!
d ∂L ∂L
− = 0, i = 1, 2, . . . , n. (37)
dt ∂ q̇i ∂qi

3 Relation to the Sturm-Liouville Problem


Suppose we have the Sturm-Liouville operator:
d d
L= p(x) − q(x), (38)
dx dx
with x ∈ (0, U ) and Dirichlet boundary conditions. We are interested in
solving the inhomogeneous equation Lf = g, where g is a given function.
Consider the functional
Z U 
J= pf 02 + qf 2 + 2gf dx. (39)
0

The Euler-Lagrange equation for J to be an extremum is:


!
∂F d ∂F
− = 0, (40)
∂f dx ∂f 0

where F = pf 02 + qf 2 + 2gf . We have


∂F
= 2qy + 2g (41)
∂f
!
d ∂F
0
= 2p0 f 0 + 2pf 00 . (42)
dx ∂f

8
Substituting into the Euler-Lagrange equation gives
" #
d d
p(x) f (x) − q(x)f (x) = 0. (43)
dx dx
This is the Sturm-Liouville equation! That is, the Sturm-Liouville differential
equation is just the Euler-Lagrange equation for the functional J.
We have the following theorem:
Theorem: The solution to
" #
d d
p(x) f (x) − q(x)f (x) = g(x), (44)
dx dx

where p(x) > 0, q(x) ≥ 0, and boundary conditions f (0) = a and


f (U ) = b, exists and is unique.

Proof: First, suppose there exist two solutions, f1 and f2 . Then d = f1 − f2


must satisfy the homogeneous equation:
" #
d d
p(x) d(x) − q(x)d(x) = 0, (45)
dx dx

with homogeneous boundary conditions d(0) = d(U ) = 0. Now multi-


ply Equation 45 by d(x) and integrate:
Z U ! Z U
d d
d(x) p(x) d(x) dx − q(x)d(x)2 dx = 0
0 dx dx 0
!2
dd(x) U Z U dd(x)
= d(x)p(x) − p(x) dx
dx 0 0 dx
Z U
=− pd02 dx. (46)
0

Thus, Z U 
pd02 (x) + q(x)d(x)2 dx = 0. (47)
0

Since pd02 ≥ 0 and qd2 ≥ 0, we must thus have pd02 = 0 and qd2 = 0
in order for the integral to vanish. Since p > 0 and pd02 = 0 it must
be true that d0 = 0, that is d is a constant. But d(0) = 0, therefore
d(x) = 0. The solution, if it exists, is unique.
The issue for existence is the boundary conditions. We presume that
a solution to the differential equation exists for some boundary con-
ditions, and must show that a solution exists for the given boundary

9
condition. From elementary calculus we know that two linearly inde-
pendent solutions to the homogeneous differential equation exist. Let
h1 (x) be a non-trivial solution to the homogeneous differential equation
with h1 (0) = 0. This must be possible because we can take a suitable
linear combination of our two solutions. Because the solution to the
inhomogeneous equation is unique, it must be true that h1 (U ) 6= 0.
Likewise, let h2 (x) be a solution to the homogeneous equation with
h2 (U ) = 0 (and therefore h2 (0) 6= 0). Suppose f0 (x) is a solution to
the inhomogeneous equation satisfying some boundary condition. Form
the function:
f (x) = f0 (x) + k1 h1 (x) + k2 h2 (x). (48)
We adjust constants k1 and k2 in order to satisfy the desired boundary
condition

a = f0 (0) + k2 h2 (0), (49)


b = f0 (U ) + k1 h1 (U ). (50)

That is,

b − f0 (U )
k1 = , (51)
h1 (U )
a − f0 (0)
k2 = . (52)
h2 (U )

We have demonstrated existence of a solution.

This discussion leads us to the variational calculus theorem:

Theorem: For continuously differentiable functions in (0, U ) satisfying f (0) =


a and f (U ) = b, the functional
Z U 
J= pf 02 + qf 2 + 2gf dx, (53)
0

with p(x) > 0 and q(x) ≥ 0, attains its minimum if and only if f (x) is
the solution of the corresponding Sturm-Liouville equation.

Proof: Let s(x) be the unique solution to the Sturm-Liouville equation sat-
isfying the given boundary conditions. Let f (x) be any other continu-
ously differentiable function satisfying the boundary conditions. Then
d(x) ≡ f (x) − s(x) is continuously differentiable and d(0) = d(U ) = 0.

10
Solving for f , squaring, and doing the same for the dervative equation,
yields

f 2 = d2 + s2 + 2sd, (54)
f 02 = d02 + s02 + 2s0 d0 . (55)

Let

∆J ≡ J(f ) − J(s) (56)


Z U 
= pf 02 + qf 2 + 2gf − ps02 − qs2 − 2gs dx (57)
0
Z Uh     i
= p d02 + 2s0 d0 + q d2 + 2ds + 2gf dx (58)
0
Z U Z U 
= 2 (pd0 s0 + qds + gd) dx + pd02 + qd2 dx. (59)
0 0

But
Z U Z U" #
0 0 U
d
(pd s + qds + gd) dx = dps0 + −d(x) (ps0 ) + qds + gd dx
0 0 0 dx
Z U " #
d
= d(x) − (ps0 ) + qs + g dx, since d(0) = d(U ) = 0
0 dx
= 0; integrand is zero by the differential equation. (60)

Thus, we have that


Z U 
∆J = pd0 + qd2 dx ≥ 0. (61)
0

In other words, f does no better than s, hence s corresponds to a


minimum. Furthermore, if ∆J = 0, then d = 0, since p > 0 implies d0
must be zero, and therefore d is constant, but we know d(0) = 0, hence
d = 0. Thus, f = s at the minimum.

4 The Rayleigh-Ritz Method


Consider the Sturm-Liouville problem:
" #
d d
p(x) f (x) − q(x)f (x) = g(x), (62)
dx dx
with p > 0, q ≥ 0, and specified boundary conditions. For simplicity here,
let’s assume f (0) = f (U ) = 0. Imagine expanding the solution in some set

11
of complete functions, {βn (x)} (not necessarily eignefunctions):

X
f (x) = An βn (x).
n=1

We have just shown that our problem is equivalent to minimizing


Z U 
J= pf 02 + qf 2 + 2gf dx. (63)
0

Substitute in our expansion, noting that

pf 02 = 0
(x)βn0 (x).
XX
Am An p(x)βm (64)
m n

Let
Z U
0
Cmn ≡ pβm βn0 dx, (65)
0
Z U
Bmn ≡ pβm βn dx, (66)
0
Z U
Gn ≡ gβn dx. (67)
0

Assume that we can interchange the sum and integral, obtaining, for example,
Z U
pf 02 dx =
XX
Cmn Am An . (68)
0 m n

Then XX X
J= (Cmn + Bmn ) Am An + 2 Gn An . (69)
m n n

Let Dmn ≡ Cmn + Bmn = Dnm . The Dmn and Gn are known, at least in
principle. We wish to solve for the expansion coefficients {An }. To accom-
plish this, use the condition that J is a minimum, that is,
∂J
= 0, ∀n. (70)
∂An
Thus,

∂J X
0= = Dnm Am + Gn , n = 1, 2, . . . (71)
∂An m=1
This is an infinite system of coupled inhomogeneous equations. If Dnm is
diagonal, the solution is simple:

An = −Gn /Dnn . (72)

12
The reader is encouraged to demonstrate that this occurs if the βn are the
eigenfunctions of the Sturm-Liouville operator.
It may be too difficult to solve the eigenvalue problem. In this case, we can
look for an approximate solution via the “Rayleigh-Ritz” approach: Choose
some finite number of linearly independent functions {α1 (x), α2 (x), . . . , αN (x)}.
In order to find a function
N
f¯(x) =
X
Ān α(n)(x) (73)
n=1

that approximates closely f (x), we find the values for Ān that minimize
N N
J(f¯) =
X X
D̄nm Ām Ān + 2 Ḡn Ān , (74)
n,m=1 n=1

where now
Z U
D̄nm ≡ (pαn0 αm
0
+ qαn αm ) dx (75)
0
Z U
Ḡn ≡ gαn dx. (76)
0

The minimum of J(f¯) is at:


N
X
D̄nm Ām + Ḡn = 0, n = 1, 2, . . . (77)
m=1

In this method, it is important to make a good guess for the set of functions
{αn }.
It may be remarked that the Rayleigh-Ritz method is similar in spirit
but different from the variational method we typically introduce in quantum
mechanics, for example when attempting to compute the ground state energy
of the helium atom. In that case, we adjust parameters in a non-linear
function, while in the Rayleigh-Ritz method we adjust the linear coefficients
in an expansion.

5 Adding Constraints
As in ordinary extremum problems, constraints introduce correlations, now
in the possible variations of the function at different points. As with the
ordinary problem, we may employ the method of Lagrange multipliers to
impose the constraints.

13
We consider the case of the “isoperimetric problem”, to find the stationary
points of the functional:
Z b
J= F (f, f 0 , x) dx, (78)
a

in variations δf vanishing at x = a, b, with the constraint that


Z b
C≡ G(f, f 0 , x) dx (79)
a

is constant under variations.


We have the following theorem:

Theorem: (Euler) The function f that solves this problem also makes the
functional I = J + λC stationary for some λ, as long as δC
δf
6= 0 (i.e., f
does not satisfy the Euler-Lagrange equation for C).

Proof: (partial) We make stationary the integral:


Z b
I = J + λC = (F + λG)dx. (80)
a

That is, f must satisfy


!
∂F d ∂F ∂G d ∂G
− 0
+λ − = 0. (81)
∂f dx ∂f ∂f dx ∂f 0

Multiply by the variation δf (x) and integrate:


Z b ! Z b !
∂F d ∂F ∂G d ∂G
− 0
δf (x) dx + λ − δf (x) dx = 0.
a ∂f dx ∂f a ∂f dx ∂f 0
(82)
Here, δf (x) is arbitrary. However, only those variations that keep C
invariant are allowed (e.g., take partial derivative with respect to λ and
require it to be zero):
Z b !
∂G d ∂G
δC = − δf (x) dx = 0. (83)
a ∂f dx ∂f 0

5.1 Example: Catenary


A heavy chain is suspended from endpoints at (x1 , y1 ) and (x2 , y2 ). What
curve describes its equilibrium position, under a uniform gravitational field?

14
The solution must minimize the potential energy:
Z 2
V = g ydm (84)
1
Z 2
= ρg yds (85)
Z1x2 q
= ρg y 1 + y 02 dy, (86)
x1

where ρ is the linear


√ density of the chain, and the distance element along the
02
chain is ds = dx 1 + y .
We wish to minimize V , under the constraint that the length of the chain
is L, a constant. We have,
Z 2 Z x2 q
L= ds = 1 + y 02 dx. (87)
1 x1

To solve, let (we multiply L by ρg and divide out of the problem)


q q
F (y, y 0 , x) = y 1 + y 02 + λ 1 + y 02 , (88)
and solve the Euler-Lagrange equation for F .
Notice that F does not depend explicitly on x, so we again use our short
cut that
∂F
F − y 0 0 = constant = C. (89)
∂y
Thus,
∂F
C = F − y0 (90)
∂y 0
y 02
q !
= (y + λ) 1+ y 02 −√ (91)
1 + y 02
(y + λ)
= √ (92)
( 1 + y 02 .
Some manipulation yields
dy dx
q = . (93)
(y + λ)2 − C 2 C

x+k
With the substitution y + λ = C cosh θ, we obtain θ = C
, where k is an
integraton constant, and thus
!
x+k
y + λ = C cosh . (94)
C

15
There are three unknown constants to determine in this expression, C, k,
and λ. We have three equations to use for this:
!
x1 + k
y1 + λ = C cosh , (95)
C
!
x2 + k
y2 + λ = C cosh , and (96)
C
Z x2 q
L = 1 + y 02 dx. (97)
x1

6 Eigenvalue Problems
We may treat the eigenvalue problem as a variational problem. As an exam-
ple, consider again the Sturm-Liouville eigenvalue equation:
" #
d df (x)
p(x) − q(x)f (x) = −λw(x)f (x), (98)
dx dx

with boundary conditions f (0) = f (U ) = 0. This is of the form

Lf = −λwf. (99)

Earlier, we found the desired functional to make stationary was, for Lf =


0, Z U 
I= pf 02 + qf 2 dx. (100)
0
We modify this to the eigenvalue problem with q → q − λw, obtaining
Z U 
I= pf 02 + qf 2 − λwf 2 dx, (101)
0

which possesses the Euler-Lagrange equation giving the desired Sturm-Liouville


equation. Note that λ is an unknown parameter - we want to determine it.
It is natural to regard the eigenvalue problem as a variational problem
with constraints. Thus, we wish to vary f (x) so that
Z U 
J= pf 02 + qf 2 dx (102)
0

is stationary, with the constraint


Z U
C= wf 2 dx = constant. (103)
0

16
Notice here that we may take C = 1, corresponding to normalized eigenfunc-
tions f , with respect to weight w.
Let’s attempt to find approximate solutions using the Rayleigh-Ritz method.
Expand

X
f (x) = An un (x), (104)
n=1
where u(0) = u(U ) = 0. The un are some set of expansion functions, not the
eigenfunctions – if they are the eigenfunctions, then the problem is already
solved! Substitute this into I, giving
∞ X
X ∞
I= (Cmn − λDmn ) Am An , (105)
m=1 n=1

where
Z U
Cmn ≡ (pu0m u0n + qum un ) dx (106)
0
Z U
Dmn ≡ wum un dx. (107)
0

Requiring I to be stationary,
∂I
= 0, m = 1, 2, . . . , (108)
∂Am
yields the infinite set of coupled homogeneous equations:

X
(Cmn − λDmn ) An = 0, m = 1, 2, . . . (109)
n=1

This is perhaps no simpler to solve than the original differential equation.


However, we may make approximate solutions for f (x) by selecting a finite
set of linearly independent functions α1 , . . . , αN and letting
N
f¯(x) =
X
Ān αn (x). (110)
n=1

Solve for the “best” approximation of this form by finding those {Ān } that
satisfy
N 
X 
C̄mn − λ̄D̄mn Ān = 0, m = 1, 2, . . . , N, (111)
n=1
where
Z U
0
C̄mn ≡ (pαm αn0 + qαm αn ) dx (112)
0
Z U
D̄mn ≡ wαm αn dx. (113)
0

17
This looks like N equations in the N +1 unknowns λ̄, {Ān }, but the overall
normalization of the An ’s is arbitrary. Hence there are enough equations in
principle, and we obtain
PN
m,n=1 C̄mn Ām Ān
λ̄ = PN . (114)
m,n=1 D̄mn Ām Ān

Notice the similarity of Eqn. 114 with


RU
0 (pf 02 + qf 2 ) dx J(f )
λ= RU = . (115)
2
0 wf dx
C(f )

This follows since I = 0 for f a solution to the Sturm-Liouville equation:


Z U 
I = pf 02 + qf 2 − λwf 2 dx
0
Z U" #
U d
= pf f 0 + −f (pf 0 ) + qf 2 − λwf 2 dx

0 0 dx
Z U 
= 0+ −qf 2 + λwf 2 + qf 2 − λwf 2 dx
0
= 0, (116)

where we have used the both the boundary condition f (0) = f (U ) = 0 and
d
Sturm-Liouville equation dx (pf 0 ) = qf − λwf to obtain the third line. Also,

J(f¯)
λ̄ = , (117)
C(f¯)

since, for example,


Z U 
J(f¯) = pf¯02 + q f¯2 dx
0
Z U !
0
αn0 + q
X X
= p Ān Ām αm Ān Ām αm αn dx
0 m,n m,n
X
= C̄mn Ān Ām . (118)
m,n

That is, if f¯ is “close” to an eigenfunction f , then λ̄ should be “close” to an


eigenvalue λ.
Let’s try an example: Find the lowest eigenvalue of f 00 = −λf , with
boundary conditions f (±1) = 0. We of course readily see that the first
eigenfunction is cos(πx/2) with λ1 = π 2 /4, but let’s try our method to see

18
how we do. For simplicity, we’ll try a Rayleigh-Ritz approximation with only
one term in the sum.
As we noted earlier, it is a good idea to pick the functions with some
care. In this case, we know that the lowest eigenfunction won’t wiggle much,
and a good guess is that it will be symmetric with no zeros in the interval
(−1, 1). Such a function, which satisfies the boundary conditions, is:
 
f¯(x) = Ā 1 − x2 , (119)

and we’ll try it. With N = 1, we have α1 = α = 1 − x2 , and


Z 1  
C ≡ C̄11 = pα02 + qα2 dx. (120)
−1

In the Sturm-Liouville form, we have p(x) = 1, q(x) = 0, w(x) = 1.


With N = 1, we have α1 = α = 1 − x2 , and
Z 1
8
C= 4x2 dx = . (121)
−1 3
Also, Z 1 Z 1 
2
2 16
D ≡ D̄11 = wα dx = 1 − x2 dx = . (122)
−1 −1 15
The equation
N 
X 
C̄mn − λ̄D̄mn Ān = 0, m = 1, 2, . . . , N, (123)
n=1

becomes
(C − λ̄D)Ā = 0. (124)
If Ā 6= 0, then
C 5
= .
λ̄ = (125)
D 2
We are within 2% of the actual lowest eigenvalue of λ1 = π 2 /4 = 2.467. Of
course this rather good result is partly due to our good fortune at picking a
close approximation to the actual eigenfunction, as may be seen in Fig. 6.

19
1.2

1 Series1
Series2
0.8

0.6

0.4

0.2

0
‐1.5 ‐1 ‐0.5 0 0.5 1 1.5
‐0.2

Figure 3: Rayleigh-Ritz eigenvalue estimation example, comparing exact so-


lution with the guessed approximation.

7 Extending to Multiple Dimensions


It is possible to generalize our variational problem to multiple independent
variables, e.g., !
ZZ
∂u ∂u
I(u) = F u, , , x, y dxdy, (126)
D ∂x ∂y
where u = u(x, y), and bounded region D has u(x, y) specified on its bound-
ary S. We wish to find u such that I is stationary with respect to variation
of u.
We proceed along the same lines as before, letting

u(x, y) = u(x, y) + h(x, y), (127)




dI
where h(x, y)|S = 0. Look for stationary I: d
= 0. Let
=0

∂u ∂u ∂h
ux ≡ , uy ≡ , hx ≡ , etc. (128)
∂x ∂y ∂x
Then !
dI ZZ ∂F ∂F ∂F
= h+ hx + hy dxdy. (129)
d D ∂u ∂ux ∂uy

20
We want to “integrate by parts” the last two terms, in analogy with the
single-variable case. Recall Green’s theorem:
!
I ZZ
∂Q ∂P
(P dx + Qdy) = − dxdy, (130)
S D ∂x ∂y
and let
∂F ∂F
P =h , Q = −h . (131)
∂ux ∂uy
With some algrbra, we find that
! " ! !#
dI I ∂F ∂F ZZ
∂F D ∂F D ∂F
= h dx − dy + h − − dxdy,
d S ∂ux ∂uy D ∂u Dx ∂ux Dy ∂uy
(132)
where
Df ∂f ∂f ∂u ∂f ∂ 2 u ∂f ∂ 2 u
≡ + + + (133)
Dx ∂x ∂u ∂x ∂ux ∂x2 ∂uy ∂x∂y
is the “total partial derivative” with respect to x.
The boundary integral over S is zero, since h(x ∈ {S}) = 0. The re-
maining double integral over D must be zero for arbitrary functions h, and
hence, ! !
∂F D ∂F D ∂F
− − = 0. (134)
∂u Dx ∂ux Dy ∂uy
This result is once again called the Euler-Lagrange equation.

8 Exercises
1. Suppose you have a string of length L. Pin one end at (x, y) = (0, 0)
and the other end at (x, y) = (b, 0). Form the string into a curve such
that the area between the string and the x axis is maximal. Assume
that b and L are fixed, with L > b. What is the curve formed by the
string?
2. We considered the application of the Rayleigh-Ritz method to finding
approximate eigenvalues satisfying
y 00 = −λy, (135)
with boundary conditions y(−1) = y(1) = 0. Repeat the method, now
with two functions:
α1 (x) = 1 − x2 , (136)
α2 (x) = x2 (1 − x2 ). (137)

21
You should get estimates for two eigenvalues. Compare with the exact
eigenvalues, including a discussion of which eigenvalues you have man-
aged to approximate and why. If the eigenvalues you obtain are not
the two lowest, suggest another function you might have used to get
the lowest two.

3. The Bessel differential equation is

d2 y 1 dy m2
!
2
+ + k − y = 0. (138)
dx2 x dx x2

A solution is y(x) = Jm (kx), the mth order Bessel function. Assume


a boundary condition y(1) = 0. That is, k is a root of Jm (x). Use the
Rayleigh-Ritz method to estimate the first non-zero root of J3 (x). I
suggest you try to do this with one test function, rather than a sum of
multiple functions. But you must choose the function with some care.
In particular, note that J3 has a third-order root at x = 0. You should
compare your result with the actual value of 6.379. If you get within,
say, 15% of this, declare victory.

22

You might also like