Professional Documents
Culture Documents
Sturm-Liouville Problems
Sturm-Liouville problems are boundary-value problems that naturally arise when solving certain partial differential equation problems using a separation of variables method that will be
discussed in a later chapter. It is the theory behind Sturm-Liouville problems that, ultimately,
justifies the separation of variables method for these partial differential equation problems.
The simplest applications lead to the various Fourier series, and less simple applications lead to
generalizations of Fourier series involving Bessel functions, Hermite polynomials, etc.
Unfortunately, there are several difficulties with our studying Sturm-Liouville problems:
1.
Motivation: Had we the time, we would first discuss partial differential equation problems
and develop the separation of variables method for solving certain important types of
problems involving partial differential equations. We would then see how these SturmLiouville problems arise and why they are so important. But we dont have time. Instead,
Ill briefly remind you of some results from linear algebra that are analogous to the results
we will, eventually, obtain.
2.
Another difficulty is that the simplest examples (which are very important since they lead
to the Fourier series) are too simple to really illustrate certain elements of the theory,
while the other standard examples tend to get complicated and require additional tricks
which distract from illustrating the theory. Well deal with this as well as we can.
3.
The material gets theoretical. Sorry, there is no way around this. The end results, however,
are very useful in computations, especially now that we have computers to do the tedious
computations. I hope we get to that point.
4.
Finally, I must warn you that, in most texts, the presentation of the Sturm-Liouville
theory stinks. In most introductory ordinary differential equation texts, this material is
usually near the end, which usually means that the author just wants to finish the damn
book, and get it published. In texts on partial differential equations, the results are usually
quoted with some indication that the proofs can be found in a good text on differential
equations.
12/7/2014
52.1
Sturm-Liouville
Let us briefly review some elements from linear algebra which we will be recreating using
functions instead of finite-dimensional vectors. But first, let me remind you of a little complex
variable algebra. This is because our column vectors and matrices may have complex values.
.
v =
and
w
=
.
.
.
.
.
vN
wN
where the vk s and wk s are real numbers.
Recall that the norm (or length) of the above v , kvk , is computed by
q
kvk = |v1 |2 + |v2 |2 + |v N |2
and that, computationally, the classic dot product of the above v and w is
v w = v1 w1 + v2 w2 + + v N w N
w1
w
2
T
= [v1 , v2 , . . . , v N ]
.. = v w
.
wN
where vT w is the matrix product with vT being the transpose of matrix v (i.e., the matrix
constructed from v by switching rows with columns).
Observe that, since v 2 = |v|2 when v is real,
v v = v1 v1 + v2 v2 + + v N v N
Also recall that a set of vectors {v1 , v2 , v2 , . . .} is said to be orthogonal if and only if
v j vk = 0
whenever
j 6= k
And finally, recall that a matrix A is symmetric if and only if AT = A , and that, for
symmetric matrices, we have the following theorem from linear algebra (and briefly mentioned
in chapter 38):
Theorem 52.1
Let A be a symmetric NN matrix (with real-valued components). Then both of the following
hold:
1.
2.
Ultimately, we will obtain an analog to this theorem involving a linear differential operator
instead of a matrix A .
Now if x is a real number positive, negative or zero then x 2 = |x|2 . However, you can
easily verify that z 2 6= |z|2 . Instead, we have
|z|2 = x 2 + y 2 = x 2 (i y)2 = (x i y)(x + i y) = z z
v1
v
2
v =
..
.
vN
and
w1
w
2
w =
..
.
wN
where the vk s and wk s are complex numbers. Since the components are complex, we define
the complex conjugate of v in the obvious way,
v1
v
2
v =
..
.
v N
q
|v1 |2 + |v2 |2 + |v N |2
Sturm-Liouville
However, when we try to relate this to the classic dot product v v , we get
kvk2 = |v1 |2 + |v2 |2 + + |v N |2
= v1 v1 + v2 v2 + + v N v N
= (v ) v 6= v v
This suggests that, instead of using the classic dot product, we use the (standard vector) inner
product of v with w , which is denoted by h v | w i and defined by by
h v | w i = v w = v1 w1 + v2 w2 + + v N w N
w1
w
2
T
= [v1 , v2 , . . . , v N ]
.. = (v ) w
.
wN
Then
h v | v i = v v = kvk2
We also adjust our notion of orthogonality by saying that any set {v1 , v2 , . . .} of vectors in
C N is orthogonal if and only if
m n
v v
= 0
whenever m 6= n .
The inner product for vectors with complex components is the mathematically natural extension of the standard dot product for vectors with real components. Some easily verified (and
useful) properties of this inner product are given in the next theorem. Verifying it will be left as
an exercise (see exercise 52.4).
h v | w i = h w | v i ,
2.
h u | v + w i = h u | v i + h u | w i ,
3.
h v + w | u i = h v | u i + h w | u i ,
and
4.
h v | v i = kvk .
Later, well define other inner products for functions. These inner products will have very
similar properties to those in given in the last theorem.
Adjoints
The adjoint of any matrix A denoted A is the transpose of the complex conjugate of A ,
A = (A )T
(equivalently, AT ) .
That is, A is the matrix obtained from matrix A by replacing each entry in A with its complex
conjugate, and then switching the rows and columns (or first switch the rows and columns and
then replace the entries with their complex conjugates you get the same result either way).
!Example 52.1:
If
A =
"
1 + 2i
6i
3 4i
7
5i
8i
then
A =
"
1 + 2i
6i
3 4i
7
5i
8i
# !T
"
1 2i
=
6i
3 + 4i
7
5i
8i
#T
1 2i
= 3 + 4i
5i
6i
7
8i
This adjoint turns out to be more useful than the transpose when we allow vectors to have
complex components.
A matrix A is self adjoint 1 if and only if A = A . Note that:
1.
2.
If A is a square matrix with just real components, then A = AT , and A is self adjoint
means the same as A is symmetric.
If you take the proof of theorem 52.1 and modify it to take into account the possibility of
complex-valued components, you get
Theorem 52.3
Let A be a self-adjoint N N matrix. Then both of the following hold:
1.
2.
This theorem is noteworthy because it will help explain the source of some of the terminology
that we will later be using.
What is more noteworthy is what the above theorem says about computing Av when A is
self adjoint. To see this, let
1 2 3
b , b , b , . . . , bN
be any orthogonal basis for C N consisting of eigenvectors for A (remember, the theorem says
there is such a basis), and let
{ 1 , 2 , 3 , . . . , N }
be the corresponding set of eigenvectors (so Abk = k bk for k = 1, 2, . . . , N ). Since the set
of bk s is a basis, we can express v as a linear combination of these basis vectors.
v = v1 b1 + v2 b2 + v3 b3 + + v N b N =
1 the term Hermitian is also used
N
X
k=1
vk bk
(52.1)
Sturm-Liouville
= v1 1 b1 + v2 2 b2 + v3 3 b3 + + v N N b N
If N is large, this could be a lot faster than doing the basic component-by-component matrix
multiplication. (And in our analog with functions, N will be infinite.)
b v = b1 v1 b1 + v2 b2 + v3 b3 + + v N b N
.
= v1 b1 b1 + v2 b1 b2 + v3 b1 b3 + + v N b1 b N
1 k
b b = 0
if 1 6= k
On the other hand,
2
b1 b1 =
b1
b v = v1 b1 b1 + v2 b1 b2 + v3 b1 b3 + + v N b1 b N
2
= v1
b1
+ v2 0 + v3 0 + + v N 0 .
So,
2
b1 v = v1
b1
v1 =
b1 v
2
b1
Theorem 52.4
Let
b1 , b2 , b3 , . . . , b N
v = v1 b + v2 b + v3 b + + v N b
where
vk =
bk v
2
bk
N
X
vk bk
(52.2a)
k1
for k = 1, 2, 3, . . . , N
(52.2b)
If you know a little about Fourier series, then you may recognize formula set (52.2) as a
finite-dimensional analog of the Fourier series formulas. If you know nothing about Fourier
series, then Ill tell you that a Fourier series for a function f is an infinite-dimensional analog
of formula set (52.2). We will eventually see this.
b1 , b2 , b3 , . . . , b N
is an orthogonal basis for C N . If it had been orthonormal, then we would also have had
k
b
= 1
for k = 1, 2, 3, . . . , N ,
and the formulas in our last theorem would have simplified somewhat. In fact, given any orthogonal set
1 2 3
b , b , b , . . . , bN
,
we can construct a corresponding orthonormal set
1 2 3
n , n , n , . . . , nN
by just letting
bk
nk =
k
b
for k = 1, 2, 3, . . . , N
and
With these and the definition of the adjoint, you can easily verify that
(AB) = B A
and
A = A .
Also observe that, if
v1
v
2
v =
..
.
vN
and
w1
w
2
w =
..
.
wN
= A .
then
v w = v1 , v2
Sturm-Liouville
w1
w2
, . . . , vN .
= v1 w1 + v2 w2 + + v N w N = h v | w i
..
wN
v A u = v A u = v A u = (Av) u = h Av | u i
for every v, u in C N
Its a simple exercise in linear algebra to show that the above completely characterizes
adjointness and self adjointness for matrices. That is, you should be able to finish proving
the next theorem. Well use the results to extend the these concepts to things other than matrices.
v A u = h Av | u i
for every v and u in C N
B = A if and only if
h v | Bu i = h Av | u i
2.
Comments
To be honest, we are not going to directly use the material weve developed over the past several
pages. The reason we went over the theory of self-adjoint matrices and related material
concerning vectors in C N is that the Sturm-Liouville theory well be developing is a functional
analog of what we just discussed, using differential operators and functions instead of matrices
and vectors in C N . Understanding the theory and computations weve just developed should
expedite learning the theory and computations we will be developing.
52.2
In each boundary-value problem of interest to us, the differential equation will involve both
the unknown function which (following tradition) we will denote by or (x) and a
parameter (basically, is just some yet undetermined constant). Well assume that each of
these equations can be written as
A(x)
d 2
d
+ [C(x) + ] = 0
+ B(x)
dx
dx 2
(52.3a)
d
d 2
+ B(x)
+ C(x) =
2
dx
dx
(52.3b)
or, equivalently, as
A(x)
+ = 0
(52.4a)
and
(52.4b)
(0) = 0
(L) = 0
r2 + = 0
r = .
In this example, the precise formula for (x) , the equations general solution corresponding to a particular value of , depends on whether > 0 , = 0 or < 0 . Lets go
through all the cases:
Sturm-Liouville
. Then
and
(x) = c1 ex + c2 ex
and thus,
(x) = c1 ex c1 ex = c1 ex ex .
e L e L > 0
0 (x) = 0 (x) = c1 x + c2
which says that c1 = 0 (since L > 0 ). Thus, the only solution to the differential
equation that satisfies the boundary conditions when = 0 is
0 (x) = 0 x + 0 = 0
> 0 : This time, it is convenient to let = , so that the solution to the characteristic
equation becomes
r = = i .
While the general formula for can be written in terms of complex exponentials, it
is better to recall that these complex exponentials can be written in terms of sines and
cosines, and that the general formula for can then be given as
(x) = c1 cos(x) + c2 sin(x)
L = an integral multiple of
k =
with k = 1, 2, 3, . . .
k = (k ) =
k
L
2
with k = 1, 2, 3, . . .
(these are the eigenvalues), and a corresponding list of (x)s (the corresponding eigenfunctions),
k
k (x) = ck sin(k x) = ck sin
x
with k = 1, 2, 3, . . .
L
where the ck s are arbitrary constants. For our example, these are the only nontrivial
eigenfunctions.
Sturm-Liouville
(k , k )
and
k
L
2
k
k (x) = ck sin(k x) = ck sin
x
L
52.3
To solve these differential equations with parameters, you will probably want to use form (52.3a),
as we did in our example. However, to develop and use the theory that we will be developing
and using, we will want to rewrite each differential equation in its Sturm-Liouville form
d
d
p(x)
+ q(x) = w(x)
dx
dx
The equation
d
d
sin(x)
+ cos(x) = x 2
dx
dx
p(x) = sin(x)
q(x) = cos(x)
and
d 2
d
+ 2
+ [sin(x) + ] = 0
dx
dx 2
w(x) = x 2
Fortunately, just about any differential equation in form (52.3a) or (52.3a) can converted to
Sturm-Liouville form using a procedure similar to that used to solve first-order linear equations.
To describe the procedure in general, lets assume we have at least gotten our equation to the
form
A(x)
d 2
d
+ B(x)
+ C(x) =
2
dx
dx
d
d2
+ sin(x) =
+ 2
dx
dx 2
A(x)
A(x)
A(x)
dx 2
2.
B(x)
A(x)
dx
In our example,
Z
B(x)
dx =
A(x)
2
d x = 2 ln x + c
x
So (ignoring c ),
p(x) = e
B(x)
A(x)
dx
= e2 ln x = eln x = x 2
B d
C
p
d2
+ p
+ p =
A dx
A
A
dx 2
dx
dx
A dx
(c)
Sturm-Liouville
dx
dx
x2
d 2
d
+ 2x
+ x sin(x) = x
dx
dx 2
dx
dx
dx
dx
dx
Using this to simplify the first two terms of our last differential equation
above, we get
d
2 d
x
+ x sin(x) = x .
dx
dx
52.4
q(x) = x sin(x)
and
w(x) = x
The boundary conditions for a Sturm-Liouville problem will have to be homogeneous as defined
in the previous chapter. There is an additional condition that well derive in this section that will
help allow us to view a certain operator as self adjoint. That operator is given by the left side of
the differential equation of interest when that differential equation is written in Sturm-Liouville
form,
d
dx
p(x)
d
dx
+ q(x) = w(x)
For the rest of this chapter, this operator will be denoted by L . That is, given any suitably
differentiable function ,
d
d
L[] =
p(x)
+ q(x)
dx
dx
where p and q are presumably known real-valued functions on some interval (a, b) . Note that
our differential equation can be written as
L[] = w
dx
where p and q are any suitably smooth and integrable functions on (a, b) . Then
b
Z b
Z b
Z b
d f dg
dg
p
dx +
q f g dx
f L[g] d x = p(x) f (x)
dx
dx dx
(52.5)
Now suppose we have two functions u and v on (a, b) (assumed suitably smooth and
integrable, but, possibly, complex valued), and suppose we want to compare
Z b
Z b
u L[v] d x
and
L[u] v d x .
a
(Why? Because this will lead to our extending the notion of self adjointness as characterized
in theorem 52.5 on page 528)
If p and q are real-valued functions, then it is trivial to verify that
L[u] = L u
Using this and the above preliminary Greens formula, we see that
Z b
Z b
Z b
Z b
u L[v] d x
L[u] v d x =
u L[v] d x
v L u d x
a
#
b
Z b
Z b
du dv
dv
= pu
p
dx +
qu v d x
dx a
dx dx
a
a
"
#
b
Z b
Z b
du
dv du
pv
p
dx +
qvu d x
dx
dx dx
"
Nicely enough, most of the terms on the right cancel out, leaving us with:
dx
Then
Sturm-Liouville
u L[v] d x
b
du
dv
L[u] v d x = p u
v
dx
dx
(52.6)
Equation (52.6) is known as Greens formula. (Strictly speaking, the right side of the equation
is the Greens formula for the left side).
Now we can state what sort of boundary conditions are appropriate for our discussions.
We will refer to a pair of homogeneous boundary conditions at x = a and x = b as being
Sturm-Liouville appropriate if and only if
b
du
dv
p u
v
(52.7)
= 0 .
dx
dx
(a) = 0
(b) = 0
and
u(a) = 0
and
u(b) = 0
v(a) = 0
and
v(b) = 0
dx
x=a
x=a
du
dv
= p(b) 0
0
dx x=b
dx x=b
dv
du
p(a) 0
0
dx
dx
x=a
= 0
x=a
So
(a) = 0
and
(b) = 0
are Sturm-Liouville appropriate boundary conditions, at least whenever p(a) and p(b) are
finite.
u L[v] d x
L[u] v d x = 0 ,
a
which, in turn, gives us the following lemma, which, in turn, suggests just what we will be using
for inner products and self adjointness in the near future.
Lemma 52.8
Let (a, b) be some interval, p and q any suitably smooth and integrable real-valued functions
on (a, b) , and L the operator given by
d
d
L[] =
p(x)
+ q(x) .
dx
dx
Assume, further, that u(x) and v(x) satisfy Sturm-Liouville appropriate boundary conditions
at a and b . Then
Z b
Z b
u L[v] d x =
L[u] v d x .
a
dx
L[] d x =
a
[w] d x =
((x)) (x)w(x) d x =
b
d
Z b
Z b
d d
p
d
x
+
q d x
dx a
dx
dx
a
a
b
Z b 2
Z b
d
d
= p
p d x +
q ||2 d x
dx
dx
L[] d x = p
||2 w d x
Sturm-Liouville
= p
Together, the above tells us that
b
2
|| w d x =
b
d
dx a
L[] d x = p
2
d
2
p q || d x
dx
b
d
dx a
2
d
2
p q || d x
dx
Cutting out the middle and solving for gives us the next lemma.
Then
dx
b Z b 2
d
q ||2 dx
p
+
p
dx
dx a
a
=
Z b
||2 w dx
d
(52.8)
Equation (52.8) is called the Rayleigh quotient. It relates the value of an eigenvalue to any
corresponding eigenfunction. It is of particular interest in the Sturm-Liouville problems in which
the boundary conditions ensure that
p
Then the Rayleigh quotient reduces to
b
a
b
d
dx a
= 0
2
d
p q ||2 dx
dx
Z b
||2 w dx
Now look at the right side of this equation. Remember, p , q and w are all real-valued functions
on (a, b) . It should then be clear that the integrals on the right are all real valued. Thus, must
be real valued.
If, in addition,
q(x) 0
for a < x < b
(which occurs fairly often) then, in fact,
b
a
2
d
2
p q || dx
dx
0
Z b
2
|| w dx
Now if q(x) is nonzero on some subinterval of (a, b) (in addition to being nonpositive on
(a, b) ), then the integrals must be positive for any nontrivial function , which means that
Sturm-Liouville Problems
must always be positive. On the other hand, if q = 0 on (a, b) (which is quite common), then
the Rayleigh quotient further reduces to
2
d
p dx
dx
|| w dx
With a little thought, you can then show that the only possible eigenfunctions corresponding to a
zero eigenvalue, = 0 , are nonzero constant functions, and, consequently, if nonzero constant
functions do not satisfy the boundary conditions, then cannot be zero; all the eigenvalues are
positive.
52.5
Sturm-Liouville Problems
Full Definition
Finally, we can fully define the Sturm-Liouville problem: A Sturm-Liouville problem is a
boundary-value problem consisting of both of the following:
1.
dx
(52.9)
where p , q and w are sufficiently smooth and integrable functions on the finite interval
(a, b) , with p and q being real valued, and w being positive on this interval. (The sign
of p and w turn out to be relevant for many results. We will usually assume they are
positive functions on (a, b) . This is what invariably happens in real applications.)
2.
dx
Sturm-Liouville
The functions p , q and w are all real valued and continuous on the closed interval
[a, b] , with p being differentiable on (a, b) , and both p and w being positive on the
closed interval [a, b] .
2.
Many, but not all, of the Sturm-Liouville problems generated in solving partial differential
equation problems are regular. For example, the problem considered in example 52.2 on page
529 is a regular Sturm-Liouville problem.
So What?
It turns out that the eigenfunctions from a Sturm-Liouville problem on an interval (a, b) can
often be used in much the same way as the eigenvectors from a self-adjoint matrix to form an
orthogonal basis for a large set of functions. Consider, for example, the eigenfunctions from
the example 52.2
k
k (x) = ck sin
x
for k = 1, 2, 3, . . . .
L
If f is any reasonable function on (0, L) (say, any continuous function on this interval), then
we will discover that there are constants c1 , c2 , c3 , . . . such that
f (x) =
X
k=1
k
x
ck sin
L
This infinite series, called the (Fourier) sine series for f on (0, L) , turns out the be extremely
useful in many applications. For one thing, it expresses any function on (0, L) in terms of the
well-understood sine functions.
Our next goal is to develop enough of the necessary theory to discover what I just said we
will discover. In particular we want to learn how to compute the ck s in the above expression
for f (x) . It turns out to be remarkable similar to the formula for computing the vk s in theorem
52.4 on page 526. Take a look at it right now. Of course, before we can verify this claim, we
will have to find out just what we are using for an inner product.
All this, alas, will take some time.
The Eigen-Spaces
52.6
The Eigen-Spaces
Suppose we have some Sturm-Liouville problem. For convenience, let us write the differential
equation in that problem as
L[] = w
with, as usual,
d
L[] =
dx
d
p(x)
dx
+ q(x)
One observation we can immediately make is that L is a linear operator. That is, if (x)
and (x) are any two functions, and c1 and c2 are any two constants, then
L[c1 + c2 ] = = c1 L[] + c2 L[]
Now, further suppose that both (x) and (x) are eigenfunctions corresponding to the same
eigenvalue . Then,
L[c1 + c2 ] = c1 L[] + c2 L[]
= c1 [w] + c2 [w] = w[c1 + c2 ]
This shows that the linear combination c1 + c2 also satisfies the differential equation with
that particular value of . Does it also satisfy the boundary conditions? Of course. Remember,
the set of boundary conditions in a Sturm-Liouville problem is homogeneous, meaning that, if
(x) and (x) satisfy the given boundary conditions, so does any linear combination of them.
Hence, any linear combination of eigenfunctions corresponding to a single eigenvalue is also an
eigenfunction for our Sturm-Liouville problem, a fact significant enough to write as a lemma.
Lemma 52.10
Assume (, ) and (, ) are both eigen-pairs with the same eigenvalue for some SturmLiouville problem. Then any nonzero linear combination of (x) and (x) is also an eigenfunction for the Sturm-Liouville problem corresponding to eigenvalue .
This lemma tells us that the set of all eigenfunctions for our Sturm-Liouville problem corresponding to any single eigenvector is a vector space of functions (after throwing in the zero
function). Naturally, we call this the eigenspace corresponding to eigenvalue . Keep in mind
that these functions are all solutions to the second-order homogeneous linear equation
d
d
p(x)
+ [q(x) + w(x)] = 0 ,
dx
dx
and that the general solution to such a differential equation can be written as
(x) = c1 1 (x) + c2 2 (x)
where c1 and c2 are arbitrary constants and {1 , 2 } is any linearly independent pair of solutions
to the differential equation. 1 can be chosen as one of the eigenfunctions. Whether or not 2
can be chosen to be an eigenfunction depends on whether or not there is a linearly independent
pair of eigenfunctions corresponding to this . That gives us exactly two possibilities:
Sturm-Liouville
1.
2.
52.7
We must, briefly, forget about Sturm-Liouville problems, and develop the basic analog of the inner
product described for finite dimensional vectors at the beginning of this chapter. Throughout this
discussion, (a, b) is an interval, and w = w(x) is a function on (a, b) satisfying
w(x) > 0
This function, w , will be called the weight function for our inner product. Often, it is simply the
constant 1 function (i.e., w(x) = 1 for every x ).
Also, throughout this discussion, let us assume that our functions are all piecewise continuous
and sufficiently integrable for the computations being described. Well discuss what suitably
integrable means if I decide it is relevant. Otherwise, well slop over issues of integrability.
f (x) g(x)w(x) d x
For convenience, lets say that the standard inner product is simply the inner product with weight
function w = 1 ,
Z b
h f | gi =
f (x) g(x) d x .
a
For now, you can consider the inner product to be simply shorthand for some integrals that will
often arise in our work.
!Example 52.6:
f (x) g(x)x 2 d x
In particular
Z 2
1
1
6x + 2i
=
(6x + 2i) x 2 d x
x
x
0
Z 2
=
(6x 2i)x d x
0
2
6x i2x d x
2
= 2x 3 i x 2 0 = 16 4i
The inner product of functions just defined is, in many ways, analogous to the inner product
defined for finite dimensional vectors at the beginning of this chapter. To see this, well verify
the following theorem, which is very similar to theorem 52.2 on page 524.
h f | g i = h g | f i ,
2.
h h | f + g i = h h | f i + h h | g i ,
3.
h f + g | h i = h f | h i + h g | h i ,
and
4.
PROOF:
Norms
Recall that the norm of any vector v in C N is related to the inner product of v with itself by
p
kvk = h v | v i .
In turn, for each inner product h | i , we define the corresponding norm of a function f by
p
kfk = h f | f i .
In general
kfk = h f | f i =
f (x) f (x)w(x) d x =
| f (x)|2 w(x) d x
k f k is small .
!Example 52.7:
Sturm-Liouville
p
kfk = h f | f i =
s
Z
( f (x)) f (x)x 2 d x =
| f (x)|2 x 2 d x
In particular,
k5x + 6ik2 =
=
=
=
25x 2 + 36 x 2 d x
25x 4 + 36x 2 d x
2
= 5x 5 + 12x 3 0
= 256 .
So
k5x + 6ik =
256 = 16
Orthogonality
Recall that any two vectors v and w in C N are orthogonal if and only if
hv | wi = 0
Analogously, we say that any pair of functions f and g is orthogonal (over the interval) (with
respect to the inner product, or with respect to the weight function) if and only if
h f | gi = 0
whenever
k 6= n
for each k
then we say the set is orthonormal. For our work, orthogonality will be important, but we wont
spend time or effort making the sets orthonormal.
!Example 52.8:
which is the set of sine functions (without the arbitrary constants) obtained as eigenfunctions
in example 52.2 on page 529. The interval is (0, L) . For the weight function, well use
w(x) = 1 . Observe that, if k and n are two different positive integers, then, using the
trigonometric identity
2 sin(A) sin(B) = cos(A B) cos(A + B) ,
we have
D
sin
k
x
L
Z
E
n
sin
x
=
L
k
x
L
So
sin
sin
sin
k
x
L
k
x
L
sin
sin
n
x
L
n
x
L
: k = 1, 2, 3, . . .
dx
dx = = 0
is an orthogonal set of functions on (0, L) with respect to the weight function w(x) = 1 .
To find each constant ck , first observe what happens when we take the inner product of both
sides of the above with one of the k s , say, 3 . Using the linearity of the inner product and the
orthogonallity of our functions, we get
D X
E
h 3 | f i = 3
c k k
k
c k h 3 | k i
ck
(
k3 k2
0
So
c3 =
if k = 3
if k 6= 3
h 3 | f i
k3 k2
= c3 k3 k2
h k | f i
kk k2
for all k
Sturm-Liouville
h k | f i
ck =
kk k2
The ck s are called the corresponding generalized Fourier coefficients of f .2 Dont forget:
Z b
h k | f i =
k (x) f (x)w(x) d x
a
and
kk k =
|k (x)|2 w(x) d x
is an orthogonal set of functions on (0, L) with respect to the weight function w(x) = 1 .
(Recall that this is a set of eigenfunctions for the Sturm-Liouville problem from example 52.2
on page 529.) Using this set,
X
h k | f i
G.F.S.[ f ]|x =
ck k (x)
with ck =
2
kk k
becomes
G.F.S.[ f ]|x =
with
ck =
h k | f i
kk k2
G.F.S.[ f ]|x =
X
k=1
k=1
k
x
L
Z L
k
k
sin
x
f (x) dx
f (x) sin
x dx
L
L
0
=
Z L
Z L
k
k 2
x dx
sin2
x dx
sin
sin2
ck sin
Since
k
ck sin
x
L
k
x
L
dx = =
with ck
L
2
2
=
L
k
f (x) sin
x
L
dx
2 Compare the formula for the generalized Fourier coefficients with formula in theorem (52.4) on page 526 for the
components of a vector with respect to any orthogonal basis. They are virtually the same!
In particular, suppose f (x) = x for 0 < x < L . Then the above formula for ck yields
Z L
2
k
ck =
x sin
x dx
L
L
#
Z L
2
k
+
cos
x dx
k 0
L
0
"
2
L #
2
2L
k
2
k
4
=
0+
cos
L +
sin
x
= (1)k
2
=
L
"
2x
k
cos
x
k
L
So, using the given interval, weight function and orthogonal set, the generalized Fourier series
for f (x) = x , is
X
k
k 4
(1)
sin
x
.
k=1
(In fact, this is the classic Fourier sine series for f (x) = x on (0, L) .)
X
G.F.S.[ f ]|x =
ck k (x)
k=1
In practice, to avoid using the entire series, we may simply wish to approximate f using the N th
partial sum,
N
X
f (x)
ck k (x) .
k=1
N
X
ck k (x) ,
k=1
k=1
2
N
b
X
ck k (x) w(x) d x
f (x)
a
k=1
Sturm-Liouville
Equivalently
2
X
f
(x)
(x)
w(x) d x = 0 .
k k
a
k=1
P
In practice, all of this usually means that the infinite series k ck k (x) converges to f (x) at
every x in (a, b) at which f is continuous. In any case, if the set {1 , 2 , 3 , . . . } is complete,
then we can view the corresponding generalized Fourier series for a function f as being the
same as that function, and can write
X
f (x) =
ck k (x)
for a < x < b
Z
where
ck =
h k | f i
kk k2
In other words, a complete orthogonal set of (nonzero) functions can be viewed as a basis for
the vector space of all functions of interest.
52.8
dx
and Sturm-Liouville appropriate boundary conditions. As usual, well let L be the operator
given by
d
d
L[] =
p(x)
+ q(x) .
dx
dx
u L[v] d x =
L[u] v d x .
a
which looks very similar to the equation characterizing self-adjointness for a matrix A in theorem
52.5 on page 528. Because of this we often say either that Sturm-Liouville problems are self
adjoint, or that the operator L is self adjoint.3 The terminology is important for communication,
but what is even more important is that functional analogs to the results obtained for self-adjoint
matrices also hold for here.
To derive two important results, let (1 , 1 ) and (2 , 2 ) be two solutions to our SturmLiouville problem (i.e., 1 and 2 are two eigenvalues, and 1 and 2 are corresponding
eigenfunctions). From a corollary to Greens formula (lemma 52.8, noted just above), we know
Z b
Z b
1 L[2 ] d x =
L[1 ] 2 d x .
a
L[2 ] = 2 w2
L[1 ] = 1 w1
and thus,
L[1 ] = (1 w1 ) = 1 w1
(remember w is a positive function). So,
Z b
Z
1 L[2 ] d x =
a
1 (2 w2 ) d x =
L[1 ] 2 d x
(1 w1 )2 d x
1 2 w d x = 1
1 2 w d x
Since the integrals on both sides of the last equation are the same, we must have either
Z b
2 = 1
or
1 2 w d x = 0 .
(52.10)
a
Now, we did not necessarily assume the solutions were different. If they are the same,
(1 , 1 ) = (2 , 2 ) = (, )
and the above reduces to
or
w d x = 0
3 The correct terminology is that L is a self-adjoint operator on the vector space of twice-differentiable functions
Sturm-Liouville
So we must have
=
Thus,
dx
in the Sturm-Liouville problem. Accordingly, we may refer to w(x) and the corresponding inner
product on (a, b) as the natural weight function and the natural inner product corresponding to
the given Sturm-Liouville problem.
The Eigenfunctions
Let us assume that
E = { 0 , 1 , 2 , 3 , . . . }
is the set of all distinct eigenvalues for our Sturm-Liouville problem (indexed so that 0 is the
smallest and k < k+1 in general). Remember, each eigenvalue will be either a simple or a
double eigenvalue. Next, choose a set of eigenfunctions
B = { 0 , 1 , 2 , 3 , . . . }
as follows:
1.
For each simple eigenvalue, choose exactly one corresponding (real-valued) eigenfunction for B .
2.
For each double eigenvalue, choose exactly one orthogonal pair of corresponding (realvalued) eigenfunctions for B .
Remember, this set of functions will be orthogonal with respect to the weight function w from
the differential equation in the Sturm-Liouville problem. (Note: Each k is an eigenfunction
corresponding to eigenvalue k only if all the eigenvalues are simple.)
Now let f be a function on (a, b) . Since B is an orthogonal set, we can construct the
corresponding generalized Fourier series for f
G.F.S.[ f (x)] =
with
ck
h k | f i
=
=
kk k2
ck k (x)
k=0
k (x) f (x)w(x) dx
b
a
.
2
|k (x)| w(x) dx
The obvious question to now ask is Is this orthogonal set of eigenfunctions complete? That is,
can we assume
X
f =
c k k
on (a, b) ?
k=0
The answer is yes, at least for the Sturm-Liouville problems normally encountered in practice.
4
But
P you will have to trust me on this. There are some issues regarding the convergence of
k=0 ck k (x) when x is a point at which f is discontinuous or when x is an endpoint of
(a, b) and f does not satisfy the same boundary conditions as in the Sturm-Liouville problem,
but we will gloss over those issues for now.
Finally (assuming reasonable assumptions concerning the functions in the differential
equation), it can be shown that the graphs of the eigenfunctions corresponding to higher values
of the eigenvalues wiggle more than those corresponding to the lower-valued eigenvalues. To
4 One approach to proving this is to consider the problem of minimizing the Rayleigh quotient over the vector space
U of functions orthogonal to the vector space of all the eigenfunctions. You can then (I think) show that, if the
orthogonal set of eigenfunctions in not orthogonal, then U is not empty, and this minimization problem has a
solution (, ) and that this solution must also satisfy the Sturm-Liouville problem. Hence is an eigenfunction
in U , contrary to the definition of U . So U must be empty and the given set of eigenfunctions must be complete.
Sturm-Liouville
be precise, eigenfunctions corresponding to higher values of the eigenvalues must cross the X
axis (i.e., be zero) more oftern that do those corresponding to the lower-valued eigenvalues. To
see this (sort of), suppose 0 is never zero on (a, b) (so, it hardly wiggles this is typically the
case with 0 ). So 0 is either always positive or always negative on (a, b) . Since 0 will
also be an eigenfunction, we can assume weve chosen 0 to always be positive on the interval.
Now let k be an eigenfunction corresponding to another eigenvalue. If it, too, is never zero on
(a, b) , then, as with 0 , we can assume weve chosen k to always be positive on (a, b) . But
then,
Z
b
h 0 | k i =
>0
52.9
A Mega-Theorem
We have just gone through a general discussion of the general things that can (often) be derived
regarding the solutions to Sturm-Liouville problems. Precisely what can be proven depends
somewhat on the problem. Here is one standard theorem that can be found (usually unproven)
in many texts on partial differential equations and mathematical physics. It concerns the regular
Sturm-Liouville problems (see page 5220).
dx
and homogeneous regular boundary conditions at the endpoints of the finite interval (a, b) .
Then, all of the following hold:
1.
2.
(a, b) , and both p and w being positive on the closed interval [a, b] .
3.
4.
If, for each eigenvalue k , we choose a corresponding eigenfunction k , then the set
B = { 0 , 1 , 2 , 3 , . . . }
ck k (x)
k=0
with ck =
h k | f i
kk k2
f (x) if f is continuous at x .
(b)
0+
1
[ f (x + ) + f (x )]
2
if f is discontinuous at x .
5.
6.
Similar mega-theorems can be proven for other Sturm-Liouville problems. The main difference occurs when we have periodic boundary conditions. Then most of the eigenvalues are
double eigenvalues, and our complete set of eigenfunctions looks like
{ . . . , k , k , . . . }
where {k , k } is an orthogonal pair of eigenfunctions corresponding to eigenvalue k . (Typically, though, the smallest eigenvalue is still simple.)
Sturm-Liouville
Additional Exercises
52.1. Let 1 = 3 , 2 = 2 ,
" #
2
b1 =
3
and
b2 =
"
3
2
c. Express each of the following vectors in terms of b1 and b2 using the formulas in
theorem 52.4 on page 526.
" #
" #
" #
4
12
0
ii. v =
iii. w =
i. u =
6
5
1
d. Compute the following using the vectors from the previous part. Leave your answers
in terms of b1 and b2 .
i. Au
ii. Av
52.2. Let 1 = 2 , 2 = 4 , 3 = 6 ,
1
2
1
2
b = 2
,
b = 1
1
0
iii. Aw
and
1
b3 = 2
5
c. Express each of the following vectors in terms of b1 , b2 and b3 using the formulas
in theorem 52.4 on page 526.
4
1
0
ii. v = 2
iii. w = 1
i. u = 7
22
3
0
d. Compute the following using the vectors from the previous part. Leave your answers
in terms of b1 , b2 and b3 .
i. Au
ii. Av
52.3. Let 1 = 1 , 2 = 0 , 3 = 1 ,
1
1
1
2
b = 1
,
b = 2
1
1
iii. Aw
and
1
3
b = 0
1
Additional Exercises
c. Express each of the following vectors in terms of b1 , b2 and b3 using the formulas
in theorem 52.4 on page 526.
5
0
1
ii. v = 3
iii. w = 0
i. u = 8
3
6
0
d. Compute the following using the vectors from the previous part. Leave your answers
in terms of b1 , b2 and b3 .
ii. Av
i. Au
iii. Aw
for every v, u in R N
then B = A .
52.6. Find the general solutions for each of the following corresponding to each real value .
Be sure the state the values of for which each general solution is valid. (Note: Some
of these are Euler equations see chapter 18.)
a. + 4 =
b. + 2 =
c. x 2 + x =
for 0 < x
d. x 2 + 3x =
for 0 < x
for 0 < x
d. x 2 + 3x =
for 0 < x
e. 2x =
(Hermites Equation)
for 1 < x < 1
f. 1 x 2 x =
g. x + (1 x) =
for 0 < x
(Chebyshevs Equation)
(Laguerres Equation)
(where L is some finite positive length) and each of the following sets of boundary
conditons:
Sturm-Liouville
b. x 2 + x =
c. x 2 +3x =
52.10. (Preliminary Greens formula) Assume (a, b) is some finite interval, and let L be the
operator given by
d
d
p(x)
+ q(x) .
L[] =
dx
dx
where p and q are any suitably smooth and integrable functions on (a, b) . Using
integration by parts, show that
b
Z b
Z b
Z b
d f dg
dg
f L[g] d x = p(x) f (x)
p
dx +
q f g dx
dx
dx dx
d
p(x)
dx
+ q(x)
and
(b) = 0
and
(a) = (b) .
b. x 2 9 + i8x
a. h x | sin(2 x) i
c. 9 + i8x x 2
d. ei 2 x x
e. kxk
f. k9 + i8xk
Additional Exercises
h.
ei 2 x
g. ksin(2 x)k
52.13. Compute the following, assuming the interval is (0, 1) and the weight function is
w(x) = x .
b. x 2 9 + i8x
a. h x | sin(2 x) i
c. 9 + i8x x 2
d. ei 2 x x
e. kxk
f. k4 + i8xk
h.
ei 2 x
g. ksin(2 x)k
52.15. Let L be a positive value. Verify that each of the following sets of functions is orthogonal
on (0, L) with respect to the weight function w(x) = 1 .
o
n
k
a. cos
x : k = 1, 2, 3, . . .
b. ei 2k x/L : k = 0, 1, 2, 3, . . .
L
52.16. Consider
o
n
2
3
1
x , cos
x , cos
x , ...
cos
{1 , 2 , 3 , . . . } =
which, in the exercise above, you showed is an orthogonal set of functions on (0, L)
with respect to the weight function w(x) = 1 . Using this interval, weight function and
orthogonal set, do the following:
a. Show that, in this case, the generalized Fourier series
G.F.S.[ f ] =
ck k (x)
k=1
with ck =
h k | f i
kk k2
is given by
G.F.S.[ f ] =
X
k=1
k
ck cos
x
L
with ck
2
=
L
k
x
L
dx
if 0 < x <
L/
2
if
f (x) cos
i. f (x) = x
iii. f (x) =
x
Lx
if 0 < x <
L/
2
if
L/
2
<x<L
L/
2
<x<L
X
2
k
k
f (x) 6=
ck cos
x
with ck =
f (x) cos
x dx
k=1
Sturm-Liouville
52.17. In several exercises above, you considered either the Sturm-Liouville problem
+ = 0
or, at least, the above differential equation. You may use the results from those exercises
to help answer the ones below:
a. What is the natural weight function w(x) and inner product h f | g i corresponding
to this Sturm-Liouville problem?
b. We know that
2
1 2
2 2
k
{0 , 1 , 2 , . . . , k , . . .} = 0,
,
, ...,
, ...
L
is the complete set of eigenvalues for this Sturm-Liouville problem. Now write out a
corresponding orthogonal set of eigenfunctions (without arbitrary constants),
{0 , 1 , 2 , . . . , k , . . .}
c. Compute the norm of each of your k s from your answer to the last part.
d. To what does each coefficient in the generalized Fourier series
G.F.S.[ f ] =
ck k (x)
k=0
with ck =
h k | f i
kk k2
reduce to using the eigenfunctions and inner products from previous parts of this
exercise?
e. Using the results from the last part, find the generalized Fourier series for the following:
i. f (x) = 1
(
iii. f (x) =
ii. f (x) = x
1
if 0 < x <
if
L/
2
L/
2
<x<L
52.18. In several exercises above, you considered either the Sturm-Liouville problem
x 2 + x =
or, at least, the above differential equation. You may use the results from those exercises
to answer the ones below:
a. What is the natural weight function w(x) and inner product h f | g i corresponding
to this Sturm-Liouville problem?
b. We know that
{1 , 2 , 2 , . . . , k , . . .} =
1, 4, 9, . . . , k 2 , . . .
is the complete set of eigenvalues for this Sturm-Liouville problem. Now write out a
corresponding orthogonal set of eigenfunctions (without arbitrary constants),
{1 , 2 , 3 , . . . , k , . . .}
Additional Exercises
c. Compute the norm of each of your k s from your answer to the last part. (A simple
substitution may help.)
d. To what does each coefficient in the generalized Fourier series
G.F.S.[ f ] =
with ck =
ck k (x)
k=0
h k | f i
kk k2
reduce to using the eigenfunctions and inner products from previous parts of this
exercise?
e. Using the results from the last part, find the generalized Fourier series for the following:
iii. f (x) = x
i. f (x) = 1
52.19. In several exercises above, you considered either the Sturm-Liouville problem
x 2 + 3x =
or, at least, the above differential equation. You may use the results from those exercises
to answer the ones below:
a. What is the natural weight function w(x) and inner product h f | g i corresponding
to this Sturm-Liouville problem?
b. We know that
{1 , 2 , 2 , . . . , k , . . .} =
2, 5, 10, . . . , k 2 + 1, . . .
is the complete set of eigenvalues for this Sturm-Liouville problem. Now write out a
corresponding orthogonal set of eigenfunctions (without arbitrary constants),
{1 , 2 , 3 , . . . , k , . . .}
c. Compute the norm of each of your k s from your answer to the last part. (A simple
substitution may help.)
d. To what does each coefficient in the generalized Fourier series
G.F.S.[ f ] =
ck k (x)
k=0
with ck =
h k | f i
kk k2
reduce to using the eigenfunctions and inner products from previous parts of this
exercise?
e. Using the results from the last part, find the generalized Fourier series for the following:
i. f (x) = 1
ii. f (x) = x
Sturm-Liouville
1 (x) = x
, 2 (x) = 2x 2 1 ,
3 (x) = 4x 3 3x
...
ii. k1 k
iii. k2 k
iv. k3 k
e. According to our theory, any reasonable function f (x) on (1, 1) can be approximated by the N th partial sum
N
X
ck k (x)
k=0
where
ck k (x)
k=0
is the generalized Fourier series for f (x) using the type I Chebyshev polynomials.
For the following, let
p
f (x) = x 1 x 2
and find the above mentioned N th partial sum when
i. N = 0
ii. N = 1
iii. N = 2
iv. N = 3
Additional Exercises
Sturm-Liouville
13
1d i. Au = 6b1
1d ii. Av = 9b1 + 4b2
9
4
1d iii. Aw = b1 b2
1
13
2
13
2b.
b
= 6 ,
b
= 5 and
b3
= 30
2c i. u = 2b1 + 3b2 4b3
4
1
2c ii. v = b1 b3
3
1
3
1
5
2c iii. w = b1 b2 +
1 3
b
15
2d iii. Aw = b1 b2 + b3
3
5
5
3b.
b1
= 3 ,
b2
= 6 and
b3
= 2
3c i. u = 2b1 3b2 + 4b3
3c ii. v = b1 + 2b2 + 3b3
1
1
1
3c iii. w = b1 + b2 b3
3
3d i. Au = 2b1 + 4b3
3d ii. Av = b1 + 3b3
1
1
3d iii. Aw = b1 b3
3
2
p
6a. If < 4 , (x) = aex + bex with = (4
+ ) ; If = 4 , 4 (x) = ax + b ;
If > 4 , (x) = a cos(x) + b sin(x) with = 4 +
6b.
If < 1 , (x) = ae(1+)x + be(1)x with = 1 ; If = 1 , 1 (x) =
aex + bxex ; If > 1 , (x) = aex cos(x) +bex sin(x) with = 1
6c. If < 0 , (x) = ax + bx with = ; If
= 0 , 0 (x) = a ln |x| + b ;
If > 0 , (x) = a cos( ln |x|) + b sin( ln |x|) with
1
1
1
6d. If < 1 , (x) = ax
+ bx
with = 1 ; If = 1 , 1 (x)
= ax ln |x| +
1
1
1
bx ; If > 1 , (x) = ax cos( ln |x|) + bx sin( ln |x|) with = 1
7a. +
4 =
7b.
7c.
7d.
7e.
7f.
d
d
e2x
= e2x
dx
dx
d
d
1
x
=
x
dx dx
d
d
x3
= x
dx dx
2 d
2
d
ex
= ex
dx
dx
p
d
d
1
2
1x
=
dx
dx
1 x2
Additional Exercises
7g.
d
dx
xex
d
dx
= ex
k
8a. (0 , 0 (x)) = (0, c0 ) and (k , k (x)) =
, ck cos
x
for k = 1, 2, 3, . . .
L
(2k + 1) 2
(2k + 1)
8b. (k , k (x)) =
, ck sin
x
for k = 1, 2, 3, . . .
2L
2L
9a. (0 , 0 (x)) = (0, c0 ) and (k , k (x))
= k 2 , ck,1 cos(kx) + ck,2 sin(kx) for k = 1, 2, 3, . . .
9b. (k , k (x)) = k 2 , ck sin(k ln |x|) for k =
1, 2, 3, . . .
9c. (k , k (x)) = k 2 + 1 , ck x 1 sin(k ln |x|) for k = 1, 2, 3, . . .
11c. p(b) = p(a)
3
12a.
2
12b. 81 + 162i
12c. 81 162i
3
12d.
i
2
12e.
3
12f. r819
k 2
L
3
2
12g.
12h.
3
1
13a.
13b.
13c.
13d.
13e.
9
4
9
4
8
5
8
i
5
1
1
+
i
2
2 2
1
2
+i
13f. 2 r6
13g.
1
2
1+
13h.
1
2
X
k
2L
k
16b i.
(1)
1
cos
x
2
L
(k )
16b ii.
k=1
2
k
k=1
sin
k
2
k
cos
x
L
RL
h
|
i
17a. w(x)
=
1
,
f
g
=
n
0 ( f (x)) g(x)
d x
o
1
2
k
17b. 1, cos
x , cos
x , . . . , cos
x , ...
L
L
L
k
L
17c. k1k = L ,
cos
x
=
L
2 Z
Z L
L
1
2
k
17d. c0 =
f (x) d x , ck =
f (x) cos
x d x for k > 0
L
17e i. 1
17e ii.
L
2
X
2L
k=1
(k )2
k
(1)k 1 cos
x
L
17e iii.
1
2
X
2
k=1
Sturm-Liouville
sin
1
x
k
2
k
cos
x
L
R e
X
2
18e i.
1 (1)k sin(k ln |x|)
18e ii.
18e iii.
k=1
2
k
k=1
2k
( 2 + k 2 )
1 (1)k e sin(k ln |x|)
k=1
R e
h
|
i
19a. w(x)
=
x
,
f
g
=
1 ( f (x)) g(x)x d x
1
19b. x sin(1 ln |x|) , x 1 sin(2 ln |x|) , x 1 sin(3 ln |x|) , . . . , x 1 sin(k ln |x|) , . . .
q
19e i.
19e ii.
2k
1 + k2
k=1
X
k=1
sin(k ln |x|)
1 (1)k e
2k
( + 1)2 + k 2
1 (1)k e(+1)
sin(k ln |x|)
20a. 0 =
0 , 1 = 1 ,2 = 4 , 3 = 9
p
d
1
d
20b.
1 x2
=
dx
dx
1 x2
Z 1
1/2
20c. h f | g i =
( f (x)) g(x) 1 x 2
dx
1
20d i. q
20d ii.
r2
20d iii.
3
q 8
20d iv.
20e i. 0
4
20e ii.
x
20e iii.
20e iv.
3
4
x
3
4
4
x x3
3
5