You are on page 1of 46


Spring 2007

Applied Mathematics III

Pei-Ming Ho
Department of Physics
National Taiwan University
Taipei 106, Taiwan, R.O.C.
February 3, 2014

Feynman: To those who

References of the course include:
do not know mathemat-
ics, it is difficult to get
Mathews and Walker: Mathematical Methods of Physics.
across a real feeling as to
the beauty, the deepest
Trodden: Lecture notes: Methods of Mathematical Physics I
beauty, of nature... If you trodden/courses/mathmethods/
want to learn about na-
ture, to appreciate nature,
The materials to be covered in this course are the following:
it is necessary to under-
stand the language that
Complex analysis and evaluation of integrals (saddle-point approx.)
she speaks in.
Fourier transform (+ Laplace transform)

Sturm-Liouville theory, linear space of functions and special functions

Special functions

This note is provided as a supplement, not a substitute, to the references.

We will use Einsteins summation convention in this note.
Chapter 1

Complex Analysis

T: 1. Analysis of
In this chapter we will learn how to use complex analysis, in particular the
Complex Functions;
residue theorem, to evaluate integrals. We will also learn how to compute the 2.2 Riemann-Lebesgue
saddle-point approximation of an integral. Lemma and Method of
Stationary Phase; 2.3 The
holomorphic/analytic functions. Saddle-Point Method.
MW: Appendix: Some
singularity, branch cut. Properties of Functions of
a Complex Variable; 3.3
contour integral, residue theorem. Contour Integration; 3.6
Saddle-Point Methods
using residual theorem to evaluate integrals.

saddle-point approximation.

1.1 Introduction
Complex numbers and complex functions appear in physical problems as well
as formulations of fundamental physics. Why is complex analysis useful? Are
An example of the use of
not all physical quantities real? Since z = x + iy, a complex function of z is complex fxs appeared in
just two real functions of (x, y) your E&M course when
dealing with waves: cos(k
f (z) = u(x, y) + iv(x, y). (1.1) x) eikx . In QM, com-
plex fxs are needed at a
Why dont we just reduce every problem in complex analysis to real analysis? more fundamental level.
Holomorphic functions are rare, in the sense that there are a lot more
functions on R2 than holomorphic functions on C.
This is why analytic continuation (f (x) F (z)) is possible. Given a
If we are not restricted
real function f (x) of x R, we can imagine that it is the restriction of a to holomorphic functions,
holomorphic function F (z) to the real axis of the complex plane C. Therefore, there are infinitely many
properties of F (z) can be viewed as properties of f (x). A lot of properties of functions F (z, z) which
reduce to a given f (x)
f (x) are hidden until we view it in the form of F (z). These properties allow
when z = z = x. For
us to deal with problems involving f (x) (such as the integration of f (x)) more
example, for f (x) = x2 ,
easily. we can have F = z 2 ,
|z|2 , z 1/2 z3/2 etc. Only
2 the first choice is holomor-

A theorem that demonstrates this notion is Liouvilles theorem (for com-

plex analysis, not phase space).
Another example is that z z 0 = z 0 (z) always defines a conformal map.

1.2 Preliminaries and Definitions

Definitions are not just
about names, which help 1. Change of variables on R2 :
us to communicate more
efficiently. More impor-
z = x + iy, z = x iy, (1.2)
tant is the concept behind z = rei , z = rei . (1.3)
a definition.
2. We consider functions that are single-valued maps from domains in C to
ranges in C. A function which appears to be multiple-valued should be
restricted to a smaller domain such that it is single-valued.

3. What is z = /z ? (What is z = / z ?)
1 1
= (x iy ), = (x + iy ). (1.4)
2 2
4. The rules of differentiation (chain rule, Leibniz rule) are the same as for
real functions.

z ) 6= 0.
5. Caution: z (1/
Recall that the electric
potential V (r) of a point Stokes theorem
charge is 1r . Gausss
law: 2 1r = 4 (3) (~r).
(Ax dx + Ay dy) = dxdy(x Ay y Ax ) (1.5)
The analogue in 2D is
V (r) log r, and we need implies that
2 log r (2) (~r). Now
log r = 12 (log z + log z), f (z)dz = (f (z)dx + if (z)dy)
and 2 , so we need Z Z
z1 z1 (2) (~r). = dxdy(ix y )f (z) = (z).
dxdy2if (1.6)

(z) = 0, implying f (z)dz = 0 (Cauchys

General rule: If f (z) is well-defined in R, f C
To compute the derivative theorem). If f (z) = z1 , for C being an infinitesimal circle around the
of a function ill-defined H
origin, C f (z)dz = 2i, implying that
at a point, we use Gauss
law or Stokes theorem. 1 1
= = (2) (x, y). (1.7)
z z
6. A function is analytic at a point if it has a derivative (complex-differentiable)
there. That is,
f (z + h) f (z)
f 0 (z) = lim (h C) (1.8)
h0 h
exists and is independent of the path by which the complex number
approaches to 0.

7. f (z) = u(x, y) + iv(x, y) satisfies Cauchy-Riemann equations

u v u v
= , = . (1.9)
x y y x

So u and v are harmonic functions

2 u = 0, 2 v = 0. (1.10)
Let u = xy. One can
8. One can use CR-eqs to derive v if u is given (assuming that v exists). solve v from CR-eqs and
get v = (y 2 x2 )/2 + c.
Thus f = iz 2 /2+ic. Al-
ternative derivation: Try
z 2 since u is quadratic.

Ex: f (x) = n (x)
9. Taylor expansion, analytic continuation, radius of convergence, singular- P
f (z) = n (z) with n

ity. rad. of conv. = 1. One

can compute f (n) (1/2)
10. [Holomorphic/Analytic function in R] = [function that is analytic every- for all n and find f (z) =
P 2 2 1 n
where in R]. n 3 ( 3 (z 2 )) with
rad. of conv. = 3/2. This
goes beyond the conver-
gence region of the 1st ex-
Theorem: The Taylor ex-
pansion of an analytic fx.
11. The rad. of conv. R of f (z) about z0 is equal to the distance between around a point inside R
z0 and the nearest singularity of f (z). has a finite radius of con-
12. Composition of analytic functions:

af1 + bf2 , (1.11)

f1 f2 , (1.12)
f1 /f2 when f2 is not 0, (1.13)
f1 (f2 (z)). (1.14)
Examples of entire fxs:
13. [Entire function] = [function that is holomorphic on the whole complex f (z) = z, f (z) = ez , etc.

14. [Laurent series of f (x)] = [ n

n=m fn z ].

Example: f (z) = z n has

f has a pole (isolated singularity) of order m if 0 < m < .
a pole of order n at z = 0
[Simple poles] = [poles of order 1]. for n Z+ .
An essential singularity is
f has an essential singularity if m = . a singularity more severe
than poles of any order:
limzz0 (z z0 )n f (z) =
for any finite n.

A fx. f (z) is meromor-

phic iff f (z) = g(z)/h(z) 15. Meromorphic function in R = analytic except at isolated points in R
for some holomorphic fxs with singularities of finite order.
g(z) and h(z).
A fx. is discontinuous
across a branch cut. 16. Branch cut = curve in C taken out of the domain to avoid multiple-
Examples of branch cut: valuedness.
f (z) = z 1/2 , f (z) =
1.3 Residue Theorem
The residue of a function f (z) at z0 is defined by

Reszz0 f (z) f1 (1.15)

if the Laurent expansion of f around z0 is

f2 f1
f (z) = + 2
+ + f0 + f1 z + f2 z 2 + . (1.16)
z z
The residue is nonzero
only at singular points. This number f1 is special because it is singled out from all coefficients fn
if we integrate f over a small circle around z0 :
dzf (z) = 2if1 . (1.17)
S 1 (z0 )

The essence of the residue theorem is simply

Z 2
deim = 2m0
, (1.18)

which we used to derive (1.17).

Residue Theorem:
The contour integral of an analytic fx. f (z) is given by the sum of its residues
This theorem is easy to
understand by replacing at all singular points encircled in the contour:
the contour by infinitely X
many infinitesimal circles.
dzf (z) = 2i Reszzi f (z). (1.19)
C i

This is for C going around the poles once in the counterclockwise direction.
In general we get an integer factor on the right hand side if C is going in the
opposide direction or if C encircles the poles more than once.
To prove this theorem, we simply quote Cauchys theorem proved in Sec.
1.2 and (1.17).
Immediate consequences of the residue theorem:
Cauchys theorem

We used Cauchys theo-

rem to prove residue theo- f (z)dz = 0 (1.20)
rem, but once you accept
if f (z) is analytic in R and C R.
and memorize residue the-
orem, it is convenient to
view Cauchys theorem as
its consequence because

Cauchys integral formula

For an analytic function f (z) in R,
1 f ()
f (z) = d , (1.21)
2i z z

where C is a closed contour in R encircling the point z once in the

counterclockwise direction.

A contour integral of f (z) in C with endpoints z0 , z1 is invariant un-

der continuous deformations of the contour with fixed endpoints if the
deformation never crosses over any singularity.
Hint of proof: Con-
Liouvilles theorem sider |f (z1 ) f (z2 )| using
A bounded entire function must be constant. Cauchys integral formula.

Liouvilles theorem implies that: If f (z) 0 as z , f (z) must have A holomorphic function
f (z) with only simple
singularities (unless f (z) is identically zero).
poles is uniquely deter-
The residue theorem not only helps us to understand analytic functions mined by the locations
better, it is also very powerful for evaluating integrals. However it takes expe- and residues of the simple
rience (a lot of practice) and ingenuity (which usually only comes after hard poles, and its value at in-
working) in inventing the relevant analytic function and contour. finity.

1.3.1 Examples
In the following, let a > 0.

1. Z
=? (1.22)
x2 + a2

2. Z
=? (1.23)
0 (x2 + a2 )2

3. Z
=? (1.24)
0 1 + x3

4. For a > b > 0, Z

=? (1.25)
0 a + b cos
5. Z
dx =? (1.26)
0 1 + x2

6. For 0 < a < 1,

dx =? (1.27)
ex + 1

7. Z 1
=? (1.28)
1 1 x2 (1 + x2 )

8. For f (z) with several isolated singularities, consider the contour integral
f (z)dz
. (1.29)
sin z

Show that

X X Rk
(1)n f (n) = , (1.30)
n= k
sin zk

where Rk is the residue of f (z) at zk .

x sin x
dx =? (1.31)

10. A holomorphic function f (z) has only 2 simple poles at z = 1 with

both residues equal to 1. f (z) also approaches to 1 at infinities. Find
f (z) and prove that it is unique.

Given a complex number z = x+iy C, its complex conjugate z = xiy is
known. But algebraically z and z are independent. Consider a general change
of coordinates of R2

x0 = ax + by, y 0 = cx + dy (1.32)

for arbitrary coefficients a, b, c, d (as long as the inverse exists). A function

f (x, y) can be rewritten in terms of (x0 , y 0 ) as f 0 (x0 , y 0 ) such that f 0 (x0 (x, y), y 0 (x, y)) =
f (x, y). The new coordinates (x0 , y 0 ) are independent variables just as (x, y)
are independent variables. One can consider derivatives and use the chain rule
to derive
x0 y 0 x0 y 0
= + , = + . (1.33)
x x x0 x y 0 y y x0 y y 0

All of these algebraic relations remain the same for our special case of z =
x0 , z = y 0 (a = 1, b = i, c = 1, d = i). In this sense, (z, z) are independent
Analytic continuation
Analytic continuation means to extend the domain of a function from its
original definition to other regions on the complex plane as much as possible,

with the functions analyticity preserved. For example, if we define f (x) by

the series
2 3
f (x) = 1 + x + x + x + = xn , (1.34)

its domain is (1, 1) because its radius of convergence is 1.

By analytic continuation, we extend the original definition of f (x) to ev-
erywhere on the complex plane except the point x = 1, so that
f (x) = . (1.35)
Suppose we find the expansion (1.34) in a physical problem, and we are
interested in the case when x is a phase. Strictly speaking, based on the
definition (1.34), the value of f (x) for x = ei is ill-defined. One might wonder
whether we can define f (ei ) as the limit

X 1
lim (ei )n = , (1.36)
1 ei

but how do we justify this choice, rather than the other choice ( 0)?
In a physical problem, there should be physical reasons to justify one choice
over the other. A simple criterion is that, if the quantity you are computing is
physical/observable, then it must be finite and well-defined. Roughly speaking,
this justifies the use of analytic continuation whenever possible.
As an example of the use of analytic continuation, imagine that in a phys-
ical problem, we need to solve the differential equation

(1 x)f 0 (x) f (x) = 0. (1.37)

One might try to solve f (x) as an expansion

f (x) = f0 + f1 x + f2 x2 + f3 x3 + . (1.38)

Then we will get some recursion relations and find that (1.34), up to an overall
constant factor, is the solution. For this problem, one will not hesitate to
replace the series (1.34) by (1.35), because one can directly check that (1.35)
satisfies the differential equation (1.37). The appearance of the series (1.34)
is not a consequence of anything of physical significance, but only a result of
the mathematical technique we apply to the problem.
Residue theorem
In the above, we see that the use of residue theorem is very powerful. We
can use it to evaluate many integrals which we do not know how to compute
otherwise. Even for some of the integrals which we can directly integrate, it
is much easier to use the residue theorem. It is a really elegant technique.
The use of the residue theorem always involves complex numbers in the
calculation, even when the integral is real. It is important to have a rough

idea about the integral, e.g. whether it is real, positive or negative, etc. before
applying the residue theorem, so that you can easily detect an error in calcu-
lation. If, after using the residue theorem, you find a complex number for a
real integral, or a negative number for an integral that is obviously positive
definite, you must have made a mistake.

1.4 Saddle-Point Approximation

Most integrals can not be carried out to give an exact answer. Saddle-point
method is one of the most important methods to estimate an integral. The
integral must have a parameter whose value is large. The larger the param-
eter, the better the approximation.
The basic idea is this. Consider a function f (x) with a maximum at x0 .
The contrast between values of f (x) at different points x is magnified in the
ef (x) (1.39)

for large . The value of the integral

dxef (x) (1.40)

is expected to be dominated by the contribution from a small region around the

point x = x0 . As a lowest order approximation, in the very near neighborhood
of x0 (i.e., x x0 ), we can approximate f (x) by
We have f 0 (x0 ) = 0 and
f 00 (x0 ) < 0 since x0 is a 1
maximum. f (x) ' f (x0 ) + f 00 (x0 )(x x0 )2 . (1.41)
Then the integral (1.40) is approximated by a Gaussian integral
1 00 2
f (x) f (x0 )
dxe 'e dxe 2 f (x0 )(xx0 ) . (1.42)

This can be carried out easily assuming that the range of integration covers
the point x0 and its neighborhood of several standard deviations. In case the
dxex /2
= 2.
range of integration misses a significant portion of the Gaussian function, we
can still estimate the Gaussian integral by looking it up in a table or using a
The standard deviation of computer.
the Gaussian is small Whether this is a good approximation depends on whether the integral is
1 .
dominated by a small region around x0 within which the first few terms of the
Taylor expansion (1.41) is a good approximation of f (x).
Higher order approximations can be obtained by keeping more terms in
the Taylor expansion
1 1
f (x) ' f (x0 ) + f 00 (x0 )(x x0 )2 + f 000 (x0 )(x x0 )3 + , (1.44)
2 3!

and Taylor-expand the integrand apart from the Gaussian factor

1 00 (x 2
ef (x) ' ef (x0 ) e 2 f 0 )(xx0 )
(1 + A(x x0 )3 + B(x x0 )4 + ). (1.45)

Then we can carry out the integration term by term. To do so, you should
know how to derive the following (using integration by parts)
Check that the 2nd term
Z (
(n+1)/2 n+1
A(x x0 )3 in the expan-
2 2 ( ) = 2[1 3 5 (n 1)], n = even,
dxex /2 xn = 2 sion above has no contri-
0, n = odd. bution.
With the help of complex analysis, we can extend the discussions above
from R to C. The function f (x) can be analytically continuated and the
integration on the real axis can be deformed into the complex plane. The
condition f 0 (x) = 0 which determines the maximum x0 is replaced by
Both the real and imagi-
nary parts of f (z) are har-
f 0 (z) = 0 z = z0 . (1.47)
monic fxs. They have no
maximum nor minimum in
This point z0 is not a local maximum nor minimum but a saddle point. In
the bulk. They can only
the two dimensional plane C, one can choose to pick the contour with steepest have saddle points.
descent or the one with stationary phase. Formally, the two approaches look
almost the same
Z Z s
1 00 2 2
dz ef (z) ' dz ef (z0 )+ 2 f (z0 )(zz0 ) = ef (z0 ) . (1.48)
f 00 (z0 )

The subtle difference lies in how to define the square root, which depends on
how we choose the contour to pass through the saddle point z0 (at what angle
in the complex plane).
More precisely, if we choose a real parameter s on the contour so that

z ' z0 + sei (1.49)

around the point z0 for a constant , we have

1 00 2 1 2 1 2
f (z )(zz ) i
dz e 2 0 0
=e ds e 2 cos(+2)s +i 2 sin(+2)s (1.50)

where we denoted f 00 (z0 ) = ei .

In the steepest descent approach, in order for the real part of f (z) to be
local maximum, we want to choose so that

cos( + 2) = 1 sin( + 2) = 0, (1.51)

so that the imaginary part is automatically roughly constant. Then the ex-
pression above is
Z r s !
1 2 2 2
ei ds e 2 s = ei = . (1.52)
f 00 (z0 )

The sign depends on which of the two solutions of is chosen to satisfy (2.20).
For the stationary phase approach, we choose

sin( + 2) = 1 cos( + 2) = 0, (1.53)

so that the real part is automatically roughly constant. Then (1.50) becomes
Z r r s !
1 2 2 2 2
ei ds ei 2 s = ei = ei(/4) = . (1.54)
i f 00 (z0 )

The sign depends on which solution of is chosen to satisfy (2.20).

Note that the final results (1.52) and (1.54) are exactly the same up to the
ambiguity of a sign, which depends on the choice of the contour.
Refer to MW for details of
saddle-point approx.
1.4.1 Examples
1. For large x, Z
(x + 1) = dt tx et . (1.55)

2. For large , Z
dxg(x)ef (x) (1.56)

3. For large n Z, Z 2
sin2n d. (1.57)

4. For large , Z
dzef (z) . (1.58)

Before applying the saddle point approximation, it is necessary to have a rough
idea about the behavior of the integrand. We might not want to go all the
way to figure out the shape of the contour in the complex plane, but at least
we should know, for example, whether the integral is dominated by the region
close to a boundary of integration. This may happen even when there is a
local maximum within the range of integration, and then the saddle point
approximation can not be used without modification.
The ambiguity in choosing the direction of integration (due to different
choices of the contour) when we pass over the saddle point results in the
ambiguity of an overall sign of the integral. This ambiguity in theqsign can
also be associated with the ambiguity of the sign of the square root f2 00 (z ) .

Usually it is not too hard to guess which choice is correct for a given integral,
and so it is usually not necessary to figure out the details of the contour. It is

sufficient to know that the integral is dominated by a small region somewhere

along a certain contour.
In both case (steepest descent and stationary phase), the key point is
to make sure that the dominating contribution of the integral comes from
a small region of the contour, so that we can approximate the integrand
ef (z) by Taylor-expanding f (z) and keeping only the first few terms.
The difference in the two approaches lies in the reason why the integral is
dominated by a small region. For the steepest descent approach, the reason
is that the real part of the exponent f (z) is large in a small region (while the
imaginary part of f (z) does not change quickly within that small region, so
that there is no significant cancellation due to changes of phase).
2 2 3
For example, the integral of f (x) = e 2 x +x over the real line is dom-
inated by the region close to x = 0, with a width of 1 . For larger , the
region is smaller. But how small is small enough? The small region is small
enough if the term x3 can be ignored compared with the first term 2 x2
within the region ( 1 , 1 ) so that our saddle point approximation is good.
That is, we want | 13 |  | 2 12 |, i.e., ||  3 . This necessary condition for
the steepest descent approach can be directly written down by counting the
dimension of each quantity. Imagine that this integral arises in a physical
problem and the dimension of x is L. Then the dimension of is 1/L, and
that of is 1/L3 . Since it makes no sense to talk about the precise meaning of
large or small for a quantity with dimension, we can only talk about the
dimensionless quantity /3 being large or small. Since it is obvious that we
want || to be small in order to ignore it, what we really want must be that
|/3 |  1.
For the stationary phase approach, the reason why the integral is domi-
nated by a small region is that, except this small region, the imaginary part
of f (z) changes quickly (while the real part changes slowly) when we move
along the contour, corresponding to a high frequency oscillation that leads to
a large cancellation in the integration.
2 2 3
As an example, consider the integral of f (x) = ei 2 x +x over the real line.
If |/3 |  1 so that we can ignore the 2nd term, the first term of the exponent
tells us that the wavelength of the oscillation around a point x in a small region
(much smaller than the wavelength) is roughly 12 x . Integration over the fast
oscillating region largely cancals, and so the integration is dominated by the
region (1/, 1/), within which the wavelength is of order 1/ or larger.

1.5 Exercises:
1. For k, a R,
 0+ means that  re-
mains positive in the limit.
1.5. EXERCISES: 13

lim+ dx =? (1.59)
0 x2 a2 + i
2. For a, b R, m + n 2,
dx =? (1.60)
(x2 + a2 )m (x2 b2 )n

3. For a > b > 0, Z 1

dx =? (1.61)
1 1 x2 (a + bx)
4. For a > b > 0,
d =? (1.62)
0 a + b cos
5. Z
dx =? (1.63)
0 x 3 + a3
6. (a) Consider the contour integral
dzf (z) cot z (1.64)

around a suitable large contour, and obtain thereby a formula for

the sum
f (n). (1.65)


X 1
g(a) = =? (1.66)
n 2 + a2

7. Evaluate Z
I(x) = dtexte (1.67)
approximately for large positive x.

8. (a) Check that the saddle-point approximation of the gamma function

is Z
(z) = dtet tz1 ' 2z z1/2 ez . (1.68)

(b) Find the saddle-point approximation of the beta function

Z 1
B(a, b) = dxxa1 (1 x)b1 . (1.69)

(c) Check that the results in (a) and (b) agrees with the relationship
between the gamma function and the beta function
B(a, b) = . (1.70)
(a + b)

9. The Bessel function can be expressed as the Bessels integral

Jn (z) = d cos(n z sin ). (1.71)
Use saddle-point approximation to find the asymptotic behavior of the
Bessel function Jn (z) for large z.

10. (a) For large , find the saddle-point approximation of

Z 1
dx xex /2 . (1.72)

(b) Find the exact result of the integral and compare.

Chapter 2

Fourier Transform

2.1 Motivation
Fourier transform can be
viewed as a limit of Fourier Fourier transform is a rearrangement of the information contained in a function
series. As we can never f (x). It is based on the fact that any (smooth) function f (x) can be obtained
really measure anything as a superposition of sinusoidal waves with suitable amplitudes and phases
with infinite spatial or Z
temporal extension, the
f (x) = dk f(k)eikx , (2.1)
conceptual difference be-
tween f (x) defined on R
and periodic functions de- where
fined on a finite inter- f(k) = |f(k)|ei(k) C (2.2)
val is an illusion, although
specifies the amplitude and phase of the sinusoidal wave eikx with wave num-
for most applications of
Fourier transform Fourier ber k. Fourier transform is the map from f (x) to f(k). There is no loss of
series is simply impracti- information in this transformation. We can reconstruct f (x) from f(k) via the
cal. inverse Fourier transform.
Fourier transform can be generalized to higher dimensions. It implies, for
example, that we only need to consider plane waves when we learn about
electromagnetic waves, because all electromagnetic waves can be viewed as
superposition of plane waves.
Our ears work as a Fourier
transformer of the sound As we will see, sometimes it is very helpful to present the information en-
waves, although it only coded in a function by its Fourier transform. In particular, it is useful to solve
keeps the information of differential equations that are translationally invariant. In later chapters, we
the amplitudes.
will also generalize the notion of Fourier transform for more general differential

2.2 Dirac Delta Function and Distribution

Dirac -function is defined by
Sometimes we also define
-function by the limit of
f (x0 ), x0 (x1 , x2 ),
a series of functions. E.g. dx(x x0 )f (x) = (2.3)
x1 0, otherwise,
1 x2
(x) = lim e 2 .
Only properties of the -

where f (x) is a good (smooth) function.

From its definition and integration by parts, one deduce that the derivative
of the -function can be defined
dx 0 (x x0 )f (x) = f 0 (x0 ), (2.4)

assuming that the range of integral includes x0 .

Dirac -functions and its derivatives are not (good) functions. They are
defined only through integrals with good functions. Such objects are called
distributions. Other operations, e.g. 2 (x x0 ), or even (x x1 )(x x2 ),
We will allow integration
are ill-defined. by parts and use the fact
Since -functions are defined only via integrals, we can derive all properties
(x) = 0, x 6= 0.
that can be defined for -functions by considering integrals with functions.

2.2.1 Examples
Prove the following identities.
R x2
dxf (x) dx (x x0 ) = f 0 (x0 ), x1 < x0 < x2 , (2.5)
f (x)(x x0 ) = f (x0 )(x x0 ), (2.6)
(a(x x0 )) = |a| (x x0 ), (2.7)
(f (x)) = k |f 0 (x k )|
(x xk ), xk 3 f (xk ) = 0. (2.8)

Note that the coefficients on the RHS in (2.7) and (2.8) are absolute values
since the -functions are positive-definite.

2.2.2 An Identity for Delta Function

Consider the distribution
A(x) lim eikx . (2.9)

Firstly it is a distribution rather than a well-defined function of x. For x 6= 0,

A(x) = 0, because
dx A(x)f (x) = dx lim eikx f (x) = 0 (2.10)

for any smooth function f (x) that vanishes in the neighborhood of x = 0. (We
used this property in the stationary phase approximation.) For x = 0, it is
just 1. Hence this distribution A(x) is nonzero on only one point (of measure
0). It is thus equivalent to 0.
Next we consider the distribution
B(x) = dkeikx = lim (eikx eikx ). (2.11)
k ix

Again, it vanishes for any x 6= 0, but it is ill-defined for x = 0. We may guess

that it is proportional to (x), which is also 0 for x 6= 0 and ill-defined for
x = 0. We can check this by computing
Z a Z a Z Z
ikx eika eika
dxB(x) = dx dke = dk = 2 (2.12)
a a ik
for any a > 0. (You can use the residue theorem to evaluate the last step.)
Thus, if B(x) is proportional to (x), it must be
B(x) = dkeikx = 2(x). (2.13)

Let us try to be more rigorous in deriving this result (2.13). Since B(x)
vanishes for all x 6= 0, dxB(x)f (x) should only depend on the the behavior
of f (x) at x = 0. Because dxB(x)f (x) is a linear map of f (x) (to C), the
most general situation is
dxB(x)f (x) = a0 f (0) + a1 f 0 (1) + a2 f 00 (2) + (2.14)

for constant coefficients a0 , a1 , a2 , . This means

dx (n) (x)f (x) = (1)n f (n) (0).
B(x) = dkeikx = a0 (x) a1 0 (x) + a2 00 (x) . (2.15)

What we can learn from (2.12) is only that a0 = 2.

From the definition of B(x) one can check that it is an even function of
x. This implies that all an = 0 for all odd ns. We can further fix all the rest
coefficients an by considering a simple example
(n) (x) = (1)n (n) (x).
2 1 2
dx B(x)e 2 x
= dk dxe 2 (xik/) e 2 k

r Z
2 1 2
= dke 2 k

= 2. (2.16)

This should be identified with

dx(2(x) + a2 00 (x) + a4 (4) (x) + )e 2 x = 2 + a2 + 3a4 2 + .

Hence we conclude that an = 0 for all n.

2.3 Review of Fourier Series

Periodic functions f (x) of period 2 can be approximated by

f (x) = a0 + (an cos(nx) + bn sin(nx)) . (2.18)

Another equivalent expression is

f (x) = fn einx . (2.19)

If f (x) is real, an , bn R and fn = fn C.
The equal signs in (2.18) and (2.19) are not exact. There can be points of
measure 0 where the equality breaks down. In particular there is the Gibbs
phenomenon: if there is a sudden change of the slope of f (x), the value of the
Fourier series typically overshoots.
The Fourier series for periodic functions is just a special case of the fol-
lowing general idea: For a given class of functions (e.g. periodic functions
We will talk about the
on [, ), one can choose a complete basis of functions {n (x)} so that it general theory of complete
is always possible to express any function f (x) in this class as a linear su- basis of functions in Chap.
perposition of them: f (x) = n fn n (x), which holds almost everywhere on 3.
the domain of f (x). The Fourier expansion to be discussed below is another
example. Here we give two more examples.
If periodic functions on [, ) can be expressed as (2.18), then even peri-
odic functions can be expressed as

f (x) = an cos(nx), (2.20)

and the odd ones as

f (x) = bn sin(nx). (2.21)

These are functions defined on [0, ], and their values in [, 0] are determined
by either f (x) = f (x) or f (x) = f (x). When we consider a function f (x)
defined on [0, ], we can use either cos(nx) or sin(nx) to expand it, and its
values in the region [, 0] will be determined accordingly. Note that the RHS
of (2.20) has the property that its first derivative vanishes at the boundaries
x = 0, , while the RHS of (2.21) vanishes at x = 0, .

2.4 Fourier Transform as a Limit of Fourier

Any periodic function f (x) of period 2 can be approximated to arbitrary
accuracy by n fn einx . By a change of variable (scaling of x) we have

f (x) = fn ei2nx/L , x [L/2, L/2). (2.22)

Z L/2
dxei2mx/L ei2nx/L = Lm+n
, (2.23)

we find Z L/2
fn = dxei2nx/L f (x). (2.24)
L L/2

Now we take the limit L and expect that any fx. f (x) defined on
R can be approximated to arbitrary accuracy by the expansion. To do so, we
imagine that the fx fn : Z C defined on Z is promoted to a fx. f(n) : R C
defined on R by interpolation, so that f(n) = fn for n Z. In the limit of
large L, we focus on fxs fn which are very smooth, so that they do not change
much when n n + 1. That is, fn+1 fn 0 as L .
Consider the change of variables

n k = 2n/L, (2.25)
fn f(k) = N f(n). (2.26)

The first equation implies that

n ' dn = dk 2 , (2.27)
m+n L
' (m + n) = ( 2 (k k 0 )) = 2
(k k 0 ). (2.28)

To make the connection between the Kronecker delta and Dirac delta fx., we
examine a sum/integral of them
fn m+n = fm ' f(m) = dnf(n)(m + n).

Thus eqs. (2.22), (2.23) and (2.24) become

f (x) =
dk 2N f (k)eikx , (2.30)
R ikx ik0 x 0

dxe e = 2(k + k ), (2.31)

f(k) = NL dxeikx f (x).

To get a well-defined limit for L , the ratio L/N should be a finite num-
ber. On the other hand the choice of the finite number is merely a convention.

An often used convention is L/N = 2:
f (x) = dk2 f(k)eikx , (2.33)
R dx ikx
f(k) = e
f (x). (2.34)

The constants in the

Fourier transform and its Repeating this decomposition for n-variables, we get
inverse R dn k
dk f (x) = (2)n/2
f (k)eikx , (2.35)
f (x) = f (k)eikx , R dn x ikx
N 1 f(k) = (2)n/2
e f (x). (2.36)
dx ikx
f(k) = e f (x)
N 2

always satisfy

2.5 Basics of Fourier Transform

Try to be familiar with the following identities:

f(x) = f (x), (2.37)

+ g)(k) = f(k) + g(k),
(f^ (2.38)
g)(k) = af(k),
(af (2.39)
i f )(k) = iki f (k), (2.40)
im f )(k) = (iki1 ) (ikim )f(k),
(i1 ^ (2.41)
d k|f(k)|2 = dn x|f (x)|2 ,
R n R
Parseval: (2.42)
R dn x
f(0) = (2) n/2 f (x), (2.43)
f (x) = f (x) f(k) = f (k), (2.44)
R dn p
convolution: ffg(k) =
n/2 f (k p)
g (p). (2.45)

2.5.1 Uncertainty Relation

What Fourier transform does to a function f (x) is to decompose it into a

superposition of sinusoidal waves. (This is the same as Fourier series, except
that the period .)
Spiky details of f (x) correspond to the large k dependence of f(k), while
the large scale behavior of f (x) is encoded in the small k dependence of f(k).

If we superpose two si-

An important property of Fourier transform is that it demonstrates the nusoidal waves with al-
uncertainty principle xp 1 of QM. If a wave function f (x) has a charac- most the same wavelength
teristic length scale L so that it is natural to write f (x) = F (x/L), then its ei(k+)x + ei(k)x =
2 cos(x)eikx , the result
Fourier transform will have a characteristic scale of L1 . That is, if the wave
is a seemingly sinusoidal
function f (x) is mostly concentrated within a range of length L, its Fourier wave with a slowly-varying
transform f(k) is mostly concentrated within a range of length 1/L. amplitude. It takes a dis-
This can be seen by a scaling argument as follows: tance of scale x 1/,
which is the inverse of the
dx uncertainty in wave num-
f(k) = f (x)eikx ber, to see the change in
Z amplitude, and to tell its
= F (x/L)eiLkx/L deviation from an exact si-
2 nusoidal wave.
= F (x0 )eiLkx

= LF (Lk). (2.46)

In the 3rd step we let x = Lx0 . (Note that the overall scaling of a wave function
does not change the range of concentration.)

2.5.2 Example
The most important example is the Gaussian
f (x) = Ae 2a2 . (2.47)
Its Fourier transform is
dx x2
f(k) = A e 2a2 +ikx
adx0 x02 a2 k2
= A e 2 e 2
a2 k 2
= aAe 2 . (2.48)
In the 2nd step we used the change of variable ax0 = x ia2 k to complete
square. The Gaussian we start with (f (x)) has an uncertainty of x = a, and
its Fourier transform is again an Gaussian with the uncertainty of k = 1/a.
Find the Fourier transform of the following functions:
1. Identity f (x) = 1.

2. Dirac -fx. f (x) = (x x0 ).

3. f (x) = x2 +a2

2.6 Applications to Differential Equations

Fourier transform can be used to reduce the action of differential operators to
multiplications by variables. For example, the ODE with constant parameters
a, b,  2 
d d
+ a + b (x) = (x) (2.49)
dx2 dx
is reduced to
(k 2 + iak + b)f(k) = (k). (2.50)
f can be easily solved from this algebraic eq., and then f (x) is know as the
Fourier transform of f. This works also for ODEs of higher orders.
Fourier transform for several variables can also be used to solve PDEs.
Fourier transform is not so useful when the differential op. involves non-
trivial fxs as coefficients. But if the coefficients are all independent of a certain
variable, it is still helpful to perform Fourier transform on that variable. For
example, for ops of the form
D = D0 t2 , (2.51)
where D0 is independent of t, we can do Fourier transform on t so that it is
reduced to
D = D0 + k 2 . (2.52)
Fourier transform applies best to problems with translational symmetry.

2.6.1 Example
Find the general solution to the differential equation
Recall that y(x) is in gen-
d eral a special solution plus
y(x) + a y(x) + by(x) = A cos(kx) (2.53) a general solution of the
dx dx
homogeneous equation.
for given constants a, b, A. Note that we have a quite different situation if k
happens to be a solution to the algebraic equation

k 2 + iak + b = 0.

(This happens only when a = 0 if k R.)

2.6.2 Electric Potential of Time-Dependent Source

This example combines
The electric potential V (x) in Lorentz gauge satisfies
your knowledge of E&M,
complex analysis and
2 (x) = (x), (2.54)
Fourier transform.

where (x) is the charge density. Due to superposition principle (the fact that
this equation is linear), the solution of V can be obtained by superposing the
electric potential generated by a point charge at an instant of time

2 G(x, x0 ) (2 t2 )G(x, x0 ) = (4) (x x0 ), (2.55)

where G(x, x0 ) represents V (x) generated by a point charge at the space-time

In general, solutions of a
point x0 . If G(x, x0 ) is known, (x) is just PDE of the form
(x) = d4 x0 G(x, x0 )(x0 ). (2.56) DG(x, x0 ) = (x x0 )

is called Greens functions.

Therefore, solving G(x, x0 ) reduces the problem of solving V (x) for all kinds
of charge distributions (x) to a problem of integration.
Due to translation symmetry, G(x, x0 ) = G(x x0 ). Its Fourier transform
(in a different convention)

d4 k
G(x) = G(k)eikx , (2.57)

where k x k0 t ~k ~x, satisfies

(k02 ~k 2 )G(,
~k) k 2 G(k)
= 1. (2.58)

So we find
d4 k 1 ikx
G(x) = e . (2.59)
(2)4 k 2
However, this expression is ill-defined, because the integrand diverges at k 2
k02 ~k 2 = 0.

The ambiguity involved in giving this ill-defined quantity a precise meaning

corresponds to the ambiguity in choosing the boundary/initial condition for
the differential eq.
Let us single out the integral of k0

d3~k i~k~x ~
dk0 1
G(x) = e g(k), g(~k) = eik0 t . (2.60)
2 ~
2 k0 k 2

The ambiguity resides in g(~k).

It turns out that the problem is better understood in terms of complex
analysis. The divergence of the integrand can be circumvented in the complex
plane of k0 . That is, we imagine that k0 is originally a complex number,
although the integral of interest is an integral over the real line in the complex
plane of k0 . The divergences of the integrand include a simple pole at k0 =
|~k| and another simple pole at k0 = |~k|. We can circumvent the poles by
passing around them either from above or from below. There are 4 possible
combinations of choices for the 2 poles, each giving a different but well-defined
(finite) integral.
For a given choice of definition of the integral, we can evaluate the integral
using the residue theorem.
The contour in the complex plane of k0 we need here must include the
real axis of k0 . To make a complete contour we need to add half a circle at
infinity either in the upper half plane or lower half plane. At infinity in the
UHP, k0 = + i, while in the LHP, k0 = i. The difference
between the choices of UHP vs. LHP is thus eik0 t = eitt (for UHP) vs.
eik0 t = eit+t (for LHP). Therefore, for t > 0, this exponential factor is 0
at infinity in the UHP, and for t < 0, it is 0 at infinity in the LHP.
Since what we want to compute is the integral over the real axis, the residue
theorem helps us only if the half circle at infinity has no contribution to the
integral. This critirium determines whether we should choose the half circle
in the UHP or LHP for the contour.
For example, what will we get if we decide to circumvent both poles of k0
by passing them from below (deforming the contour from the real axis to the
LHP around the poles)? Let us consider what the Greens fx. is for t > 0 and
t < 0 separately. For t > 0, we should choose the contour to be the real axis
plus a half circle at infinity in the UHP, so that the contour integral is the
same as the integral over the real axis. Then the residue theorem tells us that
2i 1 ~ 1 ~
g(~k) = ei|k|t ei|k|t . (2.61)
2 2|~k| ~

Similarly, for t < 0, we choose the contour to be the real axis plus a half circle
at infinity in the LHP, so that the contour integral is the same as the integral

over the real axis. The residue theorem now gives

g(~k) = 0, (2.62)

because neither of the poles is inside the contour. The fact that G(x x0 ) = 0
for t < t0 tells us that this is the retarded Greens fx.
If we had chosen to circumvent both poles from above, we would have
gotten the advanced Greens fx. Other choices of circumventing the poles give
other types of solutions, e.g. retarded potential for positive freq. modes and
advanced potential for neg. freq. modes, which is what Feynman chose for
Having explained the conceptual problems, let us now evaluate the retarded
Greens fx. more explicitly. Plugging (2.61) into (2.60), we get

dkd cos dk 2 ikr cos i ikt

G(x) = 3
e (e eikt ), (2.63)
(2) 2k

where we denoted |~k| by k, and defined w.r.t. ~x. One can easily integrate
over and , and find
dk 1 ikr
G(x) = (e eikr )(eikt eikt ). (2.64)
(2)2 2r

Finally, integrating over t and recall that this is for t > 0, the retarded Greens
fx. is (
1 1
4 r
(t r) (t > 0),
G(~x, t) = (2.65)
0 (t < 0).
The sol. to the PDE
(2 t2 ) = (2.66)

is therefore
1 1
(~x, t) = d x G(x x )(x ) = d3~x0 2
4 0 0 0
(~x0 , t |~x ~x0 |). (2.67)
8 |~x ~x0 |

2.6.3 Diffusion Equation

The diffusion eq.
t u = x2 u (2.68)

describes the change of temperature over space and time. For the class of
problems in which u(t = 0, x) = u0 (x) is given at the initial time t = 0, we
can first find the solution for u(t = 0, x) = (x x0 ), and then superpose the
solution for a generic initial condition u0 (x).
By Fourier transform,
u(t, x) = u(t, k)eikx . (2.69)

The diffusion eq. implies that

t u(t, k) = k 2 u(t, k), (2.70)

which is solved by
u(t, k) = v(k)etk . (2.71)
At t = 0, we want
dk dk ikx
u(t, x) = v(k)eikx = (x) = e . (2.72)
2 2
Therefore v(k) = 1, and
dk tk2 +ikx 1 2
u(t, x) = e = ex /(4t) . (2.73)
2 2 t
If the source is located at x = x0 , the solution is obtained by x (x x0 ).
u(t, x) here plays a role
resembling Greens fx. It Hence, for a generic initial condition, the solution is
is sometimes called a
0 0 1 (xx0 )2 /(4t) 1
Z 2
boundary Greens fx. u(t, x) = dx u0 (x ) e = dyu0 (x + 2 ty)ey . (2.74)
2 t

2.7 Laplace Transform

Laplace transform can be viewed as an application of Fourier transform to
functions f (t) defined for t > 0 (or t > t0 ), which is not necessarily decaying
sufficiently fast as t for f(k) to be well-defined.
For f (t) defined for t > 0 (f (t) is not defined for t < 0) and f (t)ect for
some c > 0 decaying to 0 sufficiently fast, the combination
1, t > 0,
(t) = f (t)ect (t) (2.75)
0, t < 0.
is a function whose Fourier transform can be defined while the Fourier trans-
form of f (t) may be ill-defined. But the information contained in this combi-
nation is as much as f (t) itself. There is (almost) no loss of information to use
the Fourier transform of this combination to describe f (t) (except the value
limt f (t)).
Applying the formulas of Fourier transform,
ct dk
f (t)e (t) = f (k)eikt , (2.76)

f (k) = dtf (t)e(cik)t . (2.77)

Combining exponential factors in these equations, we find

Z c+i
f (t)(t) = F (s)est , (2.78)

F (s) = dtf (t)est , (2.79)

where we renamed f(k) by F (s), and

s = c + ik. (2.80)

These formulas (2.78) and (2.79) define Laplace transform.

2.7.1 Example
Find the Laplace transform for f (t) = 1. (When we talk about Laplace trans-
form, we always implicitly assume that f (t) is not defined for t < 0.) Applying
(2.79), we find Z
F (s) = dtest = . (2.81)
0 s
Let us also transform F (k) back to f (t) to check the validity of our formulas
ds 1 st
f (t) = e = 1, for t > 0. (2.82)
2i s
Here we used the residue theorem, adding a large semicircle to the left of
complex plane of s to complete the contour. (est 0 for the semicircle on the
For t < 0, we must choose the semicircle on the right. Then the residue
theorem gives f (t) = 0.

2.8 Other Integral Transforms

In addition to Fourier and Laplace transforms, there are other types of integral
See M&W p.109, Other
In general, an integral transform is a change of basis of the space of func- Transform Pairs.
tions. A complete basis for a space of functions V is a set of functions
{n (x)}, which will be denoted {|n i}, that can be used to express any func-
tion f (x) V as a linear superposition
f (x) = fn n (x). (2.83)

Typically we can find a dual basis {n (x)} such that


dx(x)m (x)n (x) = mn . (2.84)

We call {n } an orthonormal basis if n = n .

Introducing the notation of inner product on V as
hf |gi dx(x)f (x)g(x), (2.85)

eq. (2.83) can be expressed as

|f i = |n ihn |f i, (2.86)

where n is summed over the basis and

fn = hn |f i dx(x)n (x)f (x). (2.87)

2.9 Exercises
1. Prove the following identities.
R x2 d n
n d n
dxf (x) dx n (x x0 ) = (1) dxn f (x0 ), x1 < x0 < x2 , (2.88)
(n) (F~ (~x)) = k 1 (n)
Fi (~x ~xk ). (2.89)
| det( x )|

2. Find the Fourier transform of the following fxs:

(a) f (x) = dx
(x x0 ).
(b) f (x) = x2 +a2
(c) f (t) = et/T sin(t) for t > 0.
(d) f (x) = xn ex for n = 0, 1, 2, for x > 0.
(e) f (x) = g(ax)eibx if G(k) is the Fourier transform of g(x).

3. Prove Parsevals equation (2.42) and convolution theorem (2.45).

4. Find the retarded Greens function G which satisfies

(t2 + 2 m2 )G(x, x0 ) = (4) (x x0 ), (2.90)

where m R can be interpreted as the mass of the particle in propaga-


5. Find the most general solution of the differential equation

(t2 x2 )(t, x) = 0 (2.91)

with the boundary condition

(t, 0) = (t, L) = 0 (2.92)

for (t, x) R.

6. Find the most general solution of the differential equation for given pa-
rameter a
(it + x2 + y2 a)(t, x, y) = 0 (2.93)
with the boundary conditions

(t, 0, y) = 0, x (t, L, y) = 0, (t, x, y + 2R) = (t, x, y).


7. If the Laplace transform of f (t) is F (k), what is the Laplace transform

of dtd f (t)?
rm 8. Use Laplace transform to find the solution of the differential eq.
ike X(t) = 2 X(t). (2.95)
9. Find the Laplace transform of the following functions:
= 0
(a) (t t0 ).
(b) sin t.
(c) cos t.
(d) tn .
(e) et .

10. Find the inverse Laplace transform of:

(a) (sa)n
(b) (sa)2 b2
(c) (s2 +1)(s1)

Chapter 3

Sturm-Liouville Theory

Structures are the

3.1 Motivation weapons of the mathe-
matician. Bourbaki
The purpose of this chapter is mainly to establish the concept of the following
table of analogy:

Linear Algebra Differential Equations Quantum Mechanics

function state (wave function)

v= ..
f (x) |i ((x))
linear space space of functions Hilbert space
{v Cn } {f (x) : BC} {|i}
inner product integration inner product
dx(x)f1 (x)f2 (x)
vT w C h1 |2 i
matrix differential operator operator
d2 d
Mij D = a(x) dx 2 + b(x) dx + c(x) O(x, p)
Hermitian matrix self-adjoint operator observable
Mij = Mji D = D O = O
eigenvector eigenfunction eigenstate
M v = v Df = f O|i = |i

Except subtleties involved in the limit n (n is the dimension of the

linear space), the analogy between linear and (linear) differential equations
is almost exact. This analogy will be very useful for our understanding of
differential equations.


3.2 Linear Algebra: Review

Review of linear algebra:

Linear space V:
If |i, |i V, then a|i + b|i V for a, b C.

{|i} is a basis of V if any state |i V is a superposition

If = 1, , n and all
s are linearly indepen- X
|i = |i. (3.1)
dent, the dimension of V

is n.

Inner product of V:
The inner product I(|i, |i), which is often denoted h|i, is a map
The notation h|i for in-
ner product is suggestive from two elements of V to C. It should be linear and anti-linear in the
of its properties. Define a two arguments
linear space V = {h| :
|i V} and call h| I(|i, (a|1 i + b|2 i)) = ah|1 i + bh|2 i, (3.2)
the Hermitian conjugate
I((a|1 i + b|2 i), |i) = a h1 |i + b h2 |i (3.3)
of |i. Let the linear re-
lations in V be defined
and Hermitian
from a complex conjuga-
tion of those in V. That
h|i = h|i. (3.4)
is, a linear relation in V
D is a linear operator on V if it is a map that maps any state |i V
|i = a|i + b|i to a state in V and
is mapped to
D(a|i + b|i) = aD|i + bD|i. (3.5)
h| = a h| + b h|
Eigenstates and eigenvalues:
in V . Then the properties
(3.2), (3.3) are naturally Solutions ( C, |i V) of the equation
inferred from the notation
h|i. We can also gen- D|i = |i (3.6)
eralize the notion of Her-
mitian conjugate to C as are called eigenvalues () and eigenstates (|i) of D.
complex conjugate. Then
(3.4) also follows from the Hermitian/Self-Adjoint operators:
usual rule for Hermitian The adjoint op. (denoted by D ) of D is defined by the requirement
conjugation of a product
h|D|i = (D |i) |i |i, |i V. (3.7)

(AB) = B A .
D is self-adjoint if D = D.

For a finite dimensional linear space V, the eigenvalues n are real, and
the eigenstates {|ni} form a complete basis of V.

3.3 ODE: Review

An ordinary differential equation (ODE) of order n

f (y(x), y 0 (x), y 00 (x), , y (n) ) = 0 (3.8)

typically needs n initial conditions

y(0), y 0 (0), , y (n1) (0) (3.9)

to uniquely determine a solution.

ODE of order 1 are of
Here we will only consider linear ODEs of order 2 the form f (y, y 0 ) = 0.
One can solve this alge-
a(x)y 00 (x) + b(x)y 0 (x) + c(x)y(x) = 0. (3.10) braic equation to the form
y 0 = g(y) and then it is
The linearity of the equation implies that if y1 (x) and y2 (x) are two (linearly solved by integrating both
independent) solutions, then sides of

g 1 (y)dy = dx.
a1 y1 (x) + a2 y2 (x) (3.11)

is also a solution. The fact that this ODE is of order 2 implies that there are Later we will also consider
inhomogeneous equations
only two independent parameters in the general solution, and so this is already
where 0 on the RHS is re-
the most general solution. placed by a given function
There are several elementary ODEs that every physicist is expected to be d(x).
able to solve.

d2 d
y + b y + cy = 0. (3.12)
dx dx
This equation has translation symmetry and one should try the ansatz
Recall Fourier transform
and Fourier series.
y(x) = ex , C. (3.13)

This includes all the following possibilities: sin(kx), cos(kx), sin(kx)ex ,


d2 d
ax2 2
y + bx y + cy = 0. (3.14)
dx dx
This equation has scaling symmetry and one should try the ansatz

y(x) = x , C. (3.15)

You are also expected to be able to solve those that can be put into the
forms above via simple change of variables.

3.4 Space of Functions and Differential Oper-

A space of functions as a linear space.

Differential operators as linear operators.

The first thing to note about a diff. eq. of the form

Apart from the linear-
space structure of the
D = (3.16)
space of functions, the
space of functions is
also equipped with the
is that this equation is formally the same as an equation in linear algebra, with
structure of an algebra, D = matrix (linear map), and |i and |i being vectors.
i.e., that of multiplication: We will use |i to represent elements in a vector space V, so we rewrite
f1 (x)f2 (x) = f3 (x). (3.16) as
However, the latter is not
D|i = |i. (3.17)
relevant in the problem of
solving diff. eqs. The space of functions is a linear space.
The differential operator D acts on a function to give another function,
and its action is linear:

D(a|f1 i + b|f2 i) = aD|f1 i + bD|f2 i. (3.18)

Thus D is a linear map acting on the linear space of functions. This is the
most salient feature of linear diff. eqs, and we will see that it is useful to view
it as a problem in linear algebra.
Feynman: The same
equations have the same
solutions. 3.4.1 Functions on a Lattice
In a numerical analysis of an ODE using computer softwares, the continuous
domain is often approximated by a lattice. The diff. op. becomes a difference
operator. |i and |i become columns with finite number of elements. One
can imagine that the original problem is the continuum limit of this problem
of linear algebra when the number of lattice sites goes to infinity.
The linear space of functions on a lattice has the natural basis in which each
basis vector |en i is the function which is 1 at the n-th point and 0 everywhere
else. A function can be expanded in this basis as |f i = |en ifn , where fn is the
value of the function at the n-th point. (The continuum limit of fn is f (x).)
We have the following correspondence:

dx, mn (x x0 ),
n x, fn f (x), n (3.19)
|f i = fn |ni f (x) = dx0 f (x0 )(x x0 ),
hf |gi = n fn gn dxf (x) g(x),
|f ihg| = fm gn |mihn|, fm gn f (x)g(x0 ) . (3.22)

3.4.2 Change of Basis

One can also choose a different basis for the linear space V related to the
previous basis by a linear map M : |en i = |Ea iMan . In the new basis, a
function is |f i = |Ea iFa , with Fa = Man fn In the continuum limit, it is
(n x, a k) Z This is the Fourier trans-
F (k) = dx u(k, x)f (x). (3.23) form if u(k, x) eikx .

Thus, functions do not have to be represented as f (x) (in terms of the basis
(x x0 )). They are vectors in a linear space and how they look depends on
which basis you choose. (The linear space of fxs is infinite dimensional; we
will worry about convergence later.)

3.4.3 Eigenfunctions
Recall that the complete set of eigenvectors of a Hermitian matrix M consti-
tutes a basis of the linear space on which M acts. Recall also that one can
always choose this basis to be orthonormal.
Understanding that a diff. eq. is a problem in linear algebra, we can
apply techniques in linear algebra. If D is Hermitian, it is associated with a
convenient basis of the linear space V, i.e., its eigenvectors.

D|en i = n |en i. (3.24)

The number of eigenvectors equals the dimension of the linear space V.

Any two eigenvectors with different eigenvalues must be orthorgonal, because

0 = hem |Den i (|D em i) |en i = (n m )hem |en i. (3.25)

For the subspace of V spanned by eigenvectors with the same eigenvalue, we

can always choose the eigenvectors to be orthonormal

hem |en i = mn . (3.26)

via Graham-Schmidt, Then we have the identity

M = |em iMmn hen | is a
linear operator whose ac-
|en ihen | = I, (3.27)
tion on a state is obvi-
ous from its notation. As
where I is the identity operator.
a matrix its elements are
Mmn . One can think
of |en i as the basis of
3.5 Graham-Schmidt Orthonormalization columns, and hen | as the
basis of rows.
The Graham-Schmidt Orthonormalization is a way to construct a complete
basis for a linear space. The space of functions is a linear space so we can use
this method to construct a complete basis of functions.

As a linear space, the space of functions V is equipped with an inner product

h1 |2 i = dx(x)1 (x)2 (x). (3.28)

Starting with an element in V, we can normalize it and call it 0 (x). It can

be chosen as the first element of the complete basis. Then we pick arbitrarily
another function 1 (x) which is linearly independent of 0 (x). We can project
out the part of 1 (x) that is proportional to 0 (x)

1 (x) 01 1 0 h0 |1 i (3.29)

so that it is now perpendicular to 0 : h0 |01 i = 0. We can normalize it and

rename it as 1 (x), and include it in the orthonormal basis. Then we pick a
third function 2 (x) which is linearly independent of {0 , 1 }, that is,

a0 (x) + b1 (x) + c2 (x) = 0 x (3.30)

only if a = b = c = 0. Again we project out the components of 2 (x) along 0

and 1
2 (x) 02 2 0 h0 |2 i 1 h1 |2 i. (3.31)
You can check that the 02 is perpendicular to both 0 and 1 :

h0 |02 i = h1 |02 i = 0. (3.32)

Now we normalize 02 and rename it as 2 . The first 3 elements of the complete

basis are found.
The process of finding the (n + 1)-th basis element goes on like this. Find
a function n that is linearly independent of {0 , , n1 }. Define
0n n k hk |n i. (3.33)

It is orthogonal to every element already in the basis {0 , , n1 }. We can

normalize it, rename it as n (x) and include it in the basis.
The construction guarantees that the set {0 , 1 , } is orthonormal. Usu-
ally we also adopt a systematic way to pick the n-th function so that the
completeness of the basis is more manifest. For instance, often we choose
0 to be constant and n to be a polynomial of order n.

3.6 Sturm-Liouville Differential Operator

For the inner product defined by
hf |gi = dx(x)f (x)g(x), (3.34)

where (x) is called the weight function, a generic 2nd order ordinary differ-
ential operator is Hermitian if it can be written in the form
1 d d i d d
D= p(x) + r(x) + r(x) + q(x) , (3.35)
(x) dx dx 2 dx dx

with real functions p(x), r(x), q(x), assuming suitable BCs. This expression
What are the assumptions
can be better understood by recalling the identity (M N ) = N M . If V is a on BC and p(x), r(x)?
space of real fxs, or fxs with Neumann BC, we need r(x) = 0, and D is called
the Sturm-Liouville differential operator.
Eigenvectors n of a Sturm-Liouville operator
1 d d
D= p(x) + q(x) , (3.36)
(x) dx dx

constitute a complete basis of the linear space of fxs V (on which D is self-
adjoint), assuming suitable choice of p(x), q(x), (x) as well as BCs. (We
dont need to consider eigenfxs which do not belong to V.) It is complete in
the sense that any well behaved (piecewise continuous function with a finite
number of finite discontinuities) F can be approximated to arbitrary accuracy
by a series n an n . That is,

Z x1 X
lim dx(x) F (x) an n (x) = 0. (3.37)
m x0 n=0

For example the periodic function f (x) = (x) on (, ), equals the

1 2 X sin((2n + 1)x)
f (x) = + (3.38)
2 n=0 2n + 1
up to points at the discontinuities.
For an orthonormal basis n , i.e., hm |n i = mn , the coefficients an are
Recall that |n ihn | =
1. This can also be writ-
an = hn |F i dx(x)n (x)F (x). (3.39) P
ten as n n (x)n (y) =
(x) (x y), where the
To be more rigorous, one should make some assumptions about the Sturm- Dirac delta fx is defined by
Liouville operator. (Discussions on the properties of eigenfxs/eigenvalues of dy(x y)f (y) = f (x)
for any well-behaved fx.
Sturm-Liouville ops is called the Sturm-Liouville theory.) A simple example
is to assume that p(x) > 0 and (x) > 0, and p(x) is differentiable, while (x)
For a proof of the Sturm-
and q(x) are continuous, and that the space of fxs V should be defined as the Liouville theorem, see W.
set of differentiable, square-integrable fxs living on a finite interval [a, b] either We will mention the proof
with homogeneous BCs of the form (a)+0 0 (a) = 0 and (b)+ 0 0 (b) = 0, in Appl. Math. IV , after
Greens fxs and calculus of
or with the periodic BC. In this case it can be proven that the eigenvalues i
variations are introduced.
are real and discrete, and the ordered set 1 < 2 < is unbounded
on one side. Furthermore, the number of zeros of the eigenfxs i is larger for
those with larger eigenvalues.
3.7. EXAMPLES 37

On the other hand, the nice properties (e.g. completeness of eigenfxs) are
not limited to those satisfying all the conditions above. If the space on which
fxs in V are defined is not compact, the index n is a continuous parameter and
the sum n should be replaced by an integral. The Kronecker delta above
shall be replaced by Dirac delta fx.

3.7 Examples
d 2
Here we construct a complete basis for the op. dx 2 for various boundary

conditions on the interval [0, 1].

3.7.1 Periodic BC
For the periodic BC

f (1) = f (0), f 0 (1) = f 0 (0), (3.40)

the answer is the Fourier series

f (x) = An cos(2nx) + Bn sin(2nx). (3.41)
n=0 n=1

The eigenfxs of x2 have eigenvalues (2n)2 .

3.7.2 Dirichlet BC
For the Dirichlet BC
f (0) = f (1) = 0, (3.42)

the answer is

f (x) = An sin(nx). (3.43)

The eigenvalues are (n)2 .

3.7.3 Neumann BC
For the Neumann BC
f 0 (0) = f 0 (1) = 0, (3.44)

the answer is

f (x) = An cos(nx). (3.45)

The eigenvalues are (n)2 .


3.8 Inequalities

3.8.1 Bessels Inequality

For a positive definite inner product hf |f i 0, if |f i = n an |n i for an
P0 0
orthonormal basis |n i, then hf |f i n |an |2 for a sum
over a subset of
the basis. The meaning of the equality is simply that the inner product can
be represented in different ways. The equality of inner product in different
representations is called Parseval relation.

3.8.2 Schwarz Inequality

The Schwarz inequality is simply

||f || 2 ||g|| 2 |hf |gi|2 , (3.46)

where ||f || 2 hf |f i. This is as obvious as | cos | 1.

3.9 Theorems
For the Sturm-Liouville operator
1 d d
D= p(x) + q(x) , (3.47)
(x) dx dx

where (x) and q(x) are continuous, p(x) is continuously differentiable, (x)
and p(x) are positive definite, and q(x) is non-negative for the range < x < ,
we have:

1. Let the eigenvalues be ordered

0 1 2 , (3.48)

then 0 0 and limn n = .

2. n (x) (the eigenfunction with eigenvalue n ) has n zeros in (, ).

3. Between any two consecutive zeros of n (x), there must be at least one
zero of m (x) if m > n .

4. If we increase p, q, or decrease , or reduce the range (, ), all eigen-

values increase.
3.10. EXERCISES: 39

3.10 Exercises:
1. The ODE
P (x) 2 + Q(x) + R(x) S(x) (x) = 0
dx dx

can be viewed as an eigenvalue problem for a Sturm-Liouville op. D.

What are the fxs (x), p(x), q(x) defining D?
The eigenvalue problem of
an op. D is to look for
2. For the eigenvalue problem
solutions (x) of the eq.
(D )(x) = 0 for any
y 00 (x) + y(x) = 0, y(0) = y(L) = 0, (3.49)
R or C. The set of
values of is called the
verify the 4 properties in sec. 3.9.

3. Find the complete set of eigenfxs and their eigenvalues for the operator
with the following BCs:

(a) f (0) = 0, f 0 (1) = 0.

(b) f (0) + 2f 0 (0) = 0, f (1) = 0.

Find the coefficients fn to expand a generic function f (x) in terms of

the eigenfunctions.

4. Find the complete set of eigenfxs for the eigenvalue problem

(x + 1)2 y 00 (x) + y(x) = 0 (3.50)

with the boundary condition

y(0) = y(1) = 0. (3.51)

Find the coefficients fn to expand a generic function f (x) in terms of

the eigenfunctions.

5. Use Schwarz inequality to prove the uncertainty relation

Hint for Ex.5: 1. (A
and (B B)
A) are both 1
Hermitian and satisfy the
AB . (3.52)
same commutation rela-
tion. 2. AB = 12 (AB + It holds for any state |i if A and B are Hermitian ops satisfying the
BA)+ 12 (ABBA). Note commutation relation
that the 1st term on the
RHS is Hermitian and the [A, B] AB BA = i. (3.53)
2nd anti-Hermitian.
2 |i with A h|A|i.
Here A2 is defined by h|(A A)

Chapter 4

Special Functions

4.1 Introduction
Some functions are special and arise naturally in elementary problems. Here
are a few possible reasons how some functions are special.

It arises as part of the eigenfxs of the Laplace op.

More generally, variations
of this eq., say,
2 + = 0. (4.1)
(2 V (~r)) + = 0
for certain V s and The Laplace op. in flat space 2 =
P 2
i i appears in almost every
curved spaces that are
elementary problem in physics (wave eq, diffusion eq, Schrodinger eq.,
important for physi-
cists/mathematicians, etc.)
can also lead to study of In Cartesian coordinates, eik~x is special. (And hence sin, cos are special.)
special functions. In spherical coordinates, Legendre polynormials are special.

It has a geometrical meaning.

We will not expect any-
one to memorize or to be
It has some interesting algebraic properties.
able to derive all the equa-
tions listed below. The They form a complete basis for a certain space of functions.
purpose of listing all these
equations is to give you an
idea about what kind of 4.2 Legendre Polynomials
identities exist for a typ-
ical special function. In Orthogonality:
the future, when you need Z 1
to use these properties of dxPm (x)Pn (x) = mn . (4.2)
1 2n + 1
a certain special function,
you will not panic and Examples:
know what kind of tools
P0 = 1 (4.3)
you may have to solve the
problem at hand. P1 = x (4.4)
Boundary condition: 1
P2 = (3x2 1) (4.5)
Pn (1) = 1 n.
P3 = (5x3 3x) (4.6)

General formula:
1 dn 2
Pn (x) = (x 1)n . (4.7)
2n n! dxn
Generating function:

X 1
g(t, x) = Pn (x)tn = . (4.8)
1 2xt + t2

Recurrence relations:

(n + 1)Pn+1 (x) (2n + 1)xPn (x) + nPn1 (x) = 0 (4.9)

(1 x2 )Pn0 (x) = nxPn (x) + nPn1 (x). (4.10)

Differential equation:

(1 x2 )y 00 2xy 0 + n(n + 1)y = 0. (4.11)

4.3 Hermite Polynomials

dxex Hm (x)Hn (x) = 2n n! mn . (4.12)

The coefficient of the xn
term in Hn is 2n .
H0 = 1 (4.13)
H1 = 2x (4.14)
H2 = 4x2 2 (4.15)
H3 = 8x3 12x (4.16)

Hn (x) = (1)n Hn (x). (4.17)
General formula:
dn x2
n x2
Hn (x) = (1) e e . (4.18)
Generating function:

X tn
e2xtt = Hn (x). (4.19)

Recurrence relations:

Hn+1 = 2xHn 2nHn1 (4.20)

Hn0 (x) = 2nHn1 (x) (4.21)

Differential equation:

y 00 2xy 0 + 2ny = 0. (4.22)


4.4 Laguerre Polynomial

Orthogonality: Z
dxex Lm (x)Ln (x) = mn . (4.23)
Boundary condition:

Ln (0) = 1 n. L0 = 1 (4.24)
L1 = 1 x (4.25)
L2 = 1 2x + x2 (4.26)
3 1
L3 = 1 3x + x2 x3 (4.27)
2 6
General formula:
e x dn n x

Ln = x e . (4.28)
n! dxn
Generating function:
X e 1z
g(x, z) = z Ln = . (4.29)

Recurrence relations:

(n + 1)Ln+1 = (2n + 1 x)Ln nLn1 , (4.30)

xL0n (x) = nLn (x) nLn1 (x), (4.31)

Differential equation:

xy 00 + (1 x)y 0 + ny = 0. (4.32)

4.5 Bessel Functions

General formula:
X (1)` x2`+m
Jm (x) = . (4.33)
22`+m `!(m + `)!
Z Generating function:
dx Jn (x) = 1.
0 X
e = tn Jn (x). (4.34)
From this we have n=

X Recurrance relation:
eix cos = in ein Jn (x).
n= d m
(x Jm (x)) = xm Jm1 (x). (4.35)
Differential equation:

x2 y 00 + xy 0 + (x2 m2 )y = 0. (4.36)

Other identities:

Jm (x) = (1)m Jm (x), (4.37)

Jm (x) , x 0, (4.38)
(m + 1)
Jn (x + y) = Jm (x)Jnm (y), (4.39)
Jn (x) = d cos(x sin n), (4.40)
2  m 
Jm (x) cos x , x . (4.41)
x 2 4
More identities:
m= Jm (x) = 1, (4.42)
dx x Jk (zkm x)Jk (zkn x) = 12 Jk+1
(zkm )mn , (4.43)
dr r Jm (kr)Jm (k 0 r) = k (k k 0 ),

where zkm = m-th zero of Jk (x).

The defintion of Bessel function Jn can be extended to the case when the
index is real J , R.
These functions J (x) are sometimes called Bessel functions of the first
kind. There are also Bessel functions of the second kind Y (x), which are also
called Neumann functions N (x). They can be defined by
J (x) cos() J (x)
N (x) = .
This is ill-defined for = integer. In that case we take the limit n. N (x)
is the other independent solution of the same differential equation (4.36) with
m . Hankel functions are just a change of basis

H(1) (x) = J (x) + iN (x), H(2) (x) = J (x) iN (x). (4.45)

The description above allows the argument x of the Bessel function J (x) to
be complex. When it is purely imaginary, we get the modified Bessel functions
+1 (1)
I (x) = i J (ix), K (x) = i H (ix). (4.46)
They satisfy the differential equation

x2 y 00 + xy 0 (x2 + 2 )y = 0. (4.47)

4.6 Other Special Functions

In this section we briefly introduce gamma function (x), beta function B(x, y),
and hypergeometric functions.

4.6.1 Gamma Function and Beta Function

The gamma function can be defined as
(x) = dt tx1 et . (4.48)

Using integration by parts, one can show from this that

(x) = (x 1)(x 1). (4.49)

For an integer n, (n) = (n 1)!.

Another useful property is

(x)(x) = . (4.50)
x sin(x)

Beta function is defined by

B(x, y) = . (4.51)
(x + y)

4.6.2 Hypergeometric Function

Differential equation:

x(1 x)y 00 + [c (a + b + 1)x]y 0 aby = 0. (4.52)

A regular solution is

ab a(a + 1)b(b + 1) 2
2 F1 (a, b; c; x) =1+ z+ z + .... (4.53)
1!c 2!c(c + 1)

Another independent solution is

x1c 2 F1 (a + 1 c, b + 1 c; 2 c; x). (4.54)


d ab
F (a, b; c; x)
dx 2 1
= F (a + 1, b + 1; c + 1; x),
c 2 1
(c) R 1 tb1 (1t)cb1
2 F1 (a, b; c; x) = (b)(cb) 0
dt (1tx)a . (4.56)

The generalized hypergeometric functions are

" #
a1 , a2 , , ap X (a1 )k (a2 )k (ap )k xk
p q ; x = , (4.57)
b1 , b2 , , bq (b 1 )k (b2 )k (bq )k k!

(a + k)
(a)k = = a(a + 1)(a + 2) (a + k 1). (4.58)

4.7 Exercises:
1. Expand the function
+1, 0 < x < 1
f (x) = (4.59)
1, 1 < x < 0.

as an infinite series of Legendre polynomials Pn (x).

2. Evaluate the sum Hint: Use the generating

X xn+1
Pn (x). (4.60) function.

3. Use the Grahamm-Schmidt orthogonalization to work out the first few

Hermite polynomials Hn (x) for n = 0, 1, 2, assuming that Hn (x) is a
polynomial of order n of the form Hn (x) = 2n xn + . (The measure of
integral is ex .)

4. (Fourier-Bessel transform)
Using (4.44), we define the Fourier-Bessel transform (or Hankel trans-
f (r) = dk k Jn (kr)F (k), F (k) = dr r Jn (kr)f (r). (4.61)
0 0

Find F (k) for f (r) = ear /r.

Hint: Use the generating
5. (Spherical Bessel function)
Try to solve the following differential equation

x2 y 00 + 2xy 0 + (x2 n(n + 1))y = 0 (4.62)

by using the ansatz y = x J (x) and y = x Y (x). Show that the result
is r r

jn (x) = Jn+1/2 (x), yn (x) = Yn+1/2 (x). (4.63)
2x 2x
6. What linear homogeneous second-order differential equation has

x Jn (x ) (4.64)

as solutions? Give the general solution of

y 00 + x2 y = 0. (4.65)

7. Find an approximate expression for the Gamma function (x) for large
positive x.
Use Stirlings formula and