You are on page 1of 139

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/301700649

Undergraduate Course on Partial Differential Equations

Research · April 2016


DOI: 10.13140/RG.2.1.2810.4561

CITATIONS READS

0 5,596

1 author:

Artem S Novozhilov
North Dakota State University
55 PUBLICATIONS   1,117 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

A generalisation of Eigen's quasispecies model View project

All content following this page was uploaded by Artem S Novozhilov on 29 April 2016.

The user has requested enhancement of the downloaded file.


Undergraduate Course in Partial Differential Equations

Artem Novozhilov
Department of Mathematics, North Dakota State University

April 29, 2016


Preface

These lecture notes were written in Spring 2016 when I taught an undergraduate course in PDE.
They include exactly the material I was able to cover during the usual three credit course. The
main idea was to keep the personal and informal spirit of the well known book by Stanley J. Farlow,
Partial Differential Equations for Scientists and Engineers and at the same time to add more rigor
and computations.
These lectures supplement the official course textbook by Peter Olver, Introduction to PDE. De-
spite the fact that the book is quite thick and very detailed, these lectures not very often follow the
exact details of Olver’s book. While different in many details and approaches, I generally keep the
same order of the material, since it looks to me to be very close to an ideal one for the first PDE
course.
There are very few problems in these notes, since I mostly used the problems from Olver’s book. I
do hope in the future to supplement each section with interesting and non-trivial problems that would
expand the material in these notes.

Artem Novozhilov1
1 May 2016

1
I can be contacted by e-mail: artem.novozhilov@ndsu.edu

1
Contents

0.1 What are PDE? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4


0.1.1 Basic definitions and the general philosophy of the course . . . . . . . . . . . . 4
0.1.2 More examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1 Waves. Transport and wave equations 9


1.1 The linear transport equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.2 Solving the one-dimensional transport equation. Traveling waves . . . . . . . . 10
1.1.3 Well posed problems in PDE theory . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2 Solving first order linear PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 Glimpse of the nonlinear world. Hopf’s equation . . . . . . . . . . . . . . . . . . . . . 22
1.4 Deriving the wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.5 Solving the wave equation for the infinite string . . . . . . . . . . . . . . . . . . . . . . 27
1.5.1 The general solution to the wave equation . . . . . . . . . . . . . . . . . . . . . 28
1.5.2 d’Alembert’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.5.3 Two examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.6 Solving the wave equation. Some extensions . . . . . . . . . . . . . . . . . . . . . . . . 34
1.6.1 Linear versus nonlinear equations . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.6.2 Solving an inhomogeneous wave equation. Duhamel’s principle . . . . . . . . . 35
1.7 More on the wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.7.1 Half-infinite string . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.7.2 The initial value problem for the wave equation is well posed . . . . . . . . . . 38

2 Fourier method 42
2.1 The heat or diffusion equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.1.1 Conservation of energy plus Fourier’s law imply the heat equation . . . . . . . 42
2.1.2 Initial and boundary conditions for the heat equation . . . . . . . . . . . . . . 44
2.1.3 A microscopic derivation of the diffusion equation . . . . . . . . . . . . . . . . 45
2.2 Motivation for Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.3 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.3.1 Formulas for the coefficients of (2.10) . . . . . . . . . . . . . . . . . . . . . . . 54
2.3.2 On the convergence of (2.10) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.3.3 Sine and cosine series. Odd and even extensions . . . . . . . . . . . . . . . . . 59
2.3.4 Differentiating Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.3.5 Final remarks and generalizations . . . . . . . . . . . . . . . . . . . . . . . . . 62

2
2.4 Fourier method for the heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.4.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.5 Sturm–Liouville problems. Eigenvalues and eigenfunctions . . . . . . . . . . . . . . . . 72
2.6 Solving the wave equation by Fourier method . . . . . . . . . . . . . . . . . . . . . . . 75
2.7 Solving the Laplace equation by Fourier method . . . . . . . . . . . . . . . . . . . . . 81

3 Delta function. Green’s functions. Fourier transform 88


3.1 Delta function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.2 Green’s function for a second order ODE . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.3 Green’s function for the Poisson equation . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.3.1 Fundamental solution to the Laplace equation . . . . . . . . . . . . . . . . . . . 95
3.3.2 Green’s function for a disk by the method of images . . . . . . . . . . . . . . . 97
3.4 Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.4.1 A first look at the Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . 98
3.4.2 Properties of the Fourier transform. Convolution . . . . . . . . . . . . . . . . . 102
3.5 Applications of Fourier transform to differential equations . . . . . . . . . . . . . . . . 104
3.5.1 Space-free Green’s function for ODE . . . . . . . . . . . . . . . . . . . . . . . . 104
3.5.2 General solution to the wave equation . . . . . . . . . . . . . . . . . . . . . . . 105
3.5.3 Laplace’s equation in a half-plane . . . . . . . . . . . . . . . . . . . . . . . . . . 106
3.5.4 Fundamental solution to the heat equation . . . . . . . . . . . . . . . . . . . . 106
3.5.5 Duhamel’s principle revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.6 Telegrapher’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

4 Heat and wave equations in two (and three) dimensions 112


4.1 Heat and wave equations on the plane. Separation of variables revisited . . . . . . . . 112
4.1.1 Solving the Dirichlet problem for the heat equation in a rectangle . . . . . . . . 112
4.1.2 Problem for a general domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.1.3 The planar heat equation on a disc . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.2 Elements of analytic ODE theory. Bessel’s functions . . . . . . . . . . . . . . . . . . . 117
4.2.1 Elements of analytic ODE theory . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.2.2 Solving Bessel’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.2.3 Some facts about solutions to Bessel’s equation . . . . . . . . . . . . . . . . . . 121
4.2.4 Summary about Bessel’s functions . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.3 Solving planar heat and wave equations in polar coordinates . . . . . . . . . . . . . . . 124
4.3.1 Heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.3.2 Wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.4 Solving the wave equation in 2D and 3D space . . . . . . . . . . . . . . . . . . . . . . 133

3
0.1 What are PDE?
0.1.1 Basic definitions and the general philosophy of the course
Since the main prerequisite for this course is a basic course on Ordinary Differential Equations (ODE),
and everyone in class is accustomed with the idea to solve an equation where the unknown is some
function, I start directly with
Definition 0.1. Partial Differential Equation (abbreviated in the following as PDE in both singular
and plural usage) is an equation for an unknown function of two or more independent variables that
involves partial derivatives.
Since there is some vagueness in the given definition, I can give a mathematically more satisfactory
definition as
Definition 0.2. A PDE is an equation of the form
F (x, y, . . . , u, ux , uy , . . . , uxx , uxy , uyy , . . .) = 0
for the given function F and the unknown function u.
In Definition 0.2 I used the notation
∂u ∂2u
ux =, uxx = ,...
∂x ∂x2
for the partial derivatives. Sometimes other notations are used, in particular
∂u ∂2u
∂x u = , ∂xx u = ,...
∂x ∂x2
but I will usually stick to the notation with subscripts.
Definition 0.3. The order of PDE is the order of the highest derivative.
Example 0.4. Here is an example of a second order PDE:
ut = uxx + uyy + u,
where, as should be clear from the equation itself, the unknown function u is a function of three
independent variables (t, x, y). In the following I will save variable t to denote almost exclusively time
and x, y, z to denote the Cartesian coordinates.
It is nice to have a general and mathematically rigorous Definition 0.2, however, already at this
point I would like to state in a slightly incorrect and provocative form that
There exists no general mathematical theory of partial differential equations.
Moreover, the historical trend for studying various problems involving PDE shows that particular
specific examples of PDE, motivated by physical (geometrical, biological, etc) situations, are the
driving force for the development of abstract mathematical theories, and this is how I would like to
proceed in my course: From specific examples to necessary mathematical tools to the properties of
the solutions. There are a lot of good reasons picking such modus operandi, but instead of giving my
own arguments I will present two quotations.
The first one if from Preface to the first volume of Methods of mathematical physics by Courant
and Hilbert:

4
Since the seventeenth century, physical intuition has served as a vital source for math-
ematical problems and methods. Recent trends and fashions have, however, weakened
the connection between mathematics and physics; mathematicians, turning away from the
roots of mathematics in intuition, have concentrated on refinement and emphasized the
postulational side of mathematics, and at times have overlooked the unity of their sci-
ence with physics and other fields. In many cases, physicists have ceased to appreciate
the attitudes of mathematicians. This rift is unquestionably a serious threat to science
as a whole; the broad stream of scientific development may split into smaller and smaller
rivulets and dry out. It seems therefore important to direct our efforts toward reuniting
divergent trends by clarifying the common features and interconnections of many distinct
and diverse scientific facts. Only thus can the student attain some mastery of the material
and the basis be prepared for further organic development of research.
The second quotation is from Preface to Lectures on Partial Differential Equations by Vladimir
Arnold:
In the mid-twentieth century the theory of partial differential equations was considered the
summit of mathematics, both because of the difficulty and significance of the problems it
solved and because it came into existence later than most areas of mathematics.
Nowadays many are inclined to look disparagingly at this remarkable area of mathematics
as an old-fashioned art of juggling inequalities or as a testing ground for applications of
functional analysis. Courses in this subject have even disappeared from the obligatory
program of many universities [. . .] The cause of this degeneration of an important general
mathematical theory into an endless stream of papers bearing titles like “On a property
of a solution of a boundary-value problem for an equation” is most likely the attempt to
create a unified, all-encompassing, superabstract “theory of everything.”
The principal source of partial differential equations is found in the continuous-medium
models of mathematical and theoretical physics. Attempts to extend the remarkable
achievements of mathematical physics to systems that match its models only formally
lead to complicated theories that are difficult to visualize as a whole [. . .]
At the same time, general physical principles and also general concepts such as energy, the
variational principle, Huygens’ principle, the Lagrangian, the Legendre transformation, the
Hamiltonian, eigenvalues and eigenfunctions, wave-particle duality, dispersion relations,
and fundamental solutions interact elegantly in numerous highly important problems of
mathematical physics. The study of these problems motivated the development of large
areas of mathematics such as the theory of Fourier series and integrals, functional analysis,
algebraic geometry, symplectic and contact topology, the theory of asymptotics of inte-
grals, microlocal analysis, the index theory of (pseudo-)differential operators, and so forth.
Familiarity with these fundamental mathematical ideas is, in my view, absolutely essential
for every working mathematician. The exclusion of them from the university mathemat-
ical curriculum, which has occurred and continues to occur in many Western universities
under the influence of the axiomaticist/scholastics (who know nothing about applications
and have no desire to know anything except the “abstract nonsense” of the algebraists)
seems to me to be an extremely dangerous consequence of Bourbakization of both mathe-
matics and its teaching. The effort to destroy this unnecessary scholastic pseudoscience is a

5
natural and proper reaction of society (including scientific society) to the irresponsible and
self-destructive aggressiveness of the “superpure” mathematicians educated in the spirit of
Hardy and Bourbaki.

Following the spirit of these two citations (one is from 1924 and another is from 2004) I will try to
use the physical intuition and concentrate on the specific examples rather than on the general theory
as much as possible.

Example 0.5. I assume that the function u of two independent variables (x, y) satisfies the PDE

ux = 0.

How can I solve it? By a simple integration, of course:


ˆ
u(x, y) = 0 dx = f (y),

where f is an arbitrary function of variable y. Hence the first conclusion: while the general solutions
to ODE usually depend on the arbitrary constants (the number of which coincides with the order
of the equations), for PDE the general solution depends on the arbitrary functions. This fact alone
should convince you in a bigger complexity of PDE.

Next important (and very non-obvious) moment here is whether I can take any function f for my
general solution. Jumping way ahead, I would like to state that “What does it mean to solve a PDE?”
is a very difficult question. This difficulty notwithstanding, most of the time we will be contend to
live with a much easier specific concept which is called the classical solution:

Definition 0.6. The function u : D −→ R is called a classical solution to a k-th order PDE if it
satisfies this equation at every point of its definition and belongs to the set C (k) (D; R).

Recall that the notation C (k) (U ; V ) means the set of functions u : U −→ V whose all k-th or-
der derivatives are continuous (it is said that the function u is k times continuously differentiable).
Therefore (returning to Example 0.5) my general solution u(x, y) = f (y) will be a classical solution
to ux = 0 only if f ∈ C (1) (R; R).

Exercise 0.1. Can you find a classical solution to

uxy = 0?

Example 0.7. Function


1
u(t, x) = t + x2
2
is a classical solution to
ut = uxx
because u ∈ C (2) (R2 ; R) and satisfies the equation (check it). Q: Can you come up with other classical
solutions to this equation?

Four basic equations we will be studying in this course are:

6
• One-dimensional transport equation:
ut + cux = 0.

• Wave equation:
utt = ∆u.
Recall that ∆ is called the Laplace operator and is given by

∆u = div grad u = ∇2 u,

here ∇ is the del operator, in this particular case ∇ = (∂x , ∂y , ∂z ). In particular in the Cartesian
coordinates it is written for u : R3 −→ R as

∆u = uxx + uyy + uzz ,

it should be clear how to write the Laplace operator for the functions defined on the plane R2
and on the line R.

• Heat or diffusion equation:


ut = ∆uxx .

• Laplace equation:
∆u = 0.

The first one is a linear first order equation and the other three are linear second order equations.
It is simply staggering how much modern mathematics was developed in the attempts to solve these
equations (or their close relatives). We will see only a tiny, but very important, part of this.

0.1.2 More examples


Here are some more examples of PDE that appear in various applications:
• Linear transport equation:

k
ut + bi uxi = 0.
i=1

• Helmholtz’s equation:
∆u = λu, λ ∈ R.

• Schrödinger’s equation:
iut + ∆u = 0.
Here i is the imaginary unit, i2 = −1.

• Telegraph equation:
utt + 2dut − uxx = 0, d > 0.

• Beam equation:
utt + uxxxx = 0.
All the examples above are linear. Here are some nonlinear examples:

7
• Hopf ’s equation:
ut + uux = 0.
• The eikonal equation: (from German word for image)
(ux )2 + (uy )2 = 1.

• Hamilton–Jacobi equation:
ut + H(ux , x) = 0,
where H is a given nonlinear function, which is called the Hamiltonian.
• Korteweg–de Vries (KdV) equation:
ut + uux + uxxx = 0.

• Reaction–diffusion equation:
ut = f (u) + ∆u,
where f is a given nonlinear function.
It is also often necessary and important to study systems of PDE:
• Maxwell’s equations of classical electrodynamics:
E t = curl B,
B t = − curl E,
div B = div E = 0.
Here E = (E1 , E2 , E3 ), B = (B1 , B2 , B3 ),
∂F1 ∂F2 ∂F3
div F = ∇ · F = + + ,
∂x ∂y ∂z
and
i j k


curl F = ∇ × F = ∂x ∂y ∂z .
F1 F2 F3

• Navier–Stokes equations of hydrodynamics:


ut + u · ∇u − ∆u = ∇p,
div u = 0,
where u = (u1 , u2 , u3 ). Here there are four nonlinear equations for four unknown functions
u1 , u2 , u3 , p. There are still a lot of open questions about this system, most famous of which is
the existence of global solutions. This is one of the six unsolved “million” millennium problems
by Clay Institute.
Exercise 0.2. Consider a convex closed curve in the plane with coordinates (x, y). Outside the region
bounded by the curve we consider function u whose value at each point is the distance from that point
to the curve. This function is smooth. Prove that this function satisfies the eikonal equation.
Exercise 0.3. Find all solutions u = f (r) of √
the two-dimensional Laplace equation uxx + uyy = 0
that depend only on the radial coordinate r = x2 + y 2 .

8
Chapter 1

Waves. Transport and wave equations

1.1 The linear transport equation


1.1.1 Derivation
Let u denote a density (i.e., mass per volume unit) of some substance and I assume that the geometry
of the problem is well approximated by a line (think about some pollutant in a river or cars on a road)
with the coordinate x. Therefore I deal with a spatially one-dimensional problem, and my density
at the point x at time t is given by u(t, x). I assume additionally that the total amount of u stays
constant (no sources or sinks) and calculate the change of the substance in some arbitrary interval
(x1 , x2 ). First, at time t I will have that the total amount of u in (x1 , x2 ) is
ˆ x2
u(t, x) dx,
x1

and its change at time t is, assuming that u is smooth enough,


ˆ ˆ x2
d x2
u(t, x) dx = ut (t, x) dx.
dt x1 x1

At the same time the same change is given by

q(t, x1 ) − q(t, x2 ),

where q(t, x) is the flux of the same substance, i.e., the change of this substance per time unit at the
point x. Basically, q(t, x1 ) tells me how much enters (x1 , x2 ) at time t at the left boundary and q(t, x2 )
tells me how much substance leaves (x1 , x2 ) at the right boundary, hence the choice of the signs. By
the conservation law
ˆ x2 ˆ x2
ut (t, x) dx = q(t, x1 ) − q(t, x2 ) = − qx (t, x) dx,
x1 x1

where the last equality holds by the fundamental theorem of calculus. Finally, since my interval is
arbitrary, I conclude that my substance must satisfy the equation

ut + qx = 0.

9
To arrive to the final equation I must decide on the connection between u and q. In the simplest case

q(t, x) = cu(t, x)

for some constant c I end up with the linear one-dimensional transport equation

ut + cux = 0. (1.1)

Keeping in mind the physical interpretation of (1.1), I will need also some initial condition

u(0, x) = g(x), (1.2)

for some given function g (initial density) and, if my system is not spatially infinite, some boundary
conditions (see below). At this point I assume that it is a good approximation to let x ∈ (−∞, ∞)
and hence no need for boundary conditions. My next goal is to show that I can always find a solution
to the problem (1.1)–(1.2) and, even more importantly, that this solution is physically relevant (the
problem is well posed ).

1.1.2 Solving the one-dimensional transport equation. Traveling waves


The textbook provides a lot of motivation and details, I will provide a very similar arguments here.
The key point here is to realize that (1.1) actually gives a lot of geometric information about the
possible solution u which is a surface in the considered case of two variables (t, x). Note that the
gradient of u is given by
grad u = ∇u = (ut , ux ),
and hence equation (1.1) can be written equivalently as

γ · ∇u = 0, γ = (1, c).

Here dot denotes the usual dot product between two vectors. We know that if the dot product is zero
hence these two vectors are orthogonal:
γ ⊥ ∇u.
At the same time, the gradient of the function points in the direction of the fastest increase and,
hence, perpendicular to the level sets u(t, x) = const, which are curves on the plane (t, x). Finally,
I conclude that γ is parallel to the level sets u(t, x) = const, therefore the level sets are the straight
lines with the direction vector γ. I can rephrase the last sentence as follows: the sought function u
is constant along any straight line with the direction vector γ. All such straight lines can be written
as x = ct + ξ, where ξ is some constant. Now I claim that this geometric interpretation I discussed is
enough to find the solution to our problem. Namely, since I know that u is constant along x = ct + ξ
then the value of u at an arbitrary point (t∗ , x∗ ) is the same along the line x = ct + x∗ − ct∗ . At the
initial time t = 0 I will get the unique point x = x∗ − ct∗ , an the value of u(t∗ , x∗ ) must be equal to
g(x) = g(x∗ − ct∗ ). Therefore, I found, dropping the asterisks, that the unique solution to the problem
(1.1)–(1.2) is
u(t, x) = g(x − ct).
I am aware that for some people the geometric arguments I presented are not very convincing (but
please see the figures below) therefore I will present a totally algebraic argument, which also will allow

10
me to find a general solution to the linear transport equation. The key idea is that I should try to
consider (1.1) in the new coordinates (which I “guessed” from the geometry of the equation)
τ = t, ξ = x − ct,
(note that in the textbook t is used instead of τ ). This change of variables is clearly invertible.
I have
u(t, x) = u(τ, ct + ξ) = v(τ, ξ) = v(t, x − ct).
Now I can use the chain rule. To wit,
∂u ∂v ∂v ∂τ ∂v ∂ξ ∂v ∂v
(t, x) = (t, x − ct) = (τ, ξ) + (τ, ξ) = (τ, ξ) − c (τ, ξ).
∂t ∂t ∂τ ∂t ∂ξ ∂t ∂τ ∂ξ
Similarly,
∂u ∂v
(t, x) = (τ, ξ).
∂x ∂ξ
After plugging the found expressions into (1.1) I get
vτ − cvξ + cvξ = 0 =⇒ vτ = 0,
which we already solved in Section 0.1. Therefore,
v(τ, ξ) = F (ξ)
and hence, returning to the original coordinates,
u(t, x) = F (x − ct),
where F is an arbitrary C (1) function (recall that I am looking for a classical solution). This is the
general solution to equation (1.1).
To use the initial condition (1.2) I take t = 0 and get, as expected, that F (x) = g(x) hence F = g.
Summarizing,
Theorem 1.1. Problem (1.1) has the general solution
u(t, x) = F (x − ct)
for an arbitrary F ∈ C (1) (R; R) function.
The initial value problem (1.1), (1.2) with g ∈ C (1) has a unique classical solution
u(t, x) = g(x − ct).
Theorem 1.1 is an existence and uniqueness theorem for the initial value problem for the linear
one dimensional transport equation.
The straight lines x = ct + ξ are very important and called the characteristics. Hence we now
know that the solutions to the transport equation are constant along the characteristics.
What is the geometric meaning of the function of the form g(x − ct)? Take, e.g., c > 0 then
for fixed time moments t0 = 0, t1 > t0 , t2 > t1 , . . . I will get the same graph of g only shifted by
0, ct1 , ct2 , . . . units to the right. What I observe is a traveling wave moving from left to right with
the speed c. If c < 0 then my wave will travel with the speed |c| from right to left. This geometric
picture should explain the title transport equation, since, according to the analysis above, the equation
describes the transportation of the substance u from the point (0, ξ) to some other point (t, x) along
the characteristic x = ct + ξ.

11
Figure 1.1: Traveling wave solution for different time moments

Example 1.2. Here is a simple graphical illustration of solutions to the problem

ut + ux = 0,

with the initial condition


u(0, x) = e−x .
2

We have that the solution is u(t, x) = e−(x−t) , whose graphs at different time moments are given in
2

Fig. 1.1.
In Fig. 1.2 one can see the same solution in the form of full three dimensional surface (the left
panel) and a contour plot (the right panel), on which the level sets are represented.

Figure 1.2: Traveling wave solution in the form of 3D plot (left) and contour plot (right)

12
Finally, we can put everything together in the same figure (Fig. 1.3), where you can see solutions at
different time moments (bold curves), the three dimensional surface, together with the characteristics
(bold dashed lines) on the plane (t, x).

Figure 1.3: Traveling wave solution from two different viewpoints

Example 1.3 (Distributed source). To practice the considered approach, consider the initial value
problem (this is Problem 2.2.10 from the textbook)
ut + cux = f (t, x), t > 0, x ∈ R,
(1.3)
u(0, x) = g(x), x ∈ R.
The physical interpretation of problem (1.3) is that now we have a distributed source of the substance
with the intensity (density per time unit) f (t, x) at the time t and the position x.
Exercise 1.1. Deduce problem (1.3) from physical principles and the conservation law.
Using the same change of variables I find

vτ = f (τ, cτ + ξ).

Integrating, I find ˆ τ
v(τ, ξ) = f (s, cs + ξ) ds + F (ξ),
0
or, returning to the original variables,
ˆ t
u(t, x) = f (s, x − c(t − s)) ds + F (x − ct).
0

Using the initial condition I finally get


Theorem 1.4. Let g ∈ C (1) and f, fx ∈ C. Then the unique solution to (1.3) is given by
ˆ t
u(t, x) = f (s, x − c(t − s)) ds + g(x − ct).
0

13
Example 1.5 (Transport equation with decay). Recall that one of the very basic ODE is the so called
decay equation (you could have seen it with respect to, e.g., radioactive decay, it literally says that
the rate of decay of some compound is proportional to the present amount)

u′ = −au,

for some constant a > 0. The solution to this equation is (by, e.g., the separation of variables)

u(t) = Ce−at

and is used in, e.g., radioactive dating.


Consider now the PDE
ut + cux = −u,
that is, physically, I assume that in addition to the transport process there is also decay proportional
to the present density.
By the same change of the variables I get

vτ = −v,

and hence
v(τ, ξ) = F (ξ)e−τ ,
where F is an arbitrary smooth function of ξ. Returning to the original variable,

u(t, x) = F (x − ct)e−t ,

which results in
u(t, x) = g(x − ct)e−t
for the initial condition
u(0, x) = g(x).
Now the solutions are not constants along the characteristics, but they are decaying along them
representing a damped traveling wave.

Example 1.6 (A problem with a boundary condition). Assume that now u denotes the concentration
of some pollutant in a river and there is a source of this pollutant at the point x = 0 of the intensity
f (t) at time t (this is essentially Problem 2.2.14). Mathematically it means that I consider only
problem for x > 0, t > 0 for the equation

ut + cux = 0

with c > 0.
Since it is natural to have the initial condition now only for x > 0 for some part on the first
quadrant I will have to use the boundary condition. The characteristic x = ct separates two regions,
where I must use either the initial or boundary condition. If x ≥ ct then I can use the usual initial
condition. If, however, x < ct then for my general solution u(t, x) = F (x − ct) I must have

F (−ct) = f (t),

14
Figure 1.4: Correctly (left) and incorrectly (right) stated boundary conditions for a transport equation.
The arrows indicate the direction of the transport with time, hence for the left panel I have c > 0 and
for the right one c < 0

or
F (τ ) = f (−τ /c).
Finally I obtain a unique solution
{
g(x − ct), x ≥ ct,
u(t, x) =
f (t − x/c), x ≤ ct.

To guarantee that my solution is a classical one, I should request that f (0) = g(0) and −cg ′ (0) = f ′ (0)
hold.
Now assume that c < 0 in my problem. Therefore the characteristics will have a negative slope
and each characteristic will cross both x > 0 and t > 0 half-lines. This will result in a well posed
problem if and only if the initial and boundary values will be coordinated, otherwise, the problem will
have no physical solution (see Fig. 1.4).
This is a first sign that in PDE a correct (physically) choice of initial and boundary conditions
will result in a well posed problem, whereas an arbitrary assigning initial conditions can lead to no
solutions at all.

1.1.3 Well posed problems in PDE theory


I already used the term “well posed” problems several times. It has, actually, a rigorous mathematical
meaning.

Definition 1.7. The problem is called well possed is it has a solution, this solution is unique, and
this solution depends continuously on the initial data and the parameters.

15
In the examples above I usually showed that a solution exists by presenting an explicit formula
for this solution. The uniqueness was guaranteed by the fact that the signals are spread along the
characteristics and the initial (and boundary) conditions uniquely prescribe values at one point of the
characteristic. Here I will give a first example of continuous dependence on the initial data.
Proposition 1.8. Consider a boundary-initial value problem for the transport equation:

ut + cux = 0, 0 < x < R, t > 0, c > 0


u(0, x) = g(x), 0 < x < R,
u(t, 0) = f (t), t > 0.

The solution to this problem depends continuously on the initial data.


Proof. First I multiply the equation by u and note
1 c
uut + cuux = (u2 )′t + (u2 )′x = 0.
2 2
Integrating in x from 0 to R yields
ˆ ( )
d R 2
u (t, x) dx + a u2 (t, R) − u2 (t, 0) = 0.
dt 0
Hence, by positivity of a, ˆ R
d
u2 (t, x) dx ≤ af 2 (t).
dt 0
Now I integrate in t and use the condition u(0, x) = g(x):
ˆ R ˆ R ˆ t
u (t, x) dx ≤
2 2
g (x) dx + a f 2 (s) ds.
0 0 0

Finally I note that if u1 , u2 are solutions to the problems with g1 , f1 and g2 , f2 respectively, then
u1 − u2 , due to linearity, solves the problem with the initial condition g1 − g2 , f1 − f2 . Hence I end up
with the estimate
ˆ R ˆ R ˆ t
( )2 ( )2 ( )2
u1 (t, x) − u2 (t, x) dx ≤ g1 (x) − g2 (x) dx + a f1 (s) − f2 (s) ds,
0 0 0

which shows that small change in the initial data yields small change in the solutions, which is
tantamount to saying that the solution depends continuously on the initial data. 

1.2 Solving first order linear PDE


In this lecture I will deviate slightly from the textbook and consider a general linear ODE of the form

a(x, y)ux + b(x, y)uy = c(x, y)u + d(x, y), (1.4)

where a, b, c, d ∈ C (1) (R2 ; R) are given functions. I will consider an initial condition

u(x, y)|(x,y)∈Γ = g(x, y), (1.5)

16
which says that the initial condition is prescribed along some arbitrary (well, not totally, see below)
curve Γ on the plane (x, y). The main difference from the textbook is that I am allowing to have rather
general initial conditions contrary to the fixed value t = 0 in the textbook. I also use the variables
x and y for the independent variables to emphasize that while it is important to keep in mind the
physical description of the problem, mathematically for us both variables t (time) and x (space) are
equally important and sometimes better make them indistinguishable. Even more importantly, a lot
of first order PDE appear naturally in geometric rather than physical problems, and for this setting
x and y are our familiar Cartesian coordinates.

Remark 1.9. All I am going to present is almost equally valid for a semi-linear first order equation

a(x, y)ux + b(x, y)uy = f (x, y, u), (1.6)

where f is some, generally nonlinear, function.

Let me consider the system of ordinary differential equations


dx
= a(x, y),
dτ (1.7)
dy
= b(x, y).

Its solutions (x(τ ), y(τ )) form a family of curves on the plane, parameterized by the variable τ . These
curves (which is a basic fact from ODE theory) do not intersect and called (hopefully, not surprisingly)
the characteristics or characteristic curves. The key fact is that

along the characteristics problem (1.4) (or (1.6)) becomes an ordinary differential equation.

Indeed, consider the solution u(x, y) to (1.4) along (x(τ ), y(τ )). It becomes just the function of τ
alone: v(τ ) = u(x(τ ), y(τ )) (here I picked a different letter to emphasize that v depends only on τ ).
Now take the derivative with respect to τ :
dv
= ux x′τ + uy yτ′ = a(τ )ux + b(τ )uy = c(τ )v + d(τ ),

which is a linear first order ODE. To get the initial condition for this ODE I will use (1.5).
In general (several examples are given below), to solve the initial value problem (1.4)-(1.5) I proceed
in the following way. I consider a parametrization of the initial curve Γ:

Γ : x(ξ), y(ξ),

along which my initial condition becomes just a function of ξ: g(ξ).


Now, for each fixed ξ I solve problem (1.7) with the initial condition x(ξ), y(ξ), my unique (due to
ODE theory) solution will be
(x(τ, ξ), y(τ, ξ)).
Along this curve, as showed above, my PDE becomes the ODE

v̇ = c(τ )v + d(τ )

17
with the initial condition
v(0, ξ) = g(ξ).
The unique solution is v(τ, ξ) and I found a parametric representation of the solution to (1.4)-(1.5):
x(τ, ξ), y(τ, ξ), v(τ, ξ).
If I am able to express τ and ξ from the first two functions then I will finally get the unique solution
u(x, y) = v(τ (x, y), ξ(x, y)).
There is sill the question whether I will always be able to do it, but I will postpone the general
disussion and instead consider a few examples.
Example 1.10. Solve
xux + yuy = u,
(1.8)
u(x, 1) = g(x).
The curve of the initial conditions is given simply as y = 1. In this case I can always take
x = ξ, y = 1.
Hence I have for my characteristics
dx
= x, x(0, ξ) = ξ,

dy
= y, y(0, ξ) = 1,

which immediately implies that
x(τ, ξ) = ξeτ , y(τ, ξ) = eτ .
From the last expression I also have that
x
ξ=
.
y
Along my characteristics I have (note the initial condition)
dv
= v, v(0, ξ) = g(ξ) =⇒ v(τ, ξ) = g(ξ)eτ .

Finally, returning to the initial variables (x, y), I have
( )
x
u(x, y) = g y.
y
Note that my solution is not defined at y = 0.
To present graphs (see Fig. 1.5), I will use

g(x) = e−x ,
2

and hence my solution is


u(x, y) = e−(x/y) y.
2

18
Figure 1.5: Left panel: The bold solid line is the curve of the initial conditions. The dotted lines are
the characteristics. The points show where the initial condition for each characteristic is given. Note
that the characteristics are defined only for y > 0. Right panel: The surface of the solution along with
the initial condition (bold curve) and solutions of the corresponding ODE along the characteristics

Example 1.11. Solve

yux − xuy = 0,
(1.9)
u(x, 0) = g(x), x > 0.

The reason why I define the initial condition only for x > 0 will be given below.
The system for characteristics is given by
dx
= y, x(0, ξ) = ξ,

dy
= −x, y(0, ξ) = 0.

Probably the easiest way to solve it is to reduce this system to one second order ODE. Denoting with
prime the derivative with respect to τ I have

x′′ = y ′ = −x =⇒ x′′ + x = 0.

This is the equation for the harmonic oscillator, its general solution is

x(τ ) = C1 cos τ + C2 sin τ.

Using the initial conditions x(0) = ξ, x′ (0) = 0 I get

x(τ ) = ξ cos τ, y(τ ) = −ξ sin τ.

Squaring and adding the equations together I will find

x2 + y 2 = ξ 2 ,

19
hence my characteristics are circles of radius ξ. As a side remark I note that the same result can be
obtained by reducing the system to just one equation:
dx
x

dy
=−
y

and integrating this separable equation.


Along the characteristics I have
dv
= 0, v(0, ξ) = g(ξ) =⇒ v(τ, ξ) = g(ξ).

Returning to the original coordinates, I have
(√ )
u(x, y) = g x2 + y 2 ,

which gives me the solution to my problem. If I take g(x) = sin x, then the solution is drawn in Fig. 1.6.
Now we can see why the initial condition was prescribed only for x > 0. Since the characteristics are
circles in this problem, if my initial condition was given for −∞ < x < ∞ then each characteristic
would intersect it as two points. Therefore, for each ODE along this characteristic I would have two
initial conditions, which yields a contradiction (nonexistence of solution in all but very special cases).
To conclude these examples we must decide when I actually can express my two parameters τ and
ξ as functions of x, y. It turns out (this is usually not covered in Calc III, but a curious student can
look up the inverse function theorem) that it is always true if

Figure 1.6: The surface of the solution along with the initial condition (bold curve) and solutions of
the corresponding ODE (bold dashed curves) along the characteristics (thin solid curves)

20
the curve of the initial conditions is not tangent to any characteristic.

Summarizing,

Proposition 1.12. Problem (1.4)-(1.5) has a unique solution, which can in general be defined on some
subset of R2 , if the curve Γ on which the initial conditions are given is not tangent to a characteristic,
and if the characteristics do not intersect Γ at more than one point.

Example 1.13. Now I would like to reconcile the theory I outlined above and the approach in the
textbook, where the transport equation with a non-constant velocity is given:

ut + c(x)ux = 0,

with the initial condition


u(0, x) = g(x).
Using the parametrization as in the previous examples, I will find that
dt
= 1, t(0, ξ) = 0 =⇒ t = τ,

and hence I will only need one parameter ξ, which is introduced in the equation for the characteristics
(note I use t instead of τ )
dx
= c(x), x(0) = ξ.
dt
Along the curve defined by this equation I have an ODE
dv
= 0, v(0, ξ) = g(ξ) =⇒ v(t, ξ) = g(ξ).
dt
Hence, if I can express ξ from the equation of the characteristic, then I will have my unique solution
u(t, x) = g(ξ(t, x)).
To illustrate (Problem 2.2.17), let me take

c(x) = −x .

In this case the characteristics are the solutions to


dx
= −x , x(0) = ξ =⇒ x = ξe−t .
dt
If the initial condition is given by
1
u(0, x) = ,
x2 + 1
then my unique solution is hence
1
u(t, x) = .
+1 x2 e2t
More details can be found in the textbook and in the homework problems.

21
To conclude, in the last two lectures I considered the so-called method of characteristics to solve an
initial value problem for a linear (or semi-linear) first order PDE, where the unknown function depends
on two independent variables. The key fact is that along the special curves, called the characteristic
curves or characteristics, these PDE turn into ODE, for which an extensive theory exists (from a
physical point of view this is a manifestation of particle-wave duality, when the system can be either
described by the positions of discrete particles or using a continuous representation of a force field).
This method can be immediately generalized to linear first order PDE with more than two independent
variables and also, with some modifications, to nonlinear equations. I will only touch on the latter
topic in the following lecture.

1.3 Glimpse of the nonlinear world. Hopf ’s equation


Recall that I obtained the linear transport equation ut +cux = 0 as the consequence of the conservation
law ut + qx = 0, which connects the quantity u and its flux q, and the assumption that q(u) = cu. If
I make another assumption, namely q(u) = u2 /2, I will end up with the so-called Hopf ’s equation
ut + uux = 0, (1.10)
for which I will need the initial condition
u(0, x) = g(x) (1.11)
for some given smooth function g.
Equation (1.10) is an example of a quasi-linear equation. All quasi-linear equations can be analyzed
using a slight modification of the characteristic method, but I will only look at solutions of the Hopf
equation.
To solve (1.10)-(1.11) I will use an analogy with the transport equation and assume that x = x(t)
is my characteristic (which I do not know yet) along which my (also unknown) solution is constant:
u(t, x(t)) = const.
If I am able to find x(t) then my problem is solved since at t = 0 I can use my initial condition
u(t, x(t)) = g(ξ), where ξ = x(0). I differentiate u(t, x(t)) with respect to t and get
ut + x′ (t)ux = 0.
On the other hand, equation (1.10) along the characteristic is
ut + g(ξ)ux = 0.
Subtracting and assuming that ux ̸= 0 yields
x′ (t) = g(ξ), x(0) = ξ =⇒ x(t) = g(ξ)t + ξ,
which tells me that my characteristic is a straight line with the slope g(ξ). Now I can write the solution
by using that ξ = x − g(ξ)t:
u = g(x − g(ξ)t),
or, recalling that g(ξ) = u along the characteristic:
u = g(x − ut), (1.12)
which gives me the solution to my problem in an implicit form.

22
Figure 1.7: Nonlinear wave in the Hopf’s equation

Exercise 1.2. Check directly that u = g(x − ut) solves problem (1.10)-(1.11).
To see the possible consequence of (1.12) consider the following example.
Example 1.14. Let
g(x) = e−x .
2

Then, using (1.12), I have


u = e−(x−ut) ,
2

which is impossible to resolve for u, but I can sketch my solution for several time moments, see
Fig. 1.7. We can see that for our nonlinear problem the velocity of propagation depends on the
spatial coordinate and the initial density. Basically the equation itself says that the points with higher
concentration move faster, with time overcoming the points with lower concentration.
To understand what is happening here it is worth drawing the characteristics (see Fig. 1.8). Since
the slope of the characteristics depends on the initial concentrations, in some cases, as in my example,
the characteristics can intersect. Since my solution must be constant along the characteristics it means
that at the point where the characteristics intersect I must have more than one value of u, which is
what exactly happens in Fig. 1.7.
To find the first time moment when the characteristics intersect, I can look at the point when the
derivative of u with respect to x becomes infinite. Using the implicit differentiation, I find from (1.12)
( )
∂u ′ ∂u ∂u g ′ (x − ut)
= g (x − ut) 1 − t =⇒ = .
∂x ∂x ∂x 1 + g ′ (x − ut)t
Recalling that x − ut = ξ I finally get the condition of the first time when my solution becomes
multivalued (the characteristics intersect) as
{ }
1
t = min − ′ .
ξ g (ξ)

23
Figure 1.8: Characteristics of the Hopf’s equation with the initial condition g(x) = e−x
2

For my example, I get √


2e
t= ≈ 1.166 .
2
Exercise 1.3. Confirm this value .
Note that if g ′ (ξ) > 0 then my solution is well defined for all t > 0.
Equations of the form (1.10) appear in various situations, in particular in gas or hydro- dynamics,
or in the modeling of the car traffic. From a physical point of view to have several values of the
solution at the point x is meaningless and hence must be avoided. The exact details how it is done
can be found in the textbook (or, better, in the references cited in the textbook).

Figure 1.9: An example of a nonlinear wave in nature

As a takehome message from this lecture I would like all the students to remember that the main
distinction between linear (solutions to the transport equation) and nonlinear (solutions to the Hopf’s
equation) waves is that for the linear wave all the points move with the same velocity whereas for the
nonlinear waves the velocity depends on the concentration. If someone wants to have a more colorful
picture of this phenomenon remember the waves at the ocean, when eventually the wave collapses:
this is a nice example of a nonlinear wave.

24
1.4 Deriving the wave equation
From now on I consider only linear second order partial differential equations, and the first equation
I will study is the so-called wave equation which, in one spatial dimension, has the form

utt = c2 uxx , (1.13)

where c is some constant.


Contrary to the textbook, I will present one possible way to arrive to this equation. Recall that
I obtained the transport equation in the previous lectures as the consequence of the fundamental
conservation law. The wave equation is the consequence of another fundamental physical law: the
second Newton’s law, that states that the product of mass and the acceleration is equal to the net
force applied to the body. Consider, for example, the classical mechanical system of mass on a spring
(see Fig. 1.10). If the only force acting on the mass is the restoring force of the spring then, using

Figure 1.10: Mass on a spring

another physical law, namely Hook’s law, I arrive at

mu′′ = −ku.

Here u(t) is the position of my mass at time t, u′′ (t) is the acceleration, m is the mass, ku is the
Hook law that says that the force is proportional to the displacement (note that this is actually true
only for small displacements), the minus because the force points opposite to the u-axis, and k is the
parameter, rigidity of the spring. Rewriting this as

′′ 2 k
u + ω u = 0, ω = ,
m
I find that the general solution to this ODE can be written as

u(t) = C1 cos ωt + C2 sin ωt,

which represents harmonic oscillations. C1 and C2 are arbitrary constants that can be uniquely
identified using the initial position and initial velocity of the mass.
Now let me consider a more general situation, when I have k + 1 equal masses, such that the zeroth
and the last one are fixed, but all others are free to move along the axis and linearly interconnected
with the springs with the same rigidity k (see Fig. 1.11).

25
Figure 1.11: A system of k + 1 masses connected by identical springs. The initial and final masses are
fixed

Let ui (t) denote the displacement at time t of the i-th mass from its equilibrium position xi . Using
again the second Newton’s law I get, for the i-th mass

mu′′i = Fi,i+1 −Fi,i−1 = k(ui+1 −ui )−k(ui −ui−1 ) = k(ui+1 −2ui +ui−1 ), i = 1, . . . , k−1, u0 = 0, uk = 0.

For each ui I also should have two initial conditions: initial displacement and initial velocity. This
is still a system of (k − 1) ODE, which can be actually analyzed and solved. I, however, instead of
solving it, will consider the situation when the number of my masses grows to infinity, i.e., k → ∞.
Assuming that at the equilibrium all my masses separated by the same distance h it is equivalently
to say that I consider the case when h → 0. In other words I consider a continuous limit of a discrete
system!
I have that ui (t) = u(t, xi ) and the equilibrium positions of all the masses become a continuous
variable x. That is, when h → 0, I have that the vector (u1 (t), . . . , ui (t), . . . , uk−1 (t)) becomes a
function of two variables u(t, x), where now the meaning of u(t, x) is the displacement of the section
of a rod at time t that had the coordinate x at rest. Note that my discrete system of masses turns
into a rod. To carefully perform the limit I also need to understand how my two parameters behave.
It is natural to assume that
m = ρSh,
where ρ is density of the rod, S is the area of transverse section. For the rigidity I will use the physical
fact that
ES
k= ,
h
where E is the Young’s modulus. Hence I get that my system can be rewritten now as

E (ui+1 − 2ui + ui−1 )


u′′i = .
ρ h2
The left hand side tends to utt as h → 0. Consider Taylor’s series for ui+1 , ui , ui−1 around h = 0:
1 ′′
ui+1 = u(xi + h, t) = u(xi , t) + u′x (xi , t)h + u (xi , t)h2 + . . . ,
2! xx
ui = u(xi , t),
1 ′′
ui−1 = u(xi − h, t) = u(xi , t) − u′x (xi , t)h + u (xi , t)h2 + . . . ,
2! xx

26
where dots denote the terms of order 3 and above in h. Now if I plug these expressions into my
differential equation and cancel h2 , I can see that

(ui+1 − 2ui + ui−1 )


= uxx (xi , t) + . . . ,
h2
where now dots denote the terms that of order 1 in variable h and hence approach 0 as h → 0. Hence

(ui+1 − 2ui + ui−1 )


→ uxx (x, t), h → 0.
h2
Finally, I can conclude that the continuous limit of my discrete system of masses on the springs is
described by the wave equation √
E
utt = c2 uxx , c = .
ρ
I also have, by the same limit procedure, that I need two initial conditions

u(0, x) = f (x), ut (0, x) = g(x),

and, if the ends of my rod are fixed, two boundary conditions

u(t, 0) = u(t, l) = 0,

where l is the length of the rod.


Wave equations appears in my situations where some wave processes occur, such as sound waves,
light waves or water waves. I basically showed that the longitudinal oscillations of the rod are described
by the wave equation. It can be shown that (small) transversal oscillations of a string (such as guitar
or violin string) are also described by the same equation.

1.5 Solving the wave equation for the infinite string


In this lecture I assume that my string (or rod) are so long that it is reasonable to disregard the
boundary conditions, i.e., I consider an infinite space. In this case I get the initial value problem for
the wave equation
utt = c2 uxx , t > 0, −∞ < x < ∞ (1.14)
with the initial conditions

u(0, x) = f (x), ut (0, x) = g(x), −∞ < x < ∞. (1.15)

Arguably the best way to get an intuitive understanding what is modeled with this equation is to
imagine an infinite guitar string, where u(t, x) represent a transverse displacement at time t at posi-
tion x.
Very surprisingly (do not get used to it, this is a very rare case for PDE), problem (1.14)–(1.15)
can be solved explicitly.

27
1.5.1 The general solution to the wave equation
First I will find the general solution to (1.14), i.e., the formula that includes all possible solutions to
the wave equation. To do this I note that I can rewrite (1.14) as
( 2 ) ( )( )
∂2u 2
2∂ u ∂ 2 ∂
2 ∂ ∂ ∂ ∂
−c = −c u= −c +c u = 0.
∂t2 ∂x2 ∂t2 ∂x2 ∂t ∂x ∂t ∂x
Denoting ( )
∂ ∂
v= +c u,
∂t ∂x
I find that my wave equation (1.14) is equivalent to two first order linear PDE:

vt − cvx = 0,
ut + cux = v.

From the previous lectures we know immediately that the first one has the general solution v(t, x) =
F ∗ (x + ct), for some arbitrary F ∗ , and the second one has the solution
ˆ t ˆ t
u(t, x) = v(s, x − c(t − s)) ds + G∗ (x − ct) = F ∗ (x − ct + 2cs) ds + G∗ (x − ct).
0 0

Making the change of the variables τ = x − ct + 2cs in the integral, I have


ˆ
1 x+ct ∗
u(t, x) = F (τ ) dτ + G∗ (x − ct). (1.16)
2c x−ct
Finally, since F ∗ , G∗ are arbitrary, by denoting
ˆ ˆ
1 x+ct ∗ 1 0
F (x + ct) = F (τ ) dτ, G(x − ct) = F ∗ (τ ) dτ + G∗ (x − ct),
2c 0 2c x−ct
I get the final result that
u(t, x) = F (x + ct) + G(x − ct),
for arbitrary C (2) functions F and G. This expression, and the analysis from previous lectures, tell
me that the general solution to the wave equation is a sum of two linear traveling waves, one of which
moving to the left and another one moving to the right.

1.5.2 d’Alembert’s formula


Here I will solve problem (1.14)–(1.15) and reproduce a famous formula, first obtained by Jean-Baptiste
le Rond d’Alembert in 1747.
From the first condition in (1.15) I immediately get that G∗ = f . To use the second condition I
calculate
ut (0, x) = F ∗ (x) − cf ′ (x) = g(x) =⇒ F ∗ (x) = g(x) + cf ′ (x).
This yields ˆ x+ct
1 1 1
u(t, x) = g(s) ds + f (x + ct) − f (x − ct) + f (x − ct),
2c x−ct 2 2

28
and finally ˆ
1( ) 1 x+ct
u(t, x) = f (x + ct) + f (x − ct) + g(s) ds, (1.17)
2 2c x−ct
which is d’Alembert’s formula.

Remark 1.15. All the results above are obtained in a different way in the textbook. Namely, the
change of variables
η = x + ct, ξ = x − ct
reduced the wave equation to its canonical form

vηξ = 0.

I invite the students to perform these calculations on their own.

1.5.3 Two examples


To get an intuitive understanding how formula (1.17) works consider two examples. I assume that
c = 1 below.

Example 1.16. Let my initial condition be such that the initial velocity is zero. Then my formula
simplifies to
1( )
u(t, x) = f (x + ct) + f (x − ct) ,
2
and hence my solution is a sum of two identical linear waves, each of which is exactly half of the initial
displacement. Therefore I can envision the behavior of solutions by first dividing in half the initial
displacement, then shifting one half to the left and the other one to the right, and then adding them.
Let, e.g, f be defined as 

0, x < −0.5,
f (x) = 1, −0.5 ≤ x ≤ 0.5,


0, x > 0.5.
The form of the solution at different time moments is shown in Fig. 1.12.
Now it should be clear how the three dimensional surface looks like (Fig. 1.13, left panel).
Finally, the behavior of solution should be also clear from looking at the plane (t, x) and several
straight lines of the form
x = ct + ξ, x = −ct + η,
which are also called the characteristics of the wave equation. Note that the signal spreads along the
characteristics (Fig. 1.13, right panel).

Example 1.17. For the second example I take f (x) = 0 and




0, x < −0.5,
g(x) = 1, −0.5 ≤ x ≤ 0.5,


0, x > 0.5.

29
Figure 1.12: Solution at different time moments to the initial value problem for the wave equation in
case when the initial velocity is zero

Now the solution takes the form


ˆ
1 x+ct
1( )
u(t, x) = g(s) ds = G(x + ct) − G(x − ct) ,
2c x−ct 2c

30
Figure 1.13: Left: Solution in the coordinates (x, t) to the initial value problem for the wave equation
in case when the initial velocity is zero. Right: Characteristics of the wave equation

where G is any antiderivative of g, e.g., I can always take


ˆ x
G(x) = g(s) ds.
−∞

Therefore the solution now is a difference of two traveling waves, each of which is exactly half of G.
In my case 

0, x < −0.5,
G(x) = x + 0.5, −0.5 ≤ x ≤ 0.5,


1, x > 0.5,
and my solution is given in Fig. 1.14. See also Fig. 1.15 for the three dimensional picture. Again the
overall picture can be figured out from the plane (t, x) and characteristics on it.

These two examples give a general idea how actually solutions to the wave equation behave. Note
that in both cases I used initial conditions that are not continuously differential (they have “corners”)
and hence my solutions are not the classical solutions. However, the notion of the solution to the wave
equation can be extended in a way to include these, nondifferential, solutions. They are usually called
weak solutions.
Another point to note that characteristics of the wave equation allows immediately to see which
initial conditions contribute to the solution at a given point (t, x) (this is called domain of dependence)
and also how the given point ξ on the initial condition spreads the signal with time (range of influence),
see Fig. 1.16.

31
Figure 1.14: Solution for different time moments to the initial value problem for the wave equation in
case when the initial displacement is zero. The actual solution is shown with the green border

32
Figure 1.15: Solution in the coordinates (x, t) to the initial value problem for the wave equation in
case when the initial displacement is zero

Figure 1.16: Domain of dependence of the point with coordinates (t, x) (left, the bold line on the x
axis) and the range of influence of the point ξ along the line of the initial conditions (right, the shaded
area)

33
1.6 Solving the wave equation. Some extensions
1.6.1 Linear versus nonlinear equations
I would like to start this lecture with a discussion what is called linear in mathematics. Recall that I
used the presentation ( 2 )
∂ 2
2 ∂
−c u=0
∂t2 ∂x2
to write the wave equation in a form that hints to the possible way to solve it. The expression inside
the parenthesis is called a differential operator. If I denote this differential operator as L then I can
use the notation
Lu = 0
to write my wave equation in a concise way.
In general, any differential equation, either ordinary of partial, can be written in the form Lu = f ,
where L is some differential operator and f some function that does not depend on u. For example, for
the simple harmonic oscillator u′′ + u = 0 I have that L = dtd2
2 + 1. For the linear transport equation
with variable speed ut + c(x)ux = 0 my operator is ∂t + c(x)∂x and so on.

Definition 1.18. A (differential) operator L is called linear if for any u1 and u2 from its domain and
any constants α1 and α2 I have

L(α1 u1 + α2 u2 ) = α1 Lu1 + α2 Lu2 .

For example, the wave operator L = ∂tt −c2 ∂xx , the simple harmonic oscillator operator L = ∂tt +1,
and the operator from linear transport equation L = ∂t + c(x)∂x are linear (check this!). However,
the differential operator of the Hopf equation ut + uux = 0 is not linear. Indeed, let me apply this
operator to αu for some constant α. Then I have that αut + α2 uux , which is not equal to α(ut + uux )
as necessary for the operator to be linear.
Now I can give a rigorous definition of a linear differential equation.

Definition 1.19. Let L be a differential operator. Then the differential equation

Lu = f, (1.18)

where f does not depend on u, is called linear if L is a linear operator. It is called linear homogeneous
if f = 0 and inhomogeneous (or nonhomogeneous) otherwise. If L is not linear then equation (1.18)
is called nonlinear.

The linearity of the equation is very important, since for the linear equations holds the so-called
superposition principle, which is a consequence of the following simple and yet very important propo-
sition.

Proposition 1.20. Consider a linear inhomogeneous equation

Lu = f, (1.19)

and its homogeneous counterpart


Lu = 0. (1.20)

34
If u1 , u2 solve (1.20) then α1 u1 + α2 u2 also solves (1.20). If u1 , u2 solve (1.19) then u1 − u2 solves
(1.20). If u1 solves Lu = f1 and u2 solves Lu = f2 then α1 u1 + α2 u2 solves Lu = α1 f1 + α2 f2 . Finally,
any solution to (1.19) can be written as
u = uh + up ,
where uh is the general solution to (1.20) and up is a particular solution to (1.19).

Proof. Since most of the stated facts follow directly from the definition, I will only prove the final
statement.
Indeed, if I have some (fixed) particular solution up to (1.19) then the difference u − up , where u
is an arbitrary solution to (1.19), must give any (general) solution uh to (1.20). That is u − up = uh ,
which implies u = uh + up , which is the required form. 

1.6.2 Solving an inhomogeneous wave equation. Duhamel’s principle


Consider an inhomogeneous wave equation on the infinite string:

utt = c2 uxx + F (t, x),


u(0, x) = f (x), (1.21)
ut (0, x) = g(x),

where F, f, g are given sufficiently smooth functions. First I will use the linearity of this equation to
divide it into simpler problems. To wit, consider

vtt = c2 vxx ,
v(0, x) = f (x), (1.22)
vt (0, x) = g(x),

and
wtt = c2 wxx + F (t, x),
w(0, x) = 0, (1.23)
wt (0, x) = 0.

That is, I divided my original problem into the initial value problem for the homogeneous wave
equation and inhomogeneous problem with zero initial conditions.

Lemma 1.21. Let v solve (1.22) and w solve (1.23). Then u = v + w solves (1.21).

Proof. Exercise. 

I know how to solve problem (1.22), for this I can simply use d’Alembert’s formula. Hence I
only need to figure out how to solve inhomogeneous problem (1.23). For this I will use the so-called
Duhamel’s principle, which generally works for linear differential equations. The idea is to reduce the
inhomogeneous problem to a series of homogeneous ones with specific initial conditions and after it

35
sum (integrate) everything together by using the linearity of the equation. To perform this program
I should be able to solve, along with (1.22), the following problem

vtt = c2 vxx ,
v(τ, x) = f (x), (1.24)
vt (τ, x) = g(x),

which differs only in the initial time moment.

Lemma 1.22. Problem (1.24) has the solution (note that I include the parameter τ also in my formula)
ˆ
f (x − c(t − τ )) + f (x + c(t − τ )) 1 x+c(t−τ )
v(t, x; τ ) = + g(s) ds.
2 2c x−c(t−τ )

Proof. To prove the result one can use a new variable η = t − τ , for which the initial condition is
given at zero, use d’Alembert’s formula and return to the original variables. The details are left as an
(easy) exercise. 

All the main auxiliary work is done and I am ready to prove

Lemma 1.23. Consider yet another initial value problem for the wave equation:

rtt = c2 rxx ,
r(τ, x; τ ) = 0, (1.25)
rt (τ, x; τ ) = F (τ, x),

and let r(t, x; τ ) solve this problem. Then


ˆ t
w(t, x) = r(t, x; τ ) dτ
0

solves problem (1.23).

Proof. To prove the lemma I will use Leibnitz integral rule, which says that
ˆ b(x) ˆ b(x)
d d
g(s, x) ds = g(s, x) ds + g(b(x), x)b′ (x) − g(a(x), x)a′ (x).
dx a(x) a(x) dx

I have ˆ t ˆ t
wt = r(t, x; t) + rt (t, x; τ ) dτ = rt (t, x; τ ) dτ.
0 0
Next ˆ t ˆ t
wtt = rt (t, x; t) + rtt (t, x; τ ) dτ = F (t, x) + rtt (t, x; τ ) dτ.
0 0
I also have ˆ t ˆ t
1
wxx = rxx (t, x; τ ) dτ = rtt (t, x; τ ) dτ.
0 c2 0

Hence I get wtt − c2 wxx = F (t, x) and w(0, x) = wt (0, x) = 0, which concludes the proof. 

36
Now I can put everything together. According to Lemma (1.22) solution to (1.25) is given by
ˆ x+c(t−τ )
1
r(t, x; τ ) = F (τ, η) dη.
2c x−c(t−τ )

Using d’Alembert’s formula, Lemma 1.21, and Lemma 1.23 I have that the solution to (1.21) is given
by
ˆ ˆ ˆ
f (x − ct) + f (x + ct) 1 x+ct 1 t x+c(t−τ )
u(t, x) = + g(s) ds + F (τ, η) dη dτ,
2 2c x−ct 2c 0 x−c(t−τ )
which is the solution to the inhomogeneous wave equation with given initial conditions.

Remark 1.24. The same solution obtained in the textbook by switching to the characteristic coor-
dinates ξ = x − ct, η = x + ct. I invite the students to read through this derivation.

Remark 1.25. I will use Duhamel’s principle again later in the course, so I summarize here the main
idea of this principle. It proceeds in two steps:

1. Construct a family of solutions of the homogeneous initial value problem with variable initial
moment τ > 0 and the initial data F (τ, x).

2. Integrate the above family with respect to the parameter τ .

Remark 1.26. The double integral in the final solution shows that the disturbance at the point (t, x)
is obtained through the disturbances from every point of the characteristic triangle (that is, of the
triangle with vertex (t, x) and two sides given by characteristics x − ct and x + ct). Hence if we have
only initial displacement, the signal comes from just two points; if we have initial velocity nonzero
that the signal comes from the interval (domain of dependence); if we have nonhomogeneous equation,
the signal is summed (integrated) throughout the whole characteristic triangle.

Exercise 1.4. Show that the Duhamel principle, applied to the first order linear ODE u′ +p(t)u = q(t)
yields the variation of the constant method.

1.7 More on the wave equation


1.7.1 Half-infinite string
In this lecture I will assume that my string is half-infinite, that is, in addition to the initial conditions
on x ≥ 0 I also have a boundary condition at the point x = 0. When I derived the wave equation as
the limit of the system of masses connected by springs, I showed that the boundary condition of the
form
u(t, 0) = 0
corresponds physically to the fixed end of the string. It can be argued that the boundary condition

ut (t, 0) = 0

corresponds to a free end of the string. Let me start first with the fixed end.

37
I consider the problem

utt = c2 uxx , t > 0, x > 0,


u(0, x) = f (x), x > 0,
(1.26)
ut (0, x) = g(x), x > 0,
u(t, 0) = 0, t > 0.

To solve problem (1.26) I will prove the following lemma.

Lemma 1.27. Let the initial conditions for the wave equation on the infinite string are odd functions
with respect to some point x0 then the corresponding solution at this point is equal to zero.

Proof. Without loss of generality, assume that x0 = 0. Then I have that f (x) = −f (−x), g(x) =
−g(−x). Using d’Alembert’s formula for x = 0 yields
ˆ
f (−ct) + f (ct) 1 ct
u(t, 0) = + g(s) ds = 0,
2 2c −ct

since the first term is zero because f is odd and the second one is also zero as an integral of an odd
function through a symmetric interval. 

This lemma implies that to solve (1.26) all I need is to extend my initial conditions in an odd
fashion and apply d’Alembert’s formula.

Example 1.28. Consider again as the initial displacement the function that is equal to 1 on (0.5, 1.5)
and zero otherwise. If one continues this function in add fashion then we can use the usual d’Alembert’s
formula. The details are given in Fig. 1.17.
Similarly we can treat the case when the initial displacement is zero and the initial velocity is now
given as the same function (see Fig. 1.18).

How to deal with the problem with a free end? For this I can use

Lemma 1.29. Let the initial conditions for the wave equation on the infinite string are even functions
with respect to some point x0 then the derivative of the corresponding solution at this point is equal to
zero.

Exercise 1.5. Prove this lemma.

Hence to solve the problem for the wave equation with a free end one just needs to extend the
initial conditions, given for x > 0 to the whole axis in an even fashion, see 1.19. I will leave it as an
exercise to realize what happens with the string if the initial velocity is given.
Similar technique can be used to solve problems with two boundary conditions.

1.7.2 The initial value problem for the wave equation is well posed
Recall that physically we are usually looking for well posed problems. It means that solution ex-
ists, unique, and depends continuously on the initial conditions. For the infinite string d’Alembert’s
formula proves existence of the solution and its uniqueness, since all the manipulations which I did
to obtain this formula are invertible, and hence the formula itself is equivalent to the equations and

38
Figure 1.17: Solution to the wave equation with one fixed end and zero initial velocity

initial conditions. The continuous dependence on the initial conditions also follows immediately from
d’Alembert’s formula. In particular, if we are given the problem with f1 , g1 and f2 , g2 respectively, I

39
Figure 1.18: Solution to the wave equation with one fixed end and zero initial displacement

get
ˆ
|f1 (x − ct) − f2 (x − ct)| |f1 (x + ct) − f2 (x + ct)| 1 x+ct
|u1 (t, x) − u2 (t, x)| ≤ + + |g1 (s) − g2 (s)| ds,
2 2 2c x−ct
40
Figure 1.19: Solution to the wave equation with a free end at x = 0 end and zero initial velocity

which proves that small deviations in initial conditions imply small deviations in solutions.

41
Chapter 2

Fourier method

2.1 The heat or diffusion equation


In this lecture I will show how the heat equation

ut = α2 ∆u, α2 ∈ R, (2.1)

where ∆ is the Laplace operator, naturally appears macroscopically, as the consequence of the con-
servation of energy and Fourier’s law. Fourier’s law also explains what is the physical meaning of
various initial conditions. I will also give a microscopic derivation of the heat equation, as the limit
of a simple random walk, thus explaining its second title — the diffusion equation.

2.1.1 Conservation of energy plus Fourier’s law imply the heat equation
In one of the first lectures I deduced the fundamental conservation law in the form ut + qx = 0 which
connects the quantity u and its flux q. Here I first generalize this equality for more than one spatial
dimension.
Let e(t, x) denote the thermal energy at time t at the point x ∈ Rk , where k = 1, 2, or 3 (straight
line, plane, or usual three dimensional space). Note that I use bold fond to emphasize that x is a
vector. The law of the conservation of energy tells me that

the rate of change of the thermal energy in some domain D is equal to the flux of the
energy inside D minus the flux of the energy outside of D and plus the amount of energy
generated in D.

So in the following I will use D to denote my domain in R1 , R2 , or R3 . Can it be an arbitrary do-


main? Not really, and for the following to hold I assume that D is a domain without holes (this is called
simply connected ) and with a piecewise smooth boundary, which I will denote ∂D. Mathematically,
the law of the conservation of the energy can be written as
˚ ‹ ˚
d
e(t, x) dx = − q(t, x) · n dS + f ∗ (t, x) dx.
dt D ∂D D

Here q is the flux (note that now it is a vector, naturally, since the notion of the flux assumes a
direction), n is the outward normal to the domain D, and the first integral in the right hand side is

42
taken along the boundary of D, the minus sign is necessary because n is the outward normal. The
dot denotes the usual dot product in Rk . Function f ∗ specifies the energy generated (or absorbed if
it is negative) inside D.
The divergence (or Gauss) theorem says that
‹ ˚
q(t, x) · n dS = ∇ · q(t, x) dx,
∂D D

where ∇ is a differential operator, often called “del” or “nabla”, in Cartesian coordinates


∇ = (∂x , ∂y , ∂z ).
Putting everything together I get
˚ ( )
et (t, x) + ∇ · q(t, x) − f ∗ (t, x) dx = 0,
D

which implies the fundamental conservation law in an arbitrary number of dimensions:


et + ∇ · q − f ∗ = 0. (2.2)
To proceed, I will use the relation of the temperature u(t, x) and the thermal energy as
e(t, x) = c(x)ρ(x)u(t, x),
where c(x) is the heat or thermal capacity (how much energy we must supply to raise the temperature
by one degree), and ρ(x) is the density, i.e., the mass per volume unit; for many materials I can assume
that both c and ρ are constants. Finally I will use the Fourier’s law that says that
The flux of the thermal energy is proportional to the gradient of the temperature, i.e.,
q(t, x) = −k∇u(t, x),

where k is called the thermal conductivity. The minus sign describes intuitively expected fact that
the heat energy flows from hotter to cooler regions (think about one dimensional geometry, when
∇u = ux ).
Hence
1 f∗
ut = ∇ · k∇u + ,
cρ cρ
or, using the notations
k f∗
α2 = , f= , ∆ = ∇2 = ∇ · ∇,
cρ cρ
the nonhomogeneous heat equation
ut = α2 ∆u + f. (2.3)
In Cartesian coordinates in 3D I have (assuming for the moment that f = 0)
ut = α2 (uxx + uyy + uzz ) .
If I am dealing with one-dimensional geometry my equation becomes
ut = α2 uxx ,
which is a particular case of (2.1).

43
2.1.2 Initial and boundary conditions for the heat equation
In general, I will need the initial and boundary conditions to guarantee that my problem is well posed.
The initial condition is given by
u(0, x) = g(x), x ∈ D,
which physically means that we have an initial temperature at every point of my domain D.
To consider different types of the boundary conditions I will concentrate on the case when I deal
with D ⊆ R1 , i.e., my domain is simply an interval (0, l) on the real line. Physically one should imagine
a laterally isolated rod of length l, and I am interested in describing the changes in the temperature
profile inside this rod.

• Type I or Dirichlet boundary conditions. In this case I fix the temperature of the two ends of
my rod:
u(t, 0) = h1 (t), u(t, l) = h2 (t), t > 0.
Please note that h1 and h2 specify not the temperature of the surrounding medium around the
ends of the rod but the exact temperature of the ends themselves, which can be mechanically
achieved by using some kind of thermostats fixed at the ends.

• Type II or Neumann boundary conditions. In this case I fix the flux at the boundaries:

ux (t, 0) = g1 (t), ux (t, l) = g2 (t),

where g1 (t) > 0, g2 (t) > 0 implies that the heat goes from the right to the left, and from the
left to the right otherwise. The case g1 = g2 = 0 is very important and corresponds, clearly, to
no flux condition, or, in other words, to the insulated ends of the rod.

• Type III or Robin boundary conditions. This means that the temperature of the surrounding
medium is specified. I will use the Newton’s law of cooling, together with Fourier’s law, to obtain
in this case
h( ) h( )
ux (t, 0) = u(t, 0) − q1 (t) , ux (t, l) = − u(t, l) − q2 (t) ,
k k
where q1 , q2 are the functions of temperature at the left and the right ends respectively. Here k,
as before, the thermal conductivity, and h is so-called heat exchange coefficient (which is quite
difficult to measure in real systems). Here Newton’s law of cooling appears in the form of the
difference of two temperatures. Note also that I am careful about signs in my expressions to
guarantee that the heat flows from hotter to cooler places, as intuitively expected. Sometimes
the same boundary conditions can be written in a more mathematically neutral form as

α1 u(t, 0) + β1 ux (t, 0) = q1 (t), α2 u(t, l) + β2 ux (t, l) = q2 (t),

for some constants αi , βi , i = 1, 2.

Similarly the boundary conditions for two or three dimensional spatial domains can be defined. Some-
times part of the boundary has Type I conditions and part has Type II conditions. In this case it is
said that mixed boundary conditions are set. If hi , gi , qi are identically zero then it is said that the
boundary conditions are homogeneous.

44
Example 2.1. Suppose we have a copper rod 200 cm long that is laterally insulated and has an initial
temperature 0◦ C. Suppose that the top of the rod (x = 0) is insulated, while the bottom (x = 200) is
immersed into moving water that has a constant temperature of q2 (t) = 20◦ C.
The mathematical model for this problem will be
PDE ut = α2 uxx , 0 < x < 200, t > 0,

ux (t, 0) = 0,
BC ( )
ux (t, 200) = − h u(t, 200) − 20 ,
k
IC u(0, x) = 0, 0 ≤ x ≤ 200.
Exercise 2.1. Can you guess what happens with the solution to the previous problem when t → ∞?
Can you prove your expectations mathematically?

2.1.3 A microscopic derivation of the diffusion equation


Consider a one dimensional simple random walk. This means that I have, e.g., a particle that moves h
units to the left with probability p and h units to the right with probability q, p + q = 1, starting from
the origin, one step every τ units of time (see figures for some inspiration). My goal in this problem
is to determine the probability, which I denote uk,N , that after N steps (i.e., at the time t = N τ ) I
will find this particle at the position kh, −N ≤ k ≤ N . The usual notation is
uk,N = P {X = kh} ,
for the position X of the particle, which is an example of a random variable.
∑3
Exercise 2.2. Let p = q = 12 , N = 3, h = 1, what are uk,3 , −3 ≤ k ≤ 3? What is k=−3 uk,3 ?

Figure 2.1: Two examples of a symmetric (p = q = 1/2) random walk with h = τ = 1, N = 50

Actually, the previous exercise can be solved exactly for arbitrary k and N . I am, however, is more
interested in understanding what happens if h, τ → 0 and hence my simple random walk becomes
continuous in both time and space. I will use, taking into account my limiting procedure, the notation
uk,N = u(t, x).

45
Figure 2.2: Left: ten realizations of a symmetric (p = q = 1/2) random walk with h = τ = 1 and
N = 1000. Right: ten realizations of a random walk with drift (p = 0.56, q = 0.44)

To obtain the desired result I write down the fundamental relation

u(t + τ, x) = pu(t, x − h) + qu(t, x + h),

which literally says that the probability to find the particle at the position x at time t + τ can be
found as the sum of the probability to be at the position x − h at time t times the probability move
to the right (p) and the probability to be at the position x + h at time t times the probability move
to the left (q). (In probability theory this is called the law of total probability, but do not worry if
you did not see this before).
Now I assume that if h, τ → 0 then u becomes a sufficiently smooth function of x and t, such that
I can use Taylor’s series similar to what I did when I deduced the wave equation. I have

u(t + τ, x) = u(t, x) + ut (t, x)τ + o(τ ),


1
u(t, x ± h) = u(t, x) ± ux (t, x)h + uxx (t, x)h2 + o(h2 ),
2

where g(x) = o(f (x)) means the terms such that limx→0 fg(x)
(x) → 0. For example, τ = o(τ ), h = o(h );
2 3 2

o(1) means any expression tending to 0 as x → 0. I plug these series into the fundamental relation,
cancel terms and find that
h2
ut τ + o(τ ) = (q − p)hux + uxx + o(h2 ).
2
Dividing by τ yields ( )
(q − p)h h2 h2
ut + o(1) = ux + uxx + o .
τ 2τ τ
Now to get a meaningful result, I must consider a special way when both h and τ tend to zero. First,
I assume that
h2
lim = α2 > 0.
h,τ →0 2τ

46
Second,
(q − p)h (q − p)h2
lim = lim = β2α2 =: c
h,τ →0 τ h,τ →0 hτ
for a constant β, this can be always achieved by taking
1 β 1 β
q= + h + o(h), q= − h + o(h),
2 2 2 2
i.e., when the random walk is microscopically symmetric.
Now taking the limits τ, h → 0 yields the diffusion equation with drift

ut = α2 uxx + cux .

If we do not have the drift, i.e., p = q = 0.5, then we recover the familiar homogeneous heat equation

ut = α2 uxx .

Now it should be clear why α2 is often called the diffusivity or diffusion coefficient.
As a side remark I note that if α2 = 0 then I end up with the familiar linear transport equation

ut − cux = 0,

which has the general solution


u(t, x) = F (x + ct),
which is geometrically a linear traveling wave moving to the left if c > 0 (i.e., q > p) or to the right if
c < 0 (i.e, q < p), as expected.
As I a final remark I note that when h, τ → 0 then uk,N ceases to be the actual probability
and becomes, in the language of probability theory, the probability density function such that the
probability to find a particle in the interval [x1 , x2 ] at time t becomes
ˆ x2
P {x1 ≤ X ≤ x2 } = u(t, x) dx.
x1

2.2 Motivation for Fourier series


Donkeys and scholars in the middle!

French soldiers,
forming defensive squares in Egypt,
during Napoleon’s campaign

The central topic of our course is the so-called Fourier method or method of separation of variables to
solve linear PDE with constant coefficients. This methods historically required a detailed analysis of
the question when a given arbitrary function can be represented as a linear combination of trigono-
metric functions, and partial answers to this question eventually turned into a wide mathematical field
that is called nowadays Fourier Analysis. To motivate the appearance of Fourier series, i.e., infinite
linear combination of trigonometric functions, I will use exactly the same problem that is considered

47
in the textbook. Not to repeat the same arguments I will look at a somewhat more general picture
that involves the notion of functions of complex variable, probably unfamiliar to most of the students.
Consider an insulated circular rod with some initial temperature distribution along it. I am
interested in answering the question how the temperature changes with time at each point of the rod.
Using the experience from the previous lecture I can write that my temperature at the point x at the
time t, which as before I denote u(t, x), must satisfy the heat equation

ut = α2 uxx , t > 0, −π < x < π, (2.4)

note that I assumed that my spatial variable changes from −π to π, using the geometry of my rod.
I also have the initial condition

u(0, x) = g(x), −π ≤ x ≤ π. (2.5)

Clearly, none of the considered in the previous lecture boundary conditions would work in this
particular case. A little thought, however, shows that it is natural to set here periodic boundary
conditions
u(t, −π) = u(t, π), t > 0,
(2.6)
ux (t, −π) = ux (t, π), t > 0,

such that the profile of the temperature in my rod is continuously differentiable at every point.
To solve problem (2.4)–(2.6) I make, following the giants of 18th century, an ingenious assumption
that I can look for the solution in the form

u(t, x) = T (t)X(x),

i.e., as the product of two functions, the first one depends only on t and the second one depends only
on x (I will say more about this assumption later on in the course). Using this ansatz (“ansatz” in
mathematics is an educated guess) yields

T ′ (t)X(x) = α2 T (t)X ′′ (x),

where the prime denotes the derivative with respect to the corresponding variables. Rearranging
implies
T ′ (t) X ′′ (x)
= ,
α2 T (t) X(x)
i.e., the right hand side of this equality depends only on t, and the right hand side depends only on
x. What is an immediate consequence of this fact? Both sides must be equal to the same constant!
Indeed, fix some t and hence the left hand side is constant, which implies that the right hand side is
constant for any x, now go in the opposite direction. Hence, I can write
T ′ (t) X ′′ (x)
2
= = µ,
α T (t) X(x)
where µ (and this is a significant difference with the textbook) is some complex constant, µ ∈ C. The
consequence of the last two equalities is the following two differential equations

T ′ = α2 µT (2.7)

48
and
X ′′ = µX. (2.8)
Before proceeding further, let me check what my assumption on the structure of solutions of the
heat equation implies for the initial and boundary conditions. The initial condition does not give me
much insight at this point, but the boundary conditions now read

T (t)X(−π) = T (t)X(π), T (t)X ′ (−π) = T (t)X ′ (π),

which implies, naturally assuming that T (t) ̸= 0, that

X(−π) = X(π), X ′ (−π) = X ′ (π). (2.9)

Problem (2.8), (2.9) is an ODE boundary value problem, and my task is to determine for which
complex constants µ this problem has a non-zero solution (zero solution exists for any µ but it is of
no interest to me assuming that g(x) ̸= 0), and how exactly this solution looks for the given µ.
I start with equation (2.8) and use a few facts from the theory of linear ODE. I do hope that the
students can solve this problem for real values of µ.
Exercise 2.3. Solve problem (2.8), assuming that µ ∈ R. Assume first that µ > 0, then µ < 0, µ = 0.
Technically the case when µ ∈ C was not covered in the introductory ODE course. Actually, the
difference with the real case is minuscule, but let me proceed carefully in this case (still omitting
some technical steps). I will look for the solution in the form of a power series with undetermined
coefficients:


X(x) = c0 + c1 x + c2 x2 + . . . = ck xk .
k=0
Differentiating yields

X ′ (x) = c1 + 2c2 x + 3c3 x2 + . . .


X ′′ (x) = 2c2 + 3 · 2c3 x + 4 · 3c4 x2 + . . . .

Hence I must have

2c2 + 3 · 2c3 x + 4 · 3c4 x2 + . . . = µc0 + µc1 x + µc2 x2 + . . . .

Two power series equal if and only if the coefficients at the same powers are equal, that is I have

2c2 = µc0 ,
3 · 2c3 = µc1 ,
4 · 3c4 = µc2 ,
5 · 4c5 = µc3 ,
...

It is very easy to see the pattern, which I can succinctly put as


µm c0 µm c1
c2m = , c2m+1 = ,
(2m)! (2m + 1)!

49
where
m! = 1 · 2 · . . . · (m − 1) · m.
Therefore, if I know c0 and c1 my problem is solved. Recall that the space of solutions of equation
(2.8) is two dimensional, and also note that X(0) = c0 , X ′ (0) = c1 , therefore I can pick any two
linearly independent vectors (c0 , c1 ) to produce two solutions that will form a basis of my solution

space. I am going to pick first c0 = 1, c1 = µ, and get that
∞ √
∑ ( µx)k
X1 (x) = ,
k!
k=0

and then c0 = 1, c1 = − µ to have the second solution

∑ √
(− µx)k
X2 (x) = .
k!
k=0

Exercise 2.4. What happens if I pick c0 = 1, c1 = 0 and c0 = 0, c1 = 1, which may look like a more
natural choice? Hint: If you are completely lost at this point, it may help to read to the end of this
lecture and return to this exercise after.

Now an attentive reader should notice that the power series look similar to the ones that were
studied in Calculus under the name of Taylor’s series. In particular, these are exactly the series for the
exponent, the only problem that the series with complex coefficients were not studied. To put aside
this problem I simply (for those who would undertake the study of functions of complex variables in
the future this word “simply” should be remembered at some point :) ) put forward the following
definition.

Definition 2.2. The exponential function, exp z or ez , of the complex variable z ∈ C is defined as

∑ zk ∞
z z2
exp z = e := 1 + z + + ... = .
2! k!
k=0

Remark 2.3. No discussion was supplied about convergence of these power series. It can be proved
(not complicated but takes some time) that this series converges absolutely for any fixed z. For those
who want to understand the definitions literally, not defined words “converges absolutely” should be
read “makes sense.”

Remark 2.4. The definition of the exponential function together with binomial formula yield probably
the most important property of ez :
ez1 +z2 = ez1 ez2 .
I invite a curious student to prove it.

Using my definition √
of the exponential√function I can write my two linearly independent solutions
to (2.8) as X1 (x) = e µx and X2 (x) = e− µx , and the general solution as
√ √
X(x) = AX1 (x) + BX2 (x) = Ae µx
+ Be− µx
,

50
where A and B are two (complex) arbitrary constants. To be precise this formula works only if µ ̸= 0,
but if µ = 0 then two integrations yield

X(x) = A + Bx.

Now it is time to look more closely at the boundary conditions (2.9).


First, let µ = 0, then I must have that

A + Bπ = A − Bπ,

which implies B = 0, and the second boundary condition is satisfied automatically. Therefore I found
that for µ = 0 I always has a nontrivial solution to my boundary value problem, which is

X0 = A,

where A is an arbitrary constant.


Now let µ ̸= 0. For the future use I will define two new function, hyperbolic sine, sinh, and
hyperbolic cosine, cosh:

Definition 2.5.
ez − e−z ez + e−z
sinh z = , cosh z = .
2 2
Using the boundary conditions I find
√ √ √ √
Ae + Be− µπ = Ae− µπ + Be µπ ,
µπ
√ √ √ √ √ √ √ √
µAe µπ − µBe− µπ = µAe− µπ − µBe µπ ,

or, using my definitions,


√ √
A sinh µπ = B sinh µπ,
√ √
A sinh µπ = −B sinh µπ.

These equalities can be true only if A = B = 0, which gives us a trivial zero solution, of if

sinh µπ = 0.

To proceed I will define yet two more functions.

Definition 2.6.
eiz − e−iz eiz + e−iz
sin z = , cos z = ,
2i 2
where i is the imaginary unit, i2 = −1.

Exercise 2.5. Prove that defined in this way functions of the complex variable sine and cosine coincide
with our familiar trigonometric functions for the case when z is a real variable.

Exercise 2.6. What is cos i?

51
Note that the last definition implies immediately the Euler’s formula

eiz = cos z + i sin z.



Now I am ready to tackle the solution of the equation sinh µπ = 0. I rewrite it as

e2 µπ
=1

and assume that µ = α + iβ, where α, β ∈ R. Using Euler’s formula, I have

e2απ (cos 2πβ + i sin 2πβ) = 1 + i · 0.

Two complex numbers are equal if and only if the real and imaginary parts are equal, hence

e2απ cos 2πβ = 1, e2απ sin 2πβ = 0,

where now all the constants are real and we can use our knowledge about the trigonometric functions.
I immediately conclude that the last two equalities can be true if and only if

α = 0, β = k, k ∈ Z = {0, ±1, ±2, . . .}.

Returning to the original constant µ I get



µ = α + iβ = ik, k ∈ Z =⇒ µ = −k 2 , k = 1, 2, . . .

In words, my boundary value problem has a nonzero solution only if the values of the constant in
(2.8) are −k 2 , where k = 1, 2, . . .. The solutions, and I am from now on going to use the index k to
emphasize the dependence on k, are

Xk (x) = Ak eikx + Bk e−ikx , k = 1, 2, . . . ,

where still Ak , Bk are some arbitrary complex constants. Since my solution describes the temperature
I am mostly interested in a real valued solution. A complex expression M is real if and only if M = M ,
where the bar means complex conjugate. To have all Xk (x) real I must have hence that Ak = B k .
Let Ak = ak /2 + ibk /2 for some real ak , bk . Then my real solutions are

Xk (t) = (ak /2 + ibk /2)eikx + (ak /2 − ibk /2)e−ikx = ak cos kx + bk sin kx,

using Euler’s formula.


Now I finally can summarize that my boundary value problem (2.8), (2.9) with the complex variable
µ has nontrivial solutions only if µ = −k 2 , where k = 0, 1, 2, . . ., and these solutions are given by

Xk (x) = ak cos kx + bk sin kx,

where the constants are real. Note that the case µ = 0 is also included in these formulas.
Now I can consider solutions to (2.7):

Tk (t) = Ck e−k
2 α2 t
,

52
and hence each function

uk (t, x) = Tk (t)Xk (x) = e−α


2 k2 t
(ak cos kx + bk sin kx), k = 0, 1, 2, . . .

solves the heat equation (2.4) and satisfies the boundary conditions (2.6). What about the initial
condition? Here I will rely again, as before in Duhamel’s principle, on the linearity on the equation,
and in particular on the principle of superposition: If u1 , u2 solve (2.4) then any linear combination
of these functions also solves it. Hence I can write that an infinite linear combination

∑ ∞

e−α
2 k2 t
u(t, x) = uk (t, x) = a0 + (ak cos kx + bk sin kx)
k=0 k=1

solves the heat equation. I use the initial condition to find (I replace for some notational reasons a0
with a∗0 /2)

a∗ ∑
g(x) = 0 + (ak cos kx + bk sin kx).
2
k=1
And this was Jean-Baptiste Joseph Fourier, French scholar, who was one of the scientists in the
French Army during the Egyptian expedition under Napoleon’s lead at the beginning of the 19th
century, who in 1822 declared, to a big surprise and disbelief of the mathematical community, that
any function can be represented as a linear combination of trigonometric functions. In other words, he
meant that given g, he can always find ak , bk , and the corresponding series converge (“makes sense”).
Our next task is to actually figure out how to determine ak , bk and also to contemplate a little about
the question whether the found series provides us with a legitimate classical solution to the heat
equation.

2.3 Fourier series


In this lecture I will talk about the trigonometric Fourier’s series:

a0 ∑
g(x) = + (ak cos kx + bk sin kx), −π ≤ x ≤ π, (2.10)
2
k=1

where ak , bk are real constants.


My discussion will be geared towards computational aspects, theoretical discussion can be found in
the textbook (and I also, if you somehow became interested in Fourier series, very highly recommend
the set of lecture notes by Brat Osgood, Fourier Transform and Its Applications, you can find them
on the Internet).
There are several relevant questions which need to be addressed:
• If (2.10) is true then how to find ak , bk ?

• What if my interval is different from [−π, π]?

• In what sense the equality sign in (2.10) must be understood? Note that the right hand side is
an infinite series and hence a discussion on convergence is relevant.

• Is it possible to replace {1, cos kx, sin kx} with, say, only {cos kx}? or only {sin kx}?

53
• These series are important for us because they represent solutions to the second order PDE and
hence in order to be classical solutions they should be differentiable. What are the conditions
for (2.10) such that I can take a derivative of the right hand side and obtain a Fourier series for
g′ ?

2.3.1 Formulas for the coefficients of (2.10)


Since I am using sines and cosines (2.10) is called a trigonometric Fourier’s series. The right hand side
of (2.10) means that I represent g as an infinite linear combination of {1, cos kx, sin kx}∞ k=1 , which I
will call the trigonometric system of functions. This system possesses a special property, which makes
computations particularly simple. To introduce it, I first introduce an inner product on the set of
functions defined on [−π, π].
´π
Definition 2.7. Let f, g be two functions defined on [−π, π] and assume that −π f 2 (x) dx < ∞ and
´π 2
−π g (x) dx < ∞. I will denote the set of all possible such functions as L [−π, π]. The inner product
2

of f, g ∈ L [−π, π] is
2
ˆ π
⟨f, g⟩ = f (x)g(x) dx.
−π

Exercise 2.7. Show that for any f, g, h ∈ L2 [−π, π]

⟨f, f ⟩ ≥ 0, ⟨f, f ⟩ = 0 ⇔ f = 0,
⟨af, g⟩ = a⟨f, g⟩, a ∈ R,
⟨f, g⟩ = ⟨g, f ⟩,
⟨f + g, h⟩ = ⟨f, h⟩ + ⟨g, h⟩.

The key property I talked about above is orthogonality.

Definition 2.8. Two functions f, g ∈ L2 [−π, π] are called orthogonal if

⟨f, g⟩ = 0.

Lemma 2.9. The trigonometric system of functions is orthogonal, meaning that each function in this
set belongs to L2 [−π, π] and any pair of them is orthogonal.

Proof. First,
ˆ π ˆ π ˆ π
1 dx = 2π < ∞, cos kx dx = 0 < ∞, sin kx dx = 0 < ∞,
−π −π −π

54
which proves that all my functions are in L2 . Second, to show orthogonality, I need to calculate
ˆ π
sin kx x=π
⟨1, cos kx⟩ = 1 · cos kx dx = = 0,
−π k x=−π
ˆ π
cos kx x=π
⟨1, sin kx⟩ = 1 · sin kx dx = − = 0,
−π k x=−π
ˆ π
⟨cos kx, sin mx⟩ = cos kx sin mx dx = 0,
−π
ˆ π {
0, k ̸= m,
⟨cos kx, cos mx⟩ = cos kx cos mx dx =
−π π, k = m,
ˆ π {
0, k ̸= m,
⟨sin kx, sin mx⟩ = sin kx sin mx dx =
−π π, k = m,

where k, m = 1, 2, 3, . . .. 

Remark 2.10. To evaluate the above integrals I used


1( )
sin α sin β = cos(α − β) − cos(α + β) ,
2
1( )
cos α cos β = cos(α − β) + cos(α + β) ,
2
1( )
sin α cos β = sin(α + β) + sin(α − β) .
2
Now I will use the orthogonality of the trigonometric system of functions to find ak , bk in (2.10).
Let me take the inner product of the left and right hand sides of (2.10) with 1:

a0 ∑
⟨g, 1⟩ = ⟨ + (ak cos kx + bk sin kx), 1⟩ .
2
k=1

I use the properties of the inner product (see Exercise 2.7) to have
∑ ∞
a0
⟨g, 1⟩ = ⟨1, 1⟩ + ak ⟨cos kx, 1⟩ + bk ⟨sin kx, 1⟩ .
2
k=1

Now notice that due to the orthogonality all the terms except ⟨1, 1⟩ are zero:
ˆ
a0 ⟨g, 1⟩ 1 π
⟨g, 1⟩ = 2π =⇒ a0 = = g(x) dx.
2 π π −π

Similarly, taking the inner product with cos mx and sin mx respectively I find (switching again m
with k)
ˆ
⟨g, cos kx⟩ 1 π
ak = = g(x) cos kx dx, k = 0, 1, 2, . . .
π π −π
ˆ (2.11)
⟨g, sin kx⟩ 1 π
bk = = g(x) sin kx dx, k = 1, 2, . . . ,
π π −π

55
note that I also included the case a0 in my formulas, and this was the reason to have a0 /2 in (2.10).
Equations (2.11) give me the Fourier coefficients of the trigonometric Fourier series (2.10).
Often I will need to consider the set of functions L2 [−l, l], where l is some constant. I can simply
introduce a new variable y in the way that
πy
x= ,
l
such that I am rescaling my variable, and when y = ±l then x = ±π. Using this change of variable,
and replacing again y with x I immediately conclude that
{ }
Lemma 2.11. The system of functions 1, cos πkx l , sin πkx
l is orthogonal with respect to the inner
product
ˆ l
⟨f, g⟩ = f (x)g(x) dx.
−l

If function g ∈ L2 [−l, l] can be represented by its trigonometric Fourier series


∞ ( )
a0 ∑ πkx πkx
g(x) = + ak cos + bk sin , −l ≤ x ≤ l,
2 l l
k=1

then the coefficients are found as


ˆ l
1 πkx
ak = g(x) cos dx, k = 0, 1, 2, . . .
l −l l
ˆ l (2.12)
1 πkx
bk = g(x) sin dx, k = 1, 2, . . . .
l −l l

Note that Lemma 2.11 answers two first questions that I posed.

2.3.2 On the convergence of (2.10)


Example 2.12. To motivate the discussion of the convergence of the Fourier series and practice the
formulas (2.12) consider the following example:
{
0, −1 ≤ x ≤ 0,
g(x) =
1 − x, 0 < x ≤ 1.

I find that (note that I calculate a0 and ak , k = 1, 2, . . . separately):


ˆ 1
1
a0 = (1 − x) dx = ,
0 2
ˆ 1
1 1 − (−1)k
ak = (1 − x) cos πkx dx =
(1 − cos πk) = ,
0 (πk)2 (πk)2
ˆ 1
1
bk = (1 − x) sin πkx dx = .
0 πk

56
Figure 2.3: Comparison of the graphs of the original function g (light gray) and the partial sums Sk
(dark grey) of the corresponding Fourier series

Now I can form the partial sums

a0 ∑
k
Sk (x) = + (am cos πmx + bm sin πmx),
2
m=1

and compare them with the graph of the initial function. My expectation is that the bigger the value
of k I take, the closer the graph of Sk should be to the graph of g. Indeed, my expectations is correct,
see Fig. 2.3.Moreover, for bigger k the approximation becomes better and better (Fig. 2.4)

Figure 2.4: Comparison of the graphs of the original function g (light gray) and the partial sums Sk
(dark grey) of the corresponding Fourier series

57
Before jumping to conclusions let me answer another question: What would happen if I look at
the values of x outside of the interval [−1, 1]? Both my function and the partial sums of Fourier series
are obviously defined for them. Let me take, say, x ∈ [−2, 2] (see Fig. 2.5, left panel).
As it should be expected, the partial sums of Fourier series are periodic function (in my case with
period 2), and the original function is not periodic. However, I can always take a periodic extension
of my function g, such that the new function becomes also periodic with period 2. Instead of writing
careful (and useless) definition, just look at Fig. 2.5, right panel, to see what a periodic extension is.

Figure 2.5: Left: Comparison of the graphs of the original function g (light gray) and the partial sum
S10 (dark grey) on the interval [−2, 2]. Right: Comparison of the graphs of the periodic extension of
the function g (light gray) and the partial sum S10 (dark grey) on the interval [−5, 5]

Finally, my function g, as well as the corresponding periodic extension, are not continuous, whereas
the partial sums as the sums of continuous functions are continuous. It actually stays true even in
the limit k → ∞, the left hand side of (2.10) is continuous, and if the original function has a jump
discontinuity then the corresponding Fourier series converges exactly to the middle of the interval of
discontinuity.
Let me put together all the information we gained in the previous example. I start with a definition
of a piecewise smooth function.
Definition 2.13. Function g : [a, b] −→ R is said to piecewise smooth of the class C (k) if it belongs to
the class C (k) at any point of the interval [a, b] except, possibly, a finite number of points of discontinuity
x1 < x2 < . . . < xp . Moreover, the right and left limits at all these points exist and finite (i.e., all the
discontinuities are of the jump type):
lim g(x) = g(xj + 0), lim g(x) = g(xj − 0), j = 1, . . . , p.
x→xj +0 x→xj −0

The value g(xj + 0) − g(xj − 0) is called the magnitude of the jump.


Now I can state
Theorem 2.14 (Dirichlet). Let g : [−l, l] −→ R be a piecewise smooth function of the class C (1) , and
let g̃ : R −→ R be its 2l-periodic extension. Then, at any point x ∈ R, its Fourier series converges
to the value g̃(x) if g̃ is continuous at x, or to (g̃(x + 0) + g̃(x − 0))/2 if x is a point of a jump
discontinuity.
This theorem (partially) answers the third question from my list.

58
2.3.3 Sine and cosine series. Odd and even extensions
We had only one example of calculating Fourier coefficients, but it should be already clear that this
is the point that requires sometimes some tedious calculations. It is always nice when we can see
immediately answers without calculating integrals. Here is one such trick.
Recall that f is odd if f (−x) = −f (x) for all x, geometrically the graph of an odd function is
symmetric with respect to the origin; and f is even if f (−x) = f (x) for all x, the graph of the even
function is symmetric with respect to the y-axis. An example of an odd function is sine, and of an
even function is cosine.

Exercise 2.8. Show that the product of two even or two odd functions is even, and the product of
even and odd functions is odd.

Now we can use this symmetry to calculate the integrals. Note that if f is odd then the integral
with respect to a symmetric interval is zero, and if f is even, then the integral with respect to the
interval [−l, l] is double the integral from 0 to l.

Lemma 2.15. Let g be an odd function. Then


ˆ l
2 πkx
ak = 0, bk = g(x) sin dx, k = 0, 1, . . . ,
l 0 l

Let g be an even function. Then


ˆ l
2 πkx
ak = g(x) cos dx, bk = 0, k = 0, 1, . . . .
l 0 l

In other words, the trigonometric Fourier series of odd functions contain only sines, the trigono-
metric Fourier series of even functions contain only constant and cosines. This is a great news for
many calculations, however, there is a much deeper consequence of the last lemma. Note first that all
the integrals are calculated from 0 to l.
Now assume that I need to find a Fourier series of a function defined on [0, l]. I can, of course,
shift this interval by l/2 and use the formulas for the full Fourier series. I also can extend my function
evenly, for example (recall the discussion of the solution of the wave equation on a half-infinite string!),
and in this case my Fourier series will consists only of cosines and a constant. Or I can use an odd
extension, and in this case my Fourier series will contain only sine terms.
Dirichlet’s theorem, together with the discussion above, thus imply that for any piecewise continu-
ous function g : [a, b] −→ R I can almost equally easy find either the full trigonometric Fourier series,
or the sine Fourier series, or the cosine Fourier series, and the convergence of these series follows that
of Dirichlet’s theorem.

Example 2.16. Find sine and cosine Fourier series for

g(x) = x, 0 ≤ x ≤ 1.

Let me start with the sine series. The odd extension is (this is an illustration, I never use these
extensions in my calculations!)
godd (x) = x, −1 ≤ x ≤ 1.

59
Since this is an odd function, hence ak = 0 for any k. Using the formulas above
ˆ 1
(−1)k+1
bk = 2 x sin πkx dx = ,
0 kπ

the results are shown in Figs. 2.6 and 2.7.

Figure 2.6: Sine Fourier series for g(x) = x on [0, 1]

Recall that on the whole real line my Fourier series converges to the corresponding periodic exten-
sion (see Fig. 2.7)

Figure 2.7: Sine Fourier series for g(x) = x on [0, 1]. Left: Periodic extension. Right: Just the interval
[0, 1]

Now I consider the even extension:


{
x, 0 ≤ x ≤ 1,
geven (x) =
−x, −1 ≤ x ≤ 0.

60
I find
(−1)k − 1
bk = 0, a0 = 1, ak = , k = 1, 2, . . .
π2 k2
The results are shown in Figs. 2.8 and 2.9.

Figure 2.8: Cosine Fourier series for g(x) = x on [0, 1]

Figure 2.9: Cosine Fourier series for g(x) = x on [0, 1]. Left: Periodic extension. Right: Just the
interval [0, 1]

One important thing to notice from this example. The periodic extension of the odd function is
discontinuous, and my coefficients have the form C/k for some constant C. The periodic extension of
the even function is continuous (see the figures), but not continuously differentiable (it has corners),
and my coefficients are of the form C/k 2 for some (other) constant C. of course the latter series
converges faster, which can also be seen directly from the figures. This is a general phenomenon, i.e.,
the smoother the periodic extension of my function, the faster my Fourier series converges.
Exercise 2.9. Can you find a cosine Fourier series of function sin on x ∈ [0, π]?
This subsection gives an answer to the fourth question.

61
2.3.4 Differentiating Fourier series
Finally, for the future use I need some information when we can differentiate Fourier series such that
we can talk about classical solutions of PDE problems. Without going into any technical details, note
that if I differentiate the terms in Fourier series, I will get factor k in the from of my series. Hence, to
be able to differentiate, I need that my Fourier coefficients would be of the form C/k α for a sufficiently
large α. I refer the curious student to the textbook and references therein, while formulating a very
vague answer to the final question from my list:
If the function g is sufficiently nice (sufficiently smooth), and its periodic extension is also
sufficiently smooth then I can differentiate my Fourier series a sufficient amount of times
(and I will be more specific when it comes to the actual equations).

2.3.5 Final remarks and generalizations


Basically, what was shown and stated (without proofs), is that any piecewise continuous function g
can be represented as its Fourier series, using either

{1, cos kx, sin kx},

or
{1, cos kx},
or
{sin kx}.
Moreover, I found the formulas that can be used to calculate the coefficients of Fourier series. The
key property that allowed me to do this is the orthogonality of the first system of functions on [−π, π]
and orthogonality of the second and third systems on [0, π].
In general, and quite abstractly, I can consider an arbitrary system of functions

{g1 (x), g2 (x), g2 (x), . . .},

introduce an inner product ⟨gj , gk ⟩ that satisfies some natural properties and define orthogonality.
Let me assume that {gj (x)} is an orthogonal system of functions. Then I can try to represent an
arbitrary function as a generalized Fourier series

g(x) = c1 g1 (x) + c2 g2 (x) + . . . .

The orthogonality will immediately let me find my coefficients:


⟨g, gk ⟩
ck = .
⟨gk , gk ⟩
Exercise 2.10. Consider complex valued functions of the real argument and an inner product of the
form ˆ π
⟨f, g⟩ = f (x)g(x) dx.
−π
Show that system of functions

g0 (x) = 1, gk (x) = eikx , k = ±1, ±2, ±3, . . .

62
is orthogonal on [−π, π]. Find the expression for the coefficients ck in the complex Fourier’s series


g(x) = c0 + ck eikx .
k=−∞

Can you see how ck are connected with ak , bk from the trigonometric Fourier series?

2.4 Fourier method for the heat equation


Now I am well prepared to work through some simple problems for a one dimensional heat equation.

Example 2.17. Assume that I need to solve the heat equation

ut = α2 uxx , 0 < x < 1, t > 0, (2.13)

with the homogeneous Dirichlet boundary conditions

u(t, 0) = u(t, 1) = 0, t>0 (2.14)

and with the initial condition


u(0, x) = g(x), 0 ≤ x ≤ 1. (2.15)
Let me start again with the ansatz

u(t, x) = T (t)X(x).

The equation implies


T ′ X = α2 T X ′′ ,
where the primes denote the derivatives with respect to the corresponding variables. Separating the
variables implies that
T′ X ′′
= = −λ,
α2 T X
where the minus sign I chose for notational reasons. Therefore, I now have two ODE, moreover the
second ODE
X ′′ + λX = 0
is supplemented with the boundary conditions X(0) = X(1) = 0, which follows from (2.14).
Two lectures ago I analyzed a similar situation with periodic boundary conditions, and I considered
all possible complex values of the separation constant. Here I will consider only real values of λ, a
rigorous (and elementary!) general proof that they must be real will be given later.
I start with the case λ = 0. This means X ′′ = 0 =⇒ X(x) = Ax + B, where A and B are two
real constants. The boundary conditions imply that A = B = 0 and hence for λ = 0 my boundary
value problem has no nontrivial solution.
Now assume that λ < 0. The general solution to my ODE in this case is
√ √
−λx
X(x) = Ae + Be− −λx
,

63
and the boundary conditions imply
√ √
−λ −λ
A + B = 0, Ae + Be = 0,

or √
−λ
B(e2 − 1) = 0,

which implies that A = B = 0 since e2 −λ
̸= 1√for any real negative
√ λ. (It maybe nicer to start
working from the general solution X(x) = A sinh −λx + B cosh −λx, I will leave it as an exercise).
Therefore again no nontrivial solutions.
Finally, assuming that λ > 0, I get
√ √
X(x) = A cos λx + B sin λx,

and the boundary conditions imply that



A = 0, B sin λ = 0,

which can be true if B = 0 (trivial solution), or if


√ √
sin λ = 0 =⇒ λ = πk, k ∈ Z.

Therefore, for any


λk = (πk)2 , k = 1, 2, 3, . . .
note that I disregarded 0 and all negative values of k since they do not yield new solutions, I get a
nontrivial solution
Xk (x) = Bk sin πkx.
Now I can return to the ODE T ′ = −α2 λT , that has the solutions for admissible values of lambda

Tk (t) = Ck e−α
2 π 2 k2 t
.

The analysis above implies (I take bk = Bk Ck ) that

uk (t, x) = bk e−α
2 π 2 k2 t
sin πkx k = 1, 2, . . .

solve equation (2.13) and satisfy the boundary conditions (2.14). All we need is to satisfy the initial
condition (2.15). For this I will use the superposition principle that says that if uk solve (2.13) then
any linear combination is also a solution, i.e.,

∑ ∞

bk e−α
2 π 2 k2 t
u(t, x) = uk (t, x) = sin πkx
k=1 k=1

my solution. I use (2.15) to find that



∑ ∞

g(x) = uk (t, x) = bk sin πkx
k=1 k=1

64
Figure 2.10: Solution to the heat equation with homogeneous Dirichlet boundary conditions and the
initial condition (bold curve) g(x) = x − x2 Left: Three dimensional plot, right: contour plot

which is exactly the sine series for my function g, the coefficients of which can be found as
ˆ 1
bk = 2 g(x) sin πkx dx, k = 1, 2, . . . .
0

As a specific example I can take


g(x) = x − x2 .
Then
4(1 − (−1)k )
bk = .
π3 k3
The solutions are graphically represented in Fig. 2.10. We can see that, as expected, the temper-
ature in the rod approaches zero as time goes to infinity.
What else can be inferred from the representation of our solution as its Fourier series?
First I note that the exponents are responsible for the speed of approaching the equilibrium state,
moreover, for sufficiently large t, all the expressions of the form e−Ak t are very small compare to the
2

first exponent k = 1 if bk ̸= 0. Therefore, in many practical situations it is possible to concentrate


only on the first nonzero term of the Fourier series

u(t, x) ≈ uk (t, x) = bk e−α


2 k2 π 2 t
sin πkx, first bk ̸= 0.

∑10 the approximation becomes better and better as t grows. Here is the difference of u1 (t, x) −
And
k=1 uk (t, x) for my example in Fig. 2.11
Second, and more importantly, I note that the same negative exponents in the representation of
the solution by the sine Fourier series will guarantee that any derivative of the Fourier series will
converge (it does require some proof). This is an important characterization of the solutions to the
heat equation: Its solution, irrespective of the initial condition, is infinitely differentiable function
with respect to x for any t > 0.

65
∑10
Figure 2.11: The difference u1 (t, x) − k=1 uk (t, x) in the example

Here is the same problem with




0, 0 < x < 1/4,
g(x) = 1, 1/4 < x < 3/4,


0, 3/4 < x < 1.

You can see the smoothing effect of the heat equation on the discontinuous initial condition (see Fig.
2.12).

Figure 2.12: Solution to the heat equation with a discontinuous initial condition. For and t > 0 the
solution is infinitely differential function with respect to x

I can also note that if we would like to revert the time and look to the past and not to the future,
then all the exponent would have the sign plus, which means that in general Fourier series will diverge
for any t < 0. This is actually a manifestation of the fact that the inverse problem for the heat
equation is not well posed, the heat equation represents a meaningful mathematical model only for

66
t > 0 and the solutions are net reversible. (As a side remark I note that ill-posed problems are very
important and there are special methods to attack them, including solving the heat equation for t < 0,
note that this is equivalent to solve for t > 0 the equation of the form ut = −α2 uxx ).

Example 2.18. Consider now the Neumann boundary value problem for the heat equation (recall
that homogeneous boundary conditions mean insulated ends, no energy flux):

ut =α2 uxx , t > 0,


ux (t, 0) = ux (t, 1) = 0, t > 0,
u(0, x) = g(x), 0 ≤ x ≤ 1.

Now, after introducing u(t, x) = T (t)X(x) I end up with the boundary value problem for X in the
form
X ′′ + λX = 0, X ′ (0) = X ′ (1) = 0.
I will leave it as an exercise to show that if λ < 0 then I do not have non-trivial solutions. If, however,
λ = 0, I have that
X(x) = A + Bx,
and the boundary conditions imply that B = 0 leaving me free variable A. Hence I conclude that for
λ = 0 my solution is X(x) = A. If λ > 0 then
√ √
X(x) = A cos λx + B sin λx,

and the boundary conditions imply that B = 0 and



A sin λ = 0,

which will be true if λ = π 2 k 2 , k = 1, 2, 3, . . .. Putting everything together I found that my ODE


boundary value problem has nontrivial solutions only if (note that I include k = 0)

λk = π 2 k 2 , k = 0, 1, 2, . . .

and these solutions are


Xk (x) = Ak cos πkx, k = 0, 1, . . . .
From the other ODE I find

Tk (t) = Ck e−α
2 k2 π 2 t
, k = 0, 1, . . .

and therefore, due to the same superposition principle, I can represent my solution as

∑ ∞
a0 ∑
ak e−α k π t cos πkx.
2 2 2
u(t, x) = Tk (t)Xk (x) = +
2
k=0 k=1

Using the initial condition, I find that



a0 ∑
g(x) = + ak cos πkx,
2
k=1

67
which is a cosine Fourier series for g, where
ˆ 1
ak = 2 g(x) cos πkx dx.
0

Note that as expected, my solution tends to


ˆ 1
a0
u(t, x) → = g(x) dx, t → ∞,
2 0

which is a mathematical description of the fact that the energy inside my rod must be conserved. The
solution that I found is also, as in the Dirichlet case, infinitely differentiable with respect to x at any
t > 0 and ill-posed for t < 0.
Example 2.19. Recall the problem for the heat equation with periodic boundary conditions:

ut = α2 uxx , t > 0, −π < x < π,


u(t, −π) = u(t, π), t > 0,
ux (t, −π) = u(t, π), t > 0,
u(0, x) = g(x), −π ≤ x ≤ π.

We found that the boundary value problem

X ′′ + λX = 0, X(−π) = X(π), X ′ (−π) = X ′ (π)

has a unique solution only if


λk = k 2 , k = 0, 1, 2, . . . ,
and these solutions are
Xk (x) = Ak cos kx + Bk sin kx.
Moreover, the full solution is given by the Fourier series

a0 ∑ −α2 k2 t
u(t, x) = + e (ak cos kx + bk sin kx),
2
k=1

where ak , bk are the coefficients of the trigonometric Fourier series for g on −π ≤ x ≤ π (the exact
expressions are given in the previous lecture). Again, since the rod is insulated, we find that as t → ∞
ˆ π
1
u(t, x) → g(x) dx.
2π −π
Example 2.20. Now let me consider a problem for the heat equation with Robin or Type III boundary
condition on one end. I need to solve

ut = α2 uxx , t > 0, 0 < x < 1,


u(t, 0) = 0, t > 0,
ux (t, 1) + hu(t, 1) = 0, t > 0,
u(0, x) = g(x), 0 ≤ x ≤ 1.

68
Here I will assume that my constant h is positive.
Again, using the usual method of the separation of variables, I end up with

T ′ = α2 λT,

and
X ′′ + λX = 0, X(0) = 0, X(1) + hX(1) = 0.
First I consider the latter problem. I will look into only real values of constant λ.
λ = 0 implies that X(x) = 0 and hence of no interest to me. If λ < 0 then I get the system

0 = A + B,
√ √ √ √ √ √
0 = A(he− −λ
− −λe− −λ ) + B( −λe −λ + he −λ ).

This is a system of linear homogeneous equations with respect to A and B, and it has a nontrivial
solution if and only if the corresponding determinant of the system vanishes, which is equivalent, after
some simplification, to √
√ h − −λ
2 −λ
e = √ .
h + −λ

Note that the left hand side as the function of −λ has a positive derivative, and the left hand side has
a negative derivative, moreover they cross at the point λ = 0. Therefore, for −λ > 0 it is impossible
to have solutions to this equations, which rules out the case λ < 0.
Finally, if λ > 0, then I get
√ √
X(x) = A cos λx + B sin λx,

and hence my boundary conditions imply that A = 0 and


√ √ √
B( λ cos λ + h sin λx) = 0.

The last equality can be true if B = 0 (not interesting for us) or if



√ λ
tan λ = − .
h
From the geometric considerations (see Fig. 2.13) it is clear that there is an infinite sequence of
(λk )∞
k=1 , 0 < λ1 < λ2 < . . ., such that λk → ∞ as k → ∞, and it is quite easy to find these lambdas
numerically, but I do not have a convenient formula for them. My solutions hence are

Xk (x) = Bk sin λk x,

and hence any function √


uk (t, x) = bk e−α

kt
sin λk x
solve the PDE and satisfies the boundary conditions. Now I need to satisfy the initial condition. For
this I will take the infinite linear combination of uk and plug t = 0. I get


g(x) = bk sin µk x,
k=1

69
Figure 2.13: Solutions to the equation tan µ = −µ/h


where µk = λk . This looks like a Fourier sine series, but this is not, because in the classical Fourier
sine series I need µk = πk, which is not true for my example, and hence I cannot use the formulas for
the coefficients. Luckily, however, it turns out the the system of functions {sin µk x} is orthogonal on
[0, 1] (I leave checking this fact as an exercise), and following the same steps that were done when I
derived the coefficients for the trigonometric Fourier series, I can find that
ˆ 1
2µk
bk = g(x) sin µk x dx.
µk − sin µk cos µk 0

Now my problem is fully solved because for any piecewise continuous g my time dependent Fourier
series is an infinitely differential function.
As a specific example let me take
g(x) = x.
Then the solution has the form as in Fig. 2.14.

Figure 2.14: Solutions to the heat equation with Robin boundary condition and the initial condition
g(x) = x

70
Not surprisingly, the solution approaches the trivial steady state, since the problem can be inter-
preted as the spread of the heat in an insulated rod with the fixed zero temperature at the left end and
the temperature of the surrounding medium around the right end of the rod set to zero. Eventually
the temperature evens out.
To emphasize that what I found is not a usual Fourier sine series, I will plot my infinite series for
t = 0 in the symmetric interval (−5, 5) along with a periodic extension of function x on (−1, 1):

Figure 2.15: The periodic extension (black) of g(x) = 0 on (−1, 1) along with its Fourier series (blue)
on the interval (−5, 5)

In all the examples above the boundary conditions were homogeneous. What to do if we are given
non-homogeneous boundary conditions? The method of separation of variables will not work in this
case. Sometimes, however, we can reduce the problem with non-homogenous boundary conditions to
the problem with homogeneous ones. Consider for example the Dirichlet problem for the heat equation
with
u(t, 0) = k1 , u(t, l) = k2 .
Here k1 , k2 are two given constants. It should be clear (if not, carefully do all the math) that the
equilibrium temperature is given by
x
ueq (x) = k1 + (k2 − k1 ).
l
Now consider
u(t, x) = ueq (x) + v(t, x).
For the function v I will get (check!) the problem
vt = α2 vxx ,
v(t, 0) = v(t, l) = 0,
v(0, x) = g(x) − ueq (x),
where g is the initial condition from the original problem for u. Now I have homogeneous boundary
conditions and can use the Fourier method. Such approach will usually work when the boundary
conditions do not depend on t, otherwise we will end up with a nonhomogeneous heat equation, which
still can be solved using the separation of variables technique, but the solution process is slightly more
involved.

71
2.4.1 Conclusion
We had four examples that have a lot of common features. In particular, all of them involve solving

X ′′ + λX = 0

with some boundary conditions. By solving I mean “identifying such values of the parameter λ such
that my problem has a nontrivial solution.” In all four cases I find an infinite series of such lambdas,
all of which are real. Even more importantly, in all four cases the corresponding solutions form an
orthogonal system of functions, and hence the Fourier series technique can be applied to represent the
solution to my original PDE problem in the form of a generalized Fourier series.
Is it a coincidence? Will it be the same for different boundary conditions? I will answer these
questions in the next lecture.

2.5 Sturm–Liouville problems. Eigenvalues and eigenfunctions


In the previous lecture I gave four examples of different boundary value problems for a second order
ODE that resulted in a countable number of constants (lambdas) and a countable number of corre-
sponding solutions, which were used afterwards to build a corresponding Fourier series to represent
solutions for PDE. Not surprisingly, these four examples can be generalized in a relatively abstract
framework, which I discuss in this lecture. In the literature this framework is called Sturm–Liouville
problem after two mathematicians who first concentrated on this problem.
I start with a definition of a Sturm–Liouville differential operator L on the interval x ∈ [a, b]:

Lu := −(p(x)u′ )′ + q(x)u, (2.16)

where p, q are continuous on [a, b] and p(x) > 0 for any x ∈ [a, b]. Note that all four problems from
the previous lecture can be written as
Lu = λu, (2.17)
with p(x) = 1 and q(x) = 0 and some additional boundary conditions. Most of these boundary
conditions can be written in the general form

α1 u(a) + α2 u′ (a) = 0, α12 + α22 > 0,


(2.18)
β1 u(b) + β2 u′ (b) = 0, β12 + β22 > 0.

Quite natural (recall the definition of eigenvalues and eigenvectors for, e.g., matrices) problem
(2.17)–(2.18) is called the Sturm–Liouville eigenvalue problem. Sometimes instead of (2.18) periodic
boundary conditions are used:

u(a) = u(b), p(a)u′ (a) = p(b)u′ (b). (2.19)

To study the properties of the eigenvalues of the Sturm–Liouville problem, I start with a derivation
of Lagrange’s identity. Let u, v be two arbitrary C (2) [a, b] functions, then (check the skipped steps)
( )′
uLv − vLu = −u(pv ′ )′ + quv + v(pu′ )′ − quv = v(pu′ )′ − u(pv ′ )′ = p(vu′ − uv ′ ) .

72
From Lagrange’s identity, by integrating from a to b, I get Green’s formula
ˆ b
b
(uLv − vLu) dx = p(vu′ − uv ′ ) a .
a

Both Lagrange’s identity and Green’s formula hold for any u and v. I claim that if u and v are such
that they satisfy (2.18) or (2.19) then Green’s formula becomes
ˆ b
(uLv − vLu) dx = 0.
a

Indeed, if u and v satisfy, e.g., (2.18) then I must have that

α1 u(a) + α2 u′ (a) = 0, α12 + α22 > 0,


α1 v(a) + α2 v ′ (a) = 0,

and therefore the determinant of the matrix


[ ]
u(a) u′ (a)
v(a) v ′ (a)

must be zero, that is


u(a)v ′ (a) − v(a)u′ (a) = 0.
Similarly,
u(b)v ′ (b) − v(b)u′ (b) = 0,
which proves the stated fact for the boundary conditions (2.18). I leave checking (2.19) as an exercise.
Now, using the notation for the inner product:
ˆ b
⟨u, v⟩ = u(x)v(x) dx,
a

I can rewrite Green’s identity as


⟨u, Lv⟩ = ⟨Lu, v⟩.
An operator that satisfies such condition is called self-adjoint 1 (think about symmetric real matrices,
such that A = A⊤ ), and hence I proved that the Sturm–Liouville operator defined on the functions
that satisfy (2.18) or (2.19) is self-adjoint (note that it is important to add the boundary conditions,
without them the self-adjointness does not make any sense).
Now I am ready to prove
Lemma 2.21. Let L be a self-adjoint Sturm–Liouville operator. Then all the eigenvalues of L are
real.
Proof. I will prove this lemma by contradiction. Let λ ∈ C be my eigenvalue and u ̸= 0 a corresponding
eigenfunction. By the properties of differential operator and linearity of L I get that λ and u must be
another eigenvalue and corresponding eigenfunction. Now consider

⟨Lu, u⟩ = ⟨λu, u⟩ = λ⟨u, u⟩.


1
The real life is much more complicated, I can only refer to a proper graduate course to set the matter straight.

73
On the other hand, since L is self-adjoint,

⟨Lu, u⟩ = ⟨u, Lu⟩ = ⟨u, λu⟩ = λ⟨u, u⟩.

Therefore ˆ b
(λ − λ)⟨u, u⟩ = (λ − λ) |u|2 dx,
a
´b
and since a |u|2 dx > 0 then
λ = λ,
which means that λ is real. 

Lemma 2.22. Let L be a self-adjoint operator. Then if λ1 and λ2 are two different eigenvalues and
u1 , u2 are two corresponding eigenfunctions then u1 and u2 are orthogonal.

Proof. I have
⟨Lu1 , u2 ⟩ = λ1 ⟨u1 , u2 ⟩ = λ2 ⟨u1 , u2 ⟩,
or
(λ1 − λ2 )⟨u1 , u2 ⟩ = 0 =⇒ ⟨u1 , u2 ⟩ = 0.


Two previous lemmas are very nice, however, they are true under the assumption that my operator
has any eigenvalues and eigenfunctions at all. A more impressive theorem, whose proof is significantly
more involved, and hence omitted here, is as follow.

Theorem 2.23. Consider the Sturm–Liouville eigenvalue problem, i.e., (2.17) plus (2.18) or (2.19).
Then there exists a countable sequence of eigenvalues

λ1 ≤ λ2 ≤ λ3 ≤ . . . ,

such that λk → ∞ as k → ∞. The corresponding system of eigenfunctions {u1 , u2 , . . .} is complete in


L2 [a, b], i.e., any function f ∈ L2 [a, b] can be represented as a convergent generalized Fourier series

f (x) = c1 u1 (x) + c2 u2 (x) + . . . ,

where the coefficients are given by


⟨f, uk ⟩
ck = .
⟨uk , uk ⟩
Proof. See, e.g., Sturm-Liouville Theory and its Applications, 2008, by Mohammed Al-Gwaiz for all
the details. 

To conclude this section let me collect together all the results for the Sturm–Liouville operator
Lu = −u′′ that we got so far (in the previous lecture and in homework problems).

74
Boundary conditions Eigenvalues Eigenfunctions
u(0) = u(1) = 0 λk = π 2 k 2 , k = 1, 2, . . . uk (x) = B sin πkx
2 2
u(0) = u(L) = 0 λk = πLk2 , k = 1, 2, . . . uk (x) = B sin πk
L x
u′ (0) = u′ (1) = 0 λk = π 2 k 2 , k = 0, 1, 2, . . . uk (x) = A cos πkx
u(−π) = u(π), u′ (−π) = u′ (π) λk = k 2 , k =√0, 1, 2, . √.. uk (x) = A cos kx, vk (x) = B sin kx

u(0) = u′ (1) + hu(1) = 0 Solutions to tan λ = − λ/h uk (x) = B sin λk x
u(0) = u′ (1) = 0 λk = (π(k − 1/2))2 , k = 1, 2, . . . uk (x) = B sin π(k − 1/2)x, k = 1, 2, . . .
u′ (0) = u(1) = 0 λk = (π(k − 1/2))2 , k = 1, 2, . . . uk (x) = B cos π(k − 1/2)x, k = 1, 2, . . .

2.6 Solving the wave equation by Fourier method


In this lecture I will show how to solve an initial–boundary value problem for one dimensional wave
equation:
utt = c2 uxx , 0 < x < l, t > 0, (2.20)
with the initial conditions (recall that we need two of them, since (2.20) is a mathematical formulation
of the second Newton’s law):

u(0, x) = f (x), 0 < x < l,


(2.21)
ut (0, x) = g(x), 0 < x < l,

where f is the initial displacement and g is the initial velocity.


I start with the homogeneous boundary conditions of type I:

u(t, 0) = 0, t > 0,
(2.22)
u(t, l) = 0, t > 0,

which physically means that I am studying the oscillations of a string of length l with fixed ends.
Using the same ansatz
u(t, x) = T (t)X(x)
I find that
T ′′ X ′′
= = −λ,
c2 T X
and hence I have two ordinary differential equations

T ′′ + c2 λT = 0 (2.23)

for T , and
X ′′ + λX = 0 (2.24)
for X. Using the boundary conditions (2.22) I conclude that equation (2.24) must be supplemented
with the boundary conditions
X(0) = X(l) = 0. (2.25)

75
Problem (2.24)–(2.25) is a Sturm–Liouville eigenvalue problem, which we already solved several times.
In particular, we know that there is an infinite series of eigenvalues

k2 π2
λk = , l = 1, 2, . . .
l2
and the corresponding eigenfunctions
√ πkx
Xk (x) = Ck sin λk x = Bk sin , k = 1, 2, . . . ,
l
moreover all the eigenfunctions are orthogonal on [0, l]. Here Ck are some arbitrary real constants.
Since I know which lambdas I can use, I now can look at the solutions to (2.23). Since all my λk > 0
then I have the general solution
√ √ πckx πckx
Tk (t) = Ak cos c λk t + Bk sin c λk t = Ak cos + Bk sin ,
l l
and hence each function
( )
πckx πckx πkx
uk (t, x) = Tk (t)Xk (x) = ak cos + bk sin sin ,
l l l

solves the wave equation (2.20) and satisfies the boundary conditions (2.22). Here ak = Ak Ck , bk =
Bk Ck . Since my PDE is linear I can use the superposition principle to form my solution as


u(t, x) = uk (t, x),
k=1

my task is to determine ak and bk . For this I will need the initial conditions. Note that using the first
initial conditions implies
∑∞
πkx
f (x) = ak sin ,
l
k=1
which means that ˆ l
2 πkx
ak = f (x) sin dx.
l 0 l
To use the second boundary condition I first differentiate my series and then plug in t = 0:

∑ bk cπk πkx
g(x) = sin ,
l l
k=1

which gives me
ˆ l
2 πkx
bk = g(x) sin dx.
πkc 0 l
Hence I found a formal solution to my original problem. I am writing “formal” since I must also check
that all the series are convergent and can be differentiated twice in t and x to guarantee that what
I found is a classical solution. Note that contrary to the heat equation, my series representations of
solutions do not have quickly vanishing exponents and hence the question on differentiability is not as

76
simple as before. Putting some additional smoothness requirements on the initial conditions can be
used to conclude that my series are classical solutions.
Now let me see what I can infer from the found solution.
The found functions uk are called the normal modes. Using the following trick:
√ ( )
A B
A cos αt + B sin αt = A + B √
2 2 cos αt + √ sin αt
A2 + B 2 A2 + B 2

= A2 + B 2 (cos ϕ cos αt + sin ϕ sin αt)
B √
= R cos(αt − ϕ), ϕ = tan−1 , R = A2 + B 2 ,
A
I can rewrite the normal modes in the form
( )
cπkt πkx
uk (t, x) = Rk cos − ϕk sin . (2.26)
l l
Recall that f is periodic with period T if

f (t + T ) = f (t)

for any t. The (minimal) period is the smallest T > 0 in this formula. Clearly if f is T -periodic then

f (t + kT ) = f (t), k ∈ Z.

Moreover, if f is T -periodic then f (at) has period T /a (prove it). These basic facts imply that the
normal modes are periodic functions with respect to variable t with the periods
2l
Tk = ,
ck
because cosine is a 2π-periodic function. More importantly all normal modes have period 2l/c (not
necessarily minimal), which allows me to conclude that the solution to the problem (2.20)–(2.22) is
2l/c-periodic with the period √
ρ
T = 2l .
E
The angular frequencies are
2π πck
ωk = = .
T l
The normal modes are also called the harmonics. The first harmonic is the normal mode of the lowest
frequency, which is called the fundamental frequency
πc
ω1 = .
l
And now I finally can make my first big conclusion here: all harmonics in the solution to the initial-
boundary value problem for the wave equation are multiple of the fundamental frequency ω1 , and
this is the mathematical explanation why we like the sound of musical instruments whose geometry
is one-dimensional: violin, guitar, flute, etc.
Geometrically harmonics represent standing waves (see Fig. 2.16) Using the introduced terminol-

77
Figure 2.16: Standing waves for first six normal modes uk (t, x) = cos πkt sin πkx. The bold lines
represent the time moments when cos πkt = ±1, the dotted lines are the graphs of uk at intermediate
time moments. For an observer the standing wave represent a periodic vibration of the string

ogy I can conclude that the solution to the wave equation is a sum of standing waves. However, we also
know that if the wave equation has no boundary conditions then the solution to the wave equation is
a sum of traveling waves. This is still true (recall the reflection principle) if the boundary conditions
are imposed. So, how these two facts can be reconciled? Do we have a contradiction or these are two
sides of the same phenomenon?
Actually for this particular example there is a very simple explanation:

cos(θ − ϕ) + cos(θ + ϕ) = 2 cos θ cos ϕ,

and hence, any standing wave can be represented as a sum of two traveling waves.

Example 2.24. To give a specific example, I assume that the initial displacement has the form shown
in Fig. 2.17, and the initial velocity is zero. I find that bk = 0 for all k and

Figure 2.17: The initial displacement for the wave equation

78
Figure 2.18: Solutions to the problem (2.20)–(2.22) with the initial displacement as in Fig. 2.17 and
initial velocity g(x) = 0 at different time moments. First 50 terms of the Fourier series are shown

( ( ) ( πk ))
4 − sin 2
2 2 sin πk
ak = .
πk 2

To illustrate the time dependent behavior of my solution I take first 50 terms of my Fourier series
and plot them at different time moments (see Fig. 2.18), you can observe the traveling waves and

Figure 2.19: Solutions to the problem (2.20)–(2.22) with the initial displacement as in Fig. 2.17 and
initial velocity g(x) = 0 in t, x, u(t, x) coordinates. First 50 terms of the Fourier series are shown

79
reflections from the boundaries for my example. A three dimensional graph of the same solution is
given in Fig. 2.19.

The standing wave solutions allow to make a guess how the scientists guessed to use separation of
variables technique. The fact is that Joseph Fourier was not the first person to assume that. Before
him exactly the same guess was used by Lagrange to obtain an analytical solution to the system of
masses on the springs (the one we used to derive in the limit the wave equation). Lagrange did not
have to deal with the questions of convergence because his “Fourier series” consisted of a finite number
of terms. Even earlier, in 1753, Daniel Bernoulli, a famous mathematician and physicist, used “Fourier
series” to represent solutions to the wave equation2 . You can see his “Fourier series” in the left panel
in Fig. 2.20. He actually did not calculate the coefficients of the series, leaving them in undetermined
form. A possible motivation for these products comes from a very careful drawing of his father, Johann
Bernoulli (one of the early developers of Calculus), which can be seen in Fig. 2.20, right panel. These
are the drawings of the observed string oscillations, and since the graphs of standing waves look the
same and can be analytically described as a product of two trigonometric functions, one depends only
on time and the other one only on the spatial variable, hence (we can only guess at this point) that
it was the original motivation to use the form

u(t, x) = T (t)X(x)

to solve partial differential equations.

Figure 2.20: Two extracts from D. Bernoulli’s 1753 paper

2
Bernoulli, D., 1753. Réflexions et éclaircissemens sur les nouvelles vibrations des cordes. Les Mémoires de l’Académie
Royale des Sciences et des Belles-Lettres de Berlin de 1747 et, 1748, pp.47–72.

80
2.7 Solving the Laplace equation by Fourier method
I already introduced two or three dimensional heat equation, when I derived it, recall that it takes the
form
ut = α2 ∆u + F, (2.27)
where u : [0, ∞) × D −→ R, D ⊆ Rk is the domain in which we consider the equation, α2 is the
diffusivity coefficient, F : [0, ∞) × D −→ R is the function that describes the sources (F > 0) or sinks
(F < 0) of thermal energy, and ∆ is the Laplace operator, which in Cartesian coordinates takes the
form
∆u = uxx + uyy , D ⊆ R2 ,
or
∆u = uxx + uyy + xzz , D ⊆ R3 ,
if the processes are studied in three dimensional space. Of course we need also the boundary conditions
on ∂D and the initial conditions in D.
In a similar vein it can be proved that the wave equation in two or three dimensions can be written
as
utt = c2 ∆u + F, (2.28)
where now c is the wave velocity, and F is an external force. We also will need boundary and initial
conditions.
Very often the processes described by the heat or wave equation approach some equilibrium if
t → ∞. This means that the solution does not change with time and in particular ut or utt tend to
zero as t → ∞. Therefore equations (2.27) and (2.28) turn into

∆u = −f, (2.29)

where f = F/α2 for the heat equation and f = F/c2 for the wave equation. Equation (2.29) is called
Poisson equation, and, in case if f = 0,
∆u = 0, (2.30)
Laplace equation, one of the most important equations in mathematics.
Since I am talking about the equilibrium (stationary) problems (2.29) and (2.30) only boundary
conditions are relevant, in the equilibrium state the system “forgets” about the initial conditions (it
can be rigorously proved that initial value problem for either Poisson or Laplace equations is ill posed).
In the following I will use the separation of variables to solve the Laplace equation (2.30), we will
look into properties of (2.29) in the forthcoming lectures.
Only for some special plane geometries of the domain D it is possible to use the separation of
variables. First of all, in Cartesian coordinates, these are various rectangles, I will leave this case for
homework problems (see the textbook). It is also possible to use separation of variables in “circular”-
based domains, such as interior of the disk, exterior of the disk, sector, annulus, and part of an annulus.
To do this I first need to rewrite the Laplace operator in polar coordinates.
Recall that Cartesian coordinates (x, y) and polar coordinates (r, θ) are connected as

x = r cos θ, y = r sin θ,

81
or
y
r2 = x2 + y 2 , tan θ = .
x
I have (√ y)
u(x, y) = u(r cos θ, r sin θ) = v(r, θ) = v x2 + y 2 , arctan .
x
For the following I will need
x
rx = √ = cos θ,
x2+ y2
y
ry = √ = sin θ,
x2 + y 2
1 y y sin θ
θx = − 2 2
=− 2 2
=− ,
1 + (y/x) x x +y r
1 1 x cos θ
θy = = 2 =− .
1 + (y/x)2 x x + y2 r
Now I start calculating the partial derivatives using the usual chain rule
( )
sin θ
ux = vr rx + vθ θx = vr cos θ + vθ − ,
r
cos θ
uy = vr ry + vθ θy = vr sin θ + vθ ,
r
uxx = (ux )r rx + (ux )θ θy
( ) ( )( )
sin θ sin θ sin θ cos θ sin θ
= vrr cos θ − vθr + 2 vθ cos θ + vrθ cos θ − vr sin θ − vθθ − vθ − ,
r r r r r
uyy = (uy )r ry + (uy )θ θy
( ) ( )
cos θ sin θ cos θ sin θ cos θ
= vrr sin θ + vθr − 2 vθ sin θ + vrθ sin θ + vr cos θ + vθθ − vθ .
r r r r r
Now I add two last lines to find the Laplace operator in polar coordinates (replacing v with u)
1 1
∆u = urr + ur + 2 uθθ .
r r
I note that in cylindrical coordinated x = r cos θ, y = r sin θ, z the Laplace operator is
1 1
∆u = urr + ur + 2 uθθ + uzz = 0,
r r
whereas in the spherical coordinates
x = r sin φ cos θ,
y = r sin φ sin θ,
z = r cos φ,
the Laplace operator is
2 1 cot θ 1
∆u = urr + ur + 2 uφφ + 2 uφ + 2 2 uθθ .
r r r r sin φ

82
Example 2.25. To show how the separation of variables work for the Laplace equation in polar
coordinates, consider the following boundary value problem

∆u = 0,
u(r1 , θ) = g1 (θ),
u(r2 , θ) = g2 (θ),

that is, consider the problem inside the annulus r1 < r < r2 , and on both boundaries Type I non-
homogeneous boundary conditions are given. I start with a usual assumption that

u(r, θ) = R(r)Θ(θ).

Since my equation in polar coordinates takes the form


1 1
urr + ur + 2 uθθ = 0,
r r
I get, denoting with the prime the corresponding derivatives,

r2 R′′ Θ + rR′ Θ + RΘ′′ = 0,

or
r2 R′′ + rR′ Θ′′
− = .
R Θ
Since the left hand side depends only on r and the right hand side depends only on θ hence both sides
must be equal to a constant, which I will denote −λ. Using this constant I end up with two ODE

r2 R′′ + rR′ − λR = 0, (2.31)

and
Θ′′ + λΘ = 0. (2.32)
At this point I must add some boundary conditions to one of these problems so that in the end I get a
Sturm–Liouville problem, whose eigenfunctions I can use as building blocks for my generalized Fourier
series. The original boundary conditions for u are of no help here since they are non-homogeneous.
There should be something else to the problem. And indeed, after some though, it is possible to
guess that my solution must be periodic function (solution must be continuously differentiable), which
implies that
u(r, θ − π) = u(r, θ + π), uθ (r, θ − π) = uθ (r, θ + π).
This implies that my second equation (2.32) must be supplemented with

Θ(−π) = Θ(π), Θ′ (−π) = Θ(π).

Now we know that these periodic boundary conditions plus (2.32) is an eigenvalue Sturm–Liouville
problem with the eigenvalues λk = k 2 , k = 0, 1, . . . and eigenfunctions Θk (θ) = Ak cos kθ + Bk sin kθ.
Now I can return to (2.31), which can be written as

r2 R′′ + rR′ − k 2 R = 0.

83
I start with the case k = 0. Then the equation
r2 R′′ + rR′ = 0
can be solved by substitution R′ (r) = S(r):
B
r2 S ′ + rS = 0 =⇒ S(r) = ,
r
which finally gives me
R0 (r) = a0 + b0 log r.
(I do not use the absolute value since r ≥ 0.)
If k = 1, 2, . . . I have the so-called Cauchy–Euler differential equation, which can be solved by the
ansatz R(r) = rµ . I get
µ(µ − 1) + µ − k 2 = 0 =⇒ µ1,2 = ±k,
and hence the general solution is given by
Rk (r) = ck rk + dk r−k , k = 1, 2, . . .
What I did is I proved that any function of the form
uk (r, θ) = Rk (r)Θk (θ)
solves the Laplace equation ∆u = 0 (such functions are called harmonic) and satisfies the periodic
boundary conditions. Since the Laplace equation is linear, I will use the principle of superposition to
argue that the function
∞ [
∑ ]
u(r, θ) = A + B log r + (Ck rk + Dk r−k ) cos kθ + (Ek rk + Gk r−k ) sin kθ
k=1
solves the Laplace equation and satisfies the periodic conditions. It seems that I have a lot of arbitrary
constants to determine from the remaining two boundary conditions, but careful analysis shows that
I have enough. To wit, let my boundary conditions have the following Fourier series (notice that I do
not divide by 2 the first coefficient)


(1) (1) (1)
g1 (θ) = a0 + ak cos kθ + bk sin kθ,
k=1


(2) (2) (2)
g2 (θ) = a0 + ak cos kθ + bk sin kθ.
k=1
Now, comparing these series with the solution in the form of the series and invoking the boundary
conditions, I get
{ (1) ´π
1
A + B log r1 = a0 = 2π g1 (θ) dθ,
(2) 1
´−π
π
A + B log r2 = a0 = 2π −π g2 (θ) dθ,
{ ´π
Ck r1k + Dk r1−k = ak = π1 −π g1 (θ) cos kθ dθ,
(1)
´π k = 1, 2, . . .
Ck r2k + Dk r2−k = ak = π1 −π g2 (θ) cos kθ dθ,
(2)
{ ´π
Ek r1k + Gk r1−k = bk = π1 −π g1 (θ) sin kθ dθ,
(1)
´π k = 1, 2, . . .
Ek r2k + Gk r2−k = bk = π1 −π g2 (θ) sin kθ dθ,
(2)

84
and each system for each k is a system of two equations with two unknowns, which can be always
(except for some degenerate cases) solved.
For example, assuming that g1 (θ) = a, g2 (θ) = b implies
b−a (b − a) log r1
B= , A=a− ,
log r2 /r1 log r2 /r1
and all other constants are zero. Hence the solution is
u(r, θ) = A + B log r.
If I assume that g1 (θ) = a cos θ, g2 (θ) = b cos θ then I end up with the system
C1 r1 + D1 r1−1 = a,
C1 r2 + D1 r2−1 = b,
which is easy to solve. The solution to the boundary value problem for the Laplace equation hence
u(r, θ) = (C1 r + D1 r−1 ) cos θ.
Example 2.26 (Interior Dirichlet problem for the Laplace equation and Poisson formula). Consider
now the problem
∆u = 0, 0≤r<1
u(1, θ) = g(θ), 0 ≤ θ < 2π.
To solve it I will do exactly the same steps as in the previous example (assume that the solution can
be presented as a product, get two ODE, use the periodic boundary conditions on θ, end up with the
same eigenvalues and eigenfunctions, solve the ODE for R) first. Then I note that a significant part
of the solutions to the ODE for R has no meaning since I am dealing also with the point r = 0 and
hence neither log r nor r−k make sence at this point. Since these solutions have no physical meaning
I drop them to end up with the function

a0 ∑
u(r, θ) = + ak rk cos kθ + bk rk sin kθ.
2
k=1

By using the given type I or Dirichlet boundary condition I immediately find that (note that I con-
veniently assumed that disk has radius 1, make sure that you can solve the case for an arbitrary
radius) ˆ ˆ
1 π 1 π
ak = g(θ) cos kθ dθ, bk = g(θ) sin kθ dθ.
π −π π −π
I solved my problem, but it turns out that I can rewrite this solution in a closed neat form.
Interchanging the integrals and sums in my solution I get
ˆ ( ∞
)
1 π 1 ∑ k
u(r, θ) = g(ϕ) + r (cos kϕ cos kθ + sin kϕ sin kθ) dϕ
π −π 2
k=1
ˆ ( ∞
)
1 π 1 ∑ k
= g(ϕ) + r cos k(θ − ϕ) dϕ.
π −π 2
k=1

85
Now

( ∞
)
1 ∑ k 1 ∑ k
+ r cos kθ = Re + z
2 2
k=1 k=1
( ) ( )
1 z 1+z
= Re + = Re
2 1−z 2(1 − z)
( ) ( )
(1 + z)(1 − z) 1 − |z|2 + z − z
= Re = Re
2|1 − z|2 2|1 − z|2
1 − r2
= .
2(1 + r2 + 2r cos θ)
Therefore, finally, I can conclude that
ˆ π
1 1 − r2
u(r, θ) = g(ϕ) dϕ, (2.33)
2π −π 1 + r2 − 2r cos(θ − ϕ)
which is called the Poisson integral formula, and the expression
1 − r2
2(1 + r2 + 2r cos(θ − ϕ))
is called the Poisson kernel. To emphasize, the Poisson integral formula gives a closed form solution
for the Dirichlet boundary problem for the Laplace equation in a disk.
There are several immediate consequences of the Poisson formula.
• The value of the solution at the center of the disk is given by the average of its boundary value.
ˆ π
1
u(0, θ) = g(ϕ) dϕ.
2π −π
This is a particular case of the following general fact: Let u be harmonic (i.e., solves the Laplace
equation) inside the disk of radius a with the center (x0 , y0 ), then
˛
1
u(x0 , y0 ) = u ds,
2πa γ
where γ is the boundary of the disk. This immediately follows from the Poisson formula after
shift and rescaling.

• If u is a nonconstant harmonic function defined in D then it cannot achieve its local maximum
or local minimum at any interior point in D. This is true since the average of a continuous real
function lies strictly between its minimal and maximal values, and hence, due to the previous
point, I cannot have a local minimum or local maximum at an interior point.

• Immediately from the previous I have that if u is harmonic in D and if m and M are the minimal
and maximal values of u on the boundary of D, then

m ≤ u(x, y) ≤ M

anywhere in D. This statement is called the maximum principle for the Laplace equation.

86
• If u1 and u2 solve the same Poisson equation −∆u = f on D with the same boundary conditions
then u1 = u2 within D, that is, the solution to the Dirichlet boundary value problem for
the Poisson equation is unique. This follows from linearity of the equation and the maximum
principle. Indeed, by linearity function v = u1 − u2 solves the Laplace equation ∆u = 0 with
homogeneous boundary conditions v = 0 on ∂D. By the maximum principle this implies that
v(x, y) = 0 for all (x, y) ∈ D and hence u1 = u2 .

87
Chapter 3

Delta function. Green’s functions.


Fourier transform

3.1 Delta function


§450. The following story is true. There was a little boy, and his father said, “Do try to be like
other people. Don’t frown.” And he tried and tried, but could not. So his father beat him with a
strap; and then he was eaten up by lions.
Reader, if young, take warning by his sad life and death. For though it may be an honour to be
different from other people, if Carlyle’s dictum about the 30 million be still true, yet other people
do not like it. So, if you are different, you had better hide it, and pretend to be solemn and wooden-
headed. Until you make your fortune. For most wooden-headed people worship money; and, really,
I do not see what else they can do. In particular, if you are going to write a book, remember the
wooden-headed. So be rigorous; that will cover a multitude of sins. And do not frown.

Oliver Heaviside (1850–1925)


Electromagnetic Theory, Vol. 3

My next goal is to introduce the so-called Green’s functions for solving stationary boundary value
problems. There are different ways to define these functions, I am going to pick one that makes a
heavy use of the so-called delta function, which is, strictly speaking, not a function. (Delta function is
often called Dirac’s delta function, although there are strong reasons to believe that Dirac picked delta
function from Heaviside’s work.) There exists a rigorous theory of generalized function or distributions,
of which delta function is just one example, but since my use of this theory will be quite limited I will
frequently appeal to intuition and natural properties, instead of providing mathematical proofs that
the manipulations I perform are legitimate.
Consider a mass m moving along the x-axis with constant speed v.1 At the time t = t0 an elastic
collision with a wall occurs (a collision is called elastic if the total kinetic energy of the two bodies
after the encounter is equal to their total kinetic energy before the encounter). After the collision the
mass moves in the opposite direction with the same speed. If v1 , v2 denote the speeds at times t1 , t2
1
I am copying this example from Salsa, Sandro. Partial differential equations in action: from modelling to theory.
Vol. 86. Springer, 2015, where also the theory of generalized functions can be found

88
then by the law of mechanics ˆ t2
m(v2 − v1 ) = F (t) dt,
t1
where F denotes the intensity of the force acting on mass m. If t1 < t2 < t0 or t0 < t1 < t2 then no
problem, v1 = v2 and F = 0 since no force is acting. If, however, t1 < t0 < t2 then the left hand side
of my equality is 2mv, but the right hand side, as we all know from the calculus, must be zero, since F
is zero everywhere except t = t0 , and therefore we get a contradiction. To get rid of this contradiction
I introduce the delta-function by means of a physical definition

δ(t) = 0, t ̸= 0,

and ˆ
δ(t) dt = 1.
R
Then if I put
F (t) = 2mvδ(t − t0 )
then the contradiction in my reasonings evaporates. The price I pay is that now I need to be very
careful when dealing with δ(t) since no usual function has these properties and therefore I am dealing
with an unknown object. Mathematically, we still need a rigorous definition of δ(t), whereas physically
delta function is a mathematical model of something, concentrated at a point (a point mass, a unit
charge, a unit intensity of the force, etc).
There are two quite different mathematical definitions of delta function. The first one uses the so
called delta-like sequence of ordinary functions

δn (t),

which satisfies
lim δn (t) = 0, t ̸= 0,
n→∞
and ˆ
δn (t) dt = 1.
R
Then, by definition, the delta function is formally

δ(t) = lim δn (t).


ϵ→0

I write “formally” because this limit still has no meaning among the usual function, but the good
part of this definition is that it allows us to find a solution to the problems involving delta functions
as first solving a sequence of problems with “usual” delta-like functions, and after it passing to the
limit. Here is a simple example. As a delta-like sequence I will take
{
n
, − n1 < x < n1 ,
δn (t) = 2
0, otherwise.

Clearly this definition satisfies the required properties. Using this sequence I will calculate
ˆ
δ(t)f (t) dt,
R

89
for an arbitrary continuous function f . I have
ˆ ˆ
δ(t)f (t) dt = lim δn (t)f (t) dt
R n→∞ R
ˆ
n 1/n
= lim f (t) dt
n→∞ 2 −1/n

n 2
= lim · f (ξ), −1/n ≤ ξ ≤ 1/n,
n→∞ 2 n
= f (0),

where I used the integral mean value theorem. By using this calculation I now gave meaning to the
meaningless before expression ˆ
δ(t)f (t) dt = f (0).
R
By an intuitively obvious shifting properties I have that the delta-function concentrated at the point
ξ, which is sometimes denoted as δ(t − ξ) or (do not confuse it with a delta-like sequence!) δξ (t),
satisfies the property ˆ
δ(t − ξ)f (t) dt = f (ξ)
R
for any continuous f . That is (and this is very important to us), an arbitrary continuous function
f can be represented as an infinite linear combination of itself and delta functions concentrated as
different points ξ.
To be completely rigorous I must also prove that my result does not depend on a particular choice
of delta-like sequence, which I leave for a curious student.
Just few words about the second approach to define the delta-function (much more in the cited
book). It is much more mathematically pleasing to define the delta-function as a linear functional
that acts on some space of test functions such that for each test function f this functional returns the
value of this function at zero, formally
⟨δ, f ⟩ = f (0).
To play with the first definition, let me calculate
ˆ t
δ(t) dt.
−∞

I use again the same strategy


ˆ t ˆ t
δ(t) dt = lim δn (t) dt
−∞ n→∞ −∞


0, t < 1/n,
2 (t + 1/n), −1/n ≤ t ≤ 1/n,
= lim n
n→∞ 
1, t > 1/n
{
0, t < 0,
= =: χ(t).
1, t > 0,

90
That is, the result is the unit step function or, how it is also frequently called, Heaviside’s function2
(please note that the textbook uses a different notation for the Heaviside function).
Now we can integrate the delta function, what about differentiation? We know from Calculus that
if ˆ x
f (t) dt = F (x)
a
then (by the fundamental theorem of Calculus)
F ′ (x) = f (x).
Therefore, I postulate that
d
χ(t − ξ) = δ(t − ξ).
dt
That is, the delta function (which is, one more time, is not a function) is the derivative the Heaviside
function. This definition allows me to find derivatives of any piecewise continuously differentiable
functions. For example let me find the derivative of
{
−1, t < 0,
f (t) =
1, t > 0.
I can represent my function as
f (t) = 1 · χ(t) + (−1) · χ(−t).
Hence {
′ 0, t ̸= 0,
f (t) =
2δ(t), t = 0,
since
(1)′ = (−1)′ = 0, χ′ (t) = δ(t), χ′ (−t) = −δ(−t) = −δ(t),
using the chain rule and intuitively obvious fact that delta function is even.
It is not difficult to calculate a derivative of a delta-function itself. Indeed,
d
δ ′ (t) = lim δn (t).
n→∞ dt
I leave it as an exercise to prove that
ˆ
δ ′ (t)f (t) dt = ⟨δ ′ , f ⟩ = −f ′ (0).
R
Exercise 3.1. Find Fourier series of the Heaviside function and the delta function. Observe that the
Fourier series for δ can be obtained by formally differentiating the Fourier series for the Heaviside
function (keep in mind that what we find is the Fourier series for 2π periodic extensions).
In the following I will need also delta functions that depend on more then one independent variable.
In this case it is convenient to set, for, e.g., x = (x1 , x2 , x3 ), ξ = (ξ1 , ξ2 , ξ3 ),
δ(x − ξ) = δ(x1 − ξ1 )δ(x2 − ξ2 )δ(x3 − ξ3 ),
and this is a very rare case when we are allowed to multiply delta functions. In general, the operation
of multiplication is not defined.
2
In a way this is really mocking to keep the name of Heaviside to the most trivial object of the whole theory.

91
3.2 Green’s function for a second order ODE
The ultimate goal of this part of the course is to learn how it is possible, at least in principle, to solve
the boundary value problem for the Poisson equation

−∆u = f, x ∈ D,

with the Dirichlet boundary conditions


u|x∈∂D = h.
Here x ∈ Rm , and the dimension of the space, of course, important for how to proceed. In this section
I will consider the case m = 1, and hence the domain D inevitably becomes an interval (a, b), which
for simplicity I take to be (0, 1), and the Poisson equation turns into a second order elementary ODE:

−cu′′ = f, x ∈ (0, 1), (3.1)

(here c is a constant, which I keep to be consistent with the textbook) with the type I boundary
conditions, which I take to be homogeneous

u(0) = u(1) = 0. (3.2)

Clearly, we do not need any special methods to solve this problem, since we can always integrate
twice the equation and determine u up to two arbitrary constants, which, in their turn, can be
determined from the boundary conditions. I, however, choose a somewhat more complicated way to
attack this problem, which can be used in many more other situations, opposite to direct integration,
which I can do here.
To solve problem (3.1), (3.2) I will appeal to the principle of superposition again. Problem (3.1)
is linear (since it involves a linear differential operator), and hence I can represent the solution as a
linear combination of some other basic solutions. To figure out what these other solutions are, I recall
from the previous lecture that for any continuous f I can write
ˆ 1 ˆ 1
f (x) = f (ξ)δ(ξ − x) dx = f (ξ)δ(x − ξ) dξ,
0 0

i.e., I represent f as a linear (continuous) combination of itself with delta-functions concentrated at


each point ξ on the interval (0, 1). Hence, using my intuition from the discrete case, I hope that if I
am able to solve problem (3.1), (3.2) with f (x) = δ(x − ξ), then the whole solution can be found as a
linear combination. Since the (family of) solutions to my problem with the right hand side given by
a delta function play such an important role, I introduce
Definition 3.1. The function G(x; ξ), that depends on the real variable x and parameter ξ, that solves
the problem
−cG′′ (x; ξ) = δ(x − ξ), G(0; ξ) = G(1; ξ) = 0,
is called Green’s (or response) function of the boundary value problem (3.1), (3.2).
The word “response” in the definition comes from the fact that Green’s function physically is a
reaction of the system for a unit external force applied at the point ξ ∈ (0, 1). I claim that if I know
G then I can solve (3.1), (3.2) as
ˆ 1
u(x) = f (ξ)G(x; ξ) dξ, (3.3)
0

92
using again the principle of superposition. As a heuristic argument consider the following line of
reasonings (not a proof!): Let (3.3) be true. Then, by plugging this expression into the equation, I
get ˆ 1 ˆ 1
d2
−c 2 f (ξ)G(x; ξ) dξ = f (ξ)δ(x − ξ) dξ,
dx 0 0
which implies (if I am allowed to switch the order of integration and differentiation)
ˆ 1
( )
f (ξ) −cG′′ (x; ξ) − δ(x − ξ) dξ = 0,
0

which is true, due to the definition of G. The boundary conditions are also satisfied and hence my
representation of the solution indeed works.
Example 3.2. So, let us actually find my Green’s function in the simplest possible case. I have, once
again,
−cG′′ = δ(x − ξ), G(0; ξ) = G(1; ξ) = 0.
I get ˆ
1 1
G′ (x; ξ) = − δ(x − ξ) ds = − χ(x − ξ) + A,
c c
where χ is the Heaviside function. Integrating one more time, I get
1
G(x; ξ) = − ρ(x − ξ) + Ax + B,
c
where ρ is the so-called ramp function, which is the integral of χ and defined as
{
0, x ≤ ξ,
ρ(x − ξ) =
x − ξ, x ≥ ξ.

Now I can determine the constants A and B from the boundary conditions. The first boundary
condition implies that B = 0. The second one yields (note that 1 ≥ ξ)
1−ξ
−(1 − ξ)/c + A = 0 =⇒ A = .
c
Therefore, I can write my final answer as
{
(1 − ξ)x − ρ(x − ξ) (1 − ξ)x/c, x ≤ ξ,
G(x; ξ) = =
c (1 − x)ξ/c, x ≥ ξ.

Now my solution for an arbitrary right hand side can be written as


ˆ 1 ˆ ˆ
1 x 1 1
u(x) = f (ξ)G(x; ξ) dξ = f (ξ)(1 − x)ξ dξ + f (ξ)(1 − ξ)x dξ.
0 c 0 c x
Let me check one more time that my formula actually works. Take, e.g., f (x) = 1, then the solution
is ˆ x ˆ 1
1 1 (1 − x)x2 (1 − x)2 x x − x2
u(x) = (1 − x) ξ dξ + x (1 − ξ) dξ = + = .
c 0 c x 2c 2c 2c
I will leave it as an exercise to find the same answer by direct integration of the problem (3.1), (3.2).

93
The last example shows several important properties of Green’s function. In particular, this
function is continuous, but not continuously differentiable. Its derivative has a jump of magnitude
−1/c at the point x = ξ. This function satisfies the boundary conditions and the equation at any point
except x = ξ. These properties in general are enough to determine Green’s function for more general
problems (which are beyond this course). Note also that the found Green’s function is symmetric:
G(x; ξ) = G(ξ; x), which physically means that the impulse applied at the point ξ feels the same way
at the point x as it felt at the point ξ if applied at the point x (mathematically, this is a manifestation
of the fact that my differential operator plus boundary conditions is self-adjoint).
Can we always find a Green’s function? Not really, as the following example shows.
Example 3.3. Let me change the boundary conditions for Type II:
−cG′′ = δ(x − ξ), x ∈ (0, 1), ξ ∈ (0, 1), G′ (0; ξ) = G′ (1; ξ) = 0.
Acting exactly as in the previous example I find again
1
G(x; ξ) = − ρ(x − ξ) + Ax + B.
c
The first boundary condition implies that A = 0. However, the second boundary condition yields
−1/c + A = 0,
and hence I cannot come up with a suitable A. The reason for this to happen is that the Neumann
boundary value problem
−cu′′ = f, u′ (0) = u′ (1) = 0
does not have a unique solution. Indeed, take f (x) = 1 and conclude that there is no solution at all.
On the other hand, take f (x) = x − 1/2 and conclude that there are infinitely many solutions. This
gives at least some reason to understand why in this particular case there exists no Green’s function.
Actually, a deeper analysis of the problem shows that Green’s function exists if and only if the
corresponding boundary non homogeneous (f ̸= 0) value problem has a unique solution if and only
if the only solution to the corresponding homogeneous (f = 0) boundary value problem is the zero
function.

3.3 Green’s function for the Poisson equation


Now we have some experience working with Green’s functions in dimension 1, therefore, we are ready
to see how Green’s functions can be obtained in dimensions 2 and 3. That is, I am looking to solve
−∆u = f, x ∈ D ⊆ Rm , m = 2, 3, (3.4)
with the boundary conditions
u|x∈D = 0. (3.5)
To solve problem (3.4), (3.5) I need to find Green’s function G(x; ξ), i.e., the solution to
−∆G(x; ξ) = δ(x − ξ), x, ξ ∈ D ⊆ Rm , m = 2, 3
(3.6)
G(x; ξ) = 0, x ∈ ∂D.
But before attacking problem (3.6), I will look into the problem without the boundary conditions.

94
3.3.1 Fundamental solution to the Laplace equation
Definition 3.4. The solution G0 to the problem

−∆G0 (x; ξ) = δ(x − ξ), x, ξ ∈ Rm (3.7)

is called the fundamental solution to the Laplace equation (or free space Green’s function).

Planar case m = 2
To find G0 I will appeal to the physical interpretation of my equation. Physically to solve (3.7) means
to find a potential of the gravitational (or electrostatic) field, caused by the unit mass (unit charge)
positioned at ξ. The field itself is found as the gradient of G0 . Since I do not expect to have for
my gravitation field any preferred directions, I conclude that my potential should only depend on the
distance r = |x − ξ| between the points x and ξ and not on any angle. Next, I will use the fact that
G0 satisfies the Laplace equation ∆G = 0 at any point except ξ. Using the polar form of the Laplace
operator and the fact that my potential depends only on r, I get

rG′′0 + G′0 = 0

I solve this equation when I used the separation of variables for the Laplace equation in polar coordi-
nates. The general solution is given by

G0 (r) = A log r + B.

Now I note that constant B will not contribute to the delta function, since it is infinitely differentiable,
hence my fundamental solution has the form A log r, and I need only to determine constant A. For
this I will use the characteristic property of the delta function that
ˆ
δ(x − ξ) dξ = 1
R2

and the divergence theorem that says that for a nice domain D and smooth vector field F
ˆ ˛
∇ · F dx = F · n̂ dS,
D ∂D

where n̂ is the outward normal to D.


Consider a disk Dϵ or radius ϵ around ξ. Then I have
ˆ ˆ
1= δ(x − ξ) dx = δ(x − ξ) dx =
R2 Dϵ
ˆ ˆ
[due to (3.7)] = −A ∆ log r dx = −A ∇ · ∇ log r dx =
˛Dϵ Dϵ

[due to the divergence theorem] = −A ∇ log r · n̂ dS =


∂Dϵ
˛ ˆ 2π
d log r 1
[why?] = −A dS = −A r dφ = −A2π,
∂Dϵ dr 0 r

95
hence
1
A=− ,

and therefore
1 1 ( )
G0 (x; ξ) = − log |x − ξ| = − log (x − ξ)2 + (y − η)2
2π 4π
is the fundamental solution to the planar Laplace equation or, physically, is the potential of the
gravitation (or electrostatic) field induced by the unit mass (charge). Note that for the field itself
dG0 1
= ,
dr r
that is the force is inversely proportional to the distance between the points.

Case m = 3
Very briefly, and invoking exactly the same reasonings, I find that my fundamental solution must
depend only on r = |x − ξ| and solve everywhere except point ξ the equation

rG′′0 + 2G′0 = 0

(see the expression of the Laplace operator in spherical coordinates). The general solution to this
equation is
A
+ B,
r
and therefore it is reasonable to assume that
A
G0 (r) = .
r
Again, using the properties of delta function and the divergence theorem I get for a sphere Dϵ with
the center at ξ
ˆ ˆ
1= δ(x − ξ) dx = δ(x − ξ) dx =
R3 Dϵ
ˆ ˆ
1 1
= −A ∆ dx = −A ∇ · ∇ dx =
r r
˛Dϵ Dϵ
1
= −A ∇ · n̂ dS =
r
˛ ∂Dϵ ˛
1 A A
=A 2
dS = 2 dS = 2 4πϵ2 = 4πA,
∂Dϵ r ϵ ∂Dϵ ϵ

since the area of the sphere of radius ϵ is 4πϵ2 . Therefore, my fundamental solution is
1
G0 (r) = ,
4πr
and the gravitational (or electrostatic) field exerts the force that is inversely proportional to the square
of the distance, as we all remember from our physics classes.
Exercise 3.2. Find the fundamental solution to the Laplace equation for any dimension m.

96
3.3.2 Green’s function for a disk by the method of images
Now, having at my disposal the fundamental solution to the Laplace equation, namely,
1
G0 (x; ξ) = − log |x − ξ|,

I am in the position to solve the Poisson equation in a disk of radius a. That is, I consider the problem

−∆u = f, x ∈ D ⊆ R2 , D = {(x, y) : x2 + y 2 < a2 } (3.8)

with the homogeneous Dirichlet or Type I boundary conditions

u|x∈∂D = 0. (3.9)

I know that to be able to write the solution to my problem, I need the Green function that solves
−∆G(x; ξ) = δ(x − ξ), x, ξ ∈ D ⊆ R2 ,
(3.10)
G(x; ξ) = 0, x ∈ ∂D.
If I am able to figure out the solution to (3.10), then (3.8), (3.9), by the principle of superposition,
has the solution ¨
u(x) = f (ξ)G(x; ξ) dξ.
D
The key idea is to replace the problem (3.10) with another problem on the whole plane R2 , with an
additional source (or sources) outside of D, such that the boundary condition (3.9) would be satisfied
automatically.
I replace my problem (3.10) with the following

−∆G(x; ξ) = δ(x − ξ) − δ(x − ξ ∗ ), x ∈ R2 , |ξ| < a, |ξ ∗ | > a. (3.11)

Since I require the coordinates of my second source be outside of the my disk, hence within the disk,
due to the properties of the delta function, (3.11) coincides with the equation (3.10). If I am capable
to determine the coordinates of my second source as a function of the coordinates of the source inside
the disk, such that for |x| = a my solution vanishes, then it means that I solved my problem. In other
words, I am looking for the coordinates ξ∗ of the image of the point ξ, and this explains the name of
the method.
So let me try to achieve my goal. I know that solution, again by the superposition principle, to
(3.11) is given by
1 1 1 |x − ξ∗ |2
G(x; ξ) = − log |x − ξ| + log |x − ξ∗ | + c = log + c.
2π 2π 4π |x − ξ|2
Hence, for |x| = a, I must have, due to (3.10),

|x − ξ|2 = k|x − ξ ∗ |2 , k = e4πc .

To see whether the last equality must be true, I consider

|x − ξ|2 = (x − ξ) · (x − ξ) = |x|2 + |ξ|2 − 2x · ξ = a2 + r02 − 2ar0 cos θ,


|x − ξ ∗ |2 = (x − ξ ∗ ) · (x − ξ ∗ ) = |x|2 + |ξ∗ |2 − 2x · ξ ∗ = a2 + γr02 − 2γar0 cos θ,

97
where I assumed, to reduce the number of free parameters, that the angle θ between x and ξ and x
and ξ∗ is the same, that is ξ ∗ = γξ.
To get the required equality I must have
a2 + r02 = ka2 + kγ 2 r02 ,
ar0 = kγar0 ,
from the second of which kγ = 1 and hence from the first
a2
γ= .
r02
Problem solved! You can see geometrically that my point ξ∗ is one of the vertices of the triangle 0xξ∗ ,
which is similar by construction to the triangle 0xξ, see the figure.

Figure 3.1: The construction of the image of the source with coordinates ξ for the disk

Now I can write, using the polar coordinates of the point x as (r, ϕ) and of ξ as (r0 , ϕ0 ), that my
solution to (3.9), (3.10) has the form
( )
1 a2 r2 + r02 − 2rr0 cos(ϕ − ϕ0 )
G(r, ϕ; r0 , ϕ0 ) = log 2 2 .
4π r0 r + a4 − 2a2 rr0 cos(ϕ − ϕ0 )
Exercise 3.3. Find Green’s function for the unit sphere.
Similar approach works for some other domains (see the homework problems), but the list of such
domains is quite limited. There are other methods to infer Green’s function, but they are outside of
the scope of this introductory course. Probably still the best reference for a prepared reader to read
about various methods to find Green’s functions is the first volume of Courant and Hilbert Methods
of mathematical physics.

3.4 Fourier transform


3.4.1 A first look at the Fourier transform
In Math 266 you all studied the Laplace transform, which was used to turn an ordinary differential
equation into an algebraic one. There are a number of other integral transforms, which are useful in

98
one or another situation. Arguably, the most ubiquitous and general is the so-called Fourier transform,
which generalizes the technique of Fourier series on non-periodic functions. For the motivation of the
Fourier transform I recommend reading the textbook or other possible references (see below), in these
notes I start with a bare and dry definition.
Definition 3.5. The Fourier transform of the real valued function f of the real argument x is the
complex valued function fˆ of the real argument k defined as
ˆ ∞
1
f (k) = √
ˆ f (x)e−ikx dx. (3.12)
2π −∞
The inverse Fourier transform, which allows to recover f if fˆ is known, is given by
ˆ ∞
1
f (x) = √ fˆ(k)eikx dk. (3.13)
2π −∞
Remark 3.6. 1. Since the Fourier transform plays a somewhat auxiliary role in my course, I will
not dwell on it for very long. There are a lot of available textbooks on Fourier Analysis, I would
like to mention two additional sources: the lecture notes by Brad Osgood The Fourier transform
and its applications (they are freely available on the web) for a thorough introduction to the
subject, including careful coverage of the relations of the delta function and Fourier transform,
and Körner, T. W. Fourier analysis. Cambridge University Press, 1989 for the already initiated.
2. There are different definitions of the Fourier transform. I use the one, which is used in the
textbook. You can also find in the literature
ˆ
ˆ 1 ∞ iBkx
f (k) = e dx,
A −∞
where the following choices are possible:

A= 2π, B = ±1,
A = 1, B = ±2π,
A = 1, B = ±1.
The only difference in the computations is the factor which appears (or does not appear) in front
of the formulas. Be careful if you use some other results from different sources.
3. The notation is a nightmare for the Fourier transform. Very often, together with the hat, the
operator notation is used
fˆ(k) = Ff (x), f (x) = F −1 fˆ(k).
Both F and F −1 are linear operators, since for them, from the properties of the integral, it holds
that
F(αf (x) + g(x)) = αFf (x) + Fg(x),
and a similar expression is true for F −1 .
To emphasize that the pair f and fˆ are related sometimes something like
f (x)
fˆ(k)
is used.

99
4. In the definition of the Fourier transform I have an improper integral, which means that I have
to bother about the convergence. Moreover, note that the complex exponent, by Euler’s formula,
is a linear combination of sine and cosine, then my transform should be defined only for those f
that tend sufficiently fast to zero. This is actually a very nontrivial question, what is the space of
functions, on which the Fourier transform is naturally defined, but this will not bother me in my
course. Moreover, there will be some examples, which would definitely contradict the classical
understanding of the Fourier transform. I will treat them in a heuristic way remembering that
the rigorous justification can be made within the theory of generalized functions.

Consider several examples to get a feeling about the Fourier transform.

Example 3.7 (Fourier transform of the rect (for rectangle) function). Let
{
1, |x| < a,
Πa (x) =
0, otherwise.

By the definition (3.12) I have


ˆ √
1 a
−ikx eika − e−ika 2 sin ak
Π̂a (k) = √ e dx = √ = .
2π −a 2π ik π k

It follows from the definition of the inverse Fourier transform that


ˆ ∞√
1 2 sin ak ikx
Πa (x) = √ e dk.
2π −∞ π k
The last integral is quite difficult to evaluate directly, and the given definitions of the Fourier transform
are often a direct way for some nontrivial integrals. Note that the Fourier transform in this example
is even, which is not a coincidence.

Example 3.8 (The exponential decay). Let


{
0, x ≤ 0,
fr (x) = −ax
e , x > 0,

where a is a positive constant. Then


1
fˆr (k) = √ ,
2π(a + ik)

which is complex valued even if k is real. Similarly, for


{
eax , x ≤ 0,
fl (x) =
0, x > 0,

I find
1
fˆl (k) = √ .
2π(a − ik)

100
Now I can easily calculate the Fourier transform for

f (x) = e−a|x| = fl (x) + fr (x),

using the linearity of F: √


2 a
fˆ(k) = fˆr (k) + fˆl (k) = ,
π k + a2
2

which is an even function if k is real.


Example 3.9 (Duality principle). Let
1
f (x) = , a > 0.
x2 + a2
I have ˆ ∞
1 e−ikx
fˆ(k) = √ dx.
2π −∞ x 2 + a2
This integral is difficult to evaluate using only the basic knowledge from Calculus (actually, impossible).
However, from the previous, I know that
(√ ) ˆ
−a|x| −1 2 a a ∞ eikx
e =F = dk.
π k 2 + a2 π −∞ k 2 + a2

If I replace k with x and x with −k I will get the integral (up to a multiplicative factor), that I know
how to find. Therefore I conclude that

π e−a|k|
fˆ(k) = .
2 a
This is actually a consequence of the striking resemblance of the Fourier and inverse Fourier
transforms, the difference being just an extra minus sign. I would like to formulate this important
fact as a theorem.
Theorem 3.10. If the Fourier transform of f (x) is fˆ(k), then the Fourier transform of fˆ(x) is f (−k).
This theorem helps reducing the table of Fourier transforms in half, since if I know the Fourier
transform fˆ of f this immediately means that I know the Fourier transform ĝ of the function g = fˆ.
To practice this theorem convince yourself that
( ) √
sin ax π
F = Πa (k).
x 2
Example 3.11 (Fourier transform of the delta function). Let f (x) = δ(x). Then
ˆ ∞
1 1
f (k) = δ̂(k) = √
ˆ δ(x)e−ikx dx = √ .
2π −∞ 2π
Hence the Fourier transform of the delta function is a constant function. From here we can immediately
obtain, invoking the duality principle, that the Fourier transform of the constant, say 1, is

F(1) = 2π δ(k),

101
is the delta function! But stop, if I’d like to use my definition (3.12) then the integral
ˆ ∞
1 · e−ikx dk,
−∞

strictly speaking, does not exist! Well, the exact meaning to this integral can be given within the
framework of the generalized functions, but this will not bother us here.

3.4.2 Properties of the Fourier transform. Convolution


Direct evaluation of the Fourier transform becomes very often quite tedious. A list of the properties
of the Fourier transform helps evaluating it in many special cases.

1. Shift theorem. If f (x) has Fourier transform fˆ(k) then the Fourier transform of f (x − ξ) is
e−ikξ fˆ(k). A very particular example of this property is
1
Fδ(x − ξ) = √ e−ikξ .

Using the duality, the Fourier transform of eiηx f (x) is fˆ(k − η).

2. Dilation theorem. If f (x) has Fourier transform fˆ(k) then the Fourier transform of f (cx), c ̸= 0
is ( )
1 ˆ k
f .
|c| c
To practice this theorem let me find the Fourier transform of e−ax . I start with the Fourier transform
2

of e−x :
2

ˆ ∞ ˆ ∞ ˆ
e−k /4 ∞ −y2 e−k /4
2 2
1 −x2 −ikx 1 −(x−ik/2)2 −k2 /4
f (k) = √
ˆ e dx = √ e dx = √ e dy = √ ,
2π −∞ 2π −∞ 2π −∞ 2
´ √
where I used the fact that R e−x dx = π.
2

Now, using the dilation theorem, I find


( ) 1 k2
F e−ax = √ e− 4a .
2

2a

3. Derivatives and Fourier transform. If f (x) has Fourier transform fˆ(k) then the Fourier transform
of f ′ (x) is ik fˆ(k). Hence the Fourier transform turns the differentiation into an algebraic operation
of multiplication by ik. Immediate corollary is that the Fourier transform of f (n) (x) is (ik)n fˆ(k). By
ˆ
the duality principle or by a direct proof, the Fourier transform of xf (x) is i ddkf .

102
4. Integration and Fourier
´x transform. If f (x) has Fourier transform fˆ(k) then the Fourier transform
of its integral g(x) = −∞ f (s) ds is

i
ĝ(k) = − fˆ(k) + π fˆ(0)δ(k).
k
Using this property I can immediately find Fourier transform of the Heaviside function χ(x):

i 1 π
Fχ(x) = − √ + δ(k).
k 2π 2
Since
sgn x = χ(x) − χ(−x),
then √
21
F sgn x = −i .
πk
Exercise 3.4. Prove all four properties of the Fourier transform.

5. Convolution. Now let me ask the following question: if I know that f (x)
fˆ(k) and g(x)
ĝ(k)
then which function has the Fourier transform fˆ(k)ĝ(k)?
I have
ˆ ∞ ˆ ∞
1
fˆ(k)ĝ(k) = f (x)e−ikx dx g(y)e−iky dy =
2π −∞ −∞
¨ ∞
1
= f (x)g(y)e−ik(x+y) dx dy =
2π −∞
ˆ ∞ (ˆ ∞ )
1 −ik(x+y)
= g(y)e dy f (x) dx = [x + y = s]
2π −∞ −∞
ˆ ∞ (ˆ ∞ )
1 −iks
= g(s − x)e ds f (x) dx =
2π −∞ −∞
ˆ ∞ (ˆ ∞ )
1 −iks
= e g(s − x)f (x) dx ds.
2π −∞ −∞

So, if I introduce a new function


ˆ ∞
h(s) = g(s − x)f (x) dx,
−∞

then I showed that √


2π fˆ(k)ĝ(k) = ĥ(k).
Now I can formally state my result.

Definition 3.12. The convolution of two functions f and g is a function h = f ∗ g defined as


ˆ ∞
h(s) = f ∗ g = f (s − x)g(x) dx.
−∞

103
Basically, by the above reasoning I proved

Theorem 3.13. Let h = f ∗ g. Then



ĥ(k) = 2π fˆ(k)ĝ(k).

In the opposite direction, the Fourier transform of the product of two functions u(x) = f (x)g(x) is
1
û = √ fˆ ∗ ĝ.

The second part of the theorem can be proved in a similar way or using the duality principle.
It is instructive to prove that the convolution is commutative (f ∗ g = g ∗ f ), bilinear f ∗ (ag + bh) =
af ∗ g + bf ∗ h, and associative f ∗ (g ∗ h) = (f ∗ g) ∗ h.

3.5 Applications of Fourier transform to differential equations


Now I did all the preparatory work to be able to apply the Fourier transform to differential equations.
The key property that is at use here is the fact that the Fourier transform turns the differentiation
into multiplication by ik.

3.5.1 Space-free Green’s function for ODE


I start with an ordinary differential equation and consider the problem

−u′′ + ω 2 u = h(x)

in an infinite interval x ∈ R. ω here is a real parameter, and h is a given function. To solve this
problem I start with a space-free Green’s function, that, as you recall from one of the previous lectures,
must satisfy
−G′′0 + ω 2 G0 = δ(x − ξ), −∞ < x, ξ < ∞.
Let Ĝ0 be the Fourier transform of G0 . Then, using the properties of the Fourier transform, I have
that Ĝ0 must satisfy
e−ikξ
k 2 Ĝ0 + ω 2 Ĝ0 = √ ,

or
e−ikξ
Ĝ0 (k) = √ .
2π(k 2 + ω 2 )
Using the table of the inverse Fourier transform, I find that
1 −ω|x−ξ|
G0 (x; ξ) = e

is my free space Green’s function. Now, invoking again the superposition principle, the solution to
the original problem can be written as
ˆ ∞ ˆ ∞
1
u(x) = h(ξ)G0 (x; ξ) dξ = h(ξ)e−ω|x−ξ| dξ.
−∞ 2ω −∞

104
I actually never gave a proof of this formula and appealed only to the linearity and our intuition about
linearity. Let me get this answer from scratch, without using any delta-function.
Applying the Fourier transform to the original problem I get
1
û(k)(k 2 + ω 2 ) = ĥ(k) =⇒ û(k) = ĥ(k) .
k2 + ω2
On the right hand side I have √a product of two Fourier transforms. To use my convolution formula I
need to account for the√factor 2π in front. I have that the inverse Fourier transform of ĥ(k) is h(x)
and for 1/(k 2 + ω 2 ) is π/2e−ω|x| /ω, which all put together gives again the same
ˆ ∞
1
u(x) = h(ξ)e−ω|x−ξ| dξ.
2ω −∞

3.5.2 General solution to the wave equation


Next, I will show what has to be modified in my method if I’d like to apply Fourier transform to a
PDE. For this I consider the initial value problem for the wave equation
utt = c2 uxx , −∞ < x < ∞, t > 0,
u(0, x) = f (x), −∞ < x < ∞,
ut (0, x) = 0.
Since my variable x in this problem takes the values from the whole real line, I will apply my Fourier
transform to the function u of two real variables t and x with respect to the variable x:
F [u(t, x)] = û(t, k).
Applying my Fourier transform to the equation I get
d2 û
+ c2 k 2 û = 0, û(0) = fˆ(k), û′ (0) = 0,
dt2
note that I use the ordinary derivatives since only the derivatives with respect to t are involved, and
the variable k can be simply considered as a parameter. That is, instead of PDE I end up with an
ODE, which is easy to solve. In particular, I have that my general solution is
û(t, k) = C1 (k) cos ckt + C2 (k) sin ckt,
where C1 , C2 are two arbitrary functions of k. Using the initial conditions, I find
û(t, k) = fˆ(k) cos ckt.
To get an inverse Fourier transform, I note that
1 ( )
û(t, k) = fˆ(k) eickt + e−ickt ,
2
and hence the inverse Fourier transform yields the familiar formula
f (x − ct) + f (f + ct)
u(t, x) = F −1 [û(t, k)] = ,
2
which represents two traveling waves, one is going to the left and another one going to the right.

105
3.5.3 Laplace’s equation in a half-plane
Consider the following problem for Green’s function to

Gxx + Gyy = 0, y > 0, −∞ < x < ∞

with the boundary condition


G(x, 0) = δ(x).
Since the x variable runs from −∞ to ∞ I will use the Fourier transform with respect to this variable:

−k 2 Ĝ + G′′yy = 0, y > 0, G(k, 0) = 1/ 2π.

The solution to my problem is

Ĝ(k, y) = C1 (y)eky + C2 (y)e−ky ,

and I must also have that my Fourier transform would be bounded for k → ±∞, therefore I choose
my solution as
1
Ĝ(k, y) = √ e−|k|y ,

which both satisfies the equation and the boundary condition. Taking the inverse Fourier transform,
I find
y
G(x, y) = .
π(x + y 2 )
2

This Green’s function can be used immediately to solve the general Dirichlet problem for the Laplace
equation on the half-plane.

3.5.4 Fundamental solution to the heat equation


Solution to the problem
ut = α2 uxx , −∞ < x < ∞, t>0
with the initial condition
u(0, x) = δ(x)
is called a fundamental solution to the heat equation. The solution is almost immediate using the
Fourier transform. Applying the Fourier transform with respect to x, I find
1
û′t = −α2 k 2 û, û(0, k) = √ ,

which implies after an integration
1
û(t, k) = √ e−α tk ,
2 2


which is the Gaussian function. Recall that the inverse Fourier transform of the Gaussian function is
the Gaussian again: [ ] 1
F e−ax = √ e−k /(4a) .
2 2

2a

106
Figure 3.2: Fundamental solution to the heat equation at different time moments. Note that if t → 0+
then Φ(t, x) → δ(x)

Carefully using the dilation theorem I find that


1 x2
u(t, x) = Φ(t, x) = F −1 û(t, k) = √ e− 4α2 t .
2α πt
The graphs of this function are shown in the figure below. You can convince yourself that the integral
ˆ
Φ(t, x) dx = 1
R

for any time t, and since Φ(t, x) > 0 for all x and t > 0 then Φ is, in terms of probability theory, is
a probability density function.
√ This is actually a probability density function with the mean zero and
the standard deviation σ = 2tα, which connects the random walk model that leads to the diffusion
equation with the solution to this equation.
One of the very important consequences of this solution is that it shows that in our model of
the heat spread the velocity of the movement of the thermal energy is infinite. Indeed, the initial
condition says that u(0, x) = 0 in any point except x = 0, and at the same time the solution shows
that u(t, x) > 0 at any point x and any time t, which is equivalent to saying that the speed of spread
of the heat is infinite. This by no means implies that the actual velocity of the spread of the heat is
infinite! This just shows that drawbacks of our model; and if we have a problem at hands in which
the velocity of the heat spread is important, we have to replace the model.
If I change my initial condition for u(0, x) = δ(x − ξ) then I find

1 (x−ξ)2
u(t, x) = √ e− 4α2 t ,
2α πt
and hence the solution to the initial value problem for the heat equation with the initial condition

u(0, x) = f (x)

107
can be written as, by the principle of superposition,
ˆ ∞
1 (x−ξ)2
u(t, x) = √ e− 4α2 t f (ξ) dξ.
2α πt −∞
It is instructive to obtain a proof of this formula by the Fourier transform method.
Unfortunately, it is quite difficult to evaluate the last integral for an arbitrary f . This can always
be done, however, when f (x) = e−(ax +bx+c) by completing the squares. As a (tedious) exercise I ask
2

you to prove that if f (x) = e−x then my solution to the initial value problem for the heat equation
2

is (setting α = 2 for simplicity)


1 x2
u(t, x) = √ e− 16t+1 .
16t + 1

3.5.5 Duhamel’s principle revisited


Since I have a solution to the IVP for the heat equation, now I can solve the non-homogeneous problem

ut = α2 uxx + h(t, x), −∞ < x < ∞, t > 0,

with the initial condition


u(0, x) = f (x).
Let me write the solution to the homogeneous problem

vt = α2 vxx , v(0, x) = f (x)

as ˆ
v(t, x) = G(t, x; ξ)f (ξ) dξ,
R
where
1 (x−ξ)2
G(t, x; ξ) = √ e− 4α2 t .
2α πt
If I now consider a non-homogeneous problem with the zero initial condition:

wt = α2 wxx + h(t, x), w(0, x) = 0,

then it is a simple exercise to show that v + w give me solution to the original problem. To solve
problem for w I now recall Duhamel’s principle, which we already used for solving a non-homogeneous
wave equation. This principle boils down to

1. Construct a family of solutions of homogeneous Cauchy problems with variable initial time τ > 0
and initial data h(τ, x).

2. Integrate the above with respect to τ over (0, t).

I will leave it as an exercise to prove the validity of this principle in this particular case. According
to this principle the solution to

qt = α2 qxx , q(τ, x) = h(τ, x)

108
can be used to find ˆ t
w(t, x) = q(t, x; τ ) dτ,
0
which finally gives me the following general solution
ˆ ˆ tˆ
u(t, x) = G(t, x; ξ)f (ξ) dξ + G(t − τ, x; ξ)h(τ, ξ) dξ dτ.
R 0 R

3.6 Telegrapher’s equation


Information is power, and those that have access to it are powerful.

Senator Fred Thompson

In vain Whitehouse used his two thousand volt induction coils to try to push messages
through faster — after four weeks of this treatment the cable gave up the ghost; 2500 tons
of cable and £350000 of capital lay useless on the ocean floor.

T. W. Körner, Fourier Analysis.

A lot of exciting details can be found in the book by Körner, I here give only brief synopsis of the
story.
The electric telegraph was invented in 1830, and immediately cable started connecting the main
cities of Europe. In 1850 a cable connected London and France (after 12 hours of functioning the
cable was accidentally cut by a ship), for the second attempt a much heavier cable was used, since
it was discovered that the signal through under water cable cannot be transmitted as fast as along
the cable on the ground. Faraday predicted this effect because of increased capacitance of undersea
cables. Eventually it was understood that it is important to have a cable connecting USA and England
— 2500 miles long cable was required to make this possible. The first attempt was made in 1857,
the cable was snipped after 335 miles. In 1858 a new attempt to lay the cable failed because of the
storm. Finally, in 1858 a cable was laid, and on August 16th it took 16 and a half hours to receive 90
words greeting (stocks of the company that owned the cable went down); what happened next with
the cable is described in the quotation. In 1864 another attempt was made, Western Union company
decided to put the cable through Russia. At the same time, in 1965 another company tried to do it
again through the Atlantic ocean, and after 1250 miles, the cable parted with the ship. Finally, in
1866 new cable connected New-York and London. Due to some mechanical adjustments this time it
was actually possible to send and receive signals.
Now I will try to explain what happened with the signal initially, and how engineers actually solved
the problem.
Let me introduce the following variables: i is the current, v is the potential, L is the inductance,
C is the capacitance, R is the resistance, G is the leakage conductance. If I apply the usual physical
laws that describe the change of the current and potential in my cable, I end up with the system of
first order partial differential equations:
ix + Cvt + Gv = 0,
vx + Lit + Ri = 0,

109
which is called the system of telegrapher’s equations. I assume that I have no boundary conditions:
−∞ < x < ∞. If I differentiate the first equation with respect to t and the second one with respect
to x, I can write that

vxx + Litx + Rix = R(−Cvt − Gv) + vxx + L(−Cvtt − Gvt ) = 0,

or, after rearranging


(RC + LG)vt + LCvtt + GRv = vxx ,
which is now a linear second order PDE, which resembles both the heat and the wave equations, which
we know how to solve! To analyze this equation, I first would like to get rid of one term. That is, I
introduce new variable
v = ue−αt ,
where α will be determined later. I find

vxx = uxx e−αt ,


vt = ut e−αt − αue−αt ,
vtt = utt e−αt − 2αut e−αt + α2 ue−αt .

Plugging this expressions into my equation (and canceling exponents) I get

uxx = (RC + LG)(ut − αu) + RGu + LC(utt − 2αut + α2 u) =


= LCutt + ut (RC + LG − LC2α) + u(−α(RC + LG) + α2 LC).

That is, if I choose my α as ( )


RC + LG 1 R G
α= = + ,
2LC 2 L C
then my equation becomes
utt = a2 uxx + b2 u,
where ( )2
1 1 G R
2
a = , 2
b = − .
LC 4 C L
To understand what is happening with my signal, I will look for a solution to my telegrapher’s
equation in the form of a traveling wave:

u(t, x) = f (x − ct) = f (ξ).

I know that the wave equation has perfectly nice traveling wave solutions, and this is how I would like
my signal to propagate. After plugging this ansatz I have

(a2 − c2 )f ′′ + b2 f = 0,

where the derivative now is taken with respect to variable ξ.


I have several possible cases. First, b = 0 and c = ±a implies that any f can be a traveling wave
solution, and therefore one can pass any signal with the fixed speed a. In case of b > 0 I have the
characteristic polynomial
(a2 − c2 )λ2 + b2 = 0.

110
If (a2 − c2 ) < 0 we have unbounded solutions — physically not realistic. Hence I assume that a > |c|,
which would give me
b
λ1,2 = ± √ i.
a2 − c2
This implies that my only allowable solutions are

f (ξ) = cos kξ, f (ξ) = sin kξ,

where
b
k=√ ,
a − c2
2

which is called the wave number. The wave length (this is the same as wave period if my wave depends
on the time variable) is
2π 2π √ 2
λ= = a − c2 ,
k b
which means that in this case the wave length must satisfy (and hence not every wave length can be
transmitted!)
2πa
0<λ< ,
b
and each wave with a fixed length λ must travel with its own velocity

b2 2
cλ = ± a2 − λ .

This implies that in the cable with b ̸= 0 the signal must first be represented as a sum of several
harmonics (remember, only sine and cosine are traveling wave solutions), which can always be done
thanks to the Fourier series, but different harmonics move with different velocities and thus it is
almost impossible to understand what was initially sent. Hence, to fight this effect, one can choose
the parameters such that b = 0:
G R
= ,
C L
and this will allow the signal to travel as a whole. Problem solved!

111
Chapter 4

Heat and wave equations in two (and


three) dimensions

4.1 Heat and wave equations on the plane. Separation of variables


revisited
Up till now we studied mostly the equations which have only two independent variables. Now it
is time to take a look at the case when there are at least three independent variables. With such
assumptions the heat and the wave equations describe the heat transfer or wave processes in planar
medium because one variable is time and the other two define a point on a plane. I will consider the
heat equation in more detail, but basically no change must be made to solve the wave equation.

4.1.1 Solving the Dirichlet problem for the heat equation in a rectangle
Consider the following problem:

ut = α2 (uxx + uyy ) = α2 ∆u, t > 0, (x, y) ∈ D, (4.1)

where α is a given real constant, and D is a rectangle, as in the figure. I supplement my equation
with the initial condition
u(0, x, y) = f (x, y), (x, y) ∈ D, (4.2)
and Type I or Dirichlet boundary conditions

u(t, x, y) = 0, (x, y) ∈ ∂D, t > 0, (4.3)

where ∂D is the boundary of D.


To attack this problem I first assume that my solution can be represented as

u(t, x, y) = T (t)V (x, y).

Plugging this into (4.1) yields that


T ′ V = α2 T ∆V,

112
Figure 4.1: The rectangular domain D

or, after rearranging,


T′ ∆V
2
= = −λ,
α T V
and since the left hand side depends only on t and the right hand side depends on (x, y) both fractions
must be equal to the same constant, which I, for notational reasons, denote as −λ. Now, after the
separation of variables, I end up with two differential equations, one is ordinary,

T ′ = −λα2 T, (4.4)

and the other one is still the partial differential equation

−∆V = λV. (4.5)

Equation (4.5) is called the Helmholtz equation. The boundary conditions (4.3) imply that I must
supplement problem (4.5) with the boundary conditions

V (x, y) = 0, (x, y) ∈ D,

which can be explicitly written, thanks to the simple geometry of D, as

V (0, y) = V (a, y) = V (x, 0) = V (x, b) = 0.

Helmholtz equation plus the boundary conditions constitute an eigenvalue problem for the Laplace
operator ∆, that is, I am required to find such values of the constant λ that this problem has a nonzero
solution.
To solve this eigenvalue problem I, one more time, will use the assumptions that I can separate
the variables:
V (x, y) = X(x)Y (y).
My equation becomes
X ′′ Y + XY ′′ = −λXY,
or, after rearranging ,
X ′′ Y ′′
=− − λ = −µ,
X Y

113
where I used another constant −µ because the left hand side depends only on x and the right hand
side depends only on y. Hence (and using the boundary conditions), my eigenvalue problem for the
Laplace equation can be written as two boundary value problems for ordinary differential equations:

X ′′ + µX = 0, X(0) = X(a) = 0, (4.6)

and
Y ′′ + ηY = 0, Y (0) = Y (b) = 0, η = λ − µ. (4.7)
These are two Sturm–Liouville problems that we already solved in the previous lectures, the solutions
are
( )2
√ πk
Xk (x) = Ak sin µk x , µk = , k = 1, 2, . . .
a
√ ( πm )2
Ym (x) = Bk sin ηm y , ηm = , m = 1, 2, . . . .
b
Therefore, my eigenvalue problem for the Laplace operator has the eigenvalues
( )2 (
πk πm )2
λk,m = ηm + µk = + , k, m = 1, 2, . . .
a b

and the eigenfunctions


πkx πmy
Vk,m (x, y) = sin sin , k, m = 1, 2, . . .
a b
Now I can return to (4.4):
Tk,m (t) = ak,m e−λk,m α t ,
2

and
uk,m (t, x, y) = Tk,m (t)Vk,m (x, y)
solves by construction equation (4.1) and satisfies the boundary conditions (4.3).
By the linearity of the original equation and the principle of superposition the double infinite series
∞ ∑
∑ ∞
πkx πmy
ak,m e−λk,m α t sin
2
u(t, x, y) = sin ,
a b
k=1 m=1

satisfies my equation and the boundary conditions. Using the initial conditions we can uniquely identify
constants ak,m , because, as before in the case of the Sturm–Liouville problem, the eigenfunctions are
orthogonal.
I introduce the following inner product of two functions defined on D as
¨
⟨f, g⟩ = f (x, y)g(x, y) dx dy.
D

Then it can be shown (remember that in the case of the rectangle the double integral is just a repeated
integral) that
⟨Vk,m , Vp,q ⟩ = 0, k ̸= p, or m ̸= q,

114
and
ab
⟨Vk,m , Vk,m ⟩ = .
4
Therefore, using the initial condition (4.2), I get that
∞ ∑
∑ ∞
f (x, y) = ak,m Vk,m (x, y),
k=1 m=1

and hence, by orthogonality,


¨
⟨f, Vk,m ⟩ 4 πkx πmy
ak,m = = f (x, y) sin sin dx dy, k, m = 1, 2, . . .
⟨Vk,m , Vk,m ⟩ ab D a b

Moreover, if a1,1 ̸= 0 then ( )


1 1
r= + π 2 α2
a2 b2
is the rate with which the rectangular plate will be approaching the obvious equilibrium u(t, x, y) = 0.
Additionally, after some time
u(t, x, y) ≈ a1,1 e−rt V1,1 (x, y).
We solved our problem. Can we do the same calculations for some other D? Not as fast even for the
simplest D, as we will see below.

4.1.2 Problem for a general domain


Consider now the same problem

ut = α2 u, t > 0, (x, y) ∈ D

with the same initial condition


u(0, x, y) = f (x, y),
and the homogeneous Dirichlet boundary condition

u(t, x, y) = 0, (x, y) ∈ ∂D,

for some “nice” but arbitrary domain D. (By “nice” you can think of a bounded, simply connected,
with piecewise-smooth boundary.) Using the same separation of variables I again end up with an
eigenvalue problem
−∆V = λV, V (x, y) = 0, (x, y) ∈ ∂D
for the Laplace operator. A lot can be proved about the eigenvalues and eigenfunctions of this problem
without explicitly computing them. In particular,

• There are infinitely many positive eigenvalues

0 < λ1 < λ2 ≤ λ3 ≤ . . .

which tend to infinity as the index grows.

115
• The corresponding eigenfunctions are orthogonal, that is
¨
⟨Vk , Vm ⟩ = Vk (x, y)Vm (x, y) dx dy = 0, k ̸= m.
D

• The list of eigenfunctions form a complete system, that is, any sufficiently “nice” function can
be represented as a convergent Fourier series
∑ ⟨f, Vk ⟩
f= ck Vk , ck = .
⟨Vk , Vk ⟩
k

Proof of these facts is well beyond the scope of the present course.

4.1.3 The planar heat equation on a disc


The question still remans: Can we solve our equation explicitly in a domain that is different from a
rectangle? The answer is positive, but this will require some additional work. Here I will show where
the problems start.
So, consider now the same problem (4.1)-(4.2) with the difference that D = {(x, y) : x2 + y 2 < 1},
i.e., it is a unit disk. After separating the variables I will have the same problem for T and the
following Helmholtz equation
−∆V = λV,
with the homogeneous boundary condition

V (x, y) = 0, x2 + y 2 = 1.

I know that I must have infinitely many eigenvalues and eigenfunctions, the latter can be used to build
a series solution to the original problem, but how to find them analytically? A natural way to attack
this problem is to rewrite my equation in the polar coordinate:
1 1
vrr + vr + 2 vθθ + λv = 0, v(1, θ) = 0,
r r
and look for a solution in the form
v(r, θ) = R(r)Θ(θ).
I get
1 1
R′′ Θ + R′ Θ + 2 RΘ′′ + λRΘ = 0,
r r
or, after rearranging,
R′′ R′ Θ′′
r2+ r + λr2 = − = µ,
R R Θ
where I introduced another constant of separation because the left hand side depends only on r and
the right hand side depends only on θ.
I get now two ODE problems. The first one is

Θ′′ + µΘ = 0, Θ(−π) = Θ(π), Θ′ (−π) = Θ′ (π),

116
where the boundary conditions are from the periodicity requirement. This is our familiar Sturm–
Liouville problem with periodic boundary conditions, for which we know that

µ m = m2 , Θm (θ) = A cos mθ + B sin mθ, m = 0, 1, 2, . . .

Therefore for R I get

r2 R′′ + rR′ + (λr2 − m2 )R = 0, R(1) = 0, 0 ≤ r < 1.

I also supplement my boundary condition with the “physical” condition that R(r) must be bounded
for all r, in particular at r = 0: |R(0)| < ∞. To slightly simplify my problem I will introduce a new
variable √
z = λr,
and new function √
h(z) = h( λr) = R(r).
By the chain rule I have that
z 2 h′′ + zh′ + (z 2 − m2 )h = 0,
where now the derivatives are taken with respect to the new variable z. And now I am stuck since
there is no way to express a solution to this linear second order ordinary differential equation with
variable coefficients through the pool of elementary functions. I will need something else, and this
will require some facts from the so-called analytic theory of ODE.

4.2 Elements of analytic ODE theory. Bessel’s functions


Recall (I am changing the variables) that we need to solve the so-called Bessel’s equation

x2 u′′ + xu′ + (x2 − m2 )u = 0, m = 0, 1, 2, . . .

4.2.1 Elements of analytic ODE theory


Let
p(x)u′′ + q(x)u′ + r(x)u = 0 (4.8)
be a second order linear homogeneous ODE with non-constant coefficients. Recall that function f is
called analytic at x0 if it can be represented in some neighborhood of x0 by a convergent power series:

f (x) = u0 + u1 (x − x0 ) + u2 (x − x0 )2 + u3 (x − x0 )3 + . . .

The coefficients can be found by


f (k) (x0 )
uk = ,
k!
where k! = 2 · 2 · . . . · k.

117
From the calculus course, for example, we know that exponent, sine and cosine are analytic any-
where in R:
x2 x3
ex = 1 + x + + + ...,
2! 3!
x3 x5 x7
sin x = x − + − + ...,
3! 5! 7!
x2 x4 x6
cos x = 1 − + − + ...
2! 4! 6!
Theorem 4.1. Consider problem (4.8) and assume p, q, r are analytic at x0 and p(x0 ) ̸= 0. Then
problem (4.8) with the initial conditions u(x0 ) = u0 , u′ (x0 ) = u1 has a unique analytic solution.
This theorem actually gives us a way to work through the problem. All we need to do is to look
for the solution in the form

u(x) = u0 + u1 (x − x0 ) + u2 (x − x0 )2 + . . .

and determine u2 , u3 , . . ..
Since the general solution to (4.8) is given by

u(x) = Aû(x) + B ǔ(x),

where A, B are two arbitrary constants and û and ǔ are two linearly independent solutions, we can
always use our power series method with two different (and linearly independent) initial conditions,
e.g., we can take
û(x0 ) = 1, û′ (x0 ) = 0,
and
ǔ(x0 ) = 0, ǔ′ (x0 ) = 1.
In one of the previous lectures we already saw this method applied to the equation u′′ + ω 2 u = 0.
Consider another example.
Example 4.2 (Airy equation). Consider

u′′ − xu = 0,

and take the initial conditions

u(0) = 1 = u0 , u′ (0) = 0 = u1 .

The stated theorem obviously works in this case since p is constant and r(x) = x. I take

u(x) = u0 + u1 x + u2 x2 + u3 x3 + u4 x4 + . . .

and hence
u′′ (x) = 2u2 + 3 · 2u3 x + 4 · 3u4 x2 + . . . .
Plugging the obtained expressions into my equation I find

2u2 + 3 · 2u3 x + 4 · 3u4 x2 + 5 · 4u3 x3 . . . = u0 x + u1 x2 + u2 x3 + u3 x4 + u4 x5 + . . .

118
Two convergent power series equal only if the coefficients at the same powers are equal, that is
2u2 = 0,
6u3 = u0 ,
12u4 = u1 ,
20u5 = u2 ,
30u6 = u3 ,
...
(n + 1)(n + 2)un+2 = un−1 ,
...
Using the initial conditions I find
u3k−3
u3k = , k = 1, 2, 3, . . .
3k(3k − 1)
and all other ui are zero. The last expression is enough to write that my first linearly independent
solution to the Airy equation is
∑∞
û(x) = u3k x3k ,
k=0
which, as can be proved, converges for any x ∈ R. I will leave it as an exercise to show that for the
initial conditions u(0) = 0, u′ (0) = 1 the solution is

∑ u3k−2
ǔ(x) = u3k+1 x3k+1 , u3k+1 = .
(3k + 1)3k
k=0

4.2.2 Solving Bessel’s equation


Unfortunately, the same method will not work for Bessel’s equation verbatim, if I’d like to build a power
series solution around 0. The reason is that x0 = 0 is not a regular point, meaning that p(0) = 0. For
Bessel’s equation x0 = 0 is a singular point, but fortunately for us a regular singular point (a point x0
is called a regular singular point if the equation can be written as (x − x0 )2 a(x)u′′ + q(x)u′ + r(x)u = 0,
which is obviously holds for our equation with x0 = 0 and a(x) = 1). In this case it turns out the
Frobenius method will work. Frobenius method says that in this case a solution can be sought in the
form


u(x) = (x − x0 )ν un (x − x0 )n ,
n=0
where ν does not to be integer or positive.
I have, assuming that u0 = 1,
u(x) = xν + u1 xν+1 + u2 xν+2 + . . .
xu(x) = xν+1 + u1 xν+2 + u2 xν+3 + . . .
x2 u(x) = xν+2 + u1 xν+3 + u2 xν+4 + . . .
u′ (x) = νxν−1 + (ν + 1)u1 xν + (ν + 2)u2 xν+1 + . . .
u′′ (x) = ν(ν − 1)xν−2 + (ν + 1)νu1 xν−1 + (ν + 2)(ν + 1)u2 xν + . . .

119
The coefficient at xν must be
ν(ν − 1) + ν − m2 = 0,
which is true only if ν = ±m, if m ̸= 0.
For the ν + n degree I have, replacing m2 with ν 2 ,
[ ] 1
xν+n : (ν + n)2 − ν 2 un + un−2 =⇒ un = − un−2 , n = 2, 3, 4, . . .
n(2ν + n)
Starting with u0 = 1, u1 = 1 I get that all the odd indices are zero, whereas for even n = 2k

u2k−2 (−1)k
u2k = − = . . . = 2k
4k(k + ν) 2 k!(ν + k)(ν + k − 1) . . . (ν + 1)
and hence my solution is

∑ (−1)k xν+2k
u(x) = .
22k k!(ν + k)(ν + k − 1) . . . (ν + 1)
k=0

In general this is not necessary, but in our case m is an integer, and if ν = −m then the denominator
in the series above vanishes. Hence we only found one solution to Bessel’s equation:

∑ (−1)k xm+2k
u(x) = .
22k k!(m + k)(m + k − 1) . . . (m + 1)
k=0

I am allowed to multiply my solution by any constant, and I choose, by convention, to multiply the
series above by 1/(2m m!), in this case I have

∑ (−1)k xm+2k
Jm (x) = ,
22k+m k!(m + k)!
k=0

Bessel’s function of the first kind of the m-th order. An application of the ratio test yields that the
series converges for any x ∈ R and hence Jm is analytic anywhere in R (or even in C).
Just to get first idea on the Bessel’s functions, note that

J0 (0) = 1, Jm (0) = 0, m = 1, 2, . . .

So, I found one independent solution and in need of another one. Abel’s formula (see Math 266)
tells me that for two linearly independent solutions to u′′ + a(x)u′ + b(x)u = 0 I must have
[ ] ´
u1 u2
det ′ ′ = Ce− a(x) dx .
u1 u2

In my case a(x) = 1/x and hence I end up with


C
u1 u′2 − u′1 u2 = ,
x
or, after some rearranging ( )′
u2 C
= .
u1 xu21

120
Using u1 (x) = Jm (x) I have ˆ
u2 (x) C
= 2 (x)
dx.
Jm (x) Jm
Therefore, ˆ
C
u2 (x) = Jm (x) 2 (x)
dx
Jm
is the second linearly independent solution, which is actually called Neumann’s function of the second
kind of the m-th order.

4.2.3 Some facts about solutions to Bessel’s equation


The full analysis of the solutions to Bessel’s equation is beyond the scope of this course. I, however,
would like to show how at least some of the important results can be obtained and proved.
I will start with Bessel’s function of the first kind of order zero:
x2 x4 x6 x8
J0 (x) = 1 − + − + − ...
22 22 · 42 22 · 42 · 62 22 · 42 · 62 · 82
Since the ratio of two consecutive terms is
x2
− ,
(2k)2
which approaches zero as k → ∞ for any fixed x then this series converges absolutely and uniformly,
and hence J0 and all its derivatives are continuous.
For J1 I have
x x3 x5 x7
J1 (x) = − 2 + 2 2 − 2 2 2 + ...,
2 2 ·4 2 ·4 ·6 2 ·4 ·6 ·8
which immediately implies that
dJ0
(x) = −J1 (x)
dx
(cf. with cos′ x = − sin x). Analogously,
d
(xJ1 (x)) = xJ0 (x).
dx
Let me use my formula for the second independent solution to find Neumann’s function of the second
kind of zero order.
I have ˆ
dx
N0 (x) = J0 (x) .
xJ02 (x)
Using the fact that (prove it)
1 1 x 5x3
= + + + ...
xJ02 (x) x 2 32
and integrating by terms I find
( )
x2 3x4
N0 (x) = J0 (x) log x + − + ... .
4 128

121
The most important fact here is that N0 is not defined at zero and approaches −∞ (it behaves like
log for small x). Using Neumann’s function of the second kind I can define Bessel’s function of the
second kind of zero order as a special linear combination of J0 and N0 :
2
(N0 (x) − (log 2 − γ)J0 (x)) ,
Y0 (x) =
π
(∑ )
where γ is the Euler constant, γ = limn→∞ ( ni=1 ) n1 − log n ≈ 0.5772. Hence the general solution
to Bessel’s equation of zero order can be written (this is the most standard form) as

u(x) = AJ0 (x) + BY0 (x).

Recall that we are mostly interested in solutions to


( )
′′ 1 ′ m2
R + R + λ − 2 R = 0.
r r

By above and generalizing I showed that the general solution to this equation is given by
√ √
R(r) = AJm ( λr) + BYm ( λr).

Now let me denote v(x) = J0 (αx) and w(x) = J0 (βx), where α and β are some constants. Due to
the above I have that v and w solve

xv ′′ + v ′ + α2 xv = 0,
xw′′ + w′ + β 2 xw = 0.

If I multiply the first equation by w, second by v and subtract then I get, after simplifications,
( ′ )′
x(v w − vw′ ) = (β 2 − α2 )xvw.

By integrating from 0 to 1 I proved that


ˆ 1
(β 2 − α2 ) xJ0 (αx)J0 (βx) dx = αJ0′ (α)J0 (β) − βJ0′ (β)J0 (α).
0

Similarly, by multiplying the equation for v by 2xv ′ I can show (exercise) that
ˆ 1
1
xJ02 (αx) dx = (J02 (α) + J12 (α)).
0 2

An important corollary is as follows: if α and β are two roots of J0 then


ˆ 1
xJ0 (αx)J0 (βx) dx = 0
0

and ˆ 1
1
xJ02 (αx) dx = J12 (α).
0 2

122
The question is, of course, do we have any roots at all? To see that there are always infinitely
many roots, let me make the change of variables

v(x) = u(x) x

in Bessel’s equation of the order zero. Then, after straightforward manipulations, I find that
( )
′′ 1
v = − 1 + 2 v,
4x

that is, when x is large, then the equation is approximately v ′′ + v = 0, and hence for large x Bessel’s
equation has an approximate solution
A cos(x − ϕ)
u(x) = √ ,
x
for some constants A and ϕ, which indicates that Bessel’s functions approach zero as x → ∞ and that
Bessel’s functions have infinitely many real positive roots.
Let me introduce the inner product
ˆ 1
⟨f, g⟩B = xf (x)g(x) dx.
0

Note that the functions J0 (ζk x), k = 1, 2, 3, . . . are orthogonal on [0, 1] with respect to this inner
product. Here ζk is the k-th root of J0 (x). It can be proved that any “nice” function f can be
represented as a convergent series

f (x) = c1 J0 (ζ1 x) + c2 J0 (ζ2 x) + c3 J0 (ζ3 x) + . . . .

This expansion is called the Fourier–Bessel expansion, and the coefficients can be found, from the
proved formulas above, as
ˆ 1
⟨f, J0 (ζk x)⟩B 2
ck = = 2 xf (x)J0 (ζk x) dx.
⟨J0 (ζk x), J0 (ζk x)⟩B J1 (ζk ) 0

4.2.4 Summary about Bessel’s functions


In a way similar to the above one can show that the following theorem holds.

Theorem 4.3. Consider Bessel’s ODE of the order m, where m = 0, 1, 2, . . .:

x2 u′′ + xu′ + (x2 − m2 )u = 0.

The general solution to this equation can be written as

u(x) = AJm (x) + BYm (x),

where Jm is Bessel’s function of the first kind of order m and Ym is Bessel’s function of the second
kind of order m. J0 (0) = 1, Jm (0) = 0, m = 1, 2, . . .. Bessel’s functions of the second kind have a
singularity at x = 0 for any m. In particular, limx→0+ Ym (x) = −∞.

123
Figure 4.2: Graphs of several Bessel’s functions of the the first kind

Both Jm and Ym approach zero as x → ∞, both Jm and Ym have infinitely many positive roots.
Let ζk,m denote the k-th root of Jm . Then
ˆ 1 {
0, l ̸= k,
xJm (ζk,m x)Jm (ζl,m x) dx = 1 2
0 2 Jm+1 (ζk,m ), l = k.

Any sufficiently nice function f can be represented as Fourier-Bessel series




f (x) = ck Jm (ζk,m x),
k=1

where the explicit form of the coefficients can be inferred from the relation above.

4.3 Solving planar heat and wave equations in polar coordinates


Now that all the preparations are done, I can return to solving the planar heat and wave equations in
domains with rotational symmetry.

124
Figure 4.3: Graphs of several Bessel’s functions of the the second kind

4.3.1 Heat equation


Recall that we are solving

ut = α2 ∆u, t > 0, x2 + y 2 < 1,


u(0, x, y) = f (x, y), x2 + y 2 < 1,
u(t, x, y) = 0, x2 + y 2 = 1.

We found, by separating the variables u(t, x, y) = T (t)V (x, y) that

T ′ = −λα2 T,

and
−∆V = λV, x2 + y 2 < 1, V (x, y) = 0, x2 + y 2 = 1.
For the latter problem we again assumed that variables separate in polar coordinates v(r, θ) =
R(r)Θ(θ). For Θ we got

Θ′′ + µΘ = 0, Θ(−π) = Θ(π), Θ′ (−π) = Θ′ (π),

which implies that

µ m = m2 , m = 0, 1, 2, . . . Θ(θ) = Am cos mθ + Bm sin mθ,

and for R we got Bessel’s equation

r2 R′′ + rR + (λr2 − m2 )R = 0, m = 0, 1, . . . , R(1) = 0, |R(r)| < ∞.

125
Now finally we use the material from the previous section and state that the general solution is given
by √ √
R(r) = AJm ( λr) + BYm ( λr).
From the condition that my solution must be bounded I have that B = 0 since Ym is not bounded
close to zero, from the boundary condition I have

Jm ( λ) = 0,

that is my λ must be the root of the corresponding equation. Hence,
2
λk,m = ζk,m ,

where ζk,m is the k-th positive root of Jm .


To summarize, the eigenvalues of the eigenvalues problem for the Laplace operator in the unit disk
with type I or Dirichlet boundary conditions are
2
λk,m = ζk,m ,

with the eigenfunctions

vk,0 (r, θ) = J0 (ζk,0 r), k = 1, 2, . . . ,


vk,m (r, θ) = Jm (ζk,m r) cos mθ, . . . , k, m = 1, 2, 3, . . . ,
ṽk,m (r, θ) = Jm (ζk,m r) sin mθ, . . . , k, m = 1, 2, 3, . . .

Using the standard inner product in Cartesian coordinates


¨
⟨f, g⟩ = f (x, y)g(x, y) dx dy,
D

we know, thanks for the general theory, that all these eigenfunctions are orthogonal with respect to
this inner product. However, in this specific case we actually proved this fact. Indeed, in the polar
coordinates this inner product reads
ˆ 1ˆ π
⟨f, g⟩ = f (r cos θ, r sin θ)g(r cos θ, r sin θ)r dθ dr.
0 −π

Hence if we take two eigenfunctions with different m then we get zero due to the orthogonality of the
trigonometric system on [−π, π). If, however, m are the same, but k are different, then we get zeroes
due to the orthogonality of Bessel’s functions, note the necessary multiple r in our integral. Finally,
we need to calculate

⟨vk,0 , vk,0 ⟩ = πJ12 (ζk,0 ), k = 1, 2, . . .


π 2
⟨vk,m , vk,m ⟩ = Jm+1 (ζk,m ), k, m = 1, 2, . . .
2
π 2
⟨ṽk,m , ṽk,m ⟩ = Jm+1 (ζk,m ), k, m = 1, 2, . . . .
2

126
Therefore, putting everything together, my solution is given by

∑ ∞ ∑
∑ ∞
ak,0 e−α e−α
2ζ2 t 2ζ2
u(t, r cos θ, r sin θ) = k,0 J0 (ζk,0 r)+ k,m t Jm (ζk,m r) (ak,m cos mθ + bk,m sin mθ) .
k=1 m=1 k=1

The coefficients are found using the initial conditions and orthogonality of the corresponding eigen-
functions. To be precise,
´1´π
⟨f, vk,0 ⟩ f (r cos θ, r sin θ)J0 (ζk,0 r)r dθ dr
ak,0 = = 0 −π , k = 1, 2, . . . ,
⟨vk,0 , vk,0 ⟩ πJ1 (ζk,0 )
´1´π
⟨f, vk,m ⟩ 2 0 −π f (r cos θ, r sin θ)Jm (ζk,m r)r cos mθ dθ dr
ak,m = = , k, m = 1, 2, . . . ,
⟨vk,m , vk,m ⟩ πJ1 (ζk,0 )
´1´π
⟨f, ṽk,m ⟩ 2 0 −π f (r cos θ, r sin θ)Jm (ζk,m r)r sin mθ dθ dr
bk,m = = , k, m = 1, 2, . . . .
⟨ṽk,m , ṽk,m ⟩ πJ1 (ζk,0 )

In particular, for t large one has that

u(t, r cos θ, r sin θ) ≈ a1,0 e−α


2ζ2
1,0 J0 (ζ1,0 r),

since it can be proved that ζ1,0 is the largest root among all other roots of Jm .
The explicit examples will be given when I consider the wave equation below.

4.3.2 Wave equation


Consider now the wave equation

utt = c2 ∆u, t > 0, (x, y) ∈ D,


u(0, x, y) = f (x, y), (x, y) ∈ D,
ut (0, x, y) = g(x, y), (x, y) ∈ D,
u(t, x, y) = 0, (x, y) ∈ ∂D,

where D ⊂ R2 some domain.


Similarly to the heat equation, the separation of variable is possible only for some special domains.
For example you saw how to solve this problem when D = {0 < x < a, 0 < y < b} in your homework
problems.
The key difference from the heat equation is that for T one has

T ′′ + c2 λk,m T = 0,

which has the general solution

Tk,m (t) = Ak,m cos(ωk,m t − ϕk,m ),

where √
ωk,m = c λk,m

127
are the frequencies of vibration, and λk,m are the corresponding eigenvalues of the eigenvalues problem
for the Laplace operator, which is exactly the same as for the heat equation. The fundamental frequency
is the smallest one. Recall that for the rectangle we found that
( 2 )
k m2
λk,m = + π2,
a2 b2
and hence √
k 2 m2
ωk,m = cπ + 2 .
a2 b
The solution in the form of Fourier series reads

u(t, x, y) = Tk,m (t)Vk,m (x, y),
k,m

where Vk,m are the eigenfunctions of the Laplace operator with type I boundary conditions, therefore
we immediately have an important conclusion: contrary to one dimensional case, when the solution to
the wave equation is a periodic function of t, the solution to the planar wave equation is not periodic
because the ratio
ωk,m
ω1,1
is not a rational number, which is required for the solution to be periodic (compare with the one
dimensional case).
Now let me consider again D = {(x, y) : x2 + y 2 < 1}. Following exactly the same steps that I did
for the heat equation, I will end up in general with the following solution:

∑ ∞ ∑
∑ ∞
u(t, r cos θ, r sin θ) = Tk,0 (t)vk,0 (r, θ) + Tk,m (t)vk,m + T̃k,m (t)ṽk,m ,
k=1 m=1 k=1

where all the constants that are necessary to be determined


√ are
√ inside T and T̃ . By using the
initial conditions and considering that T (t) = A cos c λt + B sin c λt we can always find the explicit
expressions for these coefficients. I will consider several examples without treating the most general
case.
Example 4.4. Let me consider first the case

u(t, r cos θ, r sin θ) = f (r), ut (t, r cos θ, r sin θ) = 0,

that is the initial condition does not depend on the angle θ. From the symmetry arguments (or this
can be easily proved rigorously) the solution will not also depend on θ, and hence will have the form


U (t, r) = ak cos(cζk,0 t)J0 (ζk,0 r), k = 1, 2, . . .
k=1

and I do not have sine in my solution since the initial velocity is zero. The coefficients of my Fourier–
Bessel series are found as ˆ 1
2
ak = rf (r)J0 (ζk,0 r) dr.
J1 (ζk,0 ) 0

128
Figure 4.4: Graphs of J0 (r) and J0 (ζk,0 r)

To get what is actually happening with the solution, consider the graphs of J0 (ζk,0 r). They have
exactly k − 1 roots on the interval (0, 1) and represent the building blocks out of which the whole
solution is built. Each term of the form

J0 (ζk,0 r) cos(ζk,0 t)

represent a standing wave, such that the whole solution is a linear combination of these standing
waves, see another figure.

Figure 4.5: Standing waves. The thick curves are for cos(ζk,0 t) = ±1 and the dashed curves are for
time moments between those

129
Example 4.5. As a second example consider the problem with the initial condition is given by

u(0, r cos θ, r sin θ) = Avk,m (r, θ).

with again the initial velocity equal to zero. The functions vk,m represent fundamental vibrations with
the frequency
wk,m = cζk,m ,
since the solution to this problem, due to orthogonality of vk,m , is given by

U (t, r, θ) = Avk,m (r, θ) cos(cζk,m t),

and these are the only solutions to my problem that are periodic. Here are the graphs of several
fundamental vibrations:
Similarly to the standing waves considered above, these fundamental vibrations will generate the
two dimensional standing waves. Note that those sets of points for which

vk,m (r, θ) = 0

will stay always zero. It can be probed that these sets are composed of nodal curves, that divide the
circular drum into several nodal regions.

Example 4.6. In general, the solution to the wave equation on a unit disk can be represented as a
linear combination of standing waves, each of which is generated by a fundamental vibration with the
corresponding frequency. It can be proved (originally was proved in 1929 by Siegel) that the ration
of these frequencies is never a rational number, and hence the solution to the wave equation with
general initial conditions is not a periodic function, the same as for the rectangle plate. In the musical
language I can rephrase that higher vibrations for the drumhead are not the pure overtones of the
basic frequency ω1,0 , which gives a mathematical explanation for the fact that human ear much prefer
the sound of one dimensional instruments, such as guitar or violin to two dimensional such as a drum.

130
Figure 4.6: Fundamental vibrations (note that I am using the second index to denote the order of
Bessel’s function, and the first index to denote the k-th root, which is opposite to what is used in the
textbook)

131
Figure 4.7: Nodal curves

132
4.4 Solving the wave equation in 2D and 3D space
’No,’ replied Margarita, ’what really puzzles me is where you have found the space for all
this.’ With a wave of her hand Margarita emphasized the vastness of the hall they were in.
Koroviev smiled sweetly, wrinkling his nose. ’Easy!’ he replied. ’For anyone who knows
how to handle the fifth dimension it’s no problem to expand any place to whatever size
you please.’

Mikhail Bulgakov, The Master and Margarita

The goal of this concluding section is to find the solution to the initial value problem for the wave
equation

utt = c2 ∆u, x ∈ Rk ,
u(0, x) = g(x), (4.9)
ut (0, x) = h(x),
with k = 2, 3. Recall that in the case k = 1 we already know that the solution is given by d’Alembert’s
formula ˆ
g(x − ct) + g(x + ct) 1 x+ct
u(t, x) = + h(s) ds.
2 2c x−ct
It turns out that for the case k = 2, 3 it is also possible to find an explicit solution.
First, I will prove an auxiliary fact, which will help to reduce the number of computations.
Lemma 4.7. Let vh denote the solution to the problem.

vtt = c2 ∆v, x ∈ Rk ,
v(0, x) = 0, (4.10)
vt (0, x) = h(x).
Then function

w= vg
∂t
solves
wtt = c2 ∆w, x ∈ Rk ,
w(0, x) = g(x), (4.11)
wt (0, x) = 0.
Proof. Indeed, since vg satisfies the wave equation, taking the derivatives with respect to time on
both left and right hand sides and exchanging the order of operators implies that w solves the wave

equation. Moreover, w(0, x) = ∂t vg (0, x) = g(x) due to the definition of vg . Finally, I need to show
that wt (0, x) = 0. Consider

∂ ∂2
w(0, x) = 2 vg (0, x) = c2 ∆vg (0, x) = 0,
∂t ∂t
since, by definition vg (0, x) = 0. 

133
Corollary 4.8. Any solution to (4.9) can be written as


u= ug + uh ,
∂t
where uh solves (4.10).

Proof. The proof follows from the linearity of the original problem. 

Therefore, all I need to do is to solve problem (4.10). The lemma above holds for any dimension k.
From now on, however, I will stick to the case k = 3, i.e., I will consider our familiar Euclidian space.
To find a solution to (4.10), I first will find the fundamental solution to the three dimensional wave
equation. This solution, by definition, solves problem (4.10) with h replaced by δ(x), which is three
dimensional delta-function, and which models an unit impulse (disturbance) applied at the point 0.
To do this I will use a three dimensional Fourier transform, which is defined for any f (x), x ∈ R3 as
˚
1
f (k) = √
ˆ f (x)e−ik·x dx,
( 2π)3 R 3

where k · x = k1 x1 + k2 x2 + k3 x3 in the usual scalar product. The inverse Fourier transform is defined
in a similar way with minus replaced by the plus. Denoting v̂ the Fourier transform of my unknown
function, I get, similarly to one dimensional case
1
v̂tt = −c2 |k|2 v̂, v̂(0) = 0, v̂t (0) = √ ,
( 2π)3

where
|k|2 = k · k = k12 + k22 + k32
is the usual Euclidian norm. Solving this simple ODE I find that
1
v̂(t, k) = √ sin(ct|k|).
( 2π)3 c|k|

Hence my solution is given by


˚
1 dk
v(t, x) = 3 sin(ct|k|)eik·x .
8π c R3 |k|

To evaluate this integral I switch to spherical coordinates, chosen such that polar axis coincides with
the direction of vector x. My spherical coordinates are k = |k|, θ, φ. I also denote |x| = r. I have,
since k · x = kr cos φ, due to the choice of the polar axis. Hence my integral is now
ˆ 2π ˆ π ˆ ∞
1
v(t, x) = 3 sin(ctk)eikr cos φ k 2 sin φ dk dφ dθ.
8π c 0 0 0

Evaluating the integral for θ and φ, I find (left as an exercise)


ˆ ∞
1
v(t, x) = 2 sin kct sin kr dk.
2π cr 0

134
The last integral should be understood in terms of generalizer functions and the inverse Fourier
transform. Using the complex exponent to represent sines I end up with
ˆ ( )
1 1
v(t, x) = 2 eik(ct−r) − eik(ct+r) dk = (δ(ct − r) − δ(ct + r)) .
8π cr R 4πcr
I am looking only in the future t > 0, and hence δ(ct + r) = 0. Finally, I get

δ(ct − r)
4πcr
is the fundament solution to the three dimensional heat equation. By a translation argument I get
that if my initial velocity would be
vt (0, x) = δ(x − ξ),
then my solution is
δ(ct − |x − ξ|)
K(t, x, ξ) = .
4πc|x − ξ|
Thus the fundamental solution is a traveling wave, initially concentrated at ξ and afterwards on

∂Bct (ξ) = {x : |x − ξ| = ct},

which is the boundary of the ball with the center at ξ and radius ct. This means, among other things,
that the wave originates at time t = 0 at the point ξ will be felt at the point η only at the time
|η − ξ|/c, which is called the strong Huygen’s principle and gives a mathematical explanation why
sharp signals propagate from a point source. Using the superposition principle I can represent the
sought solution to (4.10) as ˚
v(t, x) = K(t, x, ξ)h(ξ) dξ.
R3
The last integral in spherical coordinates takes the form
ˆ ∞ ˆ ˆ
δ(ct − r) 2π π
v(t, x) = h(r, θ, φ) dφ dθ dr.
0 4πcr 0 0

The double integral inside is the integral over the surface of the sphere of radius r with the center at
the point x: ˆ 2π ˆ π ¨
h(r, θ, φ) dφ dθ = h(σ) dσ,
0 0 ∂Br (x)

therefore, using the main property of the delta function I finally get
Theorem 4.9 (Kirchhoff’s formula). The unique solution to the problem (4.9) with k = 3 is given by

u(t, x) = th + (tg) ,
∂t
where f is the average value of function f over the sphere of radius ct centered at x,
¨
1
f (t, x) = f (σ) dσ.
4πc2 t2 ∂Bct (x)

135
Kirchhoff’s formula also emphasizes the strong Huygen’s principle. To see this (see the figure)
assume that the initial disturbance h has a small compact support (that means that it is nonzero only
for some small region U ⊆ R3 ). I am interested in observing the signal at the point x. Initially, for
small t1 the sphere ∂Bct1 will not touch U and hence there will no signal at x. At some time t2 finally
the sphere will touch U , and this means that I hear the signal. I continue to experience the signal at
x up to the time t3 , when the whole domain U will be inside Bct3 . And since I am integrating over
the surface of my ball, after that time, for any t > t3 , I will have no indication of the signal at x. In
other words, the traveling wave in three dimensions has both the leading and the trailing edges sharp.

Figure 4.8: The spread of waves in three (left) and two (right) dimensional spaces. On the left I
integrate over the surface of my ball, and on the right I integrate over the whole shaded area

Now I will use the method of descent to obtain the explicit solution in the case R2 . The key idea
is to consider the problem

utt = c2 ∆u, x ∈ R2 , ut (0, x) = h(x)

as three dimensional and write the points in R3 as (x, x3 ) and set h(x, x3 ) = h(x). Then by Kirchhoff’s
formula the solution is given by
¨
1
U (t, x, x3 ) = h(σ) dσ.
4πc2 t ∂Bct (x,x3 )

I claim that this solution is independent of x3 and hence gives me the solution to the two-dimensional
problem for any choice of x3 , for example, x3 = 0. Indeed, I have that my surface consists of two

136
hemispheres defined explicitly as

ξ3 = x3 ± c2 t2 − r2 = F± (ξ), r2 = (ξ1 − x1 )2 + (ξ2 − x2 )2 .

Taking integral by any of these hemispheres and using the fact that h does not depend on x3 I get
¨
1
u(t, x) = h(σ) dσ
4πc2 t ∂Bct (x,x3 )
¨ √
1
= h(ξ) 1 + |(∂ξ1 F± )2 + (∂ξ2 F± )|2 dξ
2πc2 t Bct (x)
¨
1 h(ξ) dξ
= √ .
2πc2 t Bct (x) c2 t2 − |x − ξ|2

This yields

Theorem 4.10 (Poisson’s formula). The unique solution to the problem (4.9) with k = 2 is given by
( ¨ ¨ )
1 ∂ g(ξ) dξ h(ξ) dξ
u(t, x) = √ + √ .
2πc2 t ∂t Bct (x) c2 t2 − |x − ξ|2 Bct (x) c2 t2 − |x − ξ|2

The key fact here is that the integration now not over the surface but over the whole ball itself.
That is, the traveling wave has the sharp leading edge but not sharp trailing edge, our integral will
always be nonzero for any time t > t2 (see the right panel in the figure) since the initial disturbance
will be always inside the ball. The same holds for any space of even dimension. You can observe
actually this effect experimentally by putting a cork on water surface and dropping a stone nearby.
You will see how, after some initial time, the cork will feel the disturbance, but it will not stop and
continue to oscillate later.
Returning back to the quotation for this lecture, in even dimensions there is no possibility to talk
since the sound ways have no sharp trailing edge. In dimension five, however, the conversations can
be made the same way how we talk in our familiar three dimensions.

137

View publication stats

You might also like