Professional Documents
Culture Documents
Notes Full
Notes Full
net/publication/301700649
CITATIONS READS
0 5,596
1 author:
Artem S Novozhilov
North Dakota State University
55 PUBLICATIONS 1,117 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Artem S Novozhilov on 29 April 2016.
Artem Novozhilov
Department of Mathematics, North Dakota State University
These lecture notes were written in Spring 2016 when I taught an undergraduate course in PDE.
They include exactly the material I was able to cover during the usual three credit course. The
main idea was to keep the personal and informal spirit of the well known book by Stanley J. Farlow,
Partial Differential Equations for Scientists and Engineers and at the same time to add more rigor
and computations.
These lectures supplement the official course textbook by Peter Olver, Introduction to PDE. De-
spite the fact that the book is quite thick and very detailed, these lectures not very often follow the
exact details of Olver’s book. While different in many details and approaches, I generally keep the
same order of the material, since it looks to me to be very close to an ideal one for the first PDE
course.
There are very few problems in these notes, since I mostly used the problems from Olver’s book. I
do hope in the future to supplement each section with interesting and non-trivial problems that would
expand the material in these notes.
Artem Novozhilov1
1 May 2016
1
I can be contacted by e-mail: artem.novozhilov@ndsu.edu
1
Contents
2 Fourier method 42
2.1 The heat or diffusion equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.1.1 Conservation of energy plus Fourier’s law imply the heat equation . . . . . . . 42
2.1.2 Initial and boundary conditions for the heat equation . . . . . . . . . . . . . . 44
2.1.3 A microscopic derivation of the diffusion equation . . . . . . . . . . . . . . . . 45
2.2 Motivation for Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.3 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.3.1 Formulas for the coefficients of (2.10) . . . . . . . . . . . . . . . . . . . . . . . 54
2.3.2 On the convergence of (2.10) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.3.3 Sine and cosine series. Odd and even extensions . . . . . . . . . . . . . . . . . 59
2.3.4 Differentiating Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.3.5 Final remarks and generalizations . . . . . . . . . . . . . . . . . . . . . . . . . 62
2
2.4 Fourier method for the heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.4.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.5 Sturm–Liouville problems. Eigenvalues and eigenfunctions . . . . . . . . . . . . . . . . 72
2.6 Solving the wave equation by Fourier method . . . . . . . . . . . . . . . . . . . . . . . 75
2.7 Solving the Laplace equation by Fourier method . . . . . . . . . . . . . . . . . . . . . 81
3
0.1 What are PDE?
0.1.1 Basic definitions and the general philosophy of the course
Since the main prerequisite for this course is a basic course on Ordinary Differential Equations (ODE),
and everyone in class is accustomed with the idea to solve an equation where the unknown is some
function, I start directly with
Definition 0.1. Partial Differential Equation (abbreviated in the following as PDE in both singular
and plural usage) is an equation for an unknown function of two or more independent variables that
involves partial derivatives.
Since there is some vagueness in the given definition, I can give a mathematically more satisfactory
definition as
Definition 0.2. A PDE is an equation of the form
F (x, y, . . . , u, ux , uy , . . . , uxx , uxy , uyy , . . .) = 0
for the given function F and the unknown function u.
In Definition 0.2 I used the notation
∂u ∂2u
ux =, uxx = ,...
∂x ∂x2
for the partial derivatives. Sometimes other notations are used, in particular
∂u ∂2u
∂x u = , ∂xx u = ,...
∂x ∂x2
but I will usually stick to the notation with subscripts.
Definition 0.3. The order of PDE is the order of the highest derivative.
Example 0.4. Here is an example of a second order PDE:
ut = uxx + uyy + u,
where, as should be clear from the equation itself, the unknown function u is a function of three
independent variables (t, x, y). In the following I will save variable t to denote almost exclusively time
and x, y, z to denote the Cartesian coordinates.
It is nice to have a general and mathematically rigorous Definition 0.2, however, already at this
point I would like to state in a slightly incorrect and provocative form that
There exists no general mathematical theory of partial differential equations.
Moreover, the historical trend for studying various problems involving PDE shows that particular
specific examples of PDE, motivated by physical (geometrical, biological, etc) situations, are the
driving force for the development of abstract mathematical theories, and this is how I would like to
proceed in my course: From specific examples to necessary mathematical tools to the properties of
the solutions. There are a lot of good reasons picking such modus operandi, but instead of giving my
own arguments I will present two quotations.
The first one if from Preface to the first volume of Methods of mathematical physics by Courant
and Hilbert:
4
Since the seventeenth century, physical intuition has served as a vital source for math-
ematical problems and methods. Recent trends and fashions have, however, weakened
the connection between mathematics and physics; mathematicians, turning away from the
roots of mathematics in intuition, have concentrated on refinement and emphasized the
postulational side of mathematics, and at times have overlooked the unity of their sci-
ence with physics and other fields. In many cases, physicists have ceased to appreciate
the attitudes of mathematicians. This rift is unquestionably a serious threat to science
as a whole; the broad stream of scientific development may split into smaller and smaller
rivulets and dry out. It seems therefore important to direct our efforts toward reuniting
divergent trends by clarifying the common features and interconnections of many distinct
and diverse scientific facts. Only thus can the student attain some mastery of the material
and the basis be prepared for further organic development of research.
The second quotation is from Preface to Lectures on Partial Differential Equations by Vladimir
Arnold:
In the mid-twentieth century the theory of partial differential equations was considered the
summit of mathematics, both because of the difficulty and significance of the problems it
solved and because it came into existence later than most areas of mathematics.
Nowadays many are inclined to look disparagingly at this remarkable area of mathematics
as an old-fashioned art of juggling inequalities or as a testing ground for applications of
functional analysis. Courses in this subject have even disappeared from the obligatory
program of many universities [. . .] The cause of this degeneration of an important general
mathematical theory into an endless stream of papers bearing titles like “On a property
of a solution of a boundary-value problem for an equation” is most likely the attempt to
create a unified, all-encompassing, superabstract “theory of everything.”
The principal source of partial differential equations is found in the continuous-medium
models of mathematical and theoretical physics. Attempts to extend the remarkable
achievements of mathematical physics to systems that match its models only formally
lead to complicated theories that are difficult to visualize as a whole [. . .]
At the same time, general physical principles and also general concepts such as energy, the
variational principle, Huygens’ principle, the Lagrangian, the Legendre transformation, the
Hamiltonian, eigenvalues and eigenfunctions, wave-particle duality, dispersion relations,
and fundamental solutions interact elegantly in numerous highly important problems of
mathematical physics. The study of these problems motivated the development of large
areas of mathematics such as the theory of Fourier series and integrals, functional analysis,
algebraic geometry, symplectic and contact topology, the theory of asymptotics of inte-
grals, microlocal analysis, the index theory of (pseudo-)differential operators, and so forth.
Familiarity with these fundamental mathematical ideas is, in my view, absolutely essential
for every working mathematician. The exclusion of them from the university mathemat-
ical curriculum, which has occurred and continues to occur in many Western universities
under the influence of the axiomaticist/scholastics (who know nothing about applications
and have no desire to know anything except the “abstract nonsense” of the algebraists)
seems to me to be an extremely dangerous consequence of Bourbakization of both mathe-
matics and its teaching. The effort to destroy this unnecessary scholastic pseudoscience is a
5
natural and proper reaction of society (including scientific society) to the irresponsible and
self-destructive aggressiveness of the “superpure” mathematicians educated in the spirit of
Hardy and Bourbaki.
Following the spirit of these two citations (one is from 1924 and another is from 2004) I will try to
use the physical intuition and concentrate on the specific examples rather than on the general theory
as much as possible.
Example 0.5. I assume that the function u of two independent variables (x, y) satisfies the PDE
ux = 0.
where f is an arbitrary function of variable y. Hence the first conclusion: while the general solutions
to ODE usually depend on the arbitrary constants (the number of which coincides with the order
of the equations), for PDE the general solution depends on the arbitrary functions. This fact alone
should convince you in a bigger complexity of PDE.
Next important (and very non-obvious) moment here is whether I can take any function f for my
general solution. Jumping way ahead, I would like to state that “What does it mean to solve a PDE?”
is a very difficult question. This difficulty notwithstanding, most of the time we will be contend to
live with a much easier specific concept which is called the classical solution:
Definition 0.6. The function u : D −→ R is called a classical solution to a k-th order PDE if it
satisfies this equation at every point of its definition and belongs to the set C (k) (D; R).
Recall that the notation C (k) (U ; V ) means the set of functions u : U −→ V whose all k-th or-
der derivatives are continuous (it is said that the function u is k times continuously differentiable).
Therefore (returning to Example 0.5) my general solution u(x, y) = f (y) will be a classical solution
to ux = 0 only if f ∈ C (1) (R; R).
uxy = 0?
6
• One-dimensional transport equation:
ut + cux = 0.
• Wave equation:
utt = ∆u.
Recall that ∆ is called the Laplace operator and is given by
∆u = div grad u = ∇2 u,
here ∇ is the del operator, in this particular case ∇ = (∂x , ∂y , ∂z ). In particular in the Cartesian
coordinates it is written for u : R3 −→ R as
it should be clear how to write the Laplace operator for the functions defined on the plane R2
and on the line R.
• Laplace equation:
∆u = 0.
The first one is a linear first order equation and the other three are linear second order equations.
It is simply staggering how much modern mathematics was developed in the attempts to solve these
equations (or their close relatives). We will see only a tiny, but very important, part of this.
• Helmholtz’s equation:
∆u = λu, λ ∈ R.
• Schrödinger’s equation:
iut + ∆u = 0.
Here i is the imaginary unit, i2 = −1.
• Telegraph equation:
utt + 2dut − uxx = 0, d > 0.
• Beam equation:
utt + uxxxx = 0.
All the examples above are linear. Here are some nonlinear examples:
7
• Hopf ’s equation:
ut + uux = 0.
• The eikonal equation: (from German word for image)
(ux )2 + (uy )2 = 1.
• Hamilton–Jacobi equation:
ut + H(ux , x) = 0,
where H is a given nonlinear function, which is called the Hamiltonian.
• Korteweg–de Vries (KdV) equation:
ut + uux + uxxx = 0.
• Reaction–diffusion equation:
ut = f (u) + ∆u,
where f is a given nonlinear function.
It is also often necessary and important to study systems of PDE:
• Maxwell’s equations of classical electrodynamics:
E t = curl B,
B t = − curl E,
div B = div E = 0.
Here E = (E1 , E2 , E3 ), B = (B1 , B2 , B3 ),
∂F1 ∂F2 ∂F3
div F = ∇ · F = + + ,
∂x ∂y ∂z
and
i j k
curl F = ∇ × F = ∂x ∂y ∂z .
F1 F2 F3
8
Chapter 1
q(t, x1 ) − q(t, x2 ),
where q(t, x) is the flux of the same substance, i.e., the change of this substance per time unit at the
point x. Basically, q(t, x1 ) tells me how much enters (x1 , x2 ) at time t at the left boundary and q(t, x2 )
tells me how much substance leaves (x1 , x2 ) at the right boundary, hence the choice of the signs. By
the conservation law
ˆ x2 ˆ x2
ut (t, x) dx = q(t, x1 ) − q(t, x2 ) = − qx (t, x) dx,
x1 x1
where the last equality holds by the fundamental theorem of calculus. Finally, since my interval is
arbitrary, I conclude that my substance must satisfy the equation
ut + qx = 0.
9
To arrive to the final equation I must decide on the connection between u and q. In the simplest case
q(t, x) = cu(t, x)
for some constant c I end up with the linear one-dimensional transport equation
ut + cux = 0. (1.1)
Keeping in mind the physical interpretation of (1.1), I will need also some initial condition
for some given function g (initial density) and, if my system is not spatially infinite, some boundary
conditions (see below). At this point I assume that it is a good approximation to let x ∈ (−∞, ∞)
and hence no need for boundary conditions. My next goal is to show that I can always find a solution
to the problem (1.1)–(1.2) and, even more importantly, that this solution is physically relevant (the
problem is well posed ).
γ · ∇u = 0, γ = (1, c).
Here dot denotes the usual dot product between two vectors. We know that if the dot product is zero
hence these two vectors are orthogonal:
γ ⊥ ∇u.
At the same time, the gradient of the function points in the direction of the fastest increase and,
hence, perpendicular to the level sets u(t, x) = const, which are curves on the plane (t, x). Finally,
I conclude that γ is parallel to the level sets u(t, x) = const, therefore the level sets are the straight
lines with the direction vector γ. I can rephrase the last sentence as follows: the sought function u
is constant along any straight line with the direction vector γ. All such straight lines can be written
as x = ct + ξ, where ξ is some constant. Now I claim that this geometric interpretation I discussed is
enough to find the solution to our problem. Namely, since I know that u is constant along x = ct + ξ
then the value of u at an arbitrary point (t∗ , x∗ ) is the same along the line x = ct + x∗ − ct∗ . At the
initial time t = 0 I will get the unique point x = x∗ − ct∗ , an the value of u(t∗ , x∗ ) must be equal to
g(x) = g(x∗ − ct∗ ). Therefore, I found, dropping the asterisks, that the unique solution to the problem
(1.1)–(1.2) is
u(t, x) = g(x − ct).
I am aware that for some people the geometric arguments I presented are not very convincing (but
please see the figures below) therefore I will present a totally algebraic argument, which also will allow
10
me to find a general solution to the linear transport equation. The key idea is that I should try to
consider (1.1) in the new coordinates (which I “guessed” from the geometry of the equation)
τ = t, ξ = x − ct,
(note that in the textbook t is used instead of τ ). This change of variables is clearly invertible.
I have
u(t, x) = u(τ, ct + ξ) = v(τ, ξ) = v(t, x − ct).
Now I can use the chain rule. To wit,
∂u ∂v ∂v ∂τ ∂v ∂ξ ∂v ∂v
(t, x) = (t, x − ct) = (τ, ξ) + (τ, ξ) = (τ, ξ) − c (τ, ξ).
∂t ∂t ∂τ ∂t ∂ξ ∂t ∂τ ∂ξ
Similarly,
∂u ∂v
(t, x) = (τ, ξ).
∂x ∂ξ
After plugging the found expressions into (1.1) I get
vτ − cvξ + cvξ = 0 =⇒ vτ = 0,
which we already solved in Section 0.1. Therefore,
v(τ, ξ) = F (ξ)
and hence, returning to the original coordinates,
u(t, x) = F (x − ct),
where F is an arbitrary C (1) function (recall that I am looking for a classical solution). This is the
general solution to equation (1.1).
To use the initial condition (1.2) I take t = 0 and get, as expected, that F (x) = g(x) hence F = g.
Summarizing,
Theorem 1.1. Problem (1.1) has the general solution
u(t, x) = F (x − ct)
for an arbitrary F ∈ C (1) (R; R) function.
The initial value problem (1.1), (1.2) with g ∈ C (1) has a unique classical solution
u(t, x) = g(x − ct).
Theorem 1.1 is an existence and uniqueness theorem for the initial value problem for the linear
one dimensional transport equation.
The straight lines x = ct + ξ are very important and called the characteristics. Hence we now
know that the solutions to the transport equation are constant along the characteristics.
What is the geometric meaning of the function of the form g(x − ct)? Take, e.g., c > 0 then
for fixed time moments t0 = 0, t1 > t0 , t2 > t1 , . . . I will get the same graph of g only shifted by
0, ct1 , ct2 , . . . units to the right. What I observe is a traveling wave moving from left to right with
the speed c. If c < 0 then my wave will travel with the speed |c| from right to left. This geometric
picture should explain the title transport equation, since, according to the analysis above, the equation
describes the transportation of the substance u from the point (0, ξ) to some other point (t, x) along
the characteristic x = ct + ξ.
11
Figure 1.1: Traveling wave solution for different time moments
ut + ux = 0,
We have that the solution is u(t, x) = e−(x−t) , whose graphs at different time moments are given in
2
Fig. 1.1.
In Fig. 1.2 one can see the same solution in the form of full three dimensional surface (the left
panel) and a contour plot (the right panel), on which the level sets are represented.
Figure 1.2: Traveling wave solution in the form of 3D plot (left) and contour plot (right)
12
Finally, we can put everything together in the same figure (Fig. 1.3), where you can see solutions at
different time moments (bold curves), the three dimensional surface, together with the characteristics
(bold dashed lines) on the plane (t, x).
Example 1.3 (Distributed source). To practice the considered approach, consider the initial value
problem (this is Problem 2.2.10 from the textbook)
ut + cux = f (t, x), t > 0, x ∈ R,
(1.3)
u(0, x) = g(x), x ∈ R.
The physical interpretation of problem (1.3) is that now we have a distributed source of the substance
with the intensity (density per time unit) f (t, x) at the time t and the position x.
Exercise 1.1. Deduce problem (1.3) from physical principles and the conservation law.
Using the same change of variables I find
vτ = f (τ, cτ + ξ).
Integrating, I find ˆ τ
v(τ, ξ) = f (s, cs + ξ) ds + F (ξ),
0
or, returning to the original variables,
ˆ t
u(t, x) = f (s, x − c(t − s)) ds + F (x − ct).
0
13
Example 1.5 (Transport equation with decay). Recall that one of the very basic ODE is the so called
decay equation (you could have seen it with respect to, e.g., radioactive decay, it literally says that
the rate of decay of some compound is proportional to the present amount)
u′ = −au,
for some constant a > 0. The solution to this equation is (by, e.g., the separation of variables)
u(t) = Ce−at
vτ = −v,
and hence
v(τ, ξ) = F (ξ)e−τ ,
where F is an arbitrary smooth function of ξ. Returning to the original variable,
u(t, x) = F (x − ct)e−t ,
which results in
u(t, x) = g(x − ct)e−t
for the initial condition
u(0, x) = g(x).
Now the solutions are not constants along the characteristics, but they are decaying along them
representing a damped traveling wave.
Example 1.6 (A problem with a boundary condition). Assume that now u denotes the concentration
of some pollutant in a river and there is a source of this pollutant at the point x = 0 of the intensity
f (t) at time t (this is essentially Problem 2.2.14). Mathematically it means that I consider only
problem for x > 0, t > 0 for the equation
ut + cux = 0
with c > 0.
Since it is natural to have the initial condition now only for x > 0 for some part on the first
quadrant I will have to use the boundary condition. The characteristic x = ct separates two regions,
where I must use either the initial or boundary condition. If x ≥ ct then I can use the usual initial
condition. If, however, x < ct then for my general solution u(t, x) = F (x − ct) I must have
F (−ct) = f (t),
14
Figure 1.4: Correctly (left) and incorrectly (right) stated boundary conditions for a transport equation.
The arrows indicate the direction of the transport with time, hence for the left panel I have c > 0 and
for the right one c < 0
or
F (τ ) = f (−τ /c).
Finally I obtain a unique solution
{
g(x − ct), x ≥ ct,
u(t, x) =
f (t − x/c), x ≤ ct.
To guarantee that my solution is a classical one, I should request that f (0) = g(0) and −cg ′ (0) = f ′ (0)
hold.
Now assume that c < 0 in my problem. Therefore the characteristics will have a negative slope
and each characteristic will cross both x > 0 and t > 0 half-lines. This will result in a well posed
problem if and only if the initial and boundary values will be coordinated, otherwise, the problem will
have no physical solution (see Fig. 1.4).
This is a first sign that in PDE a correct (physically) choice of initial and boundary conditions
will result in a well posed problem, whereas an arbitrary assigning initial conditions can lead to no
solutions at all.
Definition 1.7. The problem is called well possed is it has a solution, this solution is unique, and
this solution depends continuously on the initial data and the parameters.
15
In the examples above I usually showed that a solution exists by presenting an explicit formula
for this solution. The uniqueness was guaranteed by the fact that the signals are spread along the
characteristics and the initial (and boundary) conditions uniquely prescribe values at one point of the
characteristic. Here I will give a first example of continuous dependence on the initial data.
Proposition 1.8. Consider a boundary-initial value problem for the transport equation:
Finally I note that if u1 , u2 are solutions to the problems with g1 , f1 and g2 , f2 respectively, then
u1 − u2 , due to linearity, solves the problem with the initial condition g1 − g2 , f1 − f2 . Hence I end up
with the estimate
ˆ R ˆ R ˆ t
( )2 ( )2 ( )2
u1 (t, x) − u2 (t, x) dx ≤ g1 (x) − g2 (x) dx + a f1 (s) − f2 (s) ds,
0 0 0
which shows that small change in the initial data yields small change in the solutions, which is
tantamount to saying that the solution depends continuously on the initial data.
where a, b, c, d ∈ C (1) (R2 ; R) are given functions. I will consider an initial condition
16
which says that the initial condition is prescribed along some arbitrary (well, not totally, see below)
curve Γ on the plane (x, y). The main difference from the textbook is that I am allowing to have rather
general initial conditions contrary to the fixed value t = 0 in the textbook. I also use the variables
x and y for the independent variables to emphasize that while it is important to keep in mind the
physical description of the problem, mathematically for us both variables t (time) and x (space) are
equally important and sometimes better make them indistinguishable. Even more importantly, a lot
of first order PDE appear naturally in geometric rather than physical problems, and for this setting
x and y are our familiar Cartesian coordinates.
Remark 1.9. All I am going to present is almost equally valid for a semi-linear first order equation
along the characteristics problem (1.4) (or (1.6)) becomes an ordinary differential equation.
Indeed, consider the solution u(x, y) to (1.4) along (x(τ ), y(τ )). It becomes just the function of τ
alone: v(τ ) = u(x(τ ), y(τ )) (here I picked a different letter to emphasize that v depends only on τ ).
Now take the derivative with respect to τ :
dv
= ux x′τ + uy yτ′ = a(τ )ux + b(τ )uy = c(τ )v + d(τ ),
dτ
which is a linear first order ODE. To get the initial condition for this ODE I will use (1.5).
In general (several examples are given below), to solve the initial value problem (1.4)-(1.5) I proceed
in the following way. I consider a parametrization of the initial curve Γ:
Γ : x(ξ), y(ξ),
v̇ = c(τ )v + d(τ )
17
with the initial condition
v(0, ξ) = g(ξ).
The unique solution is v(τ, ξ) and I found a parametric representation of the solution to (1.4)-(1.5):
x(τ, ξ), y(τ, ξ), v(τ, ξ).
If I am able to express τ and ξ from the first two functions then I will finally get the unique solution
u(x, y) = v(τ (x, y), ξ(x, y)).
There is sill the question whether I will always be able to do it, but I will postpone the general
disussion and instead consider a few examples.
Example 1.10. Solve
xux + yuy = u,
(1.8)
u(x, 1) = g(x).
The curve of the initial conditions is given simply as y = 1. In this case I can always take
x = ξ, y = 1.
Hence I have for my characteristics
dx
= x, x(0, ξ) = ξ,
dτ
dy
= y, y(0, ξ) = 1,
dτ
which immediately implies that
x(τ, ξ) = ξeτ , y(τ, ξ) = eτ .
From the last expression I also have that
x
ξ=
.
y
Along my characteristics I have (note the initial condition)
dv
= v, v(0, ξ) = g(ξ) =⇒ v(τ, ξ) = g(ξ)eτ .
dτ
Finally, returning to the initial variables (x, y), I have
( )
x
u(x, y) = g y.
y
Note that my solution is not defined at y = 0.
To present graphs (see Fig. 1.5), I will use
g(x) = e−x ,
2
18
Figure 1.5: Left panel: The bold solid line is the curve of the initial conditions. The dotted lines are
the characteristics. The points show where the initial condition for each characteristic is given. Note
that the characteristics are defined only for y > 0. Right panel: The surface of the solution along with
the initial condition (bold curve) and solutions of the corresponding ODE along the characteristics
yux − xuy = 0,
(1.9)
u(x, 0) = g(x), x > 0.
The reason why I define the initial condition only for x > 0 will be given below.
The system for characteristics is given by
dx
= y, x(0, ξ) = ξ,
dτ
dy
= −x, y(0, ξ) = 0.
dτ
Probably the easiest way to solve it is to reduce this system to one second order ODE. Denoting with
prime the derivative with respect to τ I have
x′′ = y ′ = −x =⇒ x′′ + x = 0.
This is the equation for the harmonic oscillator, its general solution is
x2 + y 2 = ξ 2 ,
19
hence my characteristics are circles of radius ξ. As a side remark I note that the same result can be
obtained by reducing the system to just one equation:
dx
x
dτ
dy
=−
y
dτ
which gives me the solution to my problem. If I take g(x) = sin x, then the solution is drawn in Fig. 1.6.
Now we can see why the initial condition was prescribed only for x > 0. Since the characteristics are
circles in this problem, if my initial condition was given for −∞ < x < ∞ then each characteristic
would intersect it as two points. Therefore, for each ODE along this characteristic I would have two
initial conditions, which yields a contradiction (nonexistence of solution in all but very special cases).
To conclude these examples we must decide when I actually can express my two parameters τ and
ξ as functions of x, y. It turns out (this is usually not covered in Calc III, but a curious student can
look up the inverse function theorem) that it is always true if
Figure 1.6: The surface of the solution along with the initial condition (bold curve) and solutions of
the corresponding ODE (bold dashed curves) along the characteristics (thin solid curves)
20
the curve of the initial conditions is not tangent to any characteristic.
Summarizing,
Proposition 1.12. Problem (1.4)-(1.5) has a unique solution, which can in general be defined on some
subset of R2 , if the curve Γ on which the initial conditions are given is not tangent to a characteristic,
and if the characteristics do not intersect Γ at more than one point.
Example 1.13. Now I would like to reconcile the theory I outlined above and the approach in the
textbook, where the transport equation with a non-constant velocity is given:
ut + c(x)ux = 0,
c(x) = −x .
21
To conclude, in the last two lectures I considered the so-called method of characteristics to solve an
initial value problem for a linear (or semi-linear) first order PDE, where the unknown function depends
on two independent variables. The key fact is that along the special curves, called the characteristic
curves or characteristics, these PDE turn into ODE, for which an extensive theory exists (from a
physical point of view this is a manifestation of particle-wave duality, when the system can be either
described by the positions of discrete particles or using a continuous representation of a force field).
This method can be immediately generalized to linear first order PDE with more than two independent
variables and also, with some modifications, to nonlinear equations. I will only touch on the latter
topic in the following lecture.
22
Figure 1.7: Nonlinear wave in the Hopf’s equation
Exercise 1.2. Check directly that u = g(x − ut) solves problem (1.10)-(1.11).
To see the possible consequence of (1.12) consider the following example.
Example 1.14. Let
g(x) = e−x .
2
which is impossible to resolve for u, but I can sketch my solution for several time moments, see
Fig. 1.7. We can see that for our nonlinear problem the velocity of propagation depends on the
spatial coordinate and the initial density. Basically the equation itself says that the points with higher
concentration move faster, with time overcoming the points with lower concentration.
To understand what is happening here it is worth drawing the characteristics (see Fig. 1.8). Since
the slope of the characteristics depends on the initial concentrations, in some cases, as in my example,
the characteristics can intersect. Since my solution must be constant along the characteristics it means
that at the point where the characteristics intersect I must have more than one value of u, which is
what exactly happens in Fig. 1.7.
To find the first time moment when the characteristics intersect, I can look at the point when the
derivative of u with respect to x becomes infinite. Using the implicit differentiation, I find from (1.12)
( )
∂u ′ ∂u ∂u g ′ (x − ut)
= g (x − ut) 1 − t =⇒ = .
∂x ∂x ∂x 1 + g ′ (x − ut)t
Recalling that x − ut = ξ I finally get the condition of the first time when my solution becomes
multivalued (the characteristics intersect) as
{ }
1
t = min − ′ .
ξ g (ξ)
23
Figure 1.8: Characteristics of the Hopf’s equation with the initial condition g(x) = e−x
2
As a takehome message from this lecture I would like all the students to remember that the main
distinction between linear (solutions to the transport equation) and nonlinear (solutions to the Hopf’s
equation) waves is that for the linear wave all the points move with the same velocity whereas for the
nonlinear waves the velocity depends on the concentration. If someone wants to have a more colorful
picture of this phenomenon remember the waves at the ocean, when eventually the wave collapses:
this is a nice example of a nonlinear wave.
24
1.4 Deriving the wave equation
From now on I consider only linear second order partial differential equations, and the first equation
I will study is the so-called wave equation which, in one spatial dimension, has the form
mu′′ = −ku.
Here u(t) is the position of my mass at time t, u′′ (t) is the acceleration, m is the mass, ku is the
Hook law that says that the force is proportional to the displacement (note that this is actually true
only for small displacements), the minus because the force points opposite to the u-axis, and k is the
parameter, rigidity of the spring. Rewriting this as
√
′′ 2 k
u + ω u = 0, ω = ,
m
I find that the general solution to this ODE can be written as
which represents harmonic oscillations. C1 and C2 are arbitrary constants that can be uniquely
identified using the initial position and initial velocity of the mass.
Now let me consider a more general situation, when I have k + 1 equal masses, such that the zeroth
and the last one are fixed, but all others are free to move along the axis and linearly interconnected
with the springs with the same rigidity k (see Fig. 1.11).
25
Figure 1.11: A system of k + 1 masses connected by identical springs. The initial and final masses are
fixed
Let ui (t) denote the displacement at time t of the i-th mass from its equilibrium position xi . Using
again the second Newton’s law I get, for the i-th mass
mu′′i = Fi,i+1 −Fi,i−1 = k(ui+1 −ui )−k(ui −ui−1 ) = k(ui+1 −2ui +ui−1 ), i = 1, . . . , k−1, u0 = 0, uk = 0.
For each ui I also should have two initial conditions: initial displacement and initial velocity. This
is still a system of (k − 1) ODE, which can be actually analyzed and solved. I, however, instead of
solving it, will consider the situation when the number of my masses grows to infinity, i.e., k → ∞.
Assuming that at the equilibrium all my masses separated by the same distance h it is equivalently
to say that I consider the case when h → 0. In other words I consider a continuous limit of a discrete
system!
I have that ui (t) = u(t, xi ) and the equilibrium positions of all the masses become a continuous
variable x. That is, when h → 0, I have that the vector (u1 (t), . . . , ui (t), . . . , uk−1 (t)) becomes a
function of two variables u(t, x), where now the meaning of u(t, x) is the displacement of the section
of a rod at time t that had the coordinate x at rest. Note that my discrete system of masses turns
into a rod. To carefully perform the limit I also need to understand how my two parameters behave.
It is natural to assume that
m = ρSh,
where ρ is density of the rod, S is the area of transverse section. For the rigidity I will use the physical
fact that
ES
k= ,
h
where E is the Young’s modulus. Hence I get that my system can be rewritten now as
26
where dots denote the terms of order 3 and above in h. Now if I plug these expressions into my
differential equation and cancel h2 , I can see that
u(t, 0) = u(t, l) = 0,
Arguably the best way to get an intuitive understanding what is modeled with this equation is to
imagine an infinite guitar string, where u(t, x) represent a transverse displacement at time t at posi-
tion x.
Very surprisingly (do not get used to it, this is a very rare case for PDE), problem (1.14)–(1.15)
can be solved explicitly.
27
1.5.1 The general solution to the wave equation
First I will find the general solution to (1.14), i.e., the formula that includes all possible solutions to
the wave equation. To do this I note that I can rewrite (1.14) as
( 2 ) ( )( )
∂2u 2
2∂ u ∂ 2 ∂
2 ∂ ∂ ∂ ∂
−c = −c u= −c +c u = 0.
∂t2 ∂x2 ∂t2 ∂x2 ∂t ∂x ∂t ∂x
Denoting ( )
∂ ∂
v= +c u,
∂t ∂x
I find that my wave equation (1.14) is equivalent to two first order linear PDE:
vt − cvx = 0,
ut + cux = v.
From the previous lectures we know immediately that the first one has the general solution v(t, x) =
F ∗ (x + ct), for some arbitrary F ∗ , and the second one has the solution
ˆ t ˆ t
u(t, x) = v(s, x − c(t − s)) ds + G∗ (x − ct) = F ∗ (x − ct + 2cs) ds + G∗ (x − ct).
0 0
28
and finally ˆ
1( ) 1 x+ct
u(t, x) = f (x + ct) + f (x − ct) + g(s) ds, (1.17)
2 2c x−ct
which is d’Alembert’s formula.
Remark 1.15. All the results above are obtained in a different way in the textbook. Namely, the
change of variables
η = x + ct, ξ = x − ct
reduced the wave equation to its canonical form
vηξ = 0.
Example 1.16. Let my initial condition be such that the initial velocity is zero. Then my formula
simplifies to
1( )
u(t, x) = f (x + ct) + f (x − ct) ,
2
and hence my solution is a sum of two identical linear waves, each of which is exactly half of the initial
displacement. Therefore I can envision the behavior of solutions by first dividing in half the initial
displacement, then shifting one half to the left and the other one to the right, and then adding them.
Let, e.g, f be defined as
0, x < −0.5,
f (x) = 1, −0.5 ≤ x ≤ 0.5,
0, x > 0.5.
The form of the solution at different time moments is shown in Fig. 1.12.
Now it should be clear how the three dimensional surface looks like (Fig. 1.13, left panel).
Finally, the behavior of solution should be also clear from looking at the plane (t, x) and several
straight lines of the form
x = ct + ξ, x = −ct + η,
which are also called the characteristics of the wave equation. Note that the signal spreads along the
characteristics (Fig. 1.13, right panel).
29
Figure 1.12: Solution at different time moments to the initial value problem for the wave equation in
case when the initial velocity is zero
30
Figure 1.13: Left: Solution in the coordinates (x, t) to the initial value problem for the wave equation
in case when the initial velocity is zero. Right: Characteristics of the wave equation
Therefore the solution now is a difference of two traveling waves, each of which is exactly half of G.
In my case
0, x < −0.5,
G(x) = x + 0.5, −0.5 ≤ x ≤ 0.5,
1, x > 0.5,
and my solution is given in Fig. 1.14. See also Fig. 1.15 for the three dimensional picture. Again the
overall picture can be figured out from the plane (t, x) and characteristics on it.
These two examples give a general idea how actually solutions to the wave equation behave. Note
that in both cases I used initial conditions that are not continuously differential (they have “corners”)
and hence my solutions are not the classical solutions. However, the notion of the solution to the wave
equation can be extended in a way to include these, nondifferential, solutions. They are usually called
weak solutions.
Another point to note that characteristics of the wave equation allows immediately to see which
initial conditions contribute to the solution at a given point (t, x) (this is called domain of dependence)
and also how the given point ξ on the initial condition spreads the signal with time (range of influence),
see Fig. 1.16.
31
Figure 1.14: Solution for different time moments to the initial value problem for the wave equation in
case when the initial displacement is zero. The actual solution is shown with the green border
32
Figure 1.15: Solution in the coordinates (x, t) to the initial value problem for the wave equation in
case when the initial displacement is zero
Figure 1.16: Domain of dependence of the point with coordinates (t, x) (left, the bold line on the x
axis) and the range of influence of the point ξ along the line of the initial conditions (right, the shaded
area)
33
1.6 Solving the wave equation. Some extensions
1.6.1 Linear versus nonlinear equations
I would like to start this lecture with a discussion what is called linear in mathematics. Recall that I
used the presentation ( 2 )
∂ 2
2 ∂
−c u=0
∂t2 ∂x2
to write the wave equation in a form that hints to the possible way to solve it. The expression inside
the parenthesis is called a differential operator. If I denote this differential operator as L then I can
use the notation
Lu = 0
to write my wave equation in a concise way.
In general, any differential equation, either ordinary of partial, can be written in the form Lu = f ,
where L is some differential operator and f some function that does not depend on u. For example, for
the simple harmonic oscillator u′′ + u = 0 I have that L = dtd2
2 + 1. For the linear transport equation
with variable speed ut + c(x)ux = 0 my operator is ∂t + c(x)∂x and so on.
Definition 1.18. A (differential) operator L is called linear if for any u1 and u2 from its domain and
any constants α1 and α2 I have
For example, the wave operator L = ∂tt −c2 ∂xx , the simple harmonic oscillator operator L = ∂tt +1,
and the operator from linear transport equation L = ∂t + c(x)∂x are linear (check this!). However,
the differential operator of the Hopf equation ut + uux = 0 is not linear. Indeed, let me apply this
operator to αu for some constant α. Then I have that αut + α2 uux , which is not equal to α(ut + uux )
as necessary for the operator to be linear.
Now I can give a rigorous definition of a linear differential equation.
Lu = f, (1.18)
where f does not depend on u, is called linear if L is a linear operator. It is called linear homogeneous
if f = 0 and inhomogeneous (or nonhomogeneous) otherwise. If L is not linear then equation (1.18)
is called nonlinear.
The linearity of the equation is very important, since for the linear equations holds the so-called
superposition principle, which is a consequence of the following simple and yet very important propo-
sition.
Lu = f, (1.19)
34
If u1 , u2 solve (1.20) then α1 u1 + α2 u2 also solves (1.20). If u1 , u2 solve (1.19) then u1 − u2 solves
(1.20). If u1 solves Lu = f1 and u2 solves Lu = f2 then α1 u1 + α2 u2 solves Lu = α1 f1 + α2 f2 . Finally,
any solution to (1.19) can be written as
u = uh + up ,
where uh is the general solution to (1.20) and up is a particular solution to (1.19).
Proof. Since most of the stated facts follow directly from the definition, I will only prove the final
statement.
Indeed, if I have some (fixed) particular solution up to (1.19) then the difference u − up , where u
is an arbitrary solution to (1.19), must give any (general) solution uh to (1.20). That is u − up = uh ,
which implies u = uh + up , which is the required form.
where F, f, g are given sufficiently smooth functions. First I will use the linearity of this equation to
divide it into simpler problems. To wit, consider
vtt = c2 vxx ,
v(0, x) = f (x), (1.22)
vt (0, x) = g(x),
and
wtt = c2 wxx + F (t, x),
w(0, x) = 0, (1.23)
wt (0, x) = 0.
That is, I divided my original problem into the initial value problem for the homogeneous wave
equation and inhomogeneous problem with zero initial conditions.
Lemma 1.21. Let v solve (1.22) and w solve (1.23). Then u = v + w solves (1.21).
Proof. Exercise.
I know how to solve problem (1.22), for this I can simply use d’Alembert’s formula. Hence I
only need to figure out how to solve inhomogeneous problem (1.23). For this I will use the so-called
Duhamel’s principle, which generally works for linear differential equations. The idea is to reduce the
inhomogeneous problem to a series of homogeneous ones with specific initial conditions and after it
35
sum (integrate) everything together by using the linearity of the equation. To perform this program
I should be able to solve, along with (1.22), the following problem
vtt = c2 vxx ,
v(τ, x) = f (x), (1.24)
vt (τ, x) = g(x),
Lemma 1.22. Problem (1.24) has the solution (note that I include the parameter τ also in my formula)
ˆ
f (x − c(t − τ )) + f (x + c(t − τ )) 1 x+c(t−τ )
v(t, x; τ ) = + g(s) ds.
2 2c x−c(t−τ )
Proof. To prove the result one can use a new variable η = t − τ , for which the initial condition is
given at zero, use d’Alembert’s formula and return to the original variables. The details are left as an
(easy) exercise.
Lemma 1.23. Consider yet another initial value problem for the wave equation:
rtt = c2 rxx ,
r(τ, x; τ ) = 0, (1.25)
rt (τ, x; τ ) = F (τ, x),
Proof. To prove the lemma I will use Leibnitz integral rule, which says that
ˆ b(x) ˆ b(x)
d d
g(s, x) ds = g(s, x) ds + g(b(x), x)b′ (x) − g(a(x), x)a′ (x).
dx a(x) a(x) dx
I have ˆ t ˆ t
wt = r(t, x; t) + rt (t, x; τ ) dτ = rt (t, x; τ ) dτ.
0 0
Next ˆ t ˆ t
wtt = rt (t, x; t) + rtt (t, x; τ ) dτ = F (t, x) + rtt (t, x; τ ) dτ.
0 0
I also have ˆ t ˆ t
1
wxx = rxx (t, x; τ ) dτ = rtt (t, x; τ ) dτ.
0 c2 0
Hence I get wtt − c2 wxx = F (t, x) and w(0, x) = wt (0, x) = 0, which concludes the proof.
36
Now I can put everything together. According to Lemma (1.22) solution to (1.25) is given by
ˆ x+c(t−τ )
1
r(t, x; τ ) = F (τ, η) dη.
2c x−c(t−τ )
Using d’Alembert’s formula, Lemma 1.21, and Lemma 1.23 I have that the solution to (1.21) is given
by
ˆ ˆ ˆ
f (x − ct) + f (x + ct) 1 x+ct 1 t x+c(t−τ )
u(t, x) = + g(s) ds + F (τ, η) dη dτ,
2 2c x−ct 2c 0 x−c(t−τ )
which is the solution to the inhomogeneous wave equation with given initial conditions.
Remark 1.24. The same solution obtained in the textbook by switching to the characteristic coor-
dinates ξ = x − ct, η = x + ct. I invite the students to read through this derivation.
Remark 1.25. I will use Duhamel’s principle again later in the course, so I summarize here the main
idea of this principle. It proceeds in two steps:
1. Construct a family of solutions of the homogeneous initial value problem with variable initial
moment τ > 0 and the initial data F (τ, x).
Remark 1.26. The double integral in the final solution shows that the disturbance at the point (t, x)
is obtained through the disturbances from every point of the characteristic triangle (that is, of the
triangle with vertex (t, x) and two sides given by characteristics x − ct and x + ct). Hence if we have
only initial displacement, the signal comes from just two points; if we have initial velocity nonzero
that the signal comes from the interval (domain of dependence); if we have nonhomogeneous equation,
the signal is summed (integrated) throughout the whole characteristic triangle.
Exercise 1.4. Show that the Duhamel principle, applied to the first order linear ODE u′ +p(t)u = q(t)
yields the variation of the constant method.
ut (t, 0) = 0
corresponds to a free end of the string. Let me start first with the fixed end.
37
I consider the problem
Lemma 1.27. Let the initial conditions for the wave equation on the infinite string are odd functions
with respect to some point x0 then the corresponding solution at this point is equal to zero.
Proof. Without loss of generality, assume that x0 = 0. Then I have that f (x) = −f (−x), g(x) =
−g(−x). Using d’Alembert’s formula for x = 0 yields
ˆ
f (−ct) + f (ct) 1 ct
u(t, 0) = + g(s) ds = 0,
2 2c −ct
since the first term is zero because f is odd and the second one is also zero as an integral of an odd
function through a symmetric interval.
This lemma implies that to solve (1.26) all I need is to extend my initial conditions in an odd
fashion and apply d’Alembert’s formula.
Example 1.28. Consider again as the initial displacement the function that is equal to 1 on (0.5, 1.5)
and zero otherwise. If one continues this function in add fashion then we can use the usual d’Alembert’s
formula. The details are given in Fig. 1.17.
Similarly we can treat the case when the initial displacement is zero and the initial velocity is now
given as the same function (see Fig. 1.18).
How to deal with the problem with a free end? For this I can use
Lemma 1.29. Let the initial conditions for the wave equation on the infinite string are even functions
with respect to some point x0 then the derivative of the corresponding solution at this point is equal to
zero.
Hence to solve the problem for the wave equation with a free end one just needs to extend the
initial conditions, given for x > 0 to the whole axis in an even fashion, see 1.19. I will leave it as an
exercise to realize what happens with the string if the initial velocity is given.
Similar technique can be used to solve problems with two boundary conditions.
1.7.2 The initial value problem for the wave equation is well posed
Recall that physically we are usually looking for well posed problems. It means that solution ex-
ists, unique, and depends continuously on the initial conditions. For the infinite string d’Alembert’s
formula proves existence of the solution and its uniqueness, since all the manipulations which I did
to obtain this formula are invertible, and hence the formula itself is equivalent to the equations and
38
Figure 1.17: Solution to the wave equation with one fixed end and zero initial velocity
initial conditions. The continuous dependence on the initial conditions also follows immediately from
d’Alembert’s formula. In particular, if we are given the problem with f1 , g1 and f2 , g2 respectively, I
39
Figure 1.18: Solution to the wave equation with one fixed end and zero initial displacement
get
ˆ
|f1 (x − ct) − f2 (x − ct)| |f1 (x + ct) − f2 (x + ct)| 1 x+ct
|u1 (t, x) − u2 (t, x)| ≤ + + |g1 (s) − g2 (s)| ds,
2 2 2c x−ct
40
Figure 1.19: Solution to the wave equation with a free end at x = 0 end and zero initial velocity
which proves that small deviations in initial conditions imply small deviations in solutions.
41
Chapter 2
Fourier method
ut = α2 ∆u, α2 ∈ R, (2.1)
where ∆ is the Laplace operator, naturally appears macroscopically, as the consequence of the con-
servation of energy and Fourier’s law. Fourier’s law also explains what is the physical meaning of
various initial conditions. I will also give a microscopic derivation of the heat equation, as the limit
of a simple random walk, thus explaining its second title — the diffusion equation.
2.1.1 Conservation of energy plus Fourier’s law imply the heat equation
In one of the first lectures I deduced the fundamental conservation law in the form ut + qx = 0 which
connects the quantity u and its flux q. Here I first generalize this equality for more than one spatial
dimension.
Let e(t, x) denote the thermal energy at time t at the point x ∈ Rk , where k = 1, 2, or 3 (straight
line, plane, or usual three dimensional space). Note that I use bold fond to emphasize that x is a
vector. The law of the conservation of energy tells me that
the rate of change of the thermal energy in some domain D is equal to the flux of the
energy inside D minus the flux of the energy outside of D and plus the amount of energy
generated in D.
Here q is the flux (note that now it is a vector, naturally, since the notion of the flux assumes a
direction), n is the outward normal to the domain D, and the first integral in the right hand side is
42
taken along the boundary of D, the minus sign is necessary because n is the outward normal. The
dot denotes the usual dot product in Rk . Function f ∗ specifies the energy generated (or absorbed if
it is negative) inside D.
The divergence (or Gauss) theorem says that
‹ ˚
q(t, x) · n dS = ∇ · q(t, x) dx,
∂D D
where k is called the thermal conductivity. The minus sign describes intuitively expected fact that
the heat energy flows from hotter to cooler regions (think about one dimensional geometry, when
∇u = ux ).
Hence
1 f∗
ut = ∇ · k∇u + ,
cρ cρ
or, using the notations
k f∗
α2 = , f= , ∆ = ∇2 = ∇ · ∇,
cρ cρ
the nonhomogeneous heat equation
ut = α2 ∆u + f. (2.3)
In Cartesian coordinates in 3D I have (assuming for the moment that f = 0)
ut = α2 (uxx + uyy + uzz ) .
If I am dealing with one-dimensional geometry my equation becomes
ut = α2 uxx ,
which is a particular case of (2.1).
43
2.1.2 Initial and boundary conditions for the heat equation
In general, I will need the initial and boundary conditions to guarantee that my problem is well posed.
The initial condition is given by
u(0, x) = g(x), x ∈ D,
which physically means that we have an initial temperature at every point of my domain D.
To consider different types of the boundary conditions I will concentrate on the case when I deal
with D ⊆ R1 , i.e., my domain is simply an interval (0, l) on the real line. Physically one should imagine
a laterally isolated rod of length l, and I am interested in describing the changes in the temperature
profile inside this rod.
• Type I or Dirichlet boundary conditions. In this case I fix the temperature of the two ends of
my rod:
u(t, 0) = h1 (t), u(t, l) = h2 (t), t > 0.
Please note that h1 and h2 specify not the temperature of the surrounding medium around the
ends of the rod but the exact temperature of the ends themselves, which can be mechanically
achieved by using some kind of thermostats fixed at the ends.
• Type II or Neumann boundary conditions. In this case I fix the flux at the boundaries:
where g1 (t) > 0, g2 (t) > 0 implies that the heat goes from the right to the left, and from the
left to the right otherwise. The case g1 = g2 = 0 is very important and corresponds, clearly, to
no flux condition, or, in other words, to the insulated ends of the rod.
• Type III or Robin boundary conditions. This means that the temperature of the surrounding
medium is specified. I will use the Newton’s law of cooling, together with Fourier’s law, to obtain
in this case
h( ) h( )
ux (t, 0) = u(t, 0) − q1 (t) , ux (t, l) = − u(t, l) − q2 (t) ,
k k
where q1 , q2 are the functions of temperature at the left and the right ends respectively. Here k,
as before, the thermal conductivity, and h is so-called heat exchange coefficient (which is quite
difficult to measure in real systems). Here Newton’s law of cooling appears in the form of the
difference of two temperatures. Note also that I am careful about signs in my expressions to
guarantee that the heat flows from hotter to cooler places, as intuitively expected. Sometimes
the same boundary conditions can be written in a more mathematically neutral form as
Similarly the boundary conditions for two or three dimensional spatial domains can be defined. Some-
times part of the boundary has Type I conditions and part has Type II conditions. In this case it is
said that mixed boundary conditions are set. If hi , gi , qi are identically zero then it is said that the
boundary conditions are homogeneous.
44
Example 2.1. Suppose we have a copper rod 200 cm long that is laterally insulated and has an initial
temperature 0◦ C. Suppose that the top of the rod (x = 0) is insulated, while the bottom (x = 200) is
immersed into moving water that has a constant temperature of q2 (t) = 20◦ C.
The mathematical model for this problem will be
PDE ut = α2 uxx , 0 < x < 200, t > 0,
ux (t, 0) = 0,
BC ( )
ux (t, 200) = − h u(t, 200) − 20 ,
k
IC u(0, x) = 0, 0 ≤ x ≤ 200.
Exercise 2.1. Can you guess what happens with the solution to the previous problem when t → ∞?
Can you prove your expectations mathematically?
Actually, the previous exercise can be solved exactly for arbitrary k and N . I am, however, is more
interested in understanding what happens if h, τ → 0 and hence my simple random walk becomes
continuous in both time and space. I will use, taking into account my limiting procedure, the notation
uk,N = u(t, x).
45
Figure 2.2: Left: ten realizations of a symmetric (p = q = 1/2) random walk with h = τ = 1 and
N = 1000. Right: ten realizations of a random walk with drift (p = 0.56, q = 0.44)
which literally says that the probability to find the particle at the position x at time t + τ can be
found as the sum of the probability to be at the position x − h at time t times the probability move
to the right (p) and the probability to be at the position x + h at time t times the probability move
to the left (q). (In probability theory this is called the law of total probability, but do not worry if
you did not see this before).
Now I assume that if h, τ → 0 then u becomes a sufficiently smooth function of x and t, such that
I can use Taylor’s series similar to what I did when I deduced the wave equation. I have
where g(x) = o(f (x)) means the terms such that limx→0 fg(x)
(x) → 0. For example, τ = o(τ ), h = o(h );
2 3 2
o(1) means any expression tending to 0 as x → 0. I plug these series into the fundamental relation,
cancel terms and find that
h2
ut τ + o(τ ) = (q − p)hux + uxx + o(h2 ).
2
Dividing by τ yields ( )
(q − p)h h2 h2
ut + o(1) = ux + uxx + o .
τ 2τ τ
Now to get a meaningful result, I must consider a special way when both h and τ tend to zero. First,
I assume that
h2
lim = α2 > 0.
h,τ →0 2τ
46
Second,
(q − p)h (q − p)h2
lim = lim = β2α2 =: c
h,τ →0 τ h,τ →0 hτ
for a constant β, this can be always achieved by taking
1 β 1 β
q= + h + o(h), q= − h + o(h),
2 2 2 2
i.e., when the random walk is microscopically symmetric.
Now taking the limits τ, h → 0 yields the diffusion equation with drift
ut = α2 uxx + cux .
If we do not have the drift, i.e., p = q = 0.5, then we recover the familiar homogeneous heat equation
ut = α2 uxx .
Now it should be clear why α2 is often called the diffusivity or diffusion coefficient.
As a side remark I note that if α2 = 0 then I end up with the familiar linear transport equation
ut − cux = 0,
French soldiers,
forming defensive squares in Egypt,
during Napoleon’s campaign
The central topic of our course is the so-called Fourier method or method of separation of variables to
solve linear PDE with constant coefficients. This methods historically required a detailed analysis of
the question when a given arbitrary function can be represented as a linear combination of trigono-
metric functions, and partial answers to this question eventually turned into a wide mathematical field
that is called nowadays Fourier Analysis. To motivate the appearance of Fourier series, i.e., infinite
linear combination of trigonometric functions, I will use exactly the same problem that is considered
47
in the textbook. Not to repeat the same arguments I will look at a somewhat more general picture
that involves the notion of functions of complex variable, probably unfamiliar to most of the students.
Consider an insulated circular rod with some initial temperature distribution along it. I am
interested in answering the question how the temperature changes with time at each point of the rod.
Using the experience from the previous lecture I can write that my temperature at the point x at the
time t, which as before I denote u(t, x), must satisfy the heat equation
note that I assumed that my spatial variable changes from −π to π, using the geometry of my rod.
I also have the initial condition
Clearly, none of the considered in the previous lecture boundary conditions would work in this
particular case. A little thought, however, shows that it is natural to set here periodic boundary
conditions
u(t, −π) = u(t, π), t > 0,
(2.6)
ux (t, −π) = ux (t, π), t > 0,
such that the profile of the temperature in my rod is continuously differentiable at every point.
To solve problem (2.4)–(2.6) I make, following the giants of 18th century, an ingenious assumption
that I can look for the solution in the form
u(t, x) = T (t)X(x),
i.e., as the product of two functions, the first one depends only on t and the second one depends only
on x (I will say more about this assumption later on in the course). Using this ansatz (“ansatz” in
mathematics is an educated guess) yields
where the prime denotes the derivative with respect to the corresponding variables. Rearranging
implies
T ′ (t) X ′′ (x)
= ,
α2 T (t) X(x)
i.e., the right hand side of this equality depends only on t, and the right hand side depends only on
x. What is an immediate consequence of this fact? Both sides must be equal to the same constant!
Indeed, fix some t and hence the left hand side is constant, which implies that the right hand side is
constant for any x, now go in the opposite direction. Hence, I can write
T ′ (t) X ′′ (x)
2
= = µ,
α T (t) X(x)
where µ (and this is a significant difference with the textbook) is some complex constant, µ ∈ C. The
consequence of the last two equalities is the following two differential equations
T ′ = α2 µT (2.7)
48
and
X ′′ = µX. (2.8)
Before proceeding further, let me check what my assumption on the structure of solutions of the
heat equation implies for the initial and boundary conditions. The initial condition does not give me
much insight at this point, but the boundary conditions now read
Problem (2.8), (2.9) is an ODE boundary value problem, and my task is to determine for which
complex constants µ this problem has a non-zero solution (zero solution exists for any µ but it is of
no interest to me assuming that g(x) ̸= 0), and how exactly this solution looks for the given µ.
I start with equation (2.8) and use a few facts from the theory of linear ODE. I do hope that the
students can solve this problem for real values of µ.
Exercise 2.3. Solve problem (2.8), assuming that µ ∈ R. Assume first that µ > 0, then µ < 0, µ = 0.
Technically the case when µ ∈ C was not covered in the introductory ODE course. Actually, the
difference with the real case is minuscule, but let me proceed carefully in this case (still omitting
some technical steps). I will look for the solution in the form of a power series with undetermined
coefficients:
∞
∑
X(x) = c0 + c1 x + c2 x2 + . . . = ck xk .
k=0
Differentiating yields
Two power series equal if and only if the coefficients at the same powers are equal, that is I have
2c2 = µc0 ,
3 · 2c3 = µc1 ,
4 · 3c4 = µc2 ,
5 · 4c5 = µc3 ,
...
49
where
m! = 1 · 2 · . . . · (m − 1) · m.
Therefore, if I know c0 and c1 my problem is solved. Recall that the space of solutions of equation
(2.8) is two dimensional, and also note that X(0) = c0 , X ′ (0) = c1 , therefore I can pick any two
linearly independent vectors (c0 , c1 ) to produce two solutions that will form a basis of my solution
√
space. I am going to pick first c0 = 1, c1 = µ, and get that
∞ √
∑ ( µx)k
X1 (x) = ,
k!
k=0
√
and then c0 = 1, c1 = − µ to have the second solution
∞
∑ √
(− µx)k
X2 (x) = .
k!
k=0
Exercise 2.4. What happens if I pick c0 = 1, c1 = 0 and c0 = 0, c1 = 1, which may look like a more
natural choice? Hint: If you are completely lost at this point, it may help to read to the end of this
lecture and return to this exercise after.
Now an attentive reader should notice that the power series look similar to the ones that were
studied in Calculus under the name of Taylor’s series. In particular, these are exactly the series for the
exponent, the only problem that the series with complex coefficients were not studied. To put aside
this problem I simply (for those who would undertake the study of functions of complex variables in
the future this word “simply” should be remembered at some point :) ) put forward the following
definition.
Definition 2.2. The exponential function, exp z or ez , of the complex variable z ∈ C is defined as
∑ zk ∞
z z2
exp z = e := 1 + z + + ... = .
2! k!
k=0
Remark 2.3. No discussion was supplied about convergence of these power series. It can be proved
(not complicated but takes some time) that this series converges absolutely for any fixed z. For those
who want to understand the definitions literally, not defined words “converges absolutely” should be
read “makes sense.”
Remark 2.4. The definition of the exponential function together with binomial formula yield probably
the most important property of ez :
ez1 +z2 = ez1 ez2 .
I invite a curious student to prove it.
Using my definition √
of the exponential√function I can write my two linearly independent solutions
to (2.8) as X1 (x) = e µx and X2 (x) = e− µx , and the general solution as
√ √
X(x) = AX1 (x) + BX2 (x) = Ae µx
+ Be− µx
,
50
where A and B are two (complex) arbitrary constants. To be precise this formula works only if µ ̸= 0,
but if µ = 0 then two integrations yield
X(x) = A + Bx.
A + Bπ = A − Bπ,
which implies B = 0, and the second boundary condition is satisfied automatically. Therefore I found
that for µ = 0 I always has a nontrivial solution to my boundary value problem, which is
X0 = A,
Definition 2.5.
ez − e−z ez + e−z
sinh z = , cosh z = .
2 2
Using the boundary conditions I find
√ √ √ √
Ae + Be− µπ = Ae− µπ + Be µπ ,
µπ
√ √ √ √ √ √ √ √
µAe µπ − µBe− µπ = µAe− µπ − µBe µπ ,
These equalities can be true only if A = B = 0, which gives us a trivial zero solution, of if
√
sinh µπ = 0.
Definition 2.6.
eiz − e−iz eiz + e−iz
sin z = , cos z = ,
2i 2
where i is the imaginary unit, i2 = −1.
Exercise 2.5. Prove that defined in this way functions of the complex variable sine and cosine coincide
with our familiar trigonometric functions for the case when z is a real variable.
51
Note that the last definition implies immediately the Euler’s formula
Two complex numbers are equal if and only if the real and imaginary parts are equal, hence
where now all the constants are real and we can use our knowledge about the trigonometric functions.
I immediately conclude that the last two equalities can be true if and only if
In words, my boundary value problem has a nonzero solution only if the values of the constant in
(2.8) are −k 2 , where k = 1, 2, . . .. The solutions, and I am from now on going to use the index k to
emphasize the dependence on k, are
where still Ak , Bk are some arbitrary complex constants. Since my solution describes the temperature
I am mostly interested in a real valued solution. A complex expression M is real if and only if M = M ,
where the bar means complex conjugate. To have all Xk (x) real I must have hence that Ak = B k .
Let Ak = ak /2 + ibk /2 for some real ak , bk . Then my real solutions are
Xk (t) = (ak /2 + ibk /2)eikx + (ak /2 − ibk /2)e−ikx = ak cos kx + bk sin kx,
where the constants are real. Note that the case µ = 0 is also included in these formulas.
Now I can consider solutions to (2.7):
Tk (t) = Ck e−k
2 α2 t
,
52
and hence each function
solves the heat equation (2.4) and satisfies the boundary conditions (2.6). What about the initial
condition? Here I will rely again, as before in Duhamel’s principle, on the linearity on the equation,
and in particular on the principle of superposition: If u1 , u2 solve (2.4) then any linear combination
of these functions also solves it. Hence I can write that an infinite linear combination
∞
∑ ∞
∑
e−α
2 k2 t
u(t, x) = uk (t, x) = a0 + (ak cos kx + bk sin kx)
k=0 k=1
solves the heat equation. I use the initial condition to find (I replace for some notational reasons a0
with a∗0 /2)
∞
a∗ ∑
g(x) = 0 + (ak cos kx + bk sin kx).
2
k=1
And this was Jean-Baptiste Joseph Fourier, French scholar, who was one of the scientists in the
French Army during the Egyptian expedition under Napoleon’s lead at the beginning of the 19th
century, who in 1822 declared, to a big surprise and disbelief of the mathematical community, that
any function can be represented as a linear combination of trigonometric functions. In other words, he
meant that given g, he can always find ak , bk , and the corresponding series converge (“makes sense”).
Our next task is to actually figure out how to determine ak , bk and also to contemplate a little about
the question whether the found series provides us with a legitimate classical solution to the heat
equation.
• In what sense the equality sign in (2.10) must be understood? Note that the right hand side is
an infinite series and hence a discussion on convergence is relevant.
• Is it possible to replace {1, cos kx, sin kx} with, say, only {cos kx}? or only {sin kx}?
53
• These series are important for us because they represent solutions to the second order PDE and
hence in order to be classical solutions they should be differentiable. What are the conditions
for (2.10) such that I can take a derivative of the right hand side and obtain a Fourier series for
g′ ?
of f, g ∈ L [−π, π] is
2
ˆ π
⟨f, g⟩ = f (x)g(x) dx.
−π
⟨f, f ⟩ ≥ 0, ⟨f, f ⟩ = 0 ⇔ f = 0,
⟨af, g⟩ = a⟨f, g⟩, a ∈ R,
⟨f, g⟩ = ⟨g, f ⟩,
⟨f + g, h⟩ = ⟨f, h⟩ + ⟨g, h⟩.
⟨f, g⟩ = 0.
Lemma 2.9. The trigonometric system of functions is orthogonal, meaning that each function in this
set belongs to L2 [−π, π] and any pair of them is orthogonal.
Proof. First,
ˆ π ˆ π ˆ π
1 dx = 2π < ∞, cos kx dx = 0 < ∞, sin kx dx = 0 < ∞,
−π −π −π
54
which proves that all my functions are in L2 . Second, to show orthogonality, I need to calculate
ˆ π
sin kx x=π
⟨1, cos kx⟩ = 1 · cos kx dx = = 0,
−π k x=−π
ˆ π
cos kx x=π
⟨1, sin kx⟩ = 1 · sin kx dx = − = 0,
−π k x=−π
ˆ π
⟨cos kx, sin mx⟩ = cos kx sin mx dx = 0,
−π
ˆ π {
0, k ̸= m,
⟨cos kx, cos mx⟩ = cos kx cos mx dx =
−π π, k = m,
ˆ π {
0, k ̸= m,
⟨sin kx, sin mx⟩ = sin kx sin mx dx =
−π π, k = m,
where k, m = 1, 2, 3, . . ..
I use the properties of the inner product (see Exercise 2.7) to have
∑ ∞
a0
⟨g, 1⟩ = ⟨1, 1⟩ + ak ⟨cos kx, 1⟩ + bk ⟨sin kx, 1⟩ .
2
k=1
Now notice that due to the orthogonality all the terms except ⟨1, 1⟩ are zero:
ˆ
a0 ⟨g, 1⟩ 1 π
⟨g, 1⟩ = 2π =⇒ a0 = = g(x) dx.
2 π π −π
Similarly, taking the inner product with cos mx and sin mx respectively I find (switching again m
with k)
ˆ
⟨g, cos kx⟩ 1 π
ak = = g(x) cos kx dx, k = 0, 1, 2, . . .
π π −π
ˆ (2.11)
⟨g, sin kx⟩ 1 π
bk = = g(x) sin kx dx, k = 1, 2, . . . ,
π π −π
55
note that I also included the case a0 in my formulas, and this was the reason to have a0 /2 in (2.10).
Equations (2.11) give me the Fourier coefficients of the trigonometric Fourier series (2.10).
Often I will need to consider the set of functions L2 [−l, l], where l is some constant. I can simply
introduce a new variable y in the way that
πy
x= ,
l
such that I am rescaling my variable, and when y = ±l then x = ±π. Using this change of variable,
and replacing again y with x I immediately conclude that
{ }
Lemma 2.11. The system of functions 1, cos πkx l , sin πkx
l is orthogonal with respect to the inner
product
ˆ l
⟨f, g⟩ = f (x)g(x) dx.
−l
Note that Lemma 2.11 answers two first questions that I posed.
56
Figure 2.3: Comparison of the graphs of the original function g (light gray) and the partial sums Sk
(dark grey) of the corresponding Fourier series
a0 ∑
k
Sk (x) = + (am cos πmx + bm sin πmx),
2
m=1
and compare them with the graph of the initial function. My expectation is that the bigger the value
of k I take, the closer the graph of Sk should be to the graph of g. Indeed, my expectations is correct,
see Fig. 2.3.Moreover, for bigger k the approximation becomes better and better (Fig. 2.4)
Figure 2.4: Comparison of the graphs of the original function g (light gray) and the partial sums Sk
(dark grey) of the corresponding Fourier series
57
Before jumping to conclusions let me answer another question: What would happen if I look at
the values of x outside of the interval [−1, 1]? Both my function and the partial sums of Fourier series
are obviously defined for them. Let me take, say, x ∈ [−2, 2] (see Fig. 2.5, left panel).
As it should be expected, the partial sums of Fourier series are periodic function (in my case with
period 2), and the original function is not periodic. However, I can always take a periodic extension
of my function g, such that the new function becomes also periodic with period 2. Instead of writing
careful (and useless) definition, just look at Fig. 2.5, right panel, to see what a periodic extension is.
Figure 2.5: Left: Comparison of the graphs of the original function g (light gray) and the partial sum
S10 (dark grey) on the interval [−2, 2]. Right: Comparison of the graphs of the periodic extension of
the function g (light gray) and the partial sum S10 (dark grey) on the interval [−5, 5]
Finally, my function g, as well as the corresponding periodic extension, are not continuous, whereas
the partial sums as the sums of continuous functions are continuous. It actually stays true even in
the limit k → ∞, the left hand side of (2.10) is continuous, and if the original function has a jump
discontinuity then the corresponding Fourier series converges exactly to the middle of the interval of
discontinuity.
Let me put together all the information we gained in the previous example. I start with a definition
of a piecewise smooth function.
Definition 2.13. Function g : [a, b] −→ R is said to piecewise smooth of the class C (k) if it belongs to
the class C (k) at any point of the interval [a, b] except, possibly, a finite number of points of discontinuity
x1 < x2 < . . . < xp . Moreover, the right and left limits at all these points exist and finite (i.e., all the
discontinuities are of the jump type):
lim g(x) = g(xj + 0), lim g(x) = g(xj − 0), j = 1, . . . , p.
x→xj +0 x→xj −0
58
2.3.3 Sine and cosine series. Odd and even extensions
We had only one example of calculating Fourier coefficients, but it should be already clear that this
is the point that requires sometimes some tedious calculations. It is always nice when we can see
immediately answers without calculating integrals. Here is one such trick.
Recall that f is odd if f (−x) = −f (x) for all x, geometrically the graph of an odd function is
symmetric with respect to the origin; and f is even if f (−x) = f (x) for all x, the graph of the even
function is symmetric with respect to the y-axis. An example of an odd function is sine, and of an
even function is cosine.
Exercise 2.8. Show that the product of two even or two odd functions is even, and the product of
even and odd functions is odd.
Now we can use this symmetry to calculate the integrals. Note that if f is odd then the integral
with respect to a symmetric interval is zero, and if f is even, then the integral with respect to the
interval [−l, l] is double the integral from 0 to l.
In other words, the trigonometric Fourier series of odd functions contain only sines, the trigono-
metric Fourier series of even functions contain only constant and cosines. This is a great news for
many calculations, however, there is a much deeper consequence of the last lemma. Note first that all
the integrals are calculated from 0 to l.
Now assume that I need to find a Fourier series of a function defined on [0, l]. I can, of course,
shift this interval by l/2 and use the formulas for the full Fourier series. I also can extend my function
evenly, for example (recall the discussion of the solution of the wave equation on a half-infinite string!),
and in this case my Fourier series will consists only of cosines and a constant. Or I can use an odd
extension, and in this case my Fourier series will contain only sine terms.
Dirichlet’s theorem, together with the discussion above, thus imply that for any piecewise continu-
ous function g : [a, b] −→ R I can almost equally easy find either the full trigonometric Fourier series,
or the sine Fourier series, or the cosine Fourier series, and the convergence of these series follows that
of Dirichlet’s theorem.
g(x) = x, 0 ≤ x ≤ 1.
Let me start with the sine series. The odd extension is (this is an illustration, I never use these
extensions in my calculations!)
godd (x) = x, −1 ≤ x ≤ 1.
59
Since this is an odd function, hence ak = 0 for any k. Using the formulas above
ˆ 1
(−1)k+1
bk = 2 x sin πkx dx = ,
0 kπ
Recall that on the whole real line my Fourier series converges to the corresponding periodic exten-
sion (see Fig. 2.7)
Figure 2.7: Sine Fourier series for g(x) = x on [0, 1]. Left: Periodic extension. Right: Just the interval
[0, 1]
60
I find
(−1)k − 1
bk = 0, a0 = 1, ak = , k = 1, 2, . . .
π2 k2
The results are shown in Figs. 2.8 and 2.9.
Figure 2.9: Cosine Fourier series for g(x) = x on [0, 1]. Left: Periodic extension. Right: Just the
interval [0, 1]
One important thing to notice from this example. The periodic extension of the odd function is
discontinuous, and my coefficients have the form C/k for some constant C. The periodic extension of
the even function is continuous (see the figures), but not continuously differentiable (it has corners),
and my coefficients are of the form C/k 2 for some (other) constant C. of course the latter series
converges faster, which can also be seen directly from the figures. This is a general phenomenon, i.e.,
the smoother the periodic extension of my function, the faster my Fourier series converges.
Exercise 2.9. Can you find a cosine Fourier series of function sin on x ∈ [0, π]?
This subsection gives an answer to the fourth question.
61
2.3.4 Differentiating Fourier series
Finally, for the future use I need some information when we can differentiate Fourier series such that
we can talk about classical solutions of PDE problems. Without going into any technical details, note
that if I differentiate the terms in Fourier series, I will get factor k in the from of my series. Hence, to
be able to differentiate, I need that my Fourier coefficients would be of the form C/k α for a sufficiently
large α. I refer the curious student to the textbook and references therein, while formulating a very
vague answer to the final question from my list:
If the function g is sufficiently nice (sufficiently smooth), and its periodic extension is also
sufficiently smooth then I can differentiate my Fourier series a sufficient amount of times
(and I will be more specific when it comes to the actual equations).
or
{1, cos kx},
or
{sin kx}.
Moreover, I found the formulas that can be used to calculate the coefficients of Fourier series. The
key property that allowed me to do this is the orthogonality of the first system of functions on [−π, π]
and orthogonality of the second and third systems on [0, π].
In general, and quite abstractly, I can consider an arbitrary system of functions
introduce an inner product ⟨gj , gk ⟩ that satisfies some natural properties and define orthogonality.
Let me assume that {gj (x)} is an orthogonal system of functions. Then I can try to represent an
arbitrary function as a generalized Fourier series
62
is orthogonal on [−π, π]. Find the expression for the coefficients ck in the complex Fourier’s series
∞
∑
g(x) = c0 + ck eikx .
k=−∞
Can you see how ck are connected with ak , bk from the trigonometric Fourier series?
u(t, x) = T (t)X(x).
63
and the boundary conditions imply
√ √
−λ −λ
A + B = 0, Ae + Be = 0,
or √
−λ
B(e2 − 1) = 0,
√
which implies that A = B = 0 since e2 −λ
̸= 1√for any real negative
√ λ. (It maybe nicer to start
working from the general solution X(x) = A sinh −λx + B cosh −λx, I will leave it as an exercise).
Therefore again no nontrivial solutions.
Finally, assuming that λ > 0, I get
√ √
X(x) = A cos λx + B sin λx,
Tk (t) = Ck e−α
2 π 2 k2 t
.
uk (t, x) = bk e−α
2 π 2 k2 t
sin πkx k = 1, 2, . . .
solve equation (2.13) and satisfy the boundary conditions (2.14). All we need is to satisfy the initial
condition (2.15). For this I will use the superposition principle that says that if uk solve (2.13) then
any linear combination is also a solution, i.e.,
∞
∑ ∞
∑
bk e−α
2 π 2 k2 t
u(t, x) = uk (t, x) = sin πkx
k=1 k=1
64
Figure 2.10: Solution to the heat equation with homogeneous Dirichlet boundary conditions and the
initial condition (bold curve) g(x) = x − x2 Left: Three dimensional plot, right: contour plot
which is exactly the sine series for my function g, the coefficients of which can be found as
ˆ 1
bk = 2 g(x) sin πkx dx, k = 1, 2, . . . .
0
∑10 the approximation becomes better and better as t grows. Here is the difference of u1 (t, x) −
And
k=1 uk (t, x) for my example in Fig. 2.11
Second, and more importantly, I note that the same negative exponents in the representation of
the solution by the sine Fourier series will guarantee that any derivative of the Fourier series will
converge (it does require some proof). This is an important characterization of the solutions to the
heat equation: Its solution, irrespective of the initial condition, is infinitely differentiable function
with respect to x for any t > 0.
65
∑10
Figure 2.11: The difference u1 (t, x) − k=1 uk (t, x) in the example
You can see the smoothing effect of the heat equation on the discontinuous initial condition (see Fig.
2.12).
Figure 2.12: Solution to the heat equation with a discontinuous initial condition. For and t > 0 the
solution is infinitely differential function with respect to x
I can also note that if we would like to revert the time and look to the past and not to the future,
then all the exponent would have the sign plus, which means that in general Fourier series will diverge
for any t < 0. This is actually a manifestation of the fact that the inverse problem for the heat
equation is not well posed, the heat equation represents a meaningful mathematical model only for
66
t > 0 and the solutions are net reversible. (As a side remark I note that ill-posed problems are very
important and there are special methods to attack them, including solving the heat equation for t < 0,
note that this is equivalent to solve for t > 0 the equation of the form ut = −α2 uxx ).
Example 2.18. Consider now the Neumann boundary value problem for the heat equation (recall
that homogeneous boundary conditions mean insulated ends, no energy flux):
Now, after introducing u(t, x) = T (t)X(x) I end up with the boundary value problem for X in the
form
X ′′ + λX = 0, X ′ (0) = X ′ (1) = 0.
I will leave it as an exercise to show that if λ < 0 then I do not have non-trivial solutions. If, however,
λ = 0, I have that
X(x) = A + Bx,
and the boundary conditions imply that B = 0 leaving me free variable A. Hence I conclude that for
λ = 0 my solution is X(x) = A. If λ > 0 then
√ √
X(x) = A cos λx + B sin λx,
λk = π 2 k 2 , k = 0, 1, 2, . . .
Tk (t) = Ck e−α
2 k2 π 2 t
, k = 0, 1, . . .
and therefore, due to the same superposition principle, I can represent my solution as
∞
∑ ∞
a0 ∑
ak e−α k π t cos πkx.
2 2 2
u(t, x) = Tk (t)Xk (x) = +
2
k=0 k=1
67
which is a cosine Fourier series for g, where
ˆ 1
ak = 2 g(x) cos πkx dx.
0
which is a mathematical description of the fact that the energy inside my rod must be conserved. The
solution that I found is also, as in the Dirichlet case, infinitely differentiable with respect to x at any
t > 0 and ill-posed for t < 0.
Example 2.19. Recall the problem for the heat equation with periodic boundary conditions:
where ak , bk are the coefficients of the trigonometric Fourier series for g on −π ≤ x ≤ π (the exact
expressions are given in the previous lecture). Again, since the rod is insulated, we find that as t → ∞
ˆ π
1
u(t, x) → g(x) dx.
2π −π
Example 2.20. Now let me consider a problem for the heat equation with Robin or Type III boundary
condition on one end. I need to solve
68
Here I will assume that my constant h is positive.
Again, using the usual method of the separation of variables, I end up with
T ′ = α2 λT,
and
X ′′ + λX = 0, X(0) = 0, X(1) + hX(1) = 0.
First I consider the latter problem. I will look into only real values of constant λ.
λ = 0 implies that X(x) = 0 and hence of no interest to me. If λ < 0 then I get the system
0 = A + B,
√ √ √ √ √ √
0 = A(he− −λ
− −λe− −λ ) + B( −λe −λ + he −λ ).
This is a system of linear homogeneous equations with respect to A and B, and it has a nontrivial
solution if and only if the corresponding determinant of the system vanishes, which is equivalent, after
some simplification, to √
√ h − −λ
2 −λ
e = √ .
h + −λ
√
Note that the left hand side as the function of −λ has a positive derivative, and the left hand side has
a negative derivative, moreover they cross at the point λ = 0. Therefore, for −λ > 0 it is impossible
to have solutions to this equations, which rules out the case λ < 0.
Finally, if λ > 0, then I get
√ √
X(x) = A cos λx + B sin λx,
69
Figure 2.13: Solutions to the equation tan µ = −µ/h
√
where µk = λk . This looks like a Fourier sine series, but this is not, because in the classical Fourier
sine series I need µk = πk, which is not true for my example, and hence I cannot use the formulas for
the coefficients. Luckily, however, it turns out the the system of functions {sin µk x} is orthogonal on
[0, 1] (I leave checking this fact as an exercise), and following the same steps that were done when I
derived the coefficients for the trigonometric Fourier series, I can find that
ˆ 1
2µk
bk = g(x) sin µk x dx.
µk − sin µk cos µk 0
Now my problem is fully solved because for any piecewise continuous g my time dependent Fourier
series is an infinitely differential function.
As a specific example let me take
g(x) = x.
Then the solution has the form as in Fig. 2.14.
Figure 2.14: Solutions to the heat equation with Robin boundary condition and the initial condition
g(x) = x
70
Not surprisingly, the solution approaches the trivial steady state, since the problem can be inter-
preted as the spread of the heat in an insulated rod with the fixed zero temperature at the left end and
the temperature of the surrounding medium around the right end of the rod set to zero. Eventually
the temperature evens out.
To emphasize that what I found is not a usual Fourier sine series, I will plot my infinite series for
t = 0 in the symmetric interval (−5, 5) along with a periodic extension of function x on (−1, 1):
Figure 2.15: The periodic extension (black) of g(x) = 0 on (−1, 1) along with its Fourier series (blue)
on the interval (−5, 5)
In all the examples above the boundary conditions were homogeneous. What to do if we are given
non-homogeneous boundary conditions? The method of separation of variables will not work in this
case. Sometimes, however, we can reduce the problem with non-homogenous boundary conditions to
the problem with homogeneous ones. Consider for example the Dirichlet problem for the heat equation
with
u(t, 0) = k1 , u(t, l) = k2 .
Here k1 , k2 are two given constants. It should be clear (if not, carefully do all the math) that the
equilibrium temperature is given by
x
ueq (x) = k1 + (k2 − k1 ).
l
Now consider
u(t, x) = ueq (x) + v(t, x).
For the function v I will get (check!) the problem
vt = α2 vxx ,
v(t, 0) = v(t, l) = 0,
v(0, x) = g(x) − ueq (x),
where g is the initial condition from the original problem for u. Now I have homogeneous boundary
conditions and can use the Fourier method. Such approach will usually work when the boundary
conditions do not depend on t, otherwise we will end up with a nonhomogeneous heat equation, which
still can be solved using the separation of variables technique, but the solution process is slightly more
involved.
71
2.4.1 Conclusion
We had four examples that have a lot of common features. In particular, all of them involve solving
X ′′ + λX = 0
with some boundary conditions. By solving I mean “identifying such values of the parameter λ such
that my problem has a nontrivial solution.” In all four cases I find an infinite series of such lambdas,
all of which are real. Even more importantly, in all four cases the corresponding solutions form an
orthogonal system of functions, and hence the Fourier series technique can be applied to represent the
solution to my original PDE problem in the form of a generalized Fourier series.
Is it a coincidence? Will it be the same for different boundary conditions? I will answer these
questions in the next lecture.
where p, q are continuous on [a, b] and p(x) > 0 for any x ∈ [a, b]. Note that all four problems from
the previous lecture can be written as
Lu = λu, (2.17)
with p(x) = 1 and q(x) = 0 and some additional boundary conditions. Most of these boundary
conditions can be written in the general form
Quite natural (recall the definition of eigenvalues and eigenvectors for, e.g., matrices) problem
(2.17)–(2.18) is called the Sturm–Liouville eigenvalue problem. Sometimes instead of (2.18) periodic
boundary conditions are used:
To study the properties of the eigenvalues of the Sturm–Liouville problem, I start with a derivation
of Lagrange’s identity. Let u, v be two arbitrary C (2) [a, b] functions, then (check the skipped steps)
( )′
uLv − vLu = −u(pv ′ )′ + quv + v(pu′ )′ − quv = v(pu′ )′ − u(pv ′ )′ = p(vu′ − uv ′ ) .
72
From Lagrange’s identity, by integrating from a to b, I get Green’s formula
ˆ b
b
(uLv − vLu) dx = p(vu′ − uv ′ )a .
a
Both Lagrange’s identity and Green’s formula hold for any u and v. I claim that if u and v are such
that they satisfy (2.18) or (2.19) then Green’s formula becomes
ˆ b
(uLv − vLu) dx = 0.
a
73
On the other hand, since L is self-adjoint,
Therefore ˆ b
(λ − λ)⟨u, u⟩ = (λ − λ) |u|2 dx,
a
´b
and since a |u|2 dx > 0 then
λ = λ,
which means that λ is real.
Lemma 2.22. Let L be a self-adjoint operator. Then if λ1 and λ2 are two different eigenvalues and
u1 , u2 are two corresponding eigenfunctions then u1 and u2 are orthogonal.
Proof. I have
⟨Lu1 , u2 ⟩ = λ1 ⟨u1 , u2 ⟩ = λ2 ⟨u1 , u2 ⟩,
or
(λ1 − λ2 )⟨u1 , u2 ⟩ = 0 =⇒ ⟨u1 , u2 ⟩ = 0.
Two previous lemmas are very nice, however, they are true under the assumption that my operator
has any eigenvalues and eigenfunctions at all. A more impressive theorem, whose proof is significantly
more involved, and hence omitted here, is as follow.
Theorem 2.23. Consider the Sturm–Liouville eigenvalue problem, i.e., (2.17) plus (2.18) or (2.19).
Then there exists a countable sequence of eigenvalues
λ1 ≤ λ2 ≤ λ3 ≤ . . . ,
To conclude this section let me collect together all the results for the Sturm–Liouville operator
Lu = −u′′ that we got so far (in the previous lecture and in homework problems).
74
Boundary conditions Eigenvalues Eigenfunctions
u(0) = u(1) = 0 λk = π 2 k 2 , k = 1, 2, . . . uk (x) = B sin πkx
2 2
u(0) = u(L) = 0 λk = πLk2 , k = 1, 2, . . . uk (x) = B sin πk
L x
u′ (0) = u′ (1) = 0 λk = π 2 k 2 , k = 0, 1, 2, . . . uk (x) = A cos πkx
u(−π) = u(π), u′ (−π) = u′ (π) λk = k 2 , k =√0, 1, 2, . √.. uk (x) = A cos kx, vk (x) = B sin kx
√
u(0) = u′ (1) + hu(1) = 0 Solutions to tan λ = − λ/h uk (x) = B sin λk x
u(0) = u′ (1) = 0 λk = (π(k − 1/2))2 , k = 1, 2, . . . uk (x) = B sin π(k − 1/2)x, k = 1, 2, . . .
u′ (0) = u(1) = 0 λk = (π(k − 1/2))2 , k = 1, 2, . . . uk (x) = B cos π(k − 1/2)x, k = 1, 2, . . .
u(t, 0) = 0, t > 0,
(2.22)
u(t, l) = 0, t > 0,
which physically means that I am studying the oscillations of a string of length l with fixed ends.
Using the same ansatz
u(t, x) = T (t)X(x)
I find that
T ′′ X ′′
= = −λ,
c2 T X
and hence I have two ordinary differential equations
T ′′ + c2 λT = 0 (2.23)
for T , and
X ′′ + λX = 0 (2.24)
for X. Using the boundary conditions (2.22) I conclude that equation (2.24) must be supplemented
with the boundary conditions
X(0) = X(l) = 0. (2.25)
75
Problem (2.24)–(2.25) is a Sturm–Liouville eigenvalue problem, which we already solved several times.
In particular, we know that there is an infinite series of eigenvalues
k2 π2
λk = , l = 1, 2, . . .
l2
and the corresponding eigenfunctions
√ πkx
Xk (x) = Ck sin λk x = Bk sin , k = 1, 2, . . . ,
l
moreover all the eigenfunctions are orthogonal on [0, l]. Here Ck are some arbitrary real constants.
Since I know which lambdas I can use, I now can look at the solutions to (2.23). Since all my λk > 0
then I have the general solution
√ √ πckx πckx
Tk (t) = Ak cos c λk t + Bk sin c λk t = Ak cos + Bk sin ,
l l
and hence each function
( )
πckx πckx πkx
uk (t, x) = Tk (t)Xk (x) = ak cos + bk sin sin ,
l l l
solves the wave equation (2.20) and satisfies the boundary conditions (2.22). Here ak = Ak Ck , bk =
Bk Ck . Since my PDE is linear I can use the superposition principle to form my solution as
∞
∑
u(t, x) = uk (t, x),
k=1
my task is to determine ak and bk . For this I will need the initial conditions. Note that using the first
initial conditions implies
∑∞
πkx
f (x) = ak sin ,
l
k=1
which means that ˆ l
2 πkx
ak = f (x) sin dx.
l 0 l
To use the second boundary condition I first differentiate my series and then plug in t = 0:
∞
∑ bk cπk πkx
g(x) = sin ,
l l
k=1
which gives me
ˆ l
2 πkx
bk = g(x) sin dx.
πkc 0 l
Hence I found a formal solution to my original problem. I am writing “formal” since I must also check
that all the series are convergent and can be differentiated twice in t and x to guarantee that what
I found is a classical solution. Note that contrary to the heat equation, my series representations of
solutions do not have quickly vanishing exponents and hence the question on differentiability is not as
76
simple as before. Putting some additional smoothness requirements on the initial conditions can be
used to conclude that my series are classical solutions.
Now let me see what I can infer from the found solution.
The found functions uk are called the normal modes. Using the following trick:
√ ( )
A B
A cos αt + B sin αt = A + B √
2 2 cos αt + √ sin αt
A2 + B 2 A2 + B 2
√
= A2 + B 2 (cos ϕ cos αt + sin ϕ sin αt)
B √
= R cos(αt − ϕ), ϕ = tan−1 , R = A2 + B 2 ,
A
I can rewrite the normal modes in the form
( )
cπkt πkx
uk (t, x) = Rk cos − ϕk sin . (2.26)
l l
Recall that f is periodic with period T if
f (t + T ) = f (t)
for any t. The (minimal) period is the smallest T > 0 in this formula. Clearly if f is T -periodic then
f (t + kT ) = f (t), k ∈ Z.
Moreover, if f is T -periodic then f (at) has period T /a (prove it). These basic facts imply that the
normal modes are periodic functions with respect to variable t with the periods
2l
Tk = ,
ck
because cosine is a 2π-periodic function. More importantly all normal modes have period 2l/c (not
necessarily minimal), which allows me to conclude that the solution to the problem (2.20)–(2.22) is
2l/c-periodic with the period √
ρ
T = 2l .
E
The angular frequencies are
2π πck
ωk = = .
T l
The normal modes are also called the harmonics. The first harmonic is the normal mode of the lowest
frequency, which is called the fundamental frequency
πc
ω1 = .
l
And now I finally can make my first big conclusion here: all harmonics in the solution to the initial-
boundary value problem for the wave equation are multiple of the fundamental frequency ω1 , and
this is the mathematical explanation why we like the sound of musical instruments whose geometry
is one-dimensional: violin, guitar, flute, etc.
Geometrically harmonics represent standing waves (see Fig. 2.16) Using the introduced terminol-
77
Figure 2.16: Standing waves for first six normal modes uk (t, x) = cos πkt sin πkx. The bold lines
represent the time moments when cos πkt = ±1, the dotted lines are the graphs of uk at intermediate
time moments. For an observer the standing wave represent a periodic vibration of the string
ogy I can conclude that the solution to the wave equation is a sum of standing waves. However, we also
know that if the wave equation has no boundary conditions then the solution to the wave equation is
a sum of traveling waves. This is still true (recall the reflection principle) if the boundary conditions
are imposed. So, how these two facts can be reconciled? Do we have a contradiction or these are two
sides of the same phenomenon?
Actually for this particular example there is a very simple explanation:
and hence, any standing wave can be represented as a sum of two traveling waves.
Example 2.24. To give a specific example, I assume that the initial displacement has the form shown
in Fig. 2.17, and the initial velocity is zero. I find that bk = 0 for all k and
78
Figure 2.18: Solutions to the problem (2.20)–(2.22) with the initial displacement as in Fig. 2.17 and
initial velocity g(x) = 0 at different time moments. First 50 terms of the Fourier series are shown
( ( ) ( πk ))
4 − sin 2
2 2 sin πk
ak = .
πk 2
To illustrate the time dependent behavior of my solution I take first 50 terms of my Fourier series
and plot them at different time moments (see Fig. 2.18), you can observe the traveling waves and
Figure 2.19: Solutions to the problem (2.20)–(2.22) with the initial displacement as in Fig. 2.17 and
initial velocity g(x) = 0 in t, x, u(t, x) coordinates. First 50 terms of the Fourier series are shown
79
reflections from the boundaries for my example. A three dimensional graph of the same solution is
given in Fig. 2.19.
The standing wave solutions allow to make a guess how the scientists guessed to use separation of
variables technique. The fact is that Joseph Fourier was not the first person to assume that. Before
him exactly the same guess was used by Lagrange to obtain an analytical solution to the system of
masses on the springs (the one we used to derive in the limit the wave equation). Lagrange did not
have to deal with the questions of convergence because his “Fourier series” consisted of a finite number
of terms. Even earlier, in 1753, Daniel Bernoulli, a famous mathematician and physicist, used “Fourier
series” to represent solutions to the wave equation2 . You can see his “Fourier series” in the left panel
in Fig. 2.20. He actually did not calculate the coefficients of the series, leaving them in undetermined
form. A possible motivation for these products comes from a very careful drawing of his father, Johann
Bernoulli (one of the early developers of Calculus), which can be seen in Fig. 2.20, right panel. These
are the drawings of the observed string oscillations, and since the graphs of standing waves look the
same and can be analytically described as a product of two trigonometric functions, one depends only
on time and the other one only on the spatial variable, hence (we can only guess at this point) that
it was the original motivation to use the form
u(t, x) = T (t)X(x)
2
Bernoulli, D., 1753. Réflexions et éclaircissemens sur les nouvelles vibrations des cordes. Les Mémoires de l’Académie
Royale des Sciences et des Belles-Lettres de Berlin de 1747 et, 1748, pp.47–72.
80
2.7 Solving the Laplace equation by Fourier method
I already introduced two or three dimensional heat equation, when I derived it, recall that it takes the
form
ut = α2 ∆u + F, (2.27)
where u : [0, ∞) × D −→ R, D ⊆ Rk is the domain in which we consider the equation, α2 is the
diffusivity coefficient, F : [0, ∞) × D −→ R is the function that describes the sources (F > 0) or sinks
(F < 0) of thermal energy, and ∆ is the Laplace operator, which in Cartesian coordinates takes the
form
∆u = uxx + uyy , D ⊆ R2 ,
or
∆u = uxx + uyy + xzz , D ⊆ R3 ,
if the processes are studied in three dimensional space. Of course we need also the boundary conditions
on ∂D and the initial conditions in D.
In a similar vein it can be proved that the wave equation in two or three dimensions can be written
as
utt = c2 ∆u + F, (2.28)
where now c is the wave velocity, and F is an external force. We also will need boundary and initial
conditions.
Very often the processes described by the heat or wave equation approach some equilibrium if
t → ∞. This means that the solution does not change with time and in particular ut or utt tend to
zero as t → ∞. Therefore equations (2.27) and (2.28) turn into
∆u = −f, (2.29)
where f = F/α2 for the heat equation and f = F/c2 for the wave equation. Equation (2.29) is called
Poisson equation, and, in case if f = 0,
∆u = 0, (2.30)
Laplace equation, one of the most important equations in mathematics.
Since I am talking about the equilibrium (stationary) problems (2.29) and (2.30) only boundary
conditions are relevant, in the equilibrium state the system “forgets” about the initial conditions (it
can be rigorously proved that initial value problem for either Poisson or Laplace equations is ill posed).
In the following I will use the separation of variables to solve the Laplace equation (2.30), we will
look into properties of (2.29) in the forthcoming lectures.
Only for some special plane geometries of the domain D it is possible to use the separation of
variables. First of all, in Cartesian coordinates, these are various rectangles, I will leave this case for
homework problems (see the textbook). It is also possible to use separation of variables in “circular”-
based domains, such as interior of the disk, exterior of the disk, sector, annulus, and part of an annulus.
To do this I first need to rewrite the Laplace operator in polar coordinates.
Recall that Cartesian coordinates (x, y) and polar coordinates (r, θ) are connected as
x = r cos θ, y = r sin θ,
81
or
y
r2 = x2 + y 2 , tan θ = .
x
I have (√ y)
u(x, y) = u(r cos θ, r sin θ) = v(r, θ) = v x2 + y 2 , arctan .
x
For the following I will need
x
rx = √ = cos θ,
x2+ y2
y
ry = √ = sin θ,
x2 + y 2
1 y y sin θ
θx = − 2 2
=− 2 2
=− ,
1 + (y/x) x x +y r
1 1 x cos θ
θy = = 2 =− .
1 + (y/x)2 x x + y2 r
Now I start calculating the partial derivatives using the usual chain rule
( )
sin θ
ux = vr rx + vθ θx = vr cos θ + vθ − ,
r
cos θ
uy = vr ry + vθ θy = vr sin θ + vθ ,
r
uxx = (ux )r rx + (ux )θ θy
( ) ( )( )
sin θ sin θ sin θ cos θ sin θ
= vrr cos θ − vθr + 2 vθ cos θ + vrθ cos θ − vr sin θ − vθθ − vθ − ,
r r r r r
uyy = (uy )r ry + (uy )θ θy
( ) ( )
cos θ sin θ cos θ sin θ cos θ
= vrr sin θ + vθr − 2 vθ sin θ + vrθ sin θ + vr cos θ + vθθ − vθ .
r r r r r
Now I add two last lines to find the Laplace operator in polar coordinates (replacing v with u)
1 1
∆u = urr + ur + 2 uθθ .
r r
I note that in cylindrical coordinated x = r cos θ, y = r sin θ, z the Laplace operator is
1 1
∆u = urr + ur + 2 uθθ + uzz = 0,
r r
whereas in the spherical coordinates
x = r sin φ cos θ,
y = r sin φ sin θ,
z = r cos φ,
the Laplace operator is
2 1 cot θ 1
∆u = urr + ur + 2 uφφ + 2 uφ + 2 2 uθθ .
r r r r sin φ
82
Example 2.25. To show how the separation of variables work for the Laplace equation in polar
coordinates, consider the following boundary value problem
∆u = 0,
u(r1 , θ) = g1 (θ),
u(r2 , θ) = g2 (θ),
that is, consider the problem inside the annulus r1 < r < r2 , and on both boundaries Type I non-
homogeneous boundary conditions are given. I start with a usual assumption that
u(r, θ) = R(r)Θ(θ).
or
r2 R′′ + rR′ Θ′′
− = .
R Θ
Since the left hand side depends only on r and the right hand side depends only on θ hence both sides
must be equal to a constant, which I will denote −λ. Using this constant I end up with two ODE
and
Θ′′ + λΘ = 0. (2.32)
At this point I must add some boundary conditions to one of these problems so that in the end I get a
Sturm–Liouville problem, whose eigenfunctions I can use as building blocks for my generalized Fourier
series. The original boundary conditions for u are of no help here since they are non-homogeneous.
There should be something else to the problem. And indeed, after some though, it is possible to
guess that my solution must be periodic function (solution must be continuously differentiable), which
implies that
u(r, θ − π) = u(r, θ + π), uθ (r, θ − π) = uθ (r, θ + π).
This implies that my second equation (2.32) must be supplemented with
Now we know that these periodic boundary conditions plus (2.32) is an eigenvalue Sturm–Liouville
problem with the eigenvalues λk = k 2 , k = 0, 1, . . . and eigenfunctions Θk (θ) = Ak cos kθ + Bk sin kθ.
Now I can return to (2.31), which can be written as
r2 R′′ + rR′ − k 2 R = 0.
83
I start with the case k = 0. Then the equation
r2 R′′ + rR′ = 0
can be solved by substitution R′ (r) = S(r):
B
r2 S ′ + rS = 0 =⇒ S(r) = ,
r
which finally gives me
R0 (r) = a0 + b0 log r.
(I do not use the absolute value since r ≥ 0.)
If k = 1, 2, . . . I have the so-called Cauchy–Euler differential equation, which can be solved by the
ansatz R(r) = rµ . I get
µ(µ − 1) + µ − k 2 = 0 =⇒ µ1,2 = ±k,
and hence the general solution is given by
Rk (r) = ck rk + dk r−k , k = 1, 2, . . .
What I did is I proved that any function of the form
uk (r, θ) = Rk (r)Θk (θ)
solves the Laplace equation ∆u = 0 (such functions are called harmonic) and satisfies the periodic
boundary conditions. Since the Laplace equation is linear, I will use the principle of superposition to
argue that the function
∞ [
∑ ]
u(r, θ) = A + B log r + (Ck rk + Dk r−k ) cos kθ + (Ek rk + Gk r−k ) sin kθ
k=1
solves the Laplace equation and satisfies the periodic conditions. It seems that I have a lot of arbitrary
constants to determine from the remaining two boundary conditions, but careful analysis shows that
I have enough. To wit, let my boundary conditions have the following Fourier series (notice that I do
not divide by 2 the first coefficient)
∞
∑
(1) (1) (1)
g1 (θ) = a0 + ak cos kθ + bk sin kθ,
k=1
∞
∑
(2) (2) (2)
g2 (θ) = a0 + ak cos kθ + bk sin kθ.
k=1
Now, comparing these series with the solution in the form of the series and invoking the boundary
conditions, I get
{ (1) ´π
1
A + B log r1 = a0 = 2π g1 (θ) dθ,
(2) 1
´−π
π
A + B log r2 = a0 = 2π −π g2 (θ) dθ,
{ ´π
Ck r1k + Dk r1−k = ak = π1 −π g1 (θ) cos kθ dθ,
(1)
´π k = 1, 2, . . .
Ck r2k + Dk r2−k = ak = π1 −π g2 (θ) cos kθ dθ,
(2)
{ ´π
Ek r1k + Gk r1−k = bk = π1 −π g1 (θ) sin kθ dθ,
(1)
´π k = 1, 2, . . .
Ek r2k + Gk r2−k = bk = π1 −π g2 (θ) sin kθ dθ,
(2)
84
and each system for each k is a system of two equations with two unknowns, which can be always
(except for some degenerate cases) solved.
For example, assuming that g1 (θ) = a, g2 (θ) = b implies
b−a (b − a) log r1
B= , A=a− ,
log r2 /r1 log r2 /r1
and all other constants are zero. Hence the solution is
u(r, θ) = A + B log r.
If I assume that g1 (θ) = a cos θ, g2 (θ) = b cos θ then I end up with the system
C1 r1 + D1 r1−1 = a,
C1 r2 + D1 r2−1 = b,
which is easy to solve. The solution to the boundary value problem for the Laplace equation hence
u(r, θ) = (C1 r + D1 r−1 ) cos θ.
Example 2.26 (Interior Dirichlet problem for the Laplace equation and Poisson formula). Consider
now the problem
∆u = 0, 0≤r<1
u(1, θ) = g(θ), 0 ≤ θ < 2π.
To solve it I will do exactly the same steps as in the previous example (assume that the solution can
be presented as a product, get two ODE, use the periodic boundary conditions on θ, end up with the
same eigenvalues and eigenfunctions, solve the ODE for R) first. Then I note that a significant part
of the solutions to the ODE for R has no meaning since I am dealing also with the point r = 0 and
hence neither log r nor r−k make sence at this point. Since these solutions have no physical meaning
I drop them to end up with the function
∞
a0 ∑
u(r, θ) = + ak rk cos kθ + bk rk sin kθ.
2
k=1
By using the given type I or Dirichlet boundary condition I immediately find that (note that I con-
veniently assumed that disk has radius 1, make sure that you can solve the case for an arbitrary
radius) ˆ ˆ
1 π 1 π
ak = g(θ) cos kθ dθ, bk = g(θ) sin kθ dθ.
π −π π −π
I solved my problem, but it turns out that I can rewrite this solution in a closed neat form.
Interchanging the integrals and sums in my solution I get
ˆ ( ∞
)
1 π 1 ∑ k
u(r, θ) = g(ϕ) + r (cos kϕ cos kθ + sin kϕ sin kθ) dϕ
π −π 2
k=1
ˆ ( ∞
)
1 π 1 ∑ k
= g(ϕ) + r cos k(θ − ϕ) dϕ.
π −π 2
k=1
85
Now
∞
( ∞
)
1 ∑ k 1 ∑ k
+ r cos kθ = Re + z
2 2
k=1 k=1
( ) ( )
1 z 1+z
= Re + = Re
2 1−z 2(1 − z)
( ) ( )
(1 + z)(1 − z) 1 − |z|2 + z − z
= Re = Re
2|1 − z|2 2|1 − z|2
1 − r2
= .
2(1 + r2 + 2r cos θ)
Therefore, finally, I can conclude that
ˆ π
1 1 − r2
u(r, θ) = g(ϕ) dϕ, (2.33)
2π −π 1 + r2 − 2r cos(θ − ϕ)
which is called the Poisson integral formula, and the expression
1 − r2
2(1 + r2 + 2r cos(θ − ϕ))
is called the Poisson kernel. To emphasize, the Poisson integral formula gives a closed form solution
for the Dirichlet boundary problem for the Laplace equation in a disk.
There are several immediate consequences of the Poisson formula.
• The value of the solution at the center of the disk is given by the average of its boundary value.
ˆ π
1
u(0, θ) = g(ϕ) dϕ.
2π −π
This is a particular case of the following general fact: Let u be harmonic (i.e., solves the Laplace
equation) inside the disk of radius a with the center (x0 , y0 ), then
˛
1
u(x0 , y0 ) = u ds,
2πa γ
where γ is the boundary of the disk. This immediately follows from the Poisson formula after
shift and rescaling.
• If u is a nonconstant harmonic function defined in D then it cannot achieve its local maximum
or local minimum at any interior point in D. This is true since the average of a continuous real
function lies strictly between its minimal and maximal values, and hence, due to the previous
point, I cannot have a local minimum or local maximum at an interior point.
• Immediately from the previous I have that if u is harmonic in D and if m and M are the minimal
and maximal values of u on the boundary of D, then
m ≤ u(x, y) ≤ M
anywhere in D. This statement is called the maximum principle for the Laplace equation.
86
• If u1 and u2 solve the same Poisson equation −∆u = f on D with the same boundary conditions
then u1 = u2 within D, that is, the solution to the Dirichlet boundary value problem for
the Poisson equation is unique. This follows from linearity of the equation and the maximum
principle. Indeed, by linearity function v = u1 − u2 solves the Laplace equation ∆u = 0 with
homogeneous boundary conditions v = 0 on ∂D. By the maximum principle this implies that
v(x, y) = 0 for all (x, y) ∈ D and hence u1 = u2 .
87
Chapter 3
My next goal is to introduce the so-called Green’s functions for solving stationary boundary value
problems. There are different ways to define these functions, I am going to pick one that makes a
heavy use of the so-called delta function, which is, strictly speaking, not a function. (Delta function is
often called Dirac’s delta function, although there are strong reasons to believe that Dirac picked delta
function from Heaviside’s work.) There exists a rigorous theory of generalized function or distributions,
of which delta function is just one example, but since my use of this theory will be quite limited I will
frequently appeal to intuition and natural properties, instead of providing mathematical proofs that
the manipulations I perform are legitimate.
Consider a mass m moving along the x-axis with constant speed v.1 At the time t = t0 an elastic
collision with a wall occurs (a collision is called elastic if the total kinetic energy of the two bodies
after the encounter is equal to their total kinetic energy before the encounter). After the collision the
mass moves in the opposite direction with the same speed. If v1 , v2 denote the speeds at times t1 , t2
1
I am copying this example from Salsa, Sandro. Partial differential equations in action: from modelling to theory.
Vol. 86. Springer, 2015, where also the theory of generalized functions can be found
88
then by the law of mechanics ˆ t2
m(v2 − v1 ) = F (t) dt,
t1
where F denotes the intensity of the force acting on mass m. If t1 < t2 < t0 or t0 < t1 < t2 then no
problem, v1 = v2 and F = 0 since no force is acting. If, however, t1 < t0 < t2 then the left hand side
of my equality is 2mv, but the right hand side, as we all know from the calculus, must be zero, since F
is zero everywhere except t = t0 , and therefore we get a contradiction. To get rid of this contradiction
I introduce the delta-function by means of a physical definition
δ(t) = 0, t ̸= 0,
and ˆ
δ(t) dt = 1.
R
Then if I put
F (t) = 2mvδ(t − t0 )
then the contradiction in my reasonings evaporates. The price I pay is that now I need to be very
careful when dealing with δ(t) since no usual function has these properties and therefore I am dealing
with an unknown object. Mathematically, we still need a rigorous definition of δ(t), whereas physically
delta function is a mathematical model of something, concentrated at a point (a point mass, a unit
charge, a unit intensity of the force, etc).
There are two quite different mathematical definitions of delta function. The first one uses the so
called delta-like sequence of ordinary functions
δn (t),
which satisfies
lim δn (t) = 0, t ̸= 0,
n→∞
and ˆ
δn (t) dt = 1.
R
Then, by definition, the delta function is formally
I write “formally” because this limit still has no meaning among the usual function, but the good
part of this definition is that it allows us to find a solution to the problems involving delta functions
as first solving a sequence of problems with “usual” delta-like functions, and after it passing to the
limit. Here is a simple example. As a delta-like sequence I will take
{
n
, − n1 < x < n1 ,
δn (t) = 2
0, otherwise.
Clearly this definition satisfies the required properties. Using this sequence I will calculate
ˆ
δ(t)f (t) dt,
R
89
for an arbitrary continuous function f . I have
ˆ ˆ
δ(t)f (t) dt = lim δn (t)f (t) dt
R n→∞ R
ˆ
n 1/n
= lim f (t) dt
n→∞ 2 −1/n
n 2
= lim · f (ξ), −1/n ≤ ξ ≤ 1/n,
n→∞ 2 n
= f (0),
where I used the integral mean value theorem. By using this calculation I now gave meaning to the
meaningless before expression ˆ
δ(t)f (t) dt = f (0).
R
By an intuitively obvious shifting properties I have that the delta-function concentrated at the point
ξ, which is sometimes denoted as δ(t − ξ) or (do not confuse it with a delta-like sequence!) δξ (t),
satisfies the property ˆ
δ(t − ξ)f (t) dt = f (ξ)
R
for any continuous f . That is (and this is very important to us), an arbitrary continuous function
f can be represented as an infinite linear combination of itself and delta functions concentrated as
different points ξ.
To be completely rigorous I must also prove that my result does not depend on a particular choice
of delta-like sequence, which I leave for a curious student.
Just few words about the second approach to define the delta-function (much more in the cited
book). It is much more mathematically pleasing to define the delta-function as a linear functional
that acts on some space of test functions such that for each test function f this functional returns the
value of this function at zero, formally
⟨δ, f ⟩ = f (0).
To play with the first definition, let me calculate
ˆ t
δ(t) dt.
−∞
90
That is, the result is the unit step function or, how it is also frequently called, Heaviside’s function2
(please note that the textbook uses a different notation for the Heaviside function).
Now we can integrate the delta function, what about differentiation? We know from Calculus that
if ˆ x
f (t) dt = F (x)
a
then (by the fundamental theorem of Calculus)
F ′ (x) = f (x).
Therefore, I postulate that
d
χ(t − ξ) = δ(t − ξ).
dt
That is, the delta function (which is, one more time, is not a function) is the derivative the Heaviside
function. This definition allows me to find derivatives of any piecewise continuously differentiable
functions. For example let me find the derivative of
{
−1, t < 0,
f (t) =
1, t > 0.
I can represent my function as
f (t) = 1 · χ(t) + (−1) · χ(−t).
Hence {
′ 0, t ̸= 0,
f (t) =
2δ(t), t = 0,
since
(1)′ = (−1)′ = 0, χ′ (t) = δ(t), χ′ (−t) = −δ(−t) = −δ(t),
using the chain rule and intuitively obvious fact that delta function is even.
It is not difficult to calculate a derivative of a delta-function itself. Indeed,
d
δ ′ (t) = lim δn (t).
n→∞ dt
I leave it as an exercise to prove that
ˆ
δ ′ (t)f (t) dt = ⟨δ ′ , f ⟩ = −f ′ (0).
R
Exercise 3.1. Find Fourier series of the Heaviside function and the delta function. Observe that the
Fourier series for δ can be obtained by formally differentiating the Fourier series for the Heaviside
function (keep in mind that what we find is the Fourier series for 2π periodic extensions).
In the following I will need also delta functions that depend on more then one independent variable.
In this case it is convenient to set, for, e.g., x = (x1 , x2 , x3 ), ξ = (ξ1 , ξ2 , ξ3 ),
δ(x − ξ) = δ(x1 − ξ1 )δ(x2 − ξ2 )δ(x3 − ξ3 ),
and this is a very rare case when we are allowed to multiply delta functions. In general, the operation
of multiplication is not defined.
2
In a way this is really mocking to keep the name of Heaviside to the most trivial object of the whole theory.
91
3.2 Green’s function for a second order ODE
The ultimate goal of this part of the course is to learn how it is possible, at least in principle, to solve
the boundary value problem for the Poisson equation
−∆u = f, x ∈ D,
(here c is a constant, which I keep to be consistent with the textbook) with the type I boundary
conditions, which I take to be homogeneous
Clearly, we do not need any special methods to solve this problem, since we can always integrate
twice the equation and determine u up to two arbitrary constants, which, in their turn, can be
determined from the boundary conditions. I, however, choose a somewhat more complicated way to
attack this problem, which can be used in many more other situations, opposite to direct integration,
which I can do here.
To solve problem (3.1), (3.2) I will appeal to the principle of superposition again. Problem (3.1)
is linear (since it involves a linear differential operator), and hence I can represent the solution as a
linear combination of some other basic solutions. To figure out what these other solutions are, I recall
from the previous lecture that for any continuous f I can write
ˆ 1 ˆ 1
f (x) = f (ξ)δ(ξ − x) dx = f (ξ)δ(x − ξ) dξ,
0 0
92
using again the principle of superposition. As a heuristic argument consider the following line of
reasonings (not a proof!): Let (3.3) be true. Then, by plugging this expression into the equation, I
get ˆ 1 ˆ 1
d2
−c 2 f (ξ)G(x; ξ) dξ = f (ξ)δ(x − ξ) dξ,
dx 0 0
which implies (if I am allowed to switch the order of integration and differentiation)
ˆ 1
( )
f (ξ) −cG′′ (x; ξ) − δ(x − ξ) dξ = 0,
0
which is true, due to the definition of G. The boundary conditions are also satisfied and hence my
representation of the solution indeed works.
Example 3.2. So, let us actually find my Green’s function in the simplest possible case. I have, once
again,
−cG′′ = δ(x − ξ), G(0; ξ) = G(1; ξ) = 0.
I get ˆ
1 1
G′ (x; ξ) = − δ(x − ξ) ds = − χ(x − ξ) + A,
c c
where χ is the Heaviside function. Integrating one more time, I get
1
G(x; ξ) = − ρ(x − ξ) + Ax + B,
c
where ρ is the so-called ramp function, which is the integral of χ and defined as
{
0, x ≤ ξ,
ρ(x − ξ) =
x − ξ, x ≥ ξ.
Now I can determine the constants A and B from the boundary conditions. The first boundary
condition implies that B = 0. The second one yields (note that 1 ≥ ξ)
1−ξ
−(1 − ξ)/c + A = 0 =⇒ A = .
c
Therefore, I can write my final answer as
{
(1 − ξ)x − ρ(x − ξ) (1 − ξ)x/c, x ≤ ξ,
G(x; ξ) = =
c (1 − x)ξ/c, x ≥ ξ.
93
The last example shows several important properties of Green’s function. In particular, this
function is continuous, but not continuously differentiable. Its derivative has a jump of magnitude
−1/c at the point x = ξ. This function satisfies the boundary conditions and the equation at any point
except x = ξ. These properties in general are enough to determine Green’s function for more general
problems (which are beyond this course). Note also that the found Green’s function is symmetric:
G(x; ξ) = G(ξ; x), which physically means that the impulse applied at the point ξ feels the same way
at the point x as it felt at the point ξ if applied at the point x (mathematically, this is a manifestation
of the fact that my differential operator plus boundary conditions is self-adjoint).
Can we always find a Green’s function? Not really, as the following example shows.
Example 3.3. Let me change the boundary conditions for Type II:
−cG′′ = δ(x − ξ), x ∈ (0, 1), ξ ∈ (0, 1), G′ (0; ξ) = G′ (1; ξ) = 0.
Acting exactly as in the previous example I find again
1
G(x; ξ) = − ρ(x − ξ) + Ax + B.
c
The first boundary condition implies that A = 0. However, the second boundary condition yields
−1/c + A = 0,
and hence I cannot come up with a suitable A. The reason for this to happen is that the Neumann
boundary value problem
−cu′′ = f, u′ (0) = u′ (1) = 0
does not have a unique solution. Indeed, take f (x) = 1 and conclude that there is no solution at all.
On the other hand, take f (x) = x − 1/2 and conclude that there are infinitely many solutions. This
gives at least some reason to understand why in this particular case there exists no Green’s function.
Actually, a deeper analysis of the problem shows that Green’s function exists if and only if the
corresponding boundary non homogeneous (f ̸= 0) value problem has a unique solution if and only
if the only solution to the corresponding homogeneous (f = 0) boundary value problem is the zero
function.
94
3.3.1 Fundamental solution to the Laplace equation
Definition 3.4. The solution G0 to the problem
is called the fundamental solution to the Laplace equation (or free space Green’s function).
Planar case m = 2
To find G0 I will appeal to the physical interpretation of my equation. Physically to solve (3.7) means
to find a potential of the gravitational (or electrostatic) field, caused by the unit mass (unit charge)
positioned at ξ. The field itself is found as the gradient of G0 . Since I do not expect to have for
my gravitation field any preferred directions, I conclude that my potential should only depend on the
distance r = |x − ξ| between the points x and ξ and not on any angle. Next, I will use the fact that
G0 satisfies the Laplace equation ∆G = 0 at any point except ξ. Using the polar form of the Laplace
operator and the fact that my potential depends only on r, I get
rG′′0 + G′0 = 0
I solve this equation when I used the separation of variables for the Laplace equation in polar coordi-
nates. The general solution is given by
G0 (r) = A log r + B.
Now I note that constant B will not contribute to the delta function, since it is infinitely differentiable,
hence my fundamental solution has the form A log r, and I need only to determine constant A. For
this I will use the characteristic property of the delta function that
ˆ
δ(x − ξ) dξ = 1
R2
and the divergence theorem that says that for a nice domain D and smooth vector field F
ˆ ˛
∇ · F dx = F · n̂ dS,
D ∂D
95
hence
1
A=− ,
2π
and therefore
1 1 ( )
G0 (x; ξ) = − log |x − ξ| = − log (x − ξ)2 + (y − η)2
2π 4π
is the fundamental solution to the planar Laplace equation or, physically, is the potential of the
gravitation (or electrostatic) field induced by the unit mass (charge). Note that for the field itself
dG0 1
= ,
dr r
that is the force is inversely proportional to the distance between the points.
Case m = 3
Very briefly, and invoking exactly the same reasonings, I find that my fundamental solution must
depend only on r = |x − ξ| and solve everywhere except point ξ the equation
rG′′0 + 2G′0 = 0
(see the expression of the Laplace operator in spherical coordinates). The general solution to this
equation is
A
+ B,
r
and therefore it is reasonable to assume that
A
G0 (r) = .
r
Again, using the properties of delta function and the divergence theorem I get for a sphere Dϵ with
the center at ξ
ˆ ˆ
1= δ(x − ξ) dx = δ(x − ξ) dx =
R3 Dϵ
ˆ ˆ
1 1
= −A ∆ dx = −A ∇ · ∇ dx =
r r
˛Dϵ Dϵ
1
= −A ∇ · n̂ dS =
r
˛ ∂Dϵ ˛
1 A A
=A 2
dS = 2 dS = 2 4πϵ2 = 4πA,
∂Dϵ r ϵ ∂Dϵ ϵ
since the area of the sphere of radius ϵ is 4πϵ2 . Therefore, my fundamental solution is
1
G0 (r) = ,
4πr
and the gravitational (or electrostatic) field exerts the force that is inversely proportional to the square
of the distance, as we all remember from our physics classes.
Exercise 3.2. Find the fundamental solution to the Laplace equation for any dimension m.
96
3.3.2 Green’s function for a disk by the method of images
Now, having at my disposal the fundamental solution to the Laplace equation, namely,
1
G0 (x; ξ) = − log |x − ξ|,
2π
I am in the position to solve the Poisson equation in a disk of radius a. That is, I consider the problem
u|x∈∂D = 0. (3.9)
I know that to be able to write the solution to my problem, I need the Green function that solves
−∆G(x; ξ) = δ(x − ξ), x, ξ ∈ D ⊆ R2 ,
(3.10)
G(x; ξ) = 0, x ∈ ∂D.
If I am able to figure out the solution to (3.10), then (3.8), (3.9), by the principle of superposition,
has the solution ¨
u(x) = f (ξ)G(x; ξ) dξ.
D
The key idea is to replace the problem (3.10) with another problem on the whole plane R2 , with an
additional source (or sources) outside of D, such that the boundary condition (3.9) would be satisfied
automatically.
I replace my problem (3.10) with the following
Since I require the coordinates of my second source be outside of the my disk, hence within the disk,
due to the properties of the delta function, (3.11) coincides with the equation (3.10). If I am capable
to determine the coordinates of my second source as a function of the coordinates of the source inside
the disk, such that for |x| = a my solution vanishes, then it means that I solved my problem. In other
words, I am looking for the coordinates ξ∗ of the image of the point ξ, and this explains the name of
the method.
So let me try to achieve my goal. I know that solution, again by the superposition principle, to
(3.11) is given by
1 1 1 |x − ξ∗ |2
G(x; ξ) = − log |x − ξ| + log |x − ξ∗ | + c = log + c.
2π 2π 4π |x − ξ|2
Hence, for |x| = a, I must have, due to (3.10),
97
where I assumed, to reduce the number of free parameters, that the angle θ between x and ξ and x
and ξ∗ is the same, that is ξ ∗ = γξ.
To get the required equality I must have
a2 + r02 = ka2 + kγ 2 r02 ,
ar0 = kγar0 ,
from the second of which kγ = 1 and hence from the first
a2
γ= .
r02
Problem solved! You can see geometrically that my point ξ∗ is one of the vertices of the triangle 0xξ∗ ,
which is similar by construction to the triangle 0xξ, see the figure.
Figure 3.1: The construction of the image of the source with coordinates ξ for the disk
Now I can write, using the polar coordinates of the point x as (r, ϕ) and of ξ as (r0 , ϕ0 ), that my
solution to (3.9), (3.10) has the form
( )
1 a2 r2 + r02 − 2rr0 cos(ϕ − ϕ0 )
G(r, ϕ; r0 , ϕ0 ) = log 2 2 .
4π r0 r + a4 − 2a2 rr0 cos(ϕ − ϕ0 )
Exercise 3.3. Find Green’s function for the unit sphere.
Similar approach works for some other domains (see the homework problems), but the list of such
domains is quite limited. There are other methods to infer Green’s function, but they are outside of
the scope of this introductory course. Probably still the best reference for a prepared reader to read
about various methods to find Green’s functions is the first volume of Courant and Hilbert Methods
of mathematical physics.
98
one or another situation. Arguably, the most ubiquitous and general is the so-called Fourier transform,
which generalizes the technique of Fourier series on non-periodic functions. For the motivation of the
Fourier transform I recommend reading the textbook or other possible references (see below), in these
notes I start with a bare and dry definition.
Definition 3.5. The Fourier transform of the real valued function f of the real argument x is the
complex valued function fˆ of the real argument k defined as
ˆ ∞
1
f (k) = √
ˆ f (x)e−ikx dx. (3.12)
2π −∞
The inverse Fourier transform, which allows to recover f if fˆ is known, is given by
ˆ ∞
1
f (x) = √ fˆ(k)eikx dk. (3.13)
2π −∞
Remark 3.6. 1. Since the Fourier transform plays a somewhat auxiliary role in my course, I will
not dwell on it for very long. There are a lot of available textbooks on Fourier Analysis, I would
like to mention two additional sources: the lecture notes by Brad Osgood The Fourier transform
and its applications (they are freely available on the web) for a thorough introduction to the
subject, including careful coverage of the relations of the delta function and Fourier transform,
and Körner, T. W. Fourier analysis. Cambridge University Press, 1989 for the already initiated.
2. There are different definitions of the Fourier transform. I use the one, which is used in the
textbook. You can also find in the literature
ˆ
ˆ 1 ∞ iBkx
f (k) = e dx,
A −∞
where the following choices are possible:
√
A= 2π, B = ±1,
A = 1, B = ±2π,
A = 1, B = ±1.
The only difference in the computations is the factor which appears (or does not appear) in front
of the formulas. Be careful if you use some other results from different sources.
3. The notation is a nightmare for the Fourier transform. Very often, together with the hat, the
operator notation is used
fˆ(k) = Ff (x), f (x) = F −1 fˆ(k).
Both F and F −1 are linear operators, since for them, from the properties of the integral, it holds
that
F(αf (x) + g(x)) = αFf (x) + Fg(x),
and a similar expression is true for F −1 .
To emphasize that the pair f and fˆ are related sometimes something like
f (x)
fˆ(k)
is used.
99
4. In the definition of the Fourier transform I have an improper integral, which means that I have
to bother about the convergence. Moreover, note that the complex exponent, by Euler’s formula,
is a linear combination of sine and cosine, then my transform should be defined only for those f
that tend sufficiently fast to zero. This is actually a very nontrivial question, what is the space of
functions, on which the Fourier transform is naturally defined, but this will not bother me in my
course. Moreover, there will be some examples, which would definitely contradict the classical
understanding of the Fourier transform. I will treat them in a heuristic way remembering that
the rigorous justification can be made within the theory of generalized functions.
Example 3.7 (Fourier transform of the rect (for rectangle) function). Let
{
1, |x| < a,
Πa (x) =
0, otherwise.
I find
1
fˆl (k) = √ .
2π(a − ik)
100
Now I can easily calculate the Fourier transform for
If I replace k with x and x with −k I will get the integral (up to a multiplicative factor), that I know
how to find. Therefore I conclude that
√
π e−a|k|
fˆ(k) = .
2 a
This is actually a consequence of the striking resemblance of the Fourier and inverse Fourier
transforms, the difference being just an extra minus sign. I would like to formulate this important
fact as a theorem.
Theorem 3.10. If the Fourier transform of f (x) is fˆ(k), then the Fourier transform of fˆ(x) is f (−k).
This theorem helps reducing the table of Fourier transforms in half, since if I know the Fourier
transform fˆ of f this immediately means that I know the Fourier transform ĝ of the function g = fˆ.
To practice this theorem convince yourself that
( ) √
sin ax π
F = Πa (k).
x 2
Example 3.11 (Fourier transform of the delta function). Let f (x) = δ(x). Then
ˆ ∞
1 1
f (k) = δ̂(k) = √
ˆ δ(x)e−ikx dx = √ .
2π −∞ 2π
Hence the Fourier transform of the delta function is a constant function. From here we can immediately
obtain, invoking the duality principle, that the Fourier transform of the constant, say 1, is
√
F(1) = 2π δ(k),
101
is the delta function! But stop, if I’d like to use my definition (3.12) then the integral
ˆ ∞
1 · e−ikx dk,
−∞
strictly speaking, does not exist! Well, the exact meaning to this integral can be given within the
framework of the generalized functions, but this will not bother us here.
1. Shift theorem. If f (x) has Fourier transform fˆ(k) then the Fourier transform of f (x − ξ) is
e−ikξ fˆ(k). A very particular example of this property is
1
Fδ(x − ξ) = √ e−ikξ .
2π
Using the duality, the Fourier transform of eiηx f (x) is fˆ(k − η).
2. Dilation theorem. If f (x) has Fourier transform fˆ(k) then the Fourier transform of f (cx), c ̸= 0
is ( )
1 ˆ k
f .
|c| c
To practice this theorem let me find the Fourier transform of e−ax . I start with the Fourier transform
2
of e−x :
2
ˆ ∞ ˆ ∞ ˆ
e−k /4 ∞ −y2 e−k /4
2 2
1 −x2 −ikx 1 −(x−ik/2)2 −k2 /4
f (k) = √
ˆ e dx = √ e dx = √ e dy = √ ,
2π −∞ 2π −∞ 2π −∞ 2
´ √
where I used the fact that R e−x dx = π.
2
2a
3. Derivatives and Fourier transform. If f (x) has Fourier transform fˆ(k) then the Fourier transform
of f ′ (x) is ik fˆ(k). Hence the Fourier transform turns the differentiation into an algebraic operation
of multiplication by ik. Immediate corollary is that the Fourier transform of f (n) (x) is (ik)n fˆ(k). By
ˆ
the duality principle or by a direct proof, the Fourier transform of xf (x) is i ddkf .
102
4. Integration and Fourier
´x transform. If f (x) has Fourier transform fˆ(k) then the Fourier transform
of its integral g(x) = −∞ f (s) ds is
i
ĝ(k) = − fˆ(k) + π fˆ(0)δ(k).
k
Using this property I can immediately find Fourier transform of the Heaviside function χ(x):
√
i 1 π
Fχ(x) = − √ + δ(k).
k 2π 2
Since
sgn x = χ(x) − χ(−x),
then √
21
F sgn x = −i .
πk
Exercise 3.4. Prove all four properties of the Fourier transform.
5. Convolution. Now let me ask the following question: if I know that f (x)
fˆ(k) and g(x)
ĝ(k)
then which function has the Fourier transform fˆ(k)ĝ(k)?
I have
ˆ ∞ ˆ ∞
1
fˆ(k)ĝ(k) = f (x)e−ikx dx g(y)e−iky dy =
2π −∞ −∞
¨ ∞
1
= f (x)g(y)e−ik(x+y) dx dy =
2π −∞
ˆ ∞ (ˆ ∞ )
1 −ik(x+y)
= g(y)e dy f (x) dx = [x + y = s]
2π −∞ −∞
ˆ ∞ (ˆ ∞ )
1 −iks
= g(s − x)e ds f (x) dx =
2π −∞ −∞
ˆ ∞ (ˆ ∞ )
1 −iks
= e g(s − x)f (x) dx ds.
2π −∞ −∞
103
Basically, by the above reasoning I proved
In the opposite direction, the Fourier transform of the product of two functions u(x) = f (x)g(x) is
1
û = √ fˆ ∗ ĝ.
2π
The second part of the theorem can be proved in a similar way or using the duality principle.
It is instructive to prove that the convolution is commutative (f ∗ g = g ∗ f ), bilinear f ∗ (ag + bh) =
af ∗ g + bf ∗ h, and associative f ∗ (g ∗ h) = (f ∗ g) ∗ h.
−u′′ + ω 2 u = h(x)
in an infinite interval x ∈ R. ω here is a real parameter, and h is a given function. To solve this
problem I start with a space-free Green’s function, that, as you recall from one of the previous lectures,
must satisfy
−G′′0 + ω 2 G0 = δ(x − ξ), −∞ < x, ξ < ∞.
Let Ĝ0 be the Fourier transform of G0 . Then, using the properties of the Fourier transform, I have
that Ĝ0 must satisfy
e−ikξ
k 2 Ĝ0 + ω 2 Ĝ0 = √ ,
2π
or
e−ikξ
Ĝ0 (k) = √ .
2π(k 2 + ω 2 )
Using the table of the inverse Fourier transform, I find that
1 −ω|x−ξ|
G0 (x; ξ) = e
2ω
is my free space Green’s function. Now, invoking again the superposition principle, the solution to
the original problem can be written as
ˆ ∞ ˆ ∞
1
u(x) = h(ξ)G0 (x; ξ) dξ = h(ξ)e−ω|x−ξ| dξ.
−∞ 2ω −∞
104
I actually never gave a proof of this formula and appealed only to the linearity and our intuition about
linearity. Let me get this answer from scratch, without using any delta-function.
Applying the Fourier transform to the original problem I get
1
û(k)(k 2 + ω 2 ) = ĥ(k) =⇒ û(k) = ĥ(k) .
k2 + ω2
On the right hand side I have √a product of two Fourier transforms. To use my convolution formula I
need to account for the√factor 2π in front. I have that the inverse Fourier transform of ĥ(k) is h(x)
and for 1/(k 2 + ω 2 ) is π/2e−ω|x| /ω, which all put together gives again the same
ˆ ∞
1
u(x) = h(ξ)e−ω|x−ξ| dξ.
2ω −∞
105
3.5.3 Laplace’s equation in a half-plane
Consider the following problem for Green’s function to
and I must also have that my Fourier transform would be bounded for k → ±∞, therefore I choose
my solution as
1
Ĝ(k, y) = √ e−|k|y ,
2π
which both satisfies the equation and the boundary condition. Taking the inverse Fourier transform,
I find
y
G(x, y) = .
π(x + y 2 )
2
This Green’s function can be used immediately to solve the general Dirichlet problem for the Laplace
equation on the half-plane.
2π
which is the Gaussian function. Recall that the inverse Fourier transform of the Gaussian function is
the Gaussian again: [ ] 1
F e−ax = √ e−k /(4a) .
2 2
2a
106
Figure 3.2: Fundamental solution to the heat equation at different time moments. Note that if t → 0+
then Φ(t, x) → δ(x)
for any time t, and since Φ(t, x) > 0 for all x and t > 0 then Φ is, in terms of probability theory, is
a probability density function.
√ This is actually a probability density function with the mean zero and
the standard deviation σ = 2tα, which connects the random walk model that leads to the diffusion
equation with the solution to this equation.
One of the very important consequences of this solution is that it shows that in our model of
the heat spread the velocity of the movement of the thermal energy is infinite. Indeed, the initial
condition says that u(0, x) = 0 in any point except x = 0, and at the same time the solution shows
that u(t, x) > 0 at any point x and any time t, which is equivalent to saying that the speed of spread
of the heat is infinite. This by no means implies that the actual velocity of the spread of the heat is
infinite! This just shows that drawbacks of our model; and if we have a problem at hands in which
the velocity of the heat spread is important, we have to replace the model.
If I change my initial condition for u(0, x) = δ(x − ξ) then I find
1 (x−ξ)2
u(t, x) = √ e− 4α2 t ,
2α πt
and hence the solution to the initial value problem for the heat equation with the initial condition
u(0, x) = f (x)
107
can be written as, by the principle of superposition,
ˆ ∞
1 (x−ξ)2
u(t, x) = √ e− 4α2 t f (ξ) dξ.
2α πt −∞
It is instructive to obtain a proof of this formula by the Fourier transform method.
Unfortunately, it is quite difficult to evaluate the last integral for an arbitrary f . This can always
be done, however, when f (x) = e−(ax +bx+c) by completing the squares. As a (tedious) exercise I ask
2
you to prove that if f (x) = e−x then my solution to the initial value problem for the heat equation
2
as ˆ
v(t, x) = G(t, x; ξ)f (ξ) dξ,
R
where
1 (x−ξ)2
G(t, x; ξ) = √ e− 4α2 t .
2α πt
If I now consider a non-homogeneous problem with the zero initial condition:
then it is a simple exercise to show that v + w give me solution to the original problem. To solve
problem for w I now recall Duhamel’s principle, which we already used for solving a non-homogeneous
wave equation. This principle boils down to
1. Construct a family of solutions of homogeneous Cauchy problems with variable initial time τ > 0
and initial data h(τ, x).
I will leave it as an exercise to prove the validity of this principle in this particular case. According
to this principle the solution to
108
can be used to find ˆ t
w(t, x) = q(t, x; τ ) dτ,
0
which finally gives me the following general solution
ˆ ˆ tˆ
u(t, x) = G(t, x; ξ)f (ξ) dξ + G(t − τ, x; ξ)h(τ, ξ) dξ dτ.
R 0 R
In vain Whitehouse used his two thousand volt induction coils to try to push messages
through faster — after four weeks of this treatment the cable gave up the ghost; 2500 tons
of cable and £350000 of capital lay useless on the ocean floor.
A lot of exciting details can be found in the book by Körner, I here give only brief synopsis of the
story.
The electric telegraph was invented in 1830, and immediately cable started connecting the main
cities of Europe. In 1850 a cable connected London and France (after 12 hours of functioning the
cable was accidentally cut by a ship), for the second attempt a much heavier cable was used, since
it was discovered that the signal through under water cable cannot be transmitted as fast as along
the cable on the ground. Faraday predicted this effect because of increased capacitance of undersea
cables. Eventually it was understood that it is important to have a cable connecting USA and England
— 2500 miles long cable was required to make this possible. The first attempt was made in 1857,
the cable was snipped after 335 miles. In 1858 a new attempt to lay the cable failed because of the
storm. Finally, in 1858 a cable was laid, and on August 16th it took 16 and a half hours to receive 90
words greeting (stocks of the company that owned the cable went down); what happened next with
the cable is described in the quotation. In 1864 another attempt was made, Western Union company
decided to put the cable through Russia. At the same time, in 1965 another company tried to do it
again through the Atlantic ocean, and after 1250 miles, the cable parted with the ship. Finally, in
1866 new cable connected New-York and London. Due to some mechanical adjustments this time it
was actually possible to send and receive signals.
Now I will try to explain what happened with the signal initially, and how engineers actually solved
the problem.
Let me introduce the following variables: i is the current, v is the potential, L is the inductance,
C is the capacitance, R is the resistance, G is the leakage conductance. If I apply the usual physical
laws that describe the change of the current and potential in my cable, I end up with the system of
first order partial differential equations:
ix + Cvt + Gv = 0,
vx + Lit + Ri = 0,
109
which is called the system of telegrapher’s equations. I assume that I have no boundary conditions:
−∞ < x < ∞. If I differentiate the first equation with respect to t and the second one with respect
to x, I can write that
I know that the wave equation has perfectly nice traveling wave solutions, and this is how I would like
my signal to propagate. After plugging this ansatz I have
(a2 − c2 )f ′′ + b2 f = 0,
110
If (a2 − c2 ) < 0 we have unbounded solutions — physically not realistic. Hence I assume that a > |c|,
which would give me
b
λ1,2 = ± √ i.
a2 − c2
This implies that my only allowable solutions are
where
b
k=√ ,
a − c2
2
which is called the wave number. The wave length (this is the same as wave period if my wave depends
on the time variable) is
2π 2π √ 2
λ= = a − c2 ,
k b
which means that in this case the wave length must satisfy (and hence not every wave length can be
transmitted!)
2πa
0<λ< ,
b
and each wave with a fixed length λ must travel with its own velocity
√
b2 2
cλ = ± a2 − λ .
4π
This implies that in the cable with b ̸= 0 the signal must first be represented as a sum of several
harmonics (remember, only sine and cosine are traveling wave solutions), which can always be done
thanks to the Fourier series, but different harmonics move with different velocities and thus it is
almost impossible to understand what was initially sent. Hence, to fight this effect, one can choose
the parameters such that b = 0:
G R
= ,
C L
and this will allow the signal to travel as a whole. Problem solved!
111
Chapter 4
4.1.1 Solving the Dirichlet problem for the heat equation in a rectangle
Consider the following problem:
where α is a given real constant, and D is a rectangle, as in the figure. I supplement my equation
with the initial condition
u(0, x, y) = f (x, y), (x, y) ∈ D, (4.2)
and Type I or Dirichlet boundary conditions
112
Figure 4.1: The rectangular domain D
T ′ = −λα2 T, (4.4)
Equation (4.5) is called the Helmholtz equation. The boundary conditions (4.3) imply that I must
supplement problem (4.5) with the boundary conditions
V (x, y) = 0, (x, y) ∈ D,
Helmholtz equation plus the boundary conditions constitute an eigenvalue problem for the Laplace
operator ∆, that is, I am required to find such values of the constant λ that this problem has a nonzero
solution.
To solve this eigenvalue problem I, one more time, will use the assumptions that I can separate
the variables:
V (x, y) = X(x)Y (y).
My equation becomes
X ′′ Y + XY ′′ = −λXY,
or, after rearranging ,
X ′′ Y ′′
=− − λ = −µ,
X Y
113
where I used another constant −µ because the left hand side depends only on x and the right hand
side depends only on y. Hence (and using the boundary conditions), my eigenvalue problem for the
Laplace equation can be written as two boundary value problems for ordinary differential equations:
and
Y ′′ + ηY = 0, Y (0) = Y (b) = 0, η = λ − µ. (4.7)
These are two Sturm–Liouville problems that we already solved in the previous lectures, the solutions
are
( )2
√ πk
Xk (x) = Ak sin µk x , µk = , k = 1, 2, . . .
a
√ ( πm )2
Ym (x) = Bk sin ηm y , ηm = , m = 1, 2, . . . .
b
Therefore, my eigenvalue problem for the Laplace operator has the eigenvalues
( )2 (
πk πm )2
λk,m = ηm + µk = + , k, m = 1, 2, . . .
a b
and
uk,m (t, x, y) = Tk,m (t)Vk,m (x, y)
solves by construction equation (4.1) and satisfies the boundary conditions (4.3).
By the linearity of the original equation and the principle of superposition the double infinite series
∞ ∑
∑ ∞
πkx πmy
ak,m e−λk,m α t sin
2
u(t, x, y) = sin ,
a b
k=1 m=1
satisfies my equation and the boundary conditions. Using the initial conditions we can uniquely identify
constants ak,m , because, as before in the case of the Sturm–Liouville problem, the eigenfunctions are
orthogonal.
I introduce the following inner product of two functions defined on D as
¨
⟨f, g⟩ = f (x, y)g(x, y) dx dy.
D
Then it can be shown (remember that in the case of the rectangle the double integral is just a repeated
integral) that
⟨Vk,m , Vp,q ⟩ = 0, k ̸= p, or m ̸= q,
114
and
ab
⟨Vk,m , Vk,m ⟩ = .
4
Therefore, using the initial condition (4.2), I get that
∞ ∑
∑ ∞
f (x, y) = ak,m Vk,m (x, y),
k=1 m=1
ut = α2 u, t > 0, (x, y) ∈ D
for some “nice” but arbitrary domain D. (By “nice” you can think of a bounded, simply connected,
with piecewise-smooth boundary.) Using the same separation of variables I again end up with an
eigenvalue problem
−∆V = λV, V (x, y) = 0, (x, y) ∈ ∂D
for the Laplace operator. A lot can be proved about the eigenvalues and eigenfunctions of this problem
without explicitly computing them. In particular,
0 < λ1 < λ2 ≤ λ3 ≤ . . .
115
• The corresponding eigenfunctions are orthogonal, that is
¨
⟨Vk , Vm ⟩ = Vk (x, y)Vm (x, y) dx dy = 0, k ̸= m.
D
• The list of eigenfunctions form a complete system, that is, any sufficiently “nice” function can
be represented as a convergent Fourier series
∑ ⟨f, Vk ⟩
f= ck Vk , ck = .
⟨Vk , Vk ⟩
k
Proof of these facts is well beyond the scope of the present course.
V (x, y) = 0, x2 + y 2 = 1.
I know that I must have infinitely many eigenvalues and eigenfunctions, the latter can be used to build
a series solution to the original problem, but how to find them analytically? A natural way to attack
this problem is to rewrite my equation in the polar coordinate:
1 1
vrr + vr + 2 vθθ + λv = 0, v(1, θ) = 0,
r r
and look for a solution in the form
v(r, θ) = R(r)Θ(θ).
I get
1 1
R′′ Θ + R′ Θ + 2 RΘ′′ + λRΘ = 0,
r r
or, after rearranging,
R′′ R′ Θ′′
r2+ r + λr2 = − = µ,
R R Θ
where I introduced another constant of separation because the left hand side depends only on r and
the right hand side depends only on θ.
I get now two ODE problems. The first one is
116
where the boundary conditions are from the periodicity requirement. This is our familiar Sturm–
Liouville problem with periodic boundary conditions, for which we know that
I also supplement my boundary condition with the “physical” condition that R(r) must be bounded
for all r, in particular at r = 0: |R(0)| < ∞. To slightly simplify my problem I will introduce a new
variable √
z = λr,
and new function √
h(z) = h( λr) = R(r).
By the chain rule I have that
z 2 h′′ + zh′ + (z 2 − m2 )h = 0,
where now the derivatives are taken with respect to the new variable z. And now I am stuck since
there is no way to express a solution to this linear second order ordinary differential equation with
variable coefficients through the pool of elementary functions. I will need something else, and this
will require some facts from the so-called analytic theory of ODE.
f (x) = u0 + u1 (x − x0 ) + u2 (x − x0 )2 + u3 (x − x0 )3 + . . .
117
From the calculus course, for example, we know that exponent, sine and cosine are analytic any-
where in R:
x2 x3
ex = 1 + x + + + ...,
2! 3!
x3 x5 x7
sin x = x − + − + ...,
3! 5! 7!
x2 x4 x6
cos x = 1 − + − + ...
2! 4! 6!
Theorem 4.1. Consider problem (4.8) and assume p, q, r are analytic at x0 and p(x0 ) ̸= 0. Then
problem (4.8) with the initial conditions u(x0 ) = u0 , u′ (x0 ) = u1 has a unique analytic solution.
This theorem actually gives us a way to work through the problem. All we need to do is to look
for the solution in the form
u(x) = u0 + u1 (x − x0 ) + u2 (x − x0 )2 + . . .
and determine u2 , u3 , . . ..
Since the general solution to (4.8) is given by
where A, B are two arbitrary constants and û and ǔ are two linearly independent solutions, we can
always use our power series method with two different (and linearly independent) initial conditions,
e.g., we can take
û(x0 ) = 1, û′ (x0 ) = 0,
and
ǔ(x0 ) = 0, ǔ′ (x0 ) = 1.
In one of the previous lectures we already saw this method applied to the equation u′′ + ω 2 u = 0.
Consider another example.
Example 4.2 (Airy equation). Consider
u′′ − xu = 0,
u(0) = 1 = u0 , u′ (0) = 0 = u1 .
The stated theorem obviously works in this case since p is constant and r(x) = x. I take
u(x) = u0 + u1 x + u2 x2 + u3 x3 + u4 x4 + . . .
and hence
u′′ (x) = 2u2 + 3 · 2u3 x + 4 · 3u4 x2 + . . . .
Plugging the obtained expressions into my equation I find
118
Two convergent power series equal only if the coefficients at the same powers are equal, that is
2u2 = 0,
6u3 = u0 ,
12u4 = u1 ,
20u5 = u2 ,
30u6 = u3 ,
...
(n + 1)(n + 2)un+2 = un−1 ,
...
Using the initial conditions I find
u3k−3
u3k = , k = 1, 2, 3, . . .
3k(3k − 1)
and all other ui are zero. The last expression is enough to write that my first linearly independent
solution to the Airy equation is
∑∞
û(x) = u3k x3k ,
k=0
which, as can be proved, converges for any x ∈ R. I will leave it as an exercise to show that for the
initial conditions u(0) = 0, u′ (0) = 1 the solution is
∞
∑ u3k−2
ǔ(x) = u3k+1 x3k+1 , u3k+1 = .
(3k + 1)3k
k=0
119
The coefficient at xν must be
ν(ν − 1) + ν − m2 = 0,
which is true only if ν = ±m, if m ̸= 0.
For the ν + n degree I have, replacing m2 with ν 2 ,
[ ] 1
xν+n : (ν + n)2 − ν 2 un + un−2 =⇒ un = − un−2 , n = 2, 3, 4, . . .
n(2ν + n)
Starting with u0 = 1, u1 = 1 I get that all the odd indices are zero, whereas for even n = 2k
u2k−2 (−1)k
u2k = − = . . . = 2k
4k(k + ν) 2 k!(ν + k)(ν + k − 1) . . . (ν + 1)
and hence my solution is
∞
∑ (−1)k xν+2k
u(x) = .
22k k!(ν + k)(ν + k − 1) . . . (ν + 1)
k=0
In general this is not necessary, but in our case m is an integer, and if ν = −m then the denominator
in the series above vanishes. Hence we only found one solution to Bessel’s equation:
∞
∑ (−1)k xm+2k
u(x) = .
22k k!(m + k)(m + k − 1) . . . (m + 1)
k=0
I am allowed to multiply my solution by any constant, and I choose, by convention, to multiply the
series above by 1/(2m m!), in this case I have
∞
∑ (−1)k xm+2k
Jm (x) = ,
22k+m k!(m + k)!
k=0
Bessel’s function of the first kind of the m-th order. An application of the ratio test yields that the
series converges for any x ∈ R and hence Jm is analytic anywhere in R (or even in C).
Just to get first idea on the Bessel’s functions, note that
J0 (0) = 1, Jm (0) = 0, m = 1, 2, . . .
So, I found one independent solution and in need of another one. Abel’s formula (see Math 266)
tells me that for two linearly independent solutions to u′′ + a(x)u′ + b(x)u = 0 I must have
[ ] ´
u1 u2
det ′ ′ = Ce− a(x) dx .
u1 u2
120
Using u1 (x) = Jm (x) I have ˆ
u2 (x) C
= 2 (x)
dx.
Jm (x) Jm
Therefore, ˆ
C
u2 (x) = Jm (x) 2 (x)
dx
Jm
is the second linearly independent solution, which is actually called Neumann’s function of the second
kind of the m-th order.
121
The most important fact here is that N0 is not defined at zero and approaches −∞ (it behaves like
log for small x). Using Neumann’s function of the second kind I can define Bessel’s function of the
second kind of zero order as a special linear combination of J0 and N0 :
2
(N0 (x) − (log 2 − γ)J0 (x)) ,
Y0 (x) =
π
(∑ )
where γ is the Euler constant, γ = limn→∞ ( ni=1 ) n1 − log n ≈ 0.5772. Hence the general solution
to Bessel’s equation of zero order can be written (this is the most standard form) as
By above and generalizing I showed that the general solution to this equation is given by
√ √
R(r) = AJm ( λr) + BYm ( λr).
Now let me denote v(x) = J0 (αx) and w(x) = J0 (βx), where α and β are some constants. Due to
the above I have that v and w solve
xv ′′ + v ′ + α2 xv = 0,
xw′′ + w′ + β 2 xw = 0.
If I multiply the first equation by w, second by v and subtract then I get, after simplifications,
( ′ )′
x(v w − vw′ ) = (β 2 − α2 )xvw.
Similarly, by multiplying the equation for v by 2xv ′ I can show (exercise) that
ˆ 1
1
xJ02 (αx) dx = (J02 (α) + J12 (α)).
0 2
and ˆ 1
1
xJ02 (αx) dx = J12 (α).
0 2
122
The question is, of course, do we have any roots at all? To see that there are always infinitely
many roots, let me make the change of variables
√
v(x) = u(x) x
in Bessel’s equation of the order zero. Then, after straightforward manipulations, I find that
( )
′′ 1
v = − 1 + 2 v,
4x
that is, when x is large, then the equation is approximately v ′′ + v = 0, and hence for large x Bessel’s
equation has an approximate solution
A cos(x − ϕ)
u(x) = √ ,
x
for some constants A and ϕ, which indicates that Bessel’s functions approach zero as x → ∞ and that
Bessel’s functions have infinitely many real positive roots.
Let me introduce the inner product
ˆ 1
⟨f, g⟩B = xf (x)g(x) dx.
0
Note that the functions J0 (ζk x), k = 1, 2, 3, . . . are orthogonal on [0, 1] with respect to this inner
product. Here ζk is the k-th root of J0 (x). It can be proved that any “nice” function f can be
represented as a convergent series
This expansion is called the Fourier–Bessel expansion, and the coefficients can be found, from the
proved formulas above, as
ˆ 1
⟨f, J0 (ζk x)⟩B 2
ck = = 2 xf (x)J0 (ζk x) dx.
⟨J0 (ζk x), J0 (ζk x)⟩B J1 (ζk ) 0
where Jm is Bessel’s function of the first kind of order m and Ym is Bessel’s function of the second
kind of order m. J0 (0) = 1, Jm (0) = 0, m = 1, 2, . . .. Bessel’s functions of the second kind have a
singularity at x = 0 for any m. In particular, limx→0+ Ym (x) = −∞.
123
Figure 4.2: Graphs of several Bessel’s functions of the the first kind
Both Jm and Ym approach zero as x → ∞, both Jm and Ym have infinitely many positive roots.
Let ζk,m denote the k-th root of Jm . Then
ˆ 1 {
0, l ̸= k,
xJm (ζk,m x)Jm (ζl,m x) dx = 1 2
0 2 Jm+1 (ζk,m ), l = k.
where the explicit form of the coefficients can be inferred from the relation above.
124
Figure 4.3: Graphs of several Bessel’s functions of the the second kind
T ′ = −λα2 T,
and
−∆V = λV, x2 + y 2 < 1, V (x, y) = 0, x2 + y 2 = 1.
For the latter problem we again assumed that variables separate in polar coordinates v(r, θ) =
R(r)Θ(θ). For Θ we got
125
Now finally we use the material from the previous section and state that the general solution is given
by √ √
R(r) = AJm ( λr) + BYm ( λr).
From the condition that my solution must be bounded I have that B = 0 since Ym is not bounded
close to zero, from the boundary condition I have
√
Jm ( λ) = 0,
√
that is my λ must be the root of the corresponding equation. Hence,
2
λk,m = ζk,m ,
we know, thanks for the general theory, that all these eigenfunctions are orthogonal with respect to
this inner product. However, in this specific case we actually proved this fact. Indeed, in the polar
coordinates this inner product reads
ˆ 1ˆ π
⟨f, g⟩ = f (r cos θ, r sin θ)g(r cos θ, r sin θ)r dθ dr.
0 −π
Hence if we take two eigenfunctions with different m then we get zero due to the orthogonality of the
trigonometric system on [−π, π). If, however, m are the same, but k are different, then we get zeroes
due to the orthogonality of Bessel’s functions, note the necessary multiple r in our integral. Finally,
we need to calculate
126
Therefore, putting everything together, my solution is given by
∞
∑ ∞ ∑
∑ ∞
ak,0 e−α e−α
2ζ2 t 2ζ2
u(t, r cos θ, r sin θ) = k,0 J0 (ζk,0 r)+ k,m t Jm (ζk,m r) (ak,m cos mθ + bk,m sin mθ) .
k=1 m=1 k=1
The coefficients are found using the initial conditions and orthogonality of the corresponding eigen-
functions. To be precise,
´1´π
⟨f, vk,0 ⟩ f (r cos θ, r sin θ)J0 (ζk,0 r)r dθ dr
ak,0 = = 0 −π , k = 1, 2, . . . ,
⟨vk,0 , vk,0 ⟩ πJ1 (ζk,0 )
´1´π
⟨f, vk,m ⟩ 2 0 −π f (r cos θ, r sin θ)Jm (ζk,m r)r cos mθ dθ dr
ak,m = = , k, m = 1, 2, . . . ,
⟨vk,m , vk,m ⟩ πJ1 (ζk,0 )
´1´π
⟨f, ṽk,m ⟩ 2 0 −π f (r cos θ, r sin θ)Jm (ζk,m r)r sin mθ dθ dr
bk,m = = , k, m = 1, 2, . . . .
⟨ṽk,m , ṽk,m ⟩ πJ1 (ζk,0 )
since it can be proved that ζ1,0 is the largest root among all other roots of Jm .
The explicit examples will be given when I consider the wave equation below.
T ′′ + c2 λk,m T = 0,
where √
ωk,m = c λk,m
127
are the frequencies of vibration, and λk,m are the corresponding eigenvalues of the eigenvalues problem
for the Laplace operator, which is exactly the same as for the heat equation. The fundamental frequency
is the smallest one. Recall that for the rectangle we found that
( 2 )
k m2
λk,m = + π2,
a2 b2
and hence √
k 2 m2
ωk,m = cπ + 2 .
a2 b
The solution in the form of Fourier series reads
∑
u(t, x, y) = Tk,m (t)Vk,m (x, y),
k,m
where Vk,m are the eigenfunctions of the Laplace operator with type I boundary conditions, therefore
we immediately have an important conclusion: contrary to one dimensional case, when the solution to
the wave equation is a periodic function of t, the solution to the planar wave equation is not periodic
because the ratio
ωk,m
ω1,1
is not a rational number, which is required for the solution to be periodic (compare with the one
dimensional case).
Now let me consider again D = {(x, y) : x2 + y 2 < 1}. Following exactly the same steps that I did
for the heat equation, I will end up in general with the following solution:
∞
∑ ∞ ∑
∑ ∞
u(t, r cos θ, r sin θ) = Tk,0 (t)vk,0 (r, θ) + Tk,m (t)vk,m + T̃k,m (t)ṽk,m ,
k=1 m=1 k=1
that is the initial condition does not depend on the angle θ. From the symmetry arguments (or this
can be easily proved rigorously) the solution will not also depend on θ, and hence will have the form
∞
∑
U (t, r) = ak cos(cζk,0 t)J0 (ζk,0 r), k = 1, 2, . . .
k=1
and I do not have sine in my solution since the initial velocity is zero. The coefficients of my Fourier–
Bessel series are found as ˆ 1
2
ak = rf (r)J0 (ζk,0 r) dr.
J1 (ζk,0 ) 0
128
Figure 4.4: Graphs of J0 (r) and J0 (ζk,0 r)
To get what is actually happening with the solution, consider the graphs of J0 (ζk,0 r). They have
exactly k − 1 roots on the interval (0, 1) and represent the building blocks out of which the whole
solution is built. Each term of the form
J0 (ζk,0 r) cos(ζk,0 t)
represent a standing wave, such that the whole solution is a linear combination of these standing
waves, see another figure.
Figure 4.5: Standing waves. The thick curves are for cos(ζk,0 t) = ±1 and the dashed curves are for
time moments between those
129
Example 4.5. As a second example consider the problem with the initial condition is given by
with again the initial velocity equal to zero. The functions vk,m represent fundamental vibrations with
the frequency
wk,m = cζk,m ,
since the solution to this problem, due to orthogonality of vk,m , is given by
and these are the only solutions to my problem that are periodic. Here are the graphs of several
fundamental vibrations:
Similarly to the standing waves considered above, these fundamental vibrations will generate the
two dimensional standing waves. Note that those sets of points for which
vk,m (r, θ) = 0
will stay always zero. It can be probed that these sets are composed of nodal curves, that divide the
circular drum into several nodal regions.
Example 4.6. In general, the solution to the wave equation on a unit disk can be represented as a
linear combination of standing waves, each of which is generated by a fundamental vibration with the
corresponding frequency. It can be proved (originally was proved in 1929 by Siegel) that the ration
of these frequencies is never a rational number, and hence the solution to the wave equation with
general initial conditions is not a periodic function, the same as for the rectangle plate. In the musical
language I can rephrase that higher vibrations for the drumhead are not the pure overtones of the
basic frequency ω1,0 , which gives a mathematical explanation for the fact that human ear much prefer
the sound of one dimensional instruments, such as guitar or violin to two dimensional such as a drum.
130
Figure 4.6: Fundamental vibrations (note that I am using the second index to denote the order of
Bessel’s function, and the first index to denote the k-th root, which is opposite to what is used in the
textbook)
131
Figure 4.7: Nodal curves
132
4.4 Solving the wave equation in 2D and 3D space
’No,’ replied Margarita, ’what really puzzles me is where you have found the space for all
this.’ With a wave of her hand Margarita emphasized the vastness of the hall they were in.
Koroviev smiled sweetly, wrinkling his nose. ’Easy!’ he replied. ’For anyone who knows
how to handle the fifth dimension it’s no problem to expand any place to whatever size
you please.’
The goal of this concluding section is to find the solution to the initial value problem for the wave
equation
utt = c2 ∆u, x ∈ Rk ,
u(0, x) = g(x), (4.9)
ut (0, x) = h(x),
with k = 2, 3. Recall that in the case k = 1 we already know that the solution is given by d’Alembert’s
formula ˆ
g(x − ct) + g(x + ct) 1 x+ct
u(t, x) = + h(s) ds.
2 2c x−ct
It turns out that for the case k = 2, 3 it is also possible to find an explicit solution.
First, I will prove an auxiliary fact, which will help to reduce the number of computations.
Lemma 4.7. Let vh denote the solution to the problem.
vtt = c2 ∆v, x ∈ Rk ,
v(0, x) = 0, (4.10)
vt (0, x) = h(x).
Then function
∂
w= vg
∂t
solves
wtt = c2 ∆w, x ∈ Rk ,
w(0, x) = g(x), (4.11)
wt (0, x) = 0.
Proof. Indeed, since vg satisfies the wave equation, taking the derivatives with respect to time on
both left and right hand sides and exchanging the order of operators implies that w solves the wave
∂
equation. Moreover, w(0, x) = ∂t vg (0, x) = g(x) due to the definition of vg . Finally, I need to show
that wt (0, x) = 0. Consider
∂ ∂2
w(0, x) = 2 vg (0, x) = c2 ∆vg (0, x) = 0,
∂t ∂t
since, by definition vg (0, x) = 0.
133
Corollary 4.8. Any solution to (4.9) can be written as
∂
u= ug + uh ,
∂t
where uh solves (4.10).
Proof. The proof follows from the linearity of the original problem.
Therefore, all I need to do is to solve problem (4.10). The lemma above holds for any dimension k.
From now on, however, I will stick to the case k = 3, i.e., I will consider our familiar Euclidian space.
To find a solution to (4.10), I first will find the fundamental solution to the three dimensional wave
equation. This solution, by definition, solves problem (4.10) with h replaced by δ(x), which is three
dimensional delta-function, and which models an unit impulse (disturbance) applied at the point 0.
To do this I will use a three dimensional Fourier transform, which is defined for any f (x), x ∈ R3 as
˚
1
f (k) = √
ˆ f (x)e−ik·x dx,
( 2π)3 R 3
where k · x = k1 x1 + k2 x2 + k3 x3 in the usual scalar product. The inverse Fourier transform is defined
in a similar way with minus replaced by the plus. Denoting v̂ the Fourier transform of my unknown
function, I get, similarly to one dimensional case
1
v̂tt = −c2 |k|2 v̂, v̂(0) = 0, v̂t (0) = √ ,
( 2π)3
where
|k|2 = k · k = k12 + k22 + k32
is the usual Euclidian norm. Solving this simple ODE I find that
1
v̂(t, k) = √ sin(ct|k|).
( 2π)3 c|k|
To evaluate this integral I switch to spherical coordinates, chosen such that polar axis coincides with
the direction of vector x. My spherical coordinates are k = |k|, θ, φ. I also denote |x| = r. I have,
since k · x = kr cos φ, due to the choice of the polar axis. Hence my integral is now
ˆ 2π ˆ π ˆ ∞
1
v(t, x) = 3 sin(ctk)eikr cos φ k 2 sin φ dk dφ dθ.
8π c 0 0 0
134
The last integral should be understood in terms of generalizer functions and the inverse Fourier
transform. Using the complex exponent to represent sines I end up with
ˆ ( )
1 1
v(t, x) = 2 eik(ct−r) − eik(ct+r) dk = (δ(ct − r) − δ(ct + r)) .
8π cr R 4πcr
I am looking only in the future t > 0, and hence δ(ct + r) = 0. Finally, I get
δ(ct − r)
4πcr
is the fundament solution to the three dimensional heat equation. By a translation argument I get
that if my initial velocity would be
vt (0, x) = δ(x − ξ),
then my solution is
δ(ct − |x − ξ|)
K(t, x, ξ) = .
4πc|x − ξ|
Thus the fundamental solution is a traveling wave, initially concentrated at ξ and afterwards on
which is the boundary of the ball with the center at ξ and radius ct. This means, among other things,
that the wave originates at time t = 0 at the point ξ will be felt at the point η only at the time
|η − ξ|/c, which is called the strong Huygen’s principle and gives a mathematical explanation why
sharp signals propagate from a point source. Using the superposition principle I can represent the
sought solution to (4.10) as ˚
v(t, x) = K(t, x, ξ)h(ξ) dξ.
R3
The last integral in spherical coordinates takes the form
ˆ ∞ ˆ ˆ
δ(ct − r) 2π π
v(t, x) = h(r, θ, φ) dφ dθ dr.
0 4πcr 0 0
The double integral inside is the integral over the surface of the sphere of radius r with the center at
the point x: ˆ 2π ˆ π ¨
h(r, θ, φ) dφ dθ = h(σ) dσ,
0 0 ∂Br (x)
therefore, using the main property of the delta function I finally get
Theorem 4.9 (Kirchhoff’s formula). The unique solution to the problem (4.9) with k = 3 is given by
∂
u(t, x) = th + (tg) ,
∂t
where f is the average value of function f over the sphere of radius ct centered at x,
¨
1
f (t, x) = f (σ) dσ.
4πc2 t2 ∂Bct (x)
135
Kirchhoff’s formula also emphasizes the strong Huygen’s principle. To see this (see the figure)
assume that the initial disturbance h has a small compact support (that means that it is nonzero only
for some small region U ⊆ R3 ). I am interested in observing the signal at the point x. Initially, for
small t1 the sphere ∂Bct1 will not touch U and hence there will no signal at x. At some time t2 finally
the sphere will touch U , and this means that I hear the signal. I continue to experience the signal at
x up to the time t3 , when the whole domain U will be inside Bct3 . And since I am integrating over
the surface of my ball, after that time, for any t > t3 , I will have no indication of the signal at x. In
other words, the traveling wave in three dimensions has both the leading and the trailing edges sharp.
Figure 4.8: The spread of waves in three (left) and two (right) dimensional spaces. On the left I
integrate over the surface of my ball, and on the right I integrate over the whole shaded area
Now I will use the method of descent to obtain the explicit solution in the case R2 . The key idea
is to consider the problem
as three dimensional and write the points in R3 as (x, x3 ) and set h(x, x3 ) = h(x). Then by Kirchhoff’s
formula the solution is given by
¨
1
U (t, x, x3 ) = h(σ) dσ.
4πc2 t ∂Bct (x,x3 )
I claim that this solution is independent of x3 and hence gives me the solution to the two-dimensional
problem for any choice of x3 , for example, x3 = 0. Indeed, I have that my surface consists of two
136
hemispheres defined explicitly as
√
ξ3 = x3 ± c2 t2 − r2 = F± (ξ), r2 = (ξ1 − x1 )2 + (ξ2 − x2 )2 .
Taking integral by any of these hemispheres and using the fact that h does not depend on x3 I get
¨
1
u(t, x) = h(σ) dσ
4πc2 t ∂Bct (x,x3 )
¨ √
1
= h(ξ) 1 + |(∂ξ1 F± )2 + (∂ξ2 F± )|2 dξ
2πc2 t Bct (x)
¨
1 h(ξ) dξ
= √ .
2πc2 t Bct (x) c2 t2 − |x − ξ|2
This yields
Theorem 4.10 (Poisson’s formula). The unique solution to the problem (4.9) with k = 2 is given by
( ¨ ¨ )
1 ∂ g(ξ) dξ h(ξ) dξ
u(t, x) = √ + √ .
2πc2 t ∂t Bct (x) c2 t2 − |x − ξ|2 Bct (x) c2 t2 − |x − ξ|2
The key fact here is that the integration now not over the surface but over the whole ball itself.
That is, the traveling wave has the sharp leading edge but not sharp trailing edge, our integral will
always be nonzero for any time t > t2 (see the right panel in the figure) since the initial disturbance
will be always inside the ball. The same holds for any space of even dimension. You can observe
actually this effect experimentally by putting a cork on water surface and dropping a stone nearby.
You will see how, after some initial time, the cork will feel the disturbance, but it will not stop and
continue to oscillate later.
Returning back to the quotation for this lecture, in even dimensions there is no possibility to talk
since the sound ways have no sharp trailing edge. In dimension five, however, the conversations can
be made the same way how we talk in our familiar three dimensions.
137