This action might not be possible to undo. Are you sure you want to continue?

**Michaelmas Term 2008
**

Mathematical Methods I

Dr Gordon Ogilvie

Schedule

Vector calculus. Suﬃx notation. Contractions using δ

ij

and ǫ

ijk

. Re-

minder of vector products, grad, div, curl, ∇

2

and their representations

using suﬃx notation. Divergence theorem and Stokes’s theorem. Vector

diﬀerential operators in orthogonal curvilinear coordinates, e.g. cylin-

drical and spherical polar coordinates. Jacobians. [6 lectures]

Partial diﬀerential equations. Linear second-order partial diﬀeren-

tial equations; physical examples of occurrence, the method of separa-

tion of variables (Cartesian coordinates only). [2 lectures]

Green’s functions. Response to impulses, delta function (treated

heuristically), Green’s functions for initial and boundary value prob-

lems. [3 lectures]

Fourier transform. Fourier transforms; relation to Fourier series, sim-

ple properties and examples, convolution theorem, correlation fnuctions,

Parseval’s theorem and power spectra. [2 lectures]

Matrices. N-dimensional vector spaces, matrices, scalar product, trans-

formation of basis vectors. Eigenvalues and eigenvectors of a matrix;

degenerate case, stationary property of eigenvalues. Orthogonal and

unitary transformations. Quadratic and Hermitian forms, quadric sur-

faces. [5 lectures]

Elementary analysis. Idea of convergence and limits. O notation.

Statement of Taylor’s theorem with discussion of remainder. Conver-

gence of series; comparison and ratio tests. Power series of a complex

variable; circle of convergence. Analytic functions: Cauchy–Riemann

equations, rational functions and exp(z). Zeros, poles and essential

singularities. [3 lectures]

Series solutions of ordinary diﬀerential equations. Homogeneous

equations; solution by series (without full discussion of logarithmic sin-

gularities), exempliﬁed by Legendre’s equation. Classiﬁcation of sin-

gular points. Indicial equations and local behaviour of solutions near

singular points. [3 lectures]

Course website

camtools.caret.cam.ac.uk Maths : NSTIB 08 09

Assumed knowledge

I will assume familiarity with the following topics at the level of Course A

of Part IA Mathematics for Natural Sciences.

• Algebra of complex numbers

• Algebra of vectors (including scalar and vector products)

• Algebra of matrices

• Eigenvalues and eigenvectors of matrices

• Taylor series and the geometric series

• Calculus of functions of several variables

• Line, surface and volume integrals

• The Gaussian integral

• First-order ordinary diﬀerential equations

• Second-order linear ODEs with constant coeﬃcients

• Fourier series

Textbooks

The following textbooks are particularly recommended.

• K. F. Riley, M. P. Hobson and S. J. Bence (2006). Mathematical

Methods for Physics and Engineering, 3rd edition. Cambridge

University Press

• G. Arfken and H. J. Weber (2005). Mathematical Methods for

Physicists, 6th edition. Academic Press

1 Vector calculus

1.1 Motivation

Scientiﬁc quantities are of diﬀerent kinds:

• scalars have only magnitude (and sign), e.g. mass, electric charge,

energy, temperature

• vectors have magnitude and direction, e.g. velocity, electric ﬁeld,

temperature gradient

A ﬁeld is a quantity that depends continuously on position (and possibly

on time). Examples:

• air pressure in this room (scalar ﬁeld)

• electric ﬁeld in this room (vector ﬁeld)

Vector calculus is concerned with scalar and vector ﬁelds. The spatial

variation of ﬁelds is described by vector diﬀerential operators, which

appear in the partial diﬀerential equations governing the ﬁelds.

Vector calculus is most easily done in Cartesian coordinates, but other

systems (curvilinear coordinates) are better suited for many problems

because of symmetries or boundary conditions.

1.2 Suﬃx notation and Cartesian coordinates

1.2.1 Three-dimensional Euclidean space

This is a close approximation to our physical space:

• points are the elements of the space

• vectors are translatable, directed line segments

• Euclidean means that lengths and angles obey the classical results

of geometry

1

Points and vectors have a geometrical existence without reference to

any coordinate system. For deﬁnite calculations, however, we must

introduce coordinates.

Cartesian coordinates:

• measured with respect to an origin O and a system of orthogonal

axes Oxyz

• points have coordinates (x, y, z) = (x

1

, x

2

, x

3

)

• unit vectors (e

x

, e

y

, e

z

) = (e

1

, e

2

, e

3

) in the three coordinate di-

rections, also called (ˆı, ˆ ,

ˆ

k) or (ˆ x, ˆ y, ˆ z)

The position vector of a point P is

−−→

OP = r = e

x

x +e

y

y +e

z

z =

3

i=1

e

i

x

i

1.2.2 Properties of the Cartesian unit vectors

The unit vectors form a basis for the space. Any vector a belonging to

the space can be written uniquely in the form

a = e

x

a

x

+e

y

a

y

+e

z

a

z

=

3

i=1

e

i

a

i

where a

i

are the Cartesian components of the vector a.

The basis is orthonormal (orthogonal and normalized):

e

1

· e

1

= e

2

· e

2

= e

3

· e

3

= 1

e

1

· e

2

= e

1

· e

3

= e

2

· e

3

= 0

2

and right-handed:

[e

1

, e

2

, e

3

] ≡ e

1

· e

2

e

3

= +1

This means that

e

1

e

2

= e

3

e

2

e

3

= e

1

e

3

e

1

= e

2

The choice of basis is not unique. Two diﬀerent bases, if both orthonor-

mal and right-handed, are simply related by a rotation. The Cartesian

components of a vector are diﬀerent with respect to two diﬀerent bases.

1.2.3 Suﬃx notation

• x

i

for a coordinate, a

i

(e.g.) for a vector component, e

i

for a unit

vector

• in three-dimensional space the symbolic index i can have the val-

ues 1, 2 or 3

• quantities (tensors) with more than one index, such as a

ij

or b

ijk

,

can also be deﬁned

Scalar and vector products can be evaluated in the following way:

a · b = (e

1

a

1

+e

2

a

2

+e

3

a

3

) · (e

1

b

1

+e

2

b

2

+e

3

b

3

)

= a

1

b

1

+a

2

b

2

+a

3

b

3

a b = (e

1

a

1

+e

2

a

2

+e

3

a

3

) (e

1

b

1

+e

2

b

2

+e

3

b

3

)

= e

1

(a

2

b

3

−a

3

b

2

) +e

2

(a

3

b

1

−a

1

b

3

) +e

3

(a

1

b

2

−a

2

b

1

)

=

¸

¸

¸

¸

¸

¸

e

1

e

2

e

3

a

1

a

2

a

3

b

1

b

2

b

3

¸

¸

¸

¸

¸

¸

3

1.2.4 Delta and epsilon

Suﬃx notation is made easier by deﬁning two symbols. The Kronecker

delta symbol is

δ

ij

=

_

1 if i = j

0 otherwise

In detail:

δ

11

= δ

22

= δ

33

= 1

all others, e.g. δ

12

= 0

δ

ij

gives the components of the unit matrix. It is symmetric:

δ

ji

= δ

ij

and has the substitution property:

3

j=1

δ

ij

a

j

= a

i

(in matrix notation, 1a = a)

The Levi-Civita permutation symbol is

ǫ

ijk

=

_

¸

_

¸

_

1 if (i, j, k) is an even permutation of (1, 2, 3)

−1 if (i, j, k) is an odd permutation of (1, 2, 3)

0 otherwise

In detail:

ǫ

123

= ǫ

231

= ǫ

312

= 1

ǫ

132

= ǫ

213

= ǫ

321

= −1

all others, e.g. ǫ

112

= 0

4

An even (odd) permutation is one consisting of an even (odd) number

of transpositions (interchanges of two objects). Therefore ǫ

ijk

is totally

antisymmetric: it changes sign when any two indices are interchanged,

e.g. ǫ

jik

= −ǫ

ijk

. It arises in vector products and determinants.

ǫ

ijk

has three indices because the space has three dimensions. The even

permutations of (1, 2, 3) are (1, 2, 3), (2, 3, 1) and (3, 1, 2), which also

happen to be the cyclic permutations. So ǫ

ijk

= ǫ

jki

= ǫ

kij

.

Then we can write

a · b =

3

i=1

3

j=1

δ

ij

a

i

b

j

=

3

i=1

a

i

b

i

a b =

3

i=1

3

j=1

3

k=1

ǫ

ijk

e

i

a

j

b

k

1.2.5 Einstein summation convention

The summation sign is conventionally omitted in expressions of this

type. It is implicit that a repeated index is to be summed over. Thus

a · b = a

i

b

i

a b = ǫ

ijk

e

i

a

j

b

k

or (a b)

i

= ǫ

ijk

a

j

b

k

The repeated index should occur exactly twice in any term. Examples

of invalid notation are:

a · a = a

2

i

(should be a

i

a

i

)

(a · b)(c · d) = a

i

b

i

c

i

d

i

(should be a

i

b

i

c

j

d

j

)

The repeated index is a dummy index and can be relabelled at will.

Other indices in an expression are free indices. The free indices in each

term in an equation must agree.

5

Examples:

a = b a

i

= b

i

a = b c a

i

= ǫ

ijk

b

j

c

k

[a[

2

= b · c a

i

a

i

= b

i

c

i

a = (b · c)d +e f a

i

= b

j

c

j

d

i

+ǫ

ijk

e

j

f

k

Contraction means to set one free index equal to another, so that it is

summed over. The contraction of a

ij

is a

ii

. Contraction is equivalent

to multiplication by a Kronecker delta:

a

ij

δ

ij

= a

11

+a

22

+a

33

= a

ii

The contraction of δ

ij

is δ

ii

= 1 + 1 + 1 = 3 (in three dimensions).

If the summation convention is not being used, this should be noted

explicitly.

1.2.6 Matrices and suﬃx notation

Matrix (A) times vector (x):

y = Ax y

i

= A

ij

x

j

Matrix times matrix:

A = BC A

ij

= B

ik

C

kj

Transpose of a matrix:

(A

T

)

ij

= A

ji

Trace of a matrix:

tr A = A

ii

Determinant of a (3 3) matrix:

det A = ǫ

ijk

A

1i

A

2j

A

3k

(or many equivalent expressions).

6

1.2.7 Product of two epsilons

The general identity

ǫ

ijk

ǫ

lmn

=

¸

¸

¸

¸

¸

¸

δ

il

δ

im

δ

in

δ

jl

δ

jm

δ

jn

δ

kl

δ

km

δ

kn

¸

¸

¸

¸

¸

¸

can be established by the following argument. The value of both the

LHS and the RHS:

• is 0 when any of (i, j, k) are equal or when any of (l, m, n) are

equal

• is 1 when (i, j, k) = (l, m, n) = (1, 2, 3)

• changes sign when any of (i, j, k) are interchanged or when any of

(l, m, n) are interchanged

These properties ensure that the LHS and RHS are equal for any choices

of the indices. Note that the ﬁrst property is in fact implied by the third.

We contract the identity once by setting l = i:

ǫ

ijk

ǫ

imn

=

¸

¸

¸

¸

¸

¸

δ

ii

δ

im

δ

in

δ

ji

δ

jm

δ

jn

δ

ki

δ

km

δ

kn

¸

¸

¸

¸

¸

¸

= δ

ii

(δ

jm

δ

kn

−δ

jn

δ

km

) +δ

im

(δ

jn

δ

ki

−δ

ji

δ

kn

)

+δ

in

(δ

ji

δ

km

−δ

jm

δ

ki

)

= 3(δ

jm

δ

kn

−δ

jn

δ

km

) + (δ

jn

δ

km

−δ

jm

δ

kn

)

+ (δ

jn

δ

km

−δ

jm

δ

kn

)

= δ

jm

δ

kn

−δ

jn

δ

km

This is the most useful form to remember:

ǫ

ijk

ǫ

imn

= δ

jm

δ

kn

−δ

jn

δ

km

7

Given any product of two epsilons with one common index, the indices

can be permuted cyclically into this form, e.g.:

ǫ

αβγ

ǫ

µνβ

= ǫ

βγα

ǫ

βµν

= δ

γµ

δ

αν

−δ

γν

δ

αµ

Further contractions of the identity:

ǫ

ijk

ǫ

ijn

= δ

jj

δ

kn

−δ

jn

δ

kj

= 3δ

kn

−δ

kn

= 2δ

kn

ǫ

ijk

ǫ

ijk

= 6

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Show that (a b) · (c d) = (a · c)(b · d) −(a · d)(b · c).

(a b) · (c d) = (a b)

i

(c d)

i

= (ǫ

ijk

a

j

b

k

)(ǫ

ilm

c

l

d

m

)

= ǫ

ijk

ǫ

ilm

a

j

b

k

c

l

d

m

= (δ

jl

δ

km

−δ

jm

δ

kl

)a

j

b

k

c

l

d

m

= a

j

b

k

c

j

d

k

−a

j

b

k

c

k

d

j

= (a

j

c

j

)(b

k

d

k

) −(a

j

d

j

)(b

k

c

k

)

= (a · c)(b · d) −(a · d)(b · c)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

1.3 Vector diﬀerential operators

1.3.1 The gradient operator

We consider a scalar ﬁeld (function of position, e.g. temperature)

Φ(x, y, z) = Φ(r)

Taylor’s theorem for a function of three variables:

Φ(x +δx, y +δy, z +δz) = Φ(x, y, z)

+

∂Φ

∂x

δx +

∂Φ

∂y

δy +

∂Φ

∂z

δz +O(δx

2

, δxδy, . . . )

Equivalently

Φ(r +δr) = Φ(r) + (∇Φ) · δr +O([δr[

2

)

where the gradient of the scalar ﬁeld is

∇Φ = e

x

∂Φ

∂x

+e

y

∂Φ

∂y

+e

z

∂Φ

∂z

also written grad Φ. In suﬃx notation

∇Φ = e

i

∂Φ

∂x

i

For an inﬁnitesimal increment we can write

dΦ = (∇Φ) · dr

We can consider ∇ (del, grad or nabla) as a vector diﬀerential operator

∇ = e

x

∂

∂x

+e

y

∂

∂y

+e

z

∂

∂z

= e

i

∂

∂x

i

which acts on Φ to produce ∇Φ.

9

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Find ∇f, where f(r) is a function of r = [r[.

∇f = e

x

∂f

∂x

+e

y

∂f

∂y

+e

z

∂f

∂z

∂f

∂x

=

df

dr

∂r

∂x

r

2

= x

2

+y

2

+z

2

2r

∂r

∂x

= 2x

∂f

∂x

=

df

dr

x

r

, etc.

∇f =

df

dr

_

x

r

,

y

r

,

z

r

_

=

df

dr

r

r

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3.2 Geometrical meaning of the gradient

The rate of change of Φ with distance s in the direction of the unit

vector t, evaluated at the point r, is

d

ds

[Φ(r +ts) −Φ(r)]

¸

¸

¸

¸

¸

s=0

=

d

ds

_

(∇Φ) · (ts) +O(s

2

)

¸

¸

¸

¸

¸

¸

s=0

= t · ∇Φ

This is known as a directional derivative.

Notes:

• the derivative is maximal in the direction t | ∇Φ

• the derivative is zero in directions such that t ⊥ ∇Φ

10

• the directions t ⊥ ∇Φ therefore lie in the plane tangent to the

surface Φ = constant

We conclude that:

• ∇Φ is a vector ﬁeld, pointing in the direction of increasing Φ

• the unit vector normal to the surface Φ = constant is n = ∇Φ/[∇Φ[

• the rate of change of Φ with arclength s along a curve is dΦ/ds =

t · ∇Φ, where t = dr/ds is the unit tangent vector to the curve

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Find the unit normal at the point r = (x, y, z) to the surface

Φ(r) ≡ xy +yz +zx = −c

2

where c is a constant. Hence ﬁnd the points where the plane tangent to

the surface is parallel to the (x, y) plane.

∇Φ = (y +z, x +z, y +x)

n =

∇Φ

[∇Φ[

=

(y +z, x +z, y +x)

_

2(x

2

+y

2

+z

2

+xy +xz +yz)

n| e

z

when y +z = x +z = 0

⇒−z

2

= −c

2

⇒z = ±c

solutions (−c, −c, c), (c, c, −c)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

1.3.3 Related vector diﬀerential operators

We now consider a general vector ﬁeld (e.g. electric ﬁeld)

F(r) = e

x

F

x

(r) +e

y

F

y

(r) +e

z

F

z

(r) = e

i

F

i

(r)

The divergence of a vector ﬁeld is the scalar ﬁeld

∇ · F =

_

e

x

∂

∂x

+e

y

∂

∂y

+e

z

∂

∂z

_

· (e

x

F

x

+e

y

F

y

+e

z

F

z

)

=

∂F

x

∂x

+

∂F

y

∂y

+

∂F

z

∂z

also written div F. Note that the Cartesian unit vectors are independent

of position and do not need to be diﬀerentiated. In suﬃx notation

∇ · F =

∂F

i

∂x

i

The curl of a vector ﬁeld is the vector ﬁeld

∇F =

_

e

x

∂

∂x

+e

y

∂

∂y

+e

z

∂

∂z

_

(e

x

F

x

+e

y

F

y

+e

z

F

z

)

= e

x

_

∂F

z

∂y

−

∂F

y

∂z

_

+e

y

_

∂F

x

∂z

−

∂F

z

∂x

_

+e

z

_

∂F

y

∂x

−

∂F

x

∂y

_

=

¸

¸

¸

¸

¸

¸

e

x

e

y

e

z

∂/∂x ∂/∂y ∂/∂z

F

x

F

y

F

z

¸

¸

¸

¸

¸

¸

also written curl F. In suﬃx notation

∇F = e

i

ǫ

ijk

∂F

k

∂x

j

or (∇F)

i

= ǫ

ijk

∂F

k

∂x

j

12

The Laplacian of a scalar ﬁeld is the scalar ﬁeld

∇

2

Φ = ∇ · (∇Φ) =

∂

2

Φ

∂x

2

+

∂

2

Φ

∂y

2

+

∂

2

Φ

∂z

2

=

∂

2

Φ

∂x

i

∂x

i

The Laplacian diﬀerential operator (del squared) is

∇

2

=

∂

2

∂x

2

+

∂

2

∂y

2

+

∂

2

∂z

2

=

∂

2

∂x

i

∂x

i

It appears very commonly in partial diﬀerential equations. The Lapla-

cian of a vector ﬁeld can also be deﬁned. In suﬃx notation

∇

2

F = e

i

∂

2

F

i

∂x

j

∂x

j

because the Cartesian unit vectors are independent of position.

The directional derivative operator is

t · ∇ = t

x

∂

∂x

+t

y

∂

∂y

+t

z

∂

∂z

= t

i

∂

∂x

i

t · ∇Φ is the rate of change of Φ with distance in the direction of the

unit vector t. This can be thought of either as t · (∇Φ) or as (t · ∇)Φ.

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Find the divergence and curl of the vector ﬁeld F = (x

2

y, y

2

z, z

2

x).

∇ · F =

∂

∂x

(x

2

y) +

∂

∂y

(y

2

z) +

∂

∂z

(z

2

x) = 2(xy +yz +zx)

∇F =

¸

¸

¸

¸

¸

¸

e

x

e

y

e

z

∂/∂x ∂/∂y ∂/∂z

x

2

y y

2

z z

2

x

¸

¸

¸

¸

¸

¸

= (−y

2

, −z

2

, −x

2

)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Find ∇

2

r

n

.

∇r

n

=

_

∂r

n

∂x

,

∂r

n

∂y

,

∂r

n

∂z

_

= nr

n−2

(x, y, z) (recall

∂r

∂x

=

x

r

, etc.)

13

∇

2

r

n

= ∇ · ∇r

n

=

∂

∂x

(nr

n−2

x) +

∂

∂y

(nr

n−2

y) +

∂

∂z

(nr

n−2

z)

= nr

n−2

+n(n −2)r

n−4

x

2

+

= 3nr

n−2

+n(n −2)r

n−2

= n(n + 1)r

n−2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3.4 Vector invariance

A scalar is invariant under a change of basis, while vector components

transform in a particular way under a rotation.

Fields constructed using ∇ share these properties, e.g.:

• ∇Φ and ∇F are vector ﬁelds and their components depend on

the basis

• ∇· F and ∇

2

Φ are scalar ﬁelds and are invariant under a rotation

grad, div and ∇

2

can be deﬁned in spaces of any dimension, but curl

(like the vector product) is a three-dimensional concept.

1.3.5 Vector diﬀerential identities

Here Φ and Ψ are arbitrary scalar ﬁelds, and F and G are arbitrary

vector ﬁelds.

Two operators, one ﬁeld:

∇ · (∇Φ) = ∇

2

Φ

∇ · (∇F) = 0

∇(∇Φ) = 0

∇(∇F) = ∇(∇ · F) −∇

2

F

14

One operator, two ﬁelds:

∇(ΦΨ) = Ψ∇Φ + Φ∇Ψ

∇(F · G) = (G· ∇)F +G(∇F) + (F · ∇)G+F (∇G)

∇ · (ΦF) = (∇Φ) · F + Φ∇ · F

∇ · (F G) = G· (∇F) −F · (∇G)

∇(ΦF) = (∇Φ) F + Φ∇F

∇(F G) = (G· ∇)F −G(∇ · F) −(F · ∇)G+F(∇ · G)

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Show that ∇ · (∇ F) = 0 for any (twice-diﬀerentiable) vector ﬁeld

F.

∇ · (∇F) =

∂

∂x

i

_

ǫ

ijk

∂F

k

∂x

j

_

= ǫ

ijk

∂

2

F

k

∂x

i

∂x

j

= 0

(since ǫ

ijk

is antisymmetric on (i, j) while ∂

2

F

k

/∂x

i

∂x

j

is symmetric)

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Show that ∇(F G) = (G·∇)F −G(∇·F)−(F · ∇)G+F(∇·G).

∇(F G) = e

i

ǫ

ijk

∂

∂x

j

(ǫ

klm

F

l

G

m

)

= e

i

ǫ

kij

ǫ

klm

∂

∂x

j

(F

l

G

m

)

= e

i

(δ

il

δ

jm

−δ

im

δ

jl

)

_

G

m

∂F

l

∂x

j

+F

l

∂G

m

∂x

j

_

= e

i

G

j

∂F

i

∂x

j

−e

i

G

i

∂F

j

∂x

j

+e

i

F

i

∂G

j

∂x

j

−e

i

F

j

∂G

i

∂x

j

= (G· ∇)F −G(∇ · F) +F(∇ · G) −(F · ∇)G

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Notes:

• be clear about which terms to the right an operator is acting on

(use brackets if necessary)

15

• you cannot simply apply standard vector identities to expressions

involving ∇, e.g. ∇ · (F G) ,= (∇F) · G

Related results:

• if a vector ﬁeld F is irrotational (∇F = 0) in a region of space,

it can be written as the gradient of a scalar potential : F = ∇Φ.

e.g. a ‘conservative’ force ﬁeld such as gravity

• if a vector ﬁeld F is solenoidal (∇ · F = 0) in a region of space,

it can be written as the curl of a vector potential : F = ∇ G.

e.g. the magnetic ﬁeld

1.4 Integral theorems

These very important results derive from the fundamental theorem of

calculus (integration is the inverse of diﬀerentiation):

_

b

a

df

dx

dx = f(b) −f(a)

1.4.1 The gradient theorem

_

C

(∇Φ) · dr = Φ(r

2

) −Φ(r

1

)

where C is any curve from r

1

to r

2

.

Outline proof:

_

C

(∇Φ) · dr =

_

C

dΦ

= Φ(r

2

) −Φ(r

1

)

16

1.4.2 The divergence theorem (Gauss’s theorem)

_

V

(∇ · F) dV =

_

S

F · dS

where V is a volume bounded by the closed surface S (also called ∂V ).

The right-hand side is the ﬂux of F through the surface S. The vector

surface element is dS = ndS, where n is the outward unit normal

vector.

Outline proof: ﬁrst prove for a cuboid:

_

V

(∇ · F) dV =

_

z

2

z

1

_

y

2

y

1

_

x

2

x

1

_

∂F

x

∂x

+

∂F

y

∂y

+

∂F

z

∂z

_

dxdy dz

=

_

z

2

z

1

_

y

2

y

1

[F

x

(x

2

, y, z) −F

x

(x

1

, y, z)] dy dz

+ two similar terms

=

_

S

F · dS

17

An arbitrary volume V can be subdivided into small cuboids to any

desired accuracy. When the integrals are added together, the ﬂuxes

through internal surfaces cancel out, leaving only the ﬂux through S.

A simply connected volume (e.g. a ball) is one with no holes. It has

only an outer surface. For a multiply connected volume (e.g. a spherical

shell), all the surfaces must be considered.

Related results:

_

V

(∇Φ) dV =

_

S

ΦdS

_

V

(∇F) dV =

_

S

dS F

The rule is to replace ∇ in the volume integral with n in the surface

integral, and dV with dS (note that dS = ndS).

18

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ A submerged body is acted on by a hydrostatic pressure p = −ρgz,

where ρ is the density of the ﬂuid, g is the gravitational acceleration

and z is the vertical coordinate. Find a simpliﬁed expression for the

pressure force acting on the body.

F = −

_

S

p dS

F

z

= e

z

· F =

_

S

(−e

z

p) · dS

=

_

V

∇ · (−e

z

p) dV

=

_

V

∂

∂z

(ρgz) dV

= ρg

_

V

dV

= Mg (M is the mass of ﬂuid displaced by the body)

Similarly

F

x

=

_

V

∂

∂x

(ρgz) dV = 0

F

y

= 0

F = e

z

Mg

Archimedes’ principle: buoyancy force equals weight of displaced ﬂuid

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

1.4.3 The curl theorem (Stokes’s theorem)

_

S

(∇F) · dS =

_

C

F · dr

where S is an open surface bounded by the closed curve C (also called

∂S). The right-hand side is the circulation of F around the curve C.

Whichever way the unit normal n is deﬁned on S, the line integral

follows the direction of a right-handed screw around n.

Special case: for a planar surface in the (x, y) plane, we have Green’s

theorem in the plane:

__

A

_

∂F

y

∂x

−

∂F

x

∂y

_

dxdy =

_

C

(F

x

dx +F

y

dy)

where A is a region of the plane bounded by the curve C, and the line

integral follows a positive sense.

20

Outline proof: ﬁrst prove Green’s theorem for a rectangle:

_

y

2

y

1

_

x

2

x

1

_

∂F

y

∂x

−

∂F

x

∂y

_

dxdy

=

_

y

2

y

1

[F

y

(x

2

, y) −F

y

(x

1

, y)] dy

−

_

x

2

x

1

[F

x

(x, y

2

) −F

x

(x, y

1

)] dx

=

_

x

2

x

1

F

x

(x, y

1

) dx +

_

y

2

y

1

F

y

(x

2

, y) dy

+

_

x

1

x

2

F

x

(x, y

2

) dx +

_

y

1

y

2

F

y

(x

1

, y) dy

=

_

C

(F

x

dx +F

y

dy)

An arbitrary surface S can be subdivided into small planar rectangles

to any desired accuracy. When the integrals are added together, the

circulations along internal curve segments cancel out, leaving only the

circulation around C.

21

Notes:

• in Stokes’s theorem S is an open surface, while in Gauss’s theorem

it is closed

• many diﬀerent surfaces are bounded by the same closed curve,

while only one volume is bounded by a closed surface

• a multiply connected surface (e.g. an annulus) may have more

than one bounding curve

1.4.4 Geometrical deﬁnitions of grad, div and curl

The integral theorems can be used to assign coordinate-independent

meanings to grad, div and curl.

Apply the gradient theorem to an arbitrarily small line segment δr =

t δs in the direction of any unit vector t. Since the variation of ∇Φ and

t along the line segment is negligible,

(∇Φ) · t δs ≈ δΦ

and so

t · (∇Φ) = lim

δs→0

δΦ

δs

This deﬁnition can be used to determine the component of ∇Φ in any

desired direction.

22

Similarly, by applying the divergence theorem to an arbitrarily small

volume δV bounded by a surface δS, we ﬁnd that

∇ · F = lim

δV →0

1

δV

_

δS

F · dS

Finally, by applying the curl theorem to an arbitrarily small open sur-

face δS with unit normal vector n and bounded by a curve δC, we ﬁnd

that

n · (∇F) = lim

δS→0

1

δS

_

δC

F · dr

The gradient therefore describes the rate of change of a scalar ﬁeld with

distance. The divergence describes the net source or eﬄux of a vector

ﬁeld per unit volume. The curl describes the circulation or rotation of

a vector ﬁeld per unit area.

1.5 Orthogonal curvilinear coordinates

Cartesian coordinates can be replaced with any independent set of co-

ordinates q

1

(x

1

, x

2

, x

3

), q

2

(x

1

, x

2

, x

3

), q

3

(x

1

, x

2

, x

3

), e.g. cylindrical or

spherical polar coordinates.

Curvilinear (as opposed to rectilinear) means that the coordinate ‘axes’

are curves. Curvilinear coordinates are useful for solving problems in

curved geometry (e.g. geophysics).

1.5.1 Line element

The inﬁnitesimal line element in Cartesian coordinates is

dr = e

x

dx +e

y

dy +e

z

dz

In general curvilinear coordinates we have

dr = h

1

dq

1

+h

2

dq

2

+h

3

dq

3

23

where

h

i

= e

i

h

i

=

∂r

∂q

i

(no sum)

determines the displacement associated with an increment of the coor-

dinate q

i

.

h

i

= [h

i

[ is the scale factor (or metric coeﬃcient) associated with the

coordinate q

i

. It converts a coordinate increment into a length. Any

point at which h

i

= 0 is a coordinate singularity at which the coordinate

system breaks down.

e

i

is the corresponding unit vector. This notation generalizes the use

of e

i

for a Cartesian unit vector. For Cartesian coordinates, h

i

= 1 and

e

i

are constant, but in general both h

i

and e

i

depend on position.

The summation convention does not work well with orthogonal curvi-

linear coordinates.

1.5.2 The Jacobian

The Jacobian of (x, y, z) with respect to (q

1

, q

2

, q

3

) is deﬁned as

J =

∂(x, y, z)

∂(q

1

, q

2

, q

3

)

=

¸

¸

¸

¸

¸

¸

∂x/∂q

1

∂x/∂q

2

∂x/∂q

3

∂y/∂q

1

∂y/∂q

2

∂y/∂q

3

∂z/∂q

1

∂z/∂q

2

∂z/∂q

3

¸

¸

¸

¸

¸

¸

This is the determinant of the Jacobian matrix of the transformation

from coordinates (q

1

, q

2

, q

3

) to (x

1

, x

2

, x

3

). The columns of the above

matrix are the vectors h

i

deﬁned above. Therefore the Jacobian is equal

to the scalar triple product

J = [h

1

, h

2

, h

3

] = h

1

· h

2

h

3

Given a point with curvilinear coordinates (q

1

, q

2

, q

3

), consider three

small displacements δr

1

= h

1

δq

1

, δr

2

= h

2

δq

2

and δr

3

= h

3

δq

3

along

24

the three coordinate directions. They span a parallelepiped of volume

δV = [[δr

1

, δr

2

, δr

3

][ = [J[ δq

1

δq

2

δq

3

Hence the volume element in a general curvilinear coordinate system is

dV =

¸

¸

¸

¸

∂(x, y, z)

∂(q

1

, q

2

, q

3

)

¸

¸

¸

¸

dq

1

dq

2

dq

3

The Jacobian therefore appears whenever changing variables in a mul-

tiple integral:

_

Φ(r) dV =

___

Φdxdy dz =

___

Φ

¸

¸

¸

¸

∂(x, y, z)

∂(q

1

, q

2

, q

3

)

¸

¸

¸

¸

dq

1

dq

2

dq

3

The limits on the integrals also need to be considered.

Jacobians are deﬁned similarly for transformations in any number of

dimensions. If curvilinear coordinates (q

1

, q

2

) are introduced in the

(x, y)-plane, the area element is

dA = [J[ dq

1

dq

2

with

J =

∂(x, y)

∂(q

1

, q

2

)

=

¸

¸

¸

¸

∂x/∂q

1

∂x/∂q

2

∂y/∂q

1

∂y/∂q

2

¸

¸

¸

¸

The equivalent rule for a one-dimensional integral is

_

f(x) dx =

_

f(x(q))

¸

¸

¸

¸

dx

dq

¸

¸

¸

¸

dq

The modulus sign appears here if the integrals are carried out over a

positive range (the upper limits are greater than the lower limits).

25

1.5.3 Properties of Jacobians

Consider now three sets of n variables α

i

, β

i

and γ

i

, with 1 i n,

none of which need be Cartesian coordinates. According to the chain

rule of partial diﬀerentiation,

∂α

i

∂γ

j

=

n

k=1

∂α

i

∂β

k

∂β

k

∂γ

j

(Under the summation convention we may omit the Σ sign.) The left-

hand side is the ij-component of the Jacobian matrix of the transfor-

mation from γ

i

to α

i

, and the equation states that this matrix is the

product of the Jacobian matrices of the transformations from γ

i

to β

i

and from β

i

to α

i

. Taking the determinant of this matrix equation, we

ﬁnd

∂(α

1

, , α

n

)

∂(γ

1

, , γ

n

)

=

∂(α

1

, , α

n

)

∂(β

1

, , β

n

)

∂(β

1

, , β

n

)

∂(γ

1

, , γ

n

)

This is the chain rule for Jacobians: the Jacobian of a composite trans-

formation is the product of the Jacobians of the transformations of

which is it composed.

In the special case in which γ

i

= α

i

for all i, the left-hand side is 1 (the

determinant of the unit matrix) and we obtain

∂(α

1

, , α

n

)

∂(β

1

, , β

n

)

=

_

∂(β

1

, , β

n

)

∂(α

1

, , α

n

)

_

−1

The Jacobian of an inverse transformation is therefore the reciprocal of

that of the forward transformation. This is a multidimensional gener-

alization of the result dx/dy = (dy/dx)

−1

.

1.5.4 Orthogonality of coordinates

Calculus in general curvilinear coordinates is diﬃcult. We can make

things easier by choosing the coordinates to be orthogonal:

e

i

· e

j

= δ

ij

26

and right-handed:

e

1

e

2

= e

3

The squared line element is then

[dr[

2

= [e

1

h

1

dq

1

+e

2

h

2

dq

2

+e

3

h

3

dq

3

[

2

= h

2

1

dq

2

1

+h

2

2

dq

2

2

+h

2

3

dq

2

3

There are no cross terms such as dq

1

dq

2

.

When oriented along the coordinate directions:

• line element dr = e

1

h

1

dq

1

• surface element dS = e

3

h

1

h

2

dq

1

dq

2

,

• volume element dV = h

1

h

2

h

3

dq

1

dq

2

dq

3

Note that, for orthogonal coordinates, J = h

1

· h

2

h

3

= h

1

h

2

h

3

27

1.5.5 Commonly used orthogonal coordinate systems

Cartesian coordinates (x, y, z):

−∞< x < ∞, −∞< y < ∞, −∞< z < ∞

r = (x, y, z)

h

x

=

∂r

∂x

= (1, 0, 0)

h

y

=

∂r

∂y

= (0, 1, 0)

h

z

=

∂r

∂z

= (0, 0, 1)

h

x

= 1, e

x

= (1, 0, 0)

h

y

= 1, e

y

= (0, 1, 0)

h

z

= 1, e

z

= (0, 0, 1)

r = xe

x

+y e

y

+z e

z

dV = dxdy dz

Orthogonal. No singularities.

28

Cylindrical polar coordinates (ρ, φ, z):

0 < ρ < ∞, 0 φ < 2π, −∞< z < ∞

r = (x, y, z) = (ρ cos φ, ρ sin φ, z)

h

ρ

=

∂r

∂ρ

= (cos φ, sinφ, 0)

h

φ

=

∂r

∂φ

= (−ρ sin φ, ρ cos φ, 0)

h

z

=

∂r

∂z

= (0, 0, 1)

h

ρ

= 1, e

ρ

= (cos φ, sinφ, 0)

h

φ

= ρ, e

φ

= (−sin φ, cos φ, 0)

h

z

= 1, e

z

= (0, 0, 1)

r = ρ e

ρ

+z e

z

dV = ρ dρ dφdz

Orthogonal. Singular on the axis ρ = 0.

Warning. Many authors use r for ρ and θ for φ. This is confusing

because r and θ then have diﬀerent meanings in cylindrical and spherical

polar coordinates. Instead of ρ, which is useful for other things, some

authors use R, s or ̟.

29

Spherical polar coordinates (r, θ, φ):

0 < r < ∞, 0 < θ < π, 0 φ < 2π

r = (x, y, z) = (r sinθ cos φ, r sinθ sinφ, r cos θ)

h

r

=

∂r

∂r

= (sinθ cos φ, sinθ sinφ, cos θ)

h

θ

=

∂r

∂θ

= (r cos θ cos φ, r cos θ sinφ, −r sinθ)

h

φ

=

∂r

∂φ

= (−r sinθ sinφ, r sinθ cos φ, 0)

h

r

= 1, e

r

= (sin θ cos φ, sinθ sinφ, cos θ)

h

θ

= r, e

θ

= (cos θ cos φ, cos θ sin φ, −sinθ)

h

φ

= r sinθ, e

φ

= (−sin φ, cos φ, 0)

r = r e

r

dV = r

2

sinθ dr dθ dφ

Orthogonal. Singular on the axis r = 0, θ = 0 and θ = π.

Notes:

• cylindrical and spherical are related by ρ = r sin θ, z = r cos θ

• plane polar coordinates are the restriction of cylindrical coordi-

nates to a plane z = constant

30

1.5.6 Vector calculus in orthogonal coordinates

A scalar ﬁeld Φ(r) can be regarded as function of (q

1

, q

2

, q

3

):

dΦ =

∂Φ

∂q

1

dq

1

+

∂Φ

∂q

2

dq

2

+

∂Φ

∂q

3

dq

3

=

_

e

1

h

1

∂Φ

∂q

1

+

e

2

h

2

∂Φ

∂q

2

+

e

3

h

3

∂Φ

∂q

3

_

· (e

1

h

1

dq

1

+e

2

h

2

dq

2

+e

3

h

3

dq

3

)

= (∇Φ) · dr

We identify

∇Φ =

e

1

h

1

∂Φ

∂q

1

+

e

2

h

2

∂Φ

∂q

2

+

e

3

h

3

∂Φ

∂q

3

Thus

∇q

i

=

e

i

h

i

(no sum)

We now consider a vector ﬁeld in orthogonal coordinates:

F = e

1

F

1

+e

2

F

2

+e

3

F

3

Finding the divergence and curl are non-trivial because both F

i

and e

i

depend on position. Consider

∇(q

2

∇q

3

) = (∇q

2

) (∇q

3

) =

e

2

h

2

e

3

h

3

=

e

1

h

2

h

3

which implies

∇ ·

_

e

1

h

2

h

3

_

= 0

as well as cyclic permutations of this result. Also

∇

_

e

1

h

1

_

= ∇(∇q

1

) = 0, etc.

31

To work out ∇ · F, write

F =

_

e

1

h

2

h

3

_

(h

2

h

3

F

1

) +

∇ · F =

_

e

1

h

2

h

3

_

· ∇(h

2

h

3

F

1

) +

=

_

e

1

h

2

h

3

_

·

_

e

1

h

1

∂

∂q

1

(h

2

h

3

F

1

) +

e

2

h

2

∂

∂q

2

(h

2

h

3

F

1

)

+

e

3

h

3

∂

∂q

3

(h

2

h

3

F

1

)

_

+

=

1

h

1

h

2

h

3

∂

∂q

1

(h

2

h

3

F

1

) +

Similarly, to work out ∇F, write

F =

_

e

1

h

1

_

(h

1

F

1

) +

∇F = ∇(h

1

F

1

)

_

e

1

h

1

_

+

=

_

e

1

h

1

∂

∂q

1

(h

1

F

1

) +

e

2

h

2

∂

∂q

2

(h

1

F

1

) +

e

3

h

3

∂

∂q

3

(h

1

F

1

)

_

_

e

1

h

1

_

+

=

_

e

2

h

1

h

3

∂

∂q

3

(h

1

F

1

) −

e

3

h

1

h

2

∂

∂q

2

(h

1

F

1

)

_

+

The appearance of the scale factors inside the derivatives can be un-

derstood with reference to the geometrical deﬁnitions of grad, div and

curl:

t · (∇Φ) = lim

δs→0

δΦ

δs

∇ · F = lim

δV →0

1

δV

_

δS

F · dS

n · (∇F) = lim

δS→0

1

δS

_

δC

F · dr

32

33

To summarize:

∇Φ =

e

1

h

1

∂Φ

∂q

1

+

e

2

h

2

∂Φ

∂q

2

+

e

3

h

3

∂Φ

∂q

3

∇ · F =

1

h

1

h

2

h

3

_

∂

∂q

1

(h

2

h

3

F

1

) +

∂

∂q

2

(h

3

h

1

F

2

) +

∂

∂q

3

(h

1

h

2

F

3

)

_

∇F =

e

1

h

2

h

3

_

∂

∂q

2

(h

3

F

3

) −

∂

∂q

3

(h

2

F

2

)

_

+

e

2

h

3

h

1

_

∂

∂q

3

(h

1

F

1

) −

∂

∂q

1

(h

3

F

3

)

_

+

e

3

h

1

h

2

_

∂

∂q

1

(h

2

F

2

) −

∂

∂q

2

(h

1

F

1

)

_

=

1

h

1

h

2

h

3

¸

¸

¸

¸

¸

¸

h

1

e

1

h

2

e

2

h

3

e

3

∂/∂q

1

∂/∂q

2

∂/∂q

3

h

1

F

1

h

2

F

2

h

3

F

3

¸

¸

¸

¸

¸

¸

∇

2

Φ =

1

h

1

h

2

h

3

_

∂

∂q

1

_

h

2

h

3

h

1

∂Φ

∂q

1

_

+

∂

∂q

2

_

h

3

h

1

h

2

∂Φ

∂q

2

_

+

∂

∂q

3

_

h

1

h

2

h

3

∂Φ

∂q

3

__

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Determine the Laplacian operator in spherical polar coordinates.

h

r

= 1, h

θ

= r, h

φ

= r sinθ

∇

2

Φ =

1

r

2

sinθ

_

∂

∂r

_

r

2

sinθ

∂Φ

∂r

_

+

∂

∂θ

_

sin θ

∂Φ

∂θ

_

+

∂

∂φ

_

1

sinθ

∂Φ

∂φ

__

=

1

r

2

∂

∂r

_

r

2

∂Φ

∂r

_

+

1

r

2

sinθ

∂

∂θ

_

sinθ

∂Φ

∂θ

_

+

1

r

2

sin

2

θ

∂

2

Φ

∂φ

2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

1.5.7 Grad, div, curl and ∇

2

in cylindrical and spherical polar

coordinates

Cylindrical polar coordinates:

∇Φ = e

ρ

∂Φ

∂ρ

+

e

φ

ρ

∂Φ

∂φ

+e

z

∂Φ

∂z

∇ · F =

1

ρ

∂

∂ρ

(ρF

ρ

) +

1

ρ

∂F

φ

∂φ

+

∂F

z

∂z

∇F =

1

ρ

¸

¸

¸

¸

¸

¸

e

ρ

ρ e

φ

e

z

∂/∂ρ ∂/∂φ ∂/∂z

F

ρ

ρF

φ

F

z

¸

¸

¸

¸

¸

¸

∇

2

Φ =

1

ρ

∂

∂ρ

_

ρ

∂Φ

∂ρ

_

+

1

ρ

2

∂

2

Φ

∂φ

2

+

∂

2

Φ

∂z

2

Spherical polar coordinates:

∇Φ = e

r

∂Φ

∂r

+

e

θ

r

∂Φ

∂θ

+

e

φ

r sinθ

∂Φ

∂φ

∇ · F =

1

r

2

∂

∂r

(r

2

F

r

) +

1

r sin θ

∂

∂θ

(sinθ F

θ

) +

1

r sinθ

∂F

φ

∂φ

∇F =

1

r

2

sinθ

¸

¸

¸

¸

¸

¸

e

r

r e

θ

r sin θ e

φ

∂/∂r ∂/∂θ ∂/∂φ

F

r

rF

θ

r sinθ F

φ

¸

¸

¸

¸

¸

¸

∇

2

Φ =

1

r

2

∂

∂r

_

r

2

∂Φ

∂r

_

+

1

r

2

sinθ

∂

∂θ

_

sinθ

∂Φ

∂θ

_

+

1

r

2

sin

2

θ

∂

2

Φ

∂φ

2

35

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Evaluate ∇ · r, ∇r and ∇

2

r

n

using spherical polar coordinates.

r = e

r

r

∇ · r =

1

r

2

∂

∂r

(r

2

r) = 3

∇r =

1

r

2

sinθ

¸

¸

¸

¸

¸

¸

e

r

r e

θ

r sinθ e

φ

∂/∂r ∂/∂θ ∂/∂φ

r 0 0

¸

¸

¸

¸

¸

¸

= 0

∇

2

r

n

=

1

r

2

∂

∂r

_

r

2

∂r

n

∂r

_

= n(n + 1)r

n−2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

2 Partial diﬀerential equations

2.1 Motivation

The variation in space and time of scientiﬁc quantities is usually de-

scribed by diﬀerential equations. If the quantities depend on space and

time, or on more than one spatial coordinate, then the governing equa-

tions are partial diﬀerential equations (PDEs).

Many of the most important PDEs are linear and classical methods of

analysis can be applied. The techniques developed for linear equations

are also useful in the study of nonlinear PDEs.

2.2 Linear PDEs of second order

2.2.1 Deﬁnition

We consider an unknown function u(x, y) (the dependent variable) of

two independent variables x and y. A partial diﬀerential equation is

any equation of the form

F

_

u,

∂u

∂x

,

∂u

∂y

,

∂

2

u

∂x

2

,

∂

2

u

∂x∂y

,

∂

2

u

∂y

2

, . . . , x, y

_

= 0

involving u and any of its derivatives evaluated at the same point.

• the order of the equation is the highest order of derivative that

appears

• the equation is linear if F depends linearly on u and its derivatives

A linear PDE of second order has the general form

Lu = f

where L is a diﬀerential operator such that

Lu = a

∂

2

u

∂x

2

+b

∂

2

u

∂x∂y

+c

∂

2

u

∂y

2

+d

∂u

∂x

+e

∂u

∂y

+gu

37

where a, b, c, d, e, f, g are functions of x and y.

• if f = 0 the equation is said to be homogeneous

• if a, b, c, d, e, g are independent of x and y the equation is said to

have constant coeﬃcients

These ideas can be generalized to more than two independent variables,

or to systems of PDEs with more than one dependent variable.

2.2.2 Principle of superposition

L deﬁned above is an example of a linear operator:

L(αu) = αLu

L(u +v) = Lu +Lv

where u and v any functions of x and y, and α is any constant.

The principle of superposition:

• if u and v satisfy the homogeneous equation Lu = 0, then αu and

u +v also satisfy the homogeneous equation

• similarly, any linear combination of solutions of the homogeneous

equation is also a solution

• if the particular integral u

p

satisﬁes the inhomogeneous equation

Lu = f and the complementary function u

c

satisﬁes the homoge-

neous equation Lu = 0, then u

p

+u

c

also satisﬁes the inhomoge-

neous equation:

L(u

p

+u

c

) = Lu

p

+Lu

c

= f + 0 = f

The same principle applies to any other type of linear equation (e.g.

algebraic, ordinary diﬀerential, integral).

38

2.2.3 Classic examples

Laplace’s equation:

∇

2

u = 0

Diﬀusion equation (heat equation):

∂u

∂t

= λ∇

2

u, λ = diﬀusion coeﬃcient (or diﬀusivity)

Wave equation:

∂

2

u

∂t

2

= c

2

∇

2

u, c = wave speed

All involve the vector-invariant operator ∇

2

, the form of which depends

on the number of active dimensions.

Inhomogeneous versions of these equations are also found. The inho-

mogeneous term usually represents a ‘source’ or ‘force’ that generates

or drives the quantity u.

2.3 Physical examples of occurrence

2.3.1 Examples of Laplace’s (or Poisson’s) equation

Gravitational acceleration is related to gravitational potential by

g = −∇Φ

The source (divergence) of the gravitational ﬁeld is mass:

∇ · g = −4πGρ

where G is Newton’s constant and ρ is the mass density. Combine:

∇

2

Φ = 4πGρ

39

External to the mass distribution, we have Laplace’s equation ∇

2

Φ = 0.

With the source term the inhomogeneous equation is called Poisson’s

equation.

Analogously, in electrostatics, the electric ﬁeld is related to the electro-

static potential by

E = −∇Φ

The source of the electric ﬁeld is electric charge:

∇ · E =

ρ

ǫ

0

where ρ is the charge density and ǫ

0

is the permittivity of free space.

Combine to obtain Poisson’s equation:

∇

2

Φ = −

ρ

ǫ

0

The vector ﬁeld (g or E) is said to be generated by the potential Φ. A

scalar potential is easier to work with because it does not have multiple

components and its value is independent of the coordinate system. The

potential is also directly related to the energy of the system.

2.3.2 Examples of the diﬀusion equation

A conserved quantity with density (amount per unit volume) Q and

ﬂux density (ﬂux per unit area) F satisﬁes the conservation equation:

∂Q

∂t

+ ∇ · F = 0

This is more easily understood when integrated over the volume V

bounded by any (ﬁxed) surface S:

d

dt

_

V

QdV =

_

V

∂Q

∂t

dV = −

_

V

∇ · F dV = −

_

S

F · dS

40

The amount of Q in the volume V changes only as the result of a net

ﬂux through S.

Often the ﬂux of Q is directed down the gradient of Q through the

linear relation (Fick’s law)

F = −λ∇Q

Combine to obtain the diﬀusion equation (if λ is independent of r):

∂Q

∂t

= λ∇

2

Q

In a steady state Q satisﬁes Laplace’s equation.

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Heat conduction in a solid. Conserved quantity: energy. Heat per

unit volume: CT (C is heat capacity per unit volume, T is temperature.

Heat ﬂux density: −K∇T (K is thermal conductivity). Thus

∂

∂t

(CT) + ∇ · (−K∇T) = 0

∂T

∂t

= λ∇

2

T, λ =

K

C

e.g. copper: λ ≈ 1 cm

2

s

−1

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Further example: concentration of a contaminant in a gas. λ ≈ 0.2 cm

2

s

−1

2.3.3 Examples of the wave equation

Waves on a string. The string has uniform mass per unit length ρ

and uniform tension T. The transverse displacement y(x, t) is small

([∂y/∂x[ ≪ 1) and satisﬁes Newton’s second law for an element δx of

the string:

(ρ δx)

∂

2

y

∂t

2

= δF

y

≈

∂F

y

∂x

δx

41

Now

F

y

= T sinθ ≈ T

∂y

∂x

where θ is the angle between the x-axis and the tangent to the string.

Combine to obtain the one-dimensional wave equation

∂

2

y

∂t

2

= c

2

∂

2

y

∂x

2

, c =

¸

T

ρ

e.g. violin (D-)string: T ≈ 40 N, ρ ≈ 1 g m

−1

: c ≈ 200 ms

−1

Electromagnetic waves. Maxwell’s equations for the electromagnetic

ﬁeld in a vacuum:

∇ · E = 0

∇ · B = 0

∇E +

∂B

∂t

= 0

1

µ

0

∇B −ǫ

0

∂E

∂t

= 0

where E is electric ﬁeld, B is magnetic ﬁeld, µ

0

is the permeability of

free space and ǫ

0

is the permittivity of free space. Eliminate E:

∂

2

B

∂t

2

= −∇

∂E

∂t

= −

1

µ

0

ǫ

0

∇(∇B)

Now use the identity ∇(∇B) = ∇(∇ · B) −∇

2

B and Maxwell’s

equation ∇ · B = 0 to obtain the (vector) wave equation

∂

2

B

∂t

2

= c

2

∇

2

B, c =

_

1

µ

0

ǫ

0

E obeys the same equation. c is the speed of light. c ≈ 3 10

8

ms

−1

Further example: sound waves in a gas. c ≈ 300 ms

−1

42

2.3.4 Examples of other second-order linear PDEs

Schr¨ odinger’s equation (quantum-mechanical wavefunction of a particle

of mass m in a potential V ):

i

∂ψ

∂t

= −

2

2m

∇

2

ψ +V (r)ψ

Helmholtz equation (arises in wave problems):

∇

2

u +k

2

u = 0

Klein–Gordon equation (arises in quantum mechanics):

∂

2

u

∂t

2

= c

2

(∇

2

u −m

2

u)

2.3.5 Examples of nonlinear PDEs

Burgers’ equation (describes shock waves):

∂u

∂t

+u

∂u

∂x

= λ∇

2

u

Nonlinear Schr¨ odinger equation (describes solitons, e.g. in optical ﬁbre

communication):

i

∂ψ

∂t

= −∇

2

ψ −[ψ[

2

ψ

These equations require diﬀerent methods of analysis.

2.4 Separation of variables (Cartesian coordinates)

2.4.1 Diﬀusion equation

We consider the one-dimensional diﬀusion equation (e.g. conduction of

heat along a metal bar):

∂u

∂t

= λ

∂

2

u

∂x

2

43

The bar could be considered ﬁnite (having two ends), semi-inﬁnite (hav-

ing one end) or inﬁnite (having no end). Typical boundary conditions

at an end are:

• u is speciﬁed (Dirichlet boundary condition)

• ∂u/∂x is speciﬁed (Neumann boundary condition)

If a boundary is removed to inﬁnity, we usually require instead that u

be bounded (i.e. remain ﬁnite) as x →±∞. This condition is needed to

eliminate unphysical solutions.

To determine the solution fully we also require initial conditions. For

the diﬀusion equation this means specifying u as a function of x at some

initial time (usually t = 0).

Try a solution in which the variables appear in separate factors:

u(x, t) = X(x)T(t)

Substitute this into the PDE:

X(x)T

′

(t) = λX

′′

(x)T(t)

(recall that a prime denotes diﬀerentiation of a function with respect to

its argument). Divide through by λXT to separate the variables:

T

′

(t)

λT(t)

=

X

′′

(x)

X(x)

The LHS depends only on t, while the RHS depends only on x. Both

must therefore equal a constant, which we call −k

2

(for later conve-

nience). The PDE is separated into two ordinary diﬀerential equations

(ODEs)

T

′

+λk

2

T = 0

X

′′

+k

2

X = 0

with general solutions

T = Aexp(−λk

2

t)

44

X = Bsin(kx) +C cos(kx)

Finite bar: suppose the Neumann boundary conditions ∂u/∂x = 0 (i.e.

zero heat ﬂux) apply at the two ends x = 0, L. Then the admissible

solutions of the X equation are

X = C cos(kx), k =

nπ

L

, n = 0, 1, 2, . . .

Combine the factors to obtain the elementary solution (let C = 1

WLOG)

u = Acos

_

nπx

L

_

exp

_

−

n

2

π

2

λt

L

2

_

Each elementary solution represents a ‘decay mode’ of the bar. The

decay rate is proportional to n

2

. The n = 0 mode represents a uniform

temperature distribution and does not decay.

The principle of superposition allows us to construct a general solution

as a linear combination of elementary solutions:

u(x, t) =

∞

n=0

A

n

cos

_

nπx

L

_

exp

_

−

n

2

π

2

λt

L

2

_

The coeﬃcients A

n

can be determined from the initial conditions. If

the temperature at time t = 0 is speciﬁed, then

u(x, 0) =

∞

n=0

A

n

cos

_

nπx

L

_

is known. The coeﬃcients A

n

are just the Fourier coeﬃcients of the

initial temperature distribution.

Semi-inﬁnite bar: The solution X ∝ cos(kx) is valid for any real value

of k so that u is bounded as x →±∞. We may assume k 0 WLOG.

The solution decays in time unless k = 0. The general solution is a

linear combination in the form of an integral:

u =

_

∞

0

A(k) cos(kx) exp(−λk

2

t) dk

45

This is an example of a Fourier integral, studied in section 4.

Notes:

• separation of variables doesn’t work for all linear equations, by

any means, but it does for some of the most important examples

• it is not supposed to be obvious that the most general solution

can be written as a linear combination of separated solutions

2.4.2 Wave equation (example)

∂

2

u

∂t

2

= c

2

∂

2

u

∂x

2

, 0 < x < L

Boundary conditions:

u = 0 at x = 0, L

Initial conditions:

u,

∂u

∂t

speciﬁed at t = 0

Trial solution:

u(x, t) = X(x)T(t)

X(x)T

′′

(t) = c

2

X

′′

(x)T(t)

T

′′

(t)

c

2

T(t)

=

X

′′

(x)

X(x)

= −k

2

T

′′

+c

2

k

2

T = 0

X

′′

+k

2

X = 0

T = Asin(ckt) +Bcos(ckt)

X = C sin(kx), k =

nπ

L

, n = 1, 2, 3, . . .

46

Elementary solution:

u =

_

Asin

_

nπct

L

_

+Bcos

_

nπct

L

__

sin

_

nπx

L

_

General solution:

u =

∞

n=1

_

A

n

sin

_

nπct

L

_

+B

n

cos

_

nπct

L

__

sin

_

nπx

L

_

At t = 0:

u =

∞

n=1

B

n

sin

_

nπx

L

_

∂u

∂t

=

∞

n=1

_

nπc

L

_

A

n

sin

_

nπx

L

_

The Fourier series for the initial conditions determine all the coeﬃcients

A

n

, B

n

.

2.4.3 Helmholtz equation (example)

A three-dimensional example in a cube:

∇

2

u +k

2

u = 0, 0 < x, y, z < L

Boundary conditions:

u = 0 on the boundaries

Trial solution:

u(x, y, z) = X(x)Y (y)Z(z)

X

′′

(x)Y (y)Z(z) +X(x)Y

′′

(y)Z(z) +X(x)Y (y)Z

′′

(z)

+k

2

X(x)Y (y)Z(z) = 0

47

X

′′

(x)

X(x)

+

Y

′′

(y)

Y (y)

+

Z

′′

(z)

Z(z)

+k

2

= 0

The variables are separated: each term must equal a constant. Call the

constants −k

2

x

, −k

2

y

, −k

2

z

:

X

′′

+k

2

x

X = 0

Y

′′

+k

2

y

Y = 0

Z

′′

+k

2

z

Z = 0

k

2

= k

2

x

+k

2

y

+k

2

z

X ∝ sin(k

x

x), k

x

=

n

x

π

L

, n

x

= 1, 2, 3, . . .

Y ∝ sin(k

y

y), k

y

=

n

y

π

L

, n

y

= 1, 2, 3, . . .

Z ∝ sin(k

z

z), k

z

=

n

z

π

L

, n

z

= 1, 2, 3, . . .

Elementary solution:

u = Asin

_

n

x

πx

L

_

sin

_

n

y

πy

L

_

sin

_

n

z

πz

L

_

The solution is possible only if k

2

is one of the discrete values

k

2

= (n

2

x

+n

2

y

+n

2

z

)

π

2

L

2

These are the eigenvalues of the problem.

48

3 Green’s functions

3.1 Impulses and the delta function

3.1.1 Physical motivation

Newton’s second law for a particle of mass m moving in one dimension

subject to a force F(t) is

dp

dt

= F

where

p = m

dx

dt

is the momentum. Suppose that the force is applied only in the time

interval 0 < t < δt. The total change in momentum is

δp =

_

δt

0

F(t) dt = I

and is called the impulse.

We may wish to represent mathematically a situation in which the mo-

mentum is changed instantaneously, e.g. if the particle experiences a

collision. To achieve this, F must tend to inﬁnity while δt tends to

zero, in such a way that its integral I is ﬁnite and non-zero.

In other applications we may wish to represent an idealized point charge

or point mass, or a localized source of heat, waves, etc. Here we need a

mathematical object of inﬁnite density and zero spatial extension but

having a non-zero integral eﬀect.

The delta function is introduced to meet these requirements.

49

3.1.2 Step function and delta function

We start by deﬁning the Heaviside unit step function

H(x) =

_

0, x < 0

1, x > 0

The value of H(0) does not matter for most purposes. It is sometimes

taken to be 1/2. An alternative notation for H(x) is θ(x).

H(x) can be used to construct other discontinuous functions. Consider

the particular ‘top-hat’ function

δ

ǫ

(x) =

_

1/ǫ, 0 < x < ǫ

0, otherwise

where ǫ is a positive parameter. This function can also be written as

δ

ǫ

(x) =

H(x) −H(x −ǫ)

ǫ

The area under the curve is equal to one. In the limit ǫ →0, we obtain

a ‘spike’ of inﬁnite height, vanishing width and unit area localized at

x = 0. This limit is the Dirac delta function, δ(x).

50

The indeﬁnite integral of δ

ǫ

(x) is

_

x

−∞

δ

ǫ

(ξ) dξ =

_

¸

_

¸

_

0, x 0

x/ǫ, 0 x ǫ

1, x ǫ

In the limit ǫ →0, we obtain

_

x

−∞

δ(ξ) dξ = H(x)

or, equivalently,

δ(x) = H

′

(x)

Our idealized impulsive force (section 3.1.1) can be represented as

F(t) = I δ(t)

which represents a spike of strength I localized at t = 0. If the particle

is at rest before the impulse, the solution for its momentum is

p = I H(t).

In other physical applications δ(x) is used to represent an idealized

point charge or localized source. It can be placed anywhere: q δ(x −ξ)

represents a ‘spike’ of strength q located at x = ξ.

3.2 Other deﬁnitions

We might think of deﬁning the delta function as

δ(x) =

_

∞, x = 0

0, x ,= 0

51

but this is not speciﬁc enough to describe it. It is not a function in the

usual sense but a ‘generalized function’ or ‘distribution’.

Instead, the deﬁning property of δ(x) can be taken to be

_

∞

−∞

f(x)δ(x) dx = f(0)

where f(x) is any continuous function. This is because the unit spike

picks out the value of the function at the location of the spike. It also

follows that

_

∞

−∞

f(x)δ(x −ξ) dx = f(ξ)

Since δ(x−ξ) = 0 for x ,= ξ, the integral can be taken over any interval

that includes the point x = ξ.

One way to justify this result is as follows. Consider a continuous

function f(x) with indeﬁnite integral g(x), i.e. f(x) = g

′

(x). Then

_

∞

−∞

f(x)δ

ǫ

(x −ξ) dx =

1

ǫ

_

ξ+ǫ

ξ

f(x) dx

=

g(ξ +ǫ) −g(ξ)

ǫ

From the deﬁnition of the derivative,

lim

ǫ→0

_

∞

−∞

f(x)δ

ǫ

(x −ξ) dx = g

′

(ξ) = f(ξ)

as required.

This result (the boxed formula above) is equivalent to the substitution

property of the Kronecker delta:

3

j=1

a

j

δ

ij

= a

i

52

The Dirac delta function can be understood as the equivalent of the

Kronecker delta symbol for functions of a continuous variable.

δ(x) can also be seen as the limit of localized functions other than our

top-hat example. Alternative, smooth choices for δ

ǫ

(x) include

δ

ǫ

(x) =

ǫ

π(x

2

+ǫ

2

)

δ

ǫ

(x) = (2πǫ

2

)

−1/2

exp

_

−

x

2

2ǫ

2

_

3.3 More on generalized functions

Derivatives of the delta function can also be deﬁned as the limits of se-

quences of functions. The generating functions for δ

′

(x) are the deriva-

tives of (smooth) functions (e.g. Gaussians) that generate δ(x), and

have both positive and negative ‘spikes’ localized at x = 0. The deﬁn-

ing property of δ

′

(x) can be taken to be

_

∞

−∞

f(x)δ

′

(x −ξ) dx = −

_

∞

−∞

f

′

(x)δ(x −ξ) dx = −f

′

(ξ)

53

where f(x) is any diﬀerentiable function. This follows from an integra-

tion by parts before the limit is taken.

Not all operations are permitted on generalized functions. In particular,

two generalized functions of the same variable cannot be multiplied

together. e.g. H(x)δ(x) is meaningless. However δ(x)δ(y) is permissible

and represents a point source in a two-dimensional space.

3.4 Diﬀerential equations containing delta functions

If a diﬀerential equation involves a step function or delta function, this

generally implies a lack of smoothness in the solution. The equation

can be solved separately on either side of the discontinuity and the two

parts of the solution connected by applying the appropriate matching

conditions.

Consider, as an example, the linear second-order ODE

d

2

y

dx

2

+y = δ(x) (1)

If x represents time, this equation could represent the behaviour of a

simple harmonic oscillator in response to an impulsive force.

54

In each of the regions x < 0 and x > 0 separately, the right-hand side

vanishes and the general solution is a linear combination of cos x and

sinx. We may write

y =

_

Acos x +Bsinx, x < 0

C cos x +Dsinx, x > 0

Since the general solution of a second-order ODE should contain only

two arbitrary constants, it must be possible to relate C and D to A and

B.

What is the nature of the non-smoothness in y? It can be helpful to

think of a sequence of functions related by diﬀerentiation:

55

y must be of the same class as xH(x): it is continuous but has a jump

in its ﬁrst derivative. The size of the jump can be found by integrating

equation (1) from x = −ǫ to x = ǫ and letting ǫ →0:

_

dy

dx

_

≡ lim

ǫ→0

_

dy

dx

_

x=ǫ

x=−ǫ

= 1

Since y is bounded it makes no contribution under this procedure. Since

y is continuous, the jump conditions are

[y] = 0,

_

dy

dx

_

= 1 at x = 0

Applying these, we obtain

C −A = 0

D−B = 1

and so the general solution is

y =

_

Acos x +Bsinx, x < 0

Acos x + (B + 1) sin x, x > 0

In particular, if the oscillator is at rest before the impulse occurs, then

A = B = 0 and the solution is y = H(x) sin x.

56

3.5 Inhomogeneous linear second-order ODEs

3.5.1 Complementary functions and particular integral

The general linear second-order ODE with constant coeﬃcients has the

form

y

′′

(x) +py

′

(x) +qy(x) = f(x) or Ly = f

where L is a linear operator such that Ly = y

′′

+py

′

+qy.

The equation is homogeneous (unforced) if f = 0, otherwise it is inho-

mogeneous (forced).

The principle of superposition applies to linear ODEs as to all linear

equations.

Suppose that y

1

(x) and y

2

(x) are linearly independent solutions of the

homogeneous equation, i.e. Ly

1

= Ly

2

= 0 and y

2

is not simply a

constant multiple of y

1

. Then the general solution of the homogeneous

equation is Ay

1

+By

2

.

If y

p

(x) is any solution of the inhomogeneous equation, i.e. Ly

p

= f,

then the general solution of the inhomogeneous equation is

y = Ay

1

+By

2

+y

p

Here y

1

and y

2

are known as complementary functions and y

p

as a

particular integral.

3.5.2 Initial-value and boundary-value problems

Two boundary conditions (BCs) must be speciﬁed to determine fully

the solution of a second-order ODE. A boundary condition is usually

an equation relating the values of y and y

′

at one point. (The ODE

allows y

′′

and higher derivatives to be expressed in terms of y and y

′

.)

The general form of a linear BC at a point x = a is

α

1

y

′

(a) +α

2

y(a) = α

3

57

where α

1

, α

2

, α

3

are constants and α

1

, α

2

are not both zero. If α

3

= 0

the BC is homogeneous.

If both BCs are speciﬁed at the same point we have an initial-value

problem, e.g. to solve

m

d

2

x

dt

2

= F(t) for t 0 subject to x =

dx

dt

= 0 at t = 0

If the BCs are speciﬁed at diﬀerent points we have a two-point boundary-

value problem, e.g. to solve

y

′′

(x) +y(x) = f(x) for a x b subject to y(a) = y(b) = 0

3.5.3 Green’s function for an initial-value problem

Suppose we want to solve the inhomogeneous ODE

y

′′

(x) +py

′

(x) +qy(x) = f(x) for x 0 (1)

subject to the homogeneous BCs

y(0) = y

′

(0) = 0 (2)

Green’s function G(x, ξ) for this problem is the solution of

∂

2

G

∂x

2

+p

∂G

∂x

+qG = δ(x −ξ) (3)

subject to the homogeneous BCs

G(0, ξ) =

∂G

∂x

(0, ξ) = 0 (4)

Notes:

• G(x, ξ) is deﬁned for x 0 and ξ 0

58

• G(x, ξ) satisﬁes the same equation and boundary conditions with

respect to x as y does

• however, it is the response to forcing that is localized at a point

x = ξ, rather than a distributed forcing f(x)

If Green’s function can be found, the solution of equation (1) is then

y(x) =

_

∞

0

G(x, ξ)f(ξ) dξ (5)

To verify this, let L be the diﬀerential operator

L =

∂

2

∂x

2

+p

∂

∂x

+q

Then equations (1) and (3) read Ly = f and LG = δ(x−ξ) respectively.

Applying L to equation (5) gives

Ly(x) =

_

∞

0

LGf(ξ) dξ =

_

∞

0

δ(x −ξ)f(ξ) dξ = f(x)

as required. It also follows from equation (4) that y satisﬁes the bound-

ary conditions (2) as required.

The meaning of equation (5) is that the response to distributed forcing

(i.e. the solution of Ly = f) is obtained by summing the responses to

forcing at individual points, weighted by the force distribution. This

works because the ODE is linear and the BCs are homogeneous.

To ﬁnd Green’s function, note that equation (3) is just an ODE involv-

ing a delta function, in which ξ appears as a parameter. To satisfy this

equation, G must be continuous but have a discontinuous ﬁrst deriva-

tive. The jump conditions can be found by integrating equation (3)

from x = ξ −ǫ to x = ξ +ǫ and letting ǫ →0:

_

∂G

∂x

_

≡ lim

ǫ→0

_

∂G

∂x

_

x=ξ+ǫ

x=ξ−ǫ

= 1

59

Since p ∂G/∂x and qG are bounded they make no contribution under

this procedure. Since G is continuous, the jump conditions are

[G] = 0,

_

∂G

∂x

_

= 1 at x = ξ (6)

Suppose that two complementary functions y

1

, y

2

are known. The

Wronskian W(x) of two solutions y

1

(x) and y

2

(x) of a second-order

ODE is the determinant of the Wronskian matrix:

W[y

1

, y

2

] =

¸

¸

¸

¸

y

1

y

2

y

′

1

y

′

2

¸

¸

¸

¸

= y

1

y

′

2

−y

2

y

′

1

The Wronskian is non-zero unless y

1

and y

2

are linearly dependent (one

is a constant multiple of the other).

Since the right-hand side of equation (3) vanishes for x < ξ and x > ξ

separately, the solution must be of the form

G(x, ξ) =

_

A(ξ)y

1

(x) +B(ξ)y

2

(x), 0 x < ξ

C(ξ)y

1

(x) +D(ξ)y

2

(x), x > ξ

To determine A, B, C, D we apply the boundary conditions (4) and the

jump conditions (6).

Boundary conditions at x = 0:

A(ξ)y

1

(0) +B(ξ)y

2

(0) = 0

A(ξ)y

′

1

(0) +B(ξ)y

′

2

(0) = 0

In matrix form:

_

y

1

(0) y

2

(0)

y

′

1

(0) y

′

2

(0)

_ _

A(ξ)

B(ξ)

_

=

_

0

0

_

Since the determinant W(0) of the matrix is non-zero the only solution

is A(ξ) = B(ξ) = 0.

60

Jump conditions at x = ξ:

C(ξ)y

1

(ξ) +D(ξ)y

2

(ξ) = 0

C(ξ)y

′

1

(ξ) +D(ξ)y

′

2

(ξ) = 1

In matrix form:

_

y

1

(ξ) y

2

(ξ)

y

′

1

(ξ) y

′

2

(ξ)

_ _

C(ξ)

D(ξ)

_

=

_

0

1

_

Solution:

_

C(ξ)

D(ξ)

_

=

1

W(ξ)

_

y

′

2

(ξ) −y

2

(ξ)

−y

′

1

(ξ) y

1

(ξ)

_ _

0

1

_

=

_

−y

2

(ξ)/W(ξ)

y

1

(ξ)/W(ξ)

_

Green’s function is therefore

G(x, ξ) =

_

0, 0 x ξ

y

1

(ξ)y

2

(x)−y

1

(x)y

2

(ξ)

W(ξ)

, x ξ

3.5.4 Green’s function for a boundary-value problem

We now consider a similar equation

Ly = f

for a x b, subject to the two-point homogeneous BCs

α

1

y

′

(a) +α

2

y(a) = 0 (1)

β

1

y

′

(b) +β

2

y(b) = 0 (2)

Green’s function G(x, ξ) for this problem is the solution of

LG = δ(x −ξ) (3)

subject to the homogeneous BCs

α

1

∂G

∂x

(a, ξ) +α

2

G(a, ξ) = 0 (4)

61

β

1

∂G

∂x

(b, ξ) +β

2

G(b, ξ) = 0 (5)

and is deﬁned for a x b and a ξ b.

By a similar argument, the solution of Ly = f subject to the BCs (1)

and (2) is then

y(x) =

_

b

a

G(x, ξ)f(ξ) dξ

We ﬁnd Green’s function by a similar method. Let y

a

(x) be a com-

plementary function satisfying the left-hand BC (1), and let y

b

(x) be a

complementary function satisfying the right-hand BC (2). (These can

always be found.) Since the right-hand side of equation (3) vanishes for

x < ξ and x > ξ separately, the solution must be of the form

G(x, ξ) =

_

A(ξ)y

a

(x), a x < ξ

B(ξ)y

b

(x), ξ < x b

satisfying the BCs (4) and (5). To determine A, B we apply the jump

conditions [G] = 0 and [∂G/∂x] = 1 at x = ξ:

B(ξ)y

b

(ξ) −A(ξ)y

a

(ξ) = 0

B(ξ)y

′

b

(ξ) −A(ξ)y

′

a

(ξ) = 1

In matrix form:

_

y

a

(ξ) y

b

(ξ)

y

′

a

(ξ) y

′

b

(ξ)

_ _

−A(ξ)

B(ξ)

_

=

_

0

1

_

Solution:

_

−A(ξ)

B(ξ)

_

=

1

W(ξ)

_

y

′

b

(ξ) −y

b

(ξ)

−y

′

a

(ξ) y

a

(ξ)

_ _

0

1

_

=

_

−y

b

(ξ)/W(ξ)

y

a

(ξ)/W(ξ)

_

Green’s function is therefore

G(x, ξ) =

_

y

a

(x)y

b

(ξ)

W(ξ)

, a x ξ

y

a

(ξ)y

b

(x)

W(ξ)

, ξ x b

62

This method fails if the Wronskian W[y

a

, y

b

] vanishes. This happens if

y

a

is proportional to y

b

, i.e. if there is a complementary function that

happens to satisfy both homogeneous BCs. In this (exceptional) case

the equation Ly = f may not have a solution satisfying the BCs; if it

does, the solution will not be unique.

3.5.5 Examples of Green’s function

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Find Green’s function for the initial-value problem

y

′′

(x) +y(x) = f(x), y(0) = y

′

(0) = 0

Complementary functions y

1

= cos x, y

2

= sinx.

Wronskian

W = y

1

y

′

2

−y

2

y

′

1

= cos

2

x + sin

2

x = 1

Now

y

1

(ξ)y

2

(x) −y

1

(x)y

2

(ξ) = cos ξ sinx −cos xsin ξ = sin(x −ξ)

Thus

G(x, ξ) =

_

0, 0 x ξ

sin(x −ξ), x ξ

So

y(x) =

_

x

0

sin(x −ξ)f(ξ) dξ

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Find Green’s function for the two-point boundary-value problem

y

′′

(x) +y(x) = f(x), y(0) = y(1) = 0

Complementary functions y

a

= sinx, y

b

= sin(x−1) satisfying left and

right BCs respectively.

Wronskian

W = y

a

y

′

b

−y

b

y

′

a

= sinxcos(x −1) −sin(x −1) cos x = sin 1

Thus

G(x, ξ) =

_

sin xsin(ξ −1)/ sin 1, 0 x ξ

sin ξ sin(x −1)/ sin 1, ξ x 1

So

y(x) =

_

1

0

G(x, ξ)f(ξ) dξ

=

sin(x −1)

sin1

_

x

0

sinξ f(ξ) dξ +

sinx

sin1

_

1

x

sin(ξ −1)f(ξ) dξ

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

4 The Fourier transform

4.1 Motivation

A periodic signal can be analysed into its harmonic components by

calculating its Fourier series. If the period is P, then the harmonics

have frequencies n/P where n is an integer.

The Fourier transform generalizes this idea to functions that are not

periodic. The ‘harmonics’ can then have any frequency.

The Fourier transform provides a complementary way of looking at a

function. Certain operations on a function are more easily computed ‘in

the Fourier domain’. This idea is particularly useful in solving certain

kinds of diﬀerential equation.

Furthermore, the Fourier transform has innumerable applications in

diverse ﬁelds such as astronomy, optics, signal processing, data analysis,

statistics and number theory.

4.2 Relation to Fourier series

A function f(x) has period P if f(x +P) = f(x) for all x. It can then

be written as a Fourier series

f(x) =

1

2

a

0

+

∞

n=1

a

n

cos(k

n

x) +

∞

n=1

b

n

sin(k

n

x)

65

where

k

n

=

2πn

P

is the wavenumber of the nth harmonic.

Such a series is also used to write any function that is deﬁned only on

an interval of length P, e.g. −P/2 < x < P/2. The Fourier series gives

the extension of the function by periodic repetition.

The Fourier coeﬃcients are found from

a

n

=

2

P

_

P/2

−P/2

f(x) cos(k

n

x) dx

b

n

=

2

P

_

P/2

−P/2

f(x) sin(k

n

x) dx

Deﬁne

c

n

=

_

¸

_

¸

_

(a

−n

+ ib

−n

)/2, n < 0

a

0

/2, n = 0

(a

n

−ib

n

)/2, n > 0

66

Then the same result can be expressed more simply and compactly in

the notation of the complex Fourier series

f(x) =

∞

n=−∞

c

n

e

ik

n

x

c

n

=

1

P

_

P/2

−P/2

f(x) e

−ik

n

x

dx

This expression for c

n

can be veriﬁed using the orthogonality relation

1

P

_

P/2

−P/2

e

i(k

n

−k

m

)x

dx = δ

mn

which follows from an elementary integration.

To approach the Fourier transform, let P → ∞ so that the function is

deﬁned on the entire real line without any periodicity. The wavenum-

bers k

n

are replaced by a continuous variable k. We deﬁne

˜

f(k) =

_

∞

−∞

f(x) e

−ikx

dx

corresponding to Pc

n

. The Fourier sum becomes an integral

f(x) =

_

∞

−∞

1

P

˜

f(k) e

ikx

dn

dk

n

dk

=

1

2π

_

∞

−∞

˜

f(k) e

ikx

dk

We therefore have the forward Fourier transform (Fourier analysis)

˜

f(k) =

_

∞

−∞

f(x) e

−ikx

dx

and the inverse Fourier transform (Fourier synthesis)

f(x) =

1

2π

_

∞

−∞

˜

f(k) e

ikx

dk

67

Notes:

• the Fourier transform operation is sometimes denoted by

˜

f(k) = T[f(x)], f(x) = T

−1

[

˜

f(k)]

• the variables are often called t and ω rather than x and k

• it is sometimes useful to consider complex values of k

• for a rigorous proof, certain technical conditions on f(x) are re-

quired

A necessary condition for

˜

f(k) to exist for all real values of k (in the

sense of an ordinary function) is that f(x) →0 as x →±∞. Otherwise

the Fourier integral does not converge (e.g. for k = 0).

A set of suﬃcient conditions for

˜

f(k) to exist is that f(x) have ‘bounded

variation’, have a ﬁnite number of discontinuities and be ‘absolutely

integrable’, i.e.

_

∞

−∞

[f(x)[ dx < ∞.

However, we will see that Fourier transforms can be assigned in a wider

sense to some functions that do not satisfy all of these conditions, e.g.

f(x) = 1.

Warning. Several diﬀerent deﬁnitions of the Fourier transform are in

use. They diﬀer in the placement of the 2π factor and in the signs of

the exponents. The deﬁnition used here is probably the most common.

How to remember this deﬁnition:

• the sign of the exponent is diﬀerent in the forward and inverse

transforms

• the inverse transform means that the function f(x) is synthesized

from a linear combination of basis functions e

ikx

• the division by 2π always accompanies an integration with respect

to k

68

4.3 Examples

Example (1): top-hat function:

f(x) =

_

c, a < x < b

0, otherwise

˜

f(k) =

_

b

a

c e

−ikx

dx =

ic

k

_

e

−ikb

−e

−ika

_

e.g. if a = −1, b = 1 and c = 1:

˜

f(k) =

i

k

_

e

−ik

−e

ik

_

=

2 sin k

k

Example (2):

f(x) = e

−|x|

˜

f(k) =

_

0

−∞

e

x

e

−ikx

dx +

_

∞

0

e

−x

e

−ikx

dx

=

1

1 −ik

_

e

(1−ik)x

_

0

−∞

−

1

1 + ik

_

e

−(1+ik)x

_

∞

0

=

1

1 −ik

+

1

1 + ik

=

2

1 +k

2

69

Example (3): Gaussian function (normal distribution):

f(x) = (2πσ

2

x

)

−1/2

exp

_

−

x

2

2σ

2

x

_

˜

f(k) = (2πσ

2

x

)

−1/2

_

∞

−∞

exp

_

−

x

2

2σ

2

x

−ikx

_

dx

Change variable to

z =

x

σ

x

+ iσ

x

k

so that

−

z

2

2

= −

x

2

2σ

2

x

−ikx +

σ

2

x

k

2

2

Then

˜

f(k) = (2πσ

2

x

)

−1/2

_

∞

−∞

exp

_

−

z

2

2

_

dz σ

x

exp

_

−

σ

2

x

k

2

2

_

= exp

_

−

σ

2

x

k

2

2

_

70

where we use the standard Gaussian integral

_

∞

−∞

exp

_

−

z

2

2

_

dz = (2π)

1/2

Actually there is a slight cheat here because z has an imaginary part.

This will be explained next term.

The result is proportional to a standard Gaussian function of k:

˜

f(k) ∝ (2πσ

2

k

)

−1/2

exp

_

−

k

2

2σ

2

k

_

of width (standard deviation) σ

k

related to σ

x

by

σ

x

σ

k

= 1

This illustrates a property of the Fourier transform: the narrower the

function of x, the wider the function of k.

71

4.4 Basic properties of the Fourier transform

Linearity:

g(x) = αf(x) ⇔ ˜ g(k) = α

˜

f(k) (1)

h(x) = f(x) +g(x) ⇔

˜

h(k) =

˜

f(k) + ˜ g(k) (2)

Rescaling (for real α):

g(x) = f(αx) ⇔ ˜ g(k) =

1

[α[

˜

f

_

k

α

_

(3)

Shift/exponential (for real α):

g(x) = f(x −α) ⇔ ˜ g(k) = e

−ikα

˜

f(k) (4)

g(x) = e

iαx

f(x) ⇔ ˜ g(k) =

˜

f(k −α) (5)

Diﬀerentiation/multiplication:

g(x) = f

′

(x) ⇔ ˜ g(k) = ik

˜

f(k) (6)

g(x) = xf(x) ⇔ ˜ g(k) = i

˜

f

′

(k) (7)

Duality:

g(x) =

˜

f(x) ⇔ ˜ g(k) = 2πf(−k) (8)

i.e. transforming twice returns (almost) the same function

Complex conjugation and parity inversion (for real x and k):

g(x) = [f(x)]

∗

⇔ ˜ g(k) = [

˜

f(−k)]

∗

(9)

Symmetry:

f(−x) = ±f(x) ⇔

˜

f(−k) = ±

˜

f(k) (10)

72

Sample derivations: property (3):

g(x) = f(αx)

˜ g(k) =

_

∞

−∞

f(αx) e

−ikx

dx

= sgn(α)

_

∞

−∞

f(y) e

−iky/α

dy

α

=

1

[α[

_

∞

−∞

f(y) e

−i(k/α)y

dy

=

1

[α[

˜

f

_

k

α

_

Property (4):

g(x) = f(x −α)

˜ g(k) =

_

∞

−∞

f(x −α) e

−ikx

dx

=

_

∞

−∞

f(y) e

−ik(y+α)

dy

= e

−ikα

˜

f(k)

Property (6):

g(x) = f

′

(x)

˜ g(k) =

_

∞

−∞

f

′

(x) e

−ikx

dx

=

_

f(x) e

−ikx

_

∞

−∞

−

_

∞

−∞

f(x)(−ik)e

−ikx

dx

= ik

˜

f(k)

The integrated part vanishes because f(x) must tend to zero as x →±∞

in order to possess a Fourier transform.

73

Property (7):

g(x) = xf(x)

˜ g(k) =

_

∞

−∞

xf(x) e

−ikx

dx

= i

_

∞

−∞

f(x)(−ix)e

−ikx

dx

= i

d

dk

_

∞

−∞

f(x) e

−ikx

dx

= i

˜

f

′

(k)

Property (8):

g(x) =

˜

f(x)

˜ g(k) =

_

∞

−∞

˜

f(x) e

−ikx

dx

= 2πf(−k)

Property (10): if f(−x) = ±f(x), i.e. f is even or odd, then

˜

f(−k) =

_

∞

−∞

f(x) e

+ikx

dx

=

_

∞

−∞

±f(−x) e

ikx

dx

= ±

_

∞

−∞

f(y) e

−iky

dy

= ±

˜

f(k)

74

4.5 The delta function and the Fourier transform

Consider the Gaussian function of example (3):

f(x) = (2πσ

2

x

)

−1/2

exp

_

−

x

2

2σ

2

x

_

˜

f(k) = exp

_

−

σ

2

x

k

2

2

_

The Gaussian is normalized such that

_

∞

−∞

f(x) dx = 1

As the width σ

x

tends to zero, the Gaussian becomes taller and nar-

rower, but the area under the curve remains the same. The value of

f(x) tends to zero for any non-zero value of x. At the same time, the

value of

˜

f(k) tends to unity for any ﬁnite value of k.

In this limit we approach the Dirac delta function, δ(x).

The substitution property allows us to verify the Fourier transform of

the delta function:

˜

δ(k) =

_

∞

−∞

δ(x) e

−ikx

dx = e

−ik0

= 1

75

Now formally apply the inverse transform:

δ(x) =

1

2π

_

∞

−∞

1 e

ikx

dk

Relabel the variables and rearrange (the exponent can have either sign):

_

∞

−∞

e

±ikx

dx = 2πδ(k)

Therefore the Fourier transform of a unit constant (1) is 2πδ(k). Note

that a constant function does not satisfy the necessary condition for the

existence of a (regular) Fourier transform. But it does have a Fourier

transform in the space of generalized functions.

4.6 The convolution theorem

4.6.1 Deﬁnition of convolution

The convolution of two functions, h = f ∗ g, is deﬁned by

h(x) =

_

∞

−∞

f(y)g(x −y) dy

Note that the sum of the arguments of f and g is the argument of h.

Convolution is a symmetric operation:

[g ∗ f](x) =

_

∞

−∞

g(y)f(x −y) dy

=

_

∞

−∞

f(z)g(x −z) dz

= [f ∗ g](x)

76

4.6.2 Interpretation and examples

In statistics, a continuous random variable x (e.g. the height of a person

drawn at random from the population) has a probability distribution

(or density) function f(x). The probability of x lying in the range

x

0

< x < x

0

+δx is f(x

0

)δx, in the limit of small δx.

If x and y are independent random variables with distribution functions

f(x) and g(y), then let the distribution function of their sum, z = x+y,

be h(z).

Now, for any given value of x, the probability that z lies in the range

z

0

< z < z

0

+δz

is just the probability that y lies in the range

z

0

−x < y < z

0

−x +δz

which is g(z

0

−x)δz. Therefore

h(z

0

)δz =

_

∞

−∞

f(x)g(z

0

−x)δz dx

which implies

h = f ∗ g

The eﬀect of measuring, observing or processing scientiﬁc data can often

be described as a convolution of the data with a certain function.

e.g. when a point source is observed by a telescope, a broadened image

is seen, known as the point spread function of the telescope. When an

extended source is observed, the image that is seen is the convolution

of the source with the point spread function.

In this sense convolution corresponds to a broadening or distortion of

the original data.

77

A point mass M at position R gives rise to a gravitational potential

Φ(r) = −GM/[r −R[. A continuous mass density ρ(r) can be thought

of as a sum of inﬁnitely many point masses ρ(R) d

3

R at positions R.

The resulting gravitational potential is

Φ(r) = −G

_

ρ(R)

[r −R[

d

3

R

which is the (3D) convolution of the mass density ρ(r) with the potential

of a unit point charge at the origin, −G/[r[.

4.6.3 The convolution theorem

The Fourier transform of a convolution is

˜

h(k) =

_

∞

−∞

__

∞

−∞

f(y)g(x −y) dy

_

e

−ikx

dx

=

_

∞

−∞

_

∞

−∞

f(y)g(x −y) e

−ikx

dxdy

=

_

∞

−∞

_

∞

−∞

f(y)g(z) e

−iky

e

−ikz

dz dy (z = x −y)

=

_

∞

−∞

f(y) e

−iky

dy

_

∞

−∞

g(z) e

−ikz

dz

=

˜

f(k)˜ g(k)

Similarly, the Fourier transform of f(x)g(x) is

1

2π

[

˜

f ∗ ˜ g](k).

This means that:

• convolution is an operation best carried out as a multiplication in

the Fourier domain

• the Fourier transform of a product is a complicated object

• convolution can be undone (deconvolution) by a division in the

Fourier domain. If g is known and f ∗ g is measured, then f can

be obtained, in principle.

78

4.6.4 Correlation

The correlation of two functions, h = f ⊗g, is deﬁned by

h(x) =

_

∞

−∞

[f(y)]

∗

g(x +y) dy

Now the argument of h is the shift between the arguments of f and g.

Correlation is a way of quantifying the relationship between two (typi-

cally oscillatory) functions. If two signals (oscillating about an average

value of zero) oscillate in phase with each other, their correlation will

be positive. If they are out of phase, the correlation will be negative. If

they are completely unrelated, their correlation will be zero.

The Fourier transform of a correlation is

˜

h(k) =

_

∞

−∞

__

∞

−∞

[f(y)]

∗

g(x +y) dy

_

e

−ikx

dx

=

_

∞

−∞

_

∞

−∞

[f(y)]

∗

g(x +y) e

−ikx

dxdy

=

_

∞

−∞

_

∞

−∞

[f(y)]

∗

g(z) e

iky

e

−ikz

dz dy (z = x +y)

=

__

∞

−∞

f(y) e

−iky

dy

_

∗

_

∞

−∞

g(z) e

−ikz

dz

= [

˜

f(k)]

∗

˜ g(k)

79

This result (or the special case g = f) is the Wiener–Khinchin theorem.

The autoconvolution and autocorrelation of f are f ∗f and f ⊗f. Their

Fourier transforms are

˜

f

2

and [

˜

f[

2

, respectively.

4.7 Parseval’s theorem

If we apply the inverse transform to the WK theorem we ﬁnd

_

∞

−∞

[f(y)]

∗

g(x +y) dy =

1

2π

_

∞

−∞

[

˜

f(k)]

∗

˜ g(k) e

ikx

dk

Now set x = 0 and relabel y →x to obtain Parseval’s theorem

_

∞

−∞

[f(x)]

∗

g(x) dx =

1

2π

_

∞

−∞

[

˜

f(k)]

∗

˜ g(k) dk

The special case used most frequently is when g = f:

_

∞

−∞

[f(x)[

2

dx =

1

2π

_

∞

−∞

[

˜

f(k)[

2

dk

Note that division by 2π accompanies the integration with respect to k.

Parseval’s theorem means that the Fourier transform is a ‘unitary trans-

formation’ that preserves the ‘inner product’ between two functions (see

later!), in the same way that a rotation preserves lengths and angles.

Alternative derivation using the delta function:

_

∞

−∞

[f(x)]

∗

g(x) dx

=

_

∞

−∞

_

1

2π

_

∞

−∞

˜

f(k) e

ikx

dk

_

∗

1

2π

_

∞

−∞

˜ g(k

′

) e

ik

′

x

dk

′

dx

=

1

(2π)

2

_

∞

−∞

_

∞

−∞

_

∞

−∞

[

˜

f(k)]

∗

˜ g(k

′

) e

i(k

′

−k)x

dxdk

′

dk

=

1

(2π)

2

_

∞

−∞

_

∞

−∞

[

˜

f(k)]

∗

˜ g(k

′

) 2πδ(k

′

−k) dk

′

dk

=

1

2π

_

∞

−∞

[

˜

f(k)]

∗

˜ g(k) dk

80

4.8 Power spectra

The quantity

Φ(k) = [

˜

f(k)[

2

appearing in the Wiener–Khinchin theorem and Parseval’s theorem is

the (power) spectrum or (power) spectral density of the function f(x).

The WK theorem states that the FT of the autocorrelation function is

the power spectrum.

This concept is often used to quantify the spectral content (as a function

of angular frequency ω) of a signal f(t).

The spectrum of a perfectly periodic signal consists of a series of delta

functions at the principal frequency and its harmonics, if present. Its

autocorrelation function does not decay as t →∞.

White noise is an ideal random signal with autocorrelation function

proportional to δ(t): the signal is perfectly decorrelated. It therefore

has a ﬂat spectrum (Φ = constant).

Less idealized signals may have spectra that are peaked at certain fre-

quencies but also contain a general noise component.

81

5 Matrices and linear algebra

5.1 Motivation

Many scientiﬁc quantities are vectors. A linear relationship between

two vectors is described by a matrix. This could be either

• a physical relationship, e.g. that between the angular velocity and

angular momentum vectors of a rotating body

• a relationship between the components of (physically) the same

vector in diﬀerent coordinate systems

Linear algebra deals with the addition and multiplication of scalars,

vectors and matrices.

Eigenvalues and eigenvectors are the characteristic numbers and direc-

tions associated with matrices, which allow them to be expressed in the

simplest form. The matrices that occur in scientiﬁc applications usually

have special symmetries that impose conditions on their eigenvalues and

eigenvectors.

Vectors do not necessarily live in physical space. In some applications

(notably quantum mechanics) we have to deal with complex spaces of

various dimensions.

5.2 Vector spaces

5.2.1 Abstract deﬁnition of scalars and vectors

We are used to thinking of scalars as numbers, and vectors as directed

line segments. For many purposes it is useful to deﬁne scalars and

vectors in a more general (and abstract) way.

Scalars are the elements of a number ﬁeld, of which the most important

examples are:

82

• R, the set of real numbers

• C, the set of complex numbers

A number ﬁeld F:

• is a set of elements on which the operations of addition and mul-

tiplication are deﬁned and satisfy the usual laws of arithmetic

(commutative, associative and distributive)

• is closed under addition and multiplication

• includes identity elements (the numbers 0 and 1) for addition and

multiplication

• includes inverses (negatives and reciprocals) for addition and mul-

tiplication for every element, except that 0 has no reciprocal

Vectors are the elements of a vector space (or linear space) deﬁned over

a number ﬁeld F.

A vector space V :

• is a set of elements on which the operations of vector addition and

scalar multiplication are deﬁned and satisfy certain axioms

• is closed under these operations

• includes an identity element (the vector 0) for vector addition

We will write vectors in bold italic face (or underlined in handwriting).

Scalar multiplication means multiplying a vector x by a scalar α to

obtain the vector αx. Note that 0x = 0 and 1x = x.

The basic example of a vector space is F

n

. An element of F

n

is a list of

n scalars, (x

1

, . . . , x

n

), where x

i

∈ F. This is called an n-tuple. Vector

addition and scalar multiplication are deﬁned componentwise:

(x

1

, . . . , x

n

) + (y

1

, . . . , y

n

) = (x

1

+y

1

, . . . , x

n

+y

n

)

α(x

1

, . . . , x

n

) = (αx

1

, . . . , αx

n

)

83

Hence we have R

n

and C

n

.

Notes:

• vector multiplication is not deﬁned in general

• R

2

is not quite the same as C because C has a rule for multipli-

cation

• R

3

is not quite the same as physical space because physical space

has a rule for the distance between two points (i.e. Pythagoras’s

theorem, if physical space is approximated as Euclidean)

The formal axioms of number ﬁelds and vector spaces can be found in

books on linear algebra.

5.2.2 Span and linear dependence

Let S = ¦e

1

, e

2

, . . . , e

m

¦ be a subset of vectors in V .

A linear combination of S is any vector of the form e

1

x

1

+e

2

x

2

+ +

e

m

x

m

, where x

1

, x

2

, . . . , x

m

are scalars.

The span of S is the set of all vectors that are linear combinations of

S. If the span of S is the entire vector space V , then S is said to span

V .

The vectors of S are said to be linearly independent if no non-trivial

linear combination of S equals zero, i.e. if

e

1

x

1

+e

2

x

2

+ +e

m

x

m

= 0 implies x

1

= x

2

= = x

m

= 0

If, on the other hand, the vectors are linearly dependent, then such an

equation holds for some non-trivial values of the coeﬃcients x

1

, . . . , x

m

.

Suppose that x

k

is one of the non-zero coeﬃcients. Then e

k

can be

expressed as a linear combination of the other vectors:

e

k

= −

i=k

e

i

x

i

x

k

84

Linear independence therefore means that none of the vectors is a linear

combination of the others.

Notes:

• if an additional vector is added to a spanning set, it remains a

spanning set

• if a vector is removed from a linearly independent set, the set

remains linearly independent

5.2.3 Basis and dimension

A basis for a vector space V is a subset of vectors ¦e

1

, e

2

, . . . , e

n

¦ that

spans V and is linearly independent. The properties of bases (proved

in books on linear algebra) are:

• all bases have the same number of elements, n, which is called the

dimension of V

• any n linearly independent vectors in V form a basis for V

• any vector x ∈ V can written in a unique way as

x = e

1

x

1

+ +e

n

x

n

and the scalars x

i

are called the components of x with respect to

the basis ¦e

1

, e

2

, . . . , e

n

¦

Vector spaces can have inﬁnite dimension, e.g. the set of functions de-

ﬁned on the interval 0 < x < 2π and having Fourier series

f(x) =

∞

n=−∞

f

n

e

inx

Here f(x) is the ‘vector’ and f

n

are its ‘components’ with respect to the

‘basis’ of functions e

inx

. Functional analysis deals with such inﬁnite-

dimensional vector spaces.

85

5.2.4 Examples

⊲ Example (1): three-dimensional real space, R

3

:

vectors: triples (x, y, z) of real numbers

possible basis: ¦(1, 0, 0), (0, 1, 0), (0, 0, 1)¦

(x, y, z) = (1, 0, 0)x + (0, 1, 0)y + (0, 0, 1)z

⊲ Example (2): the complex plane, C:

vectors: complex numbers z

EITHER (a) a one-dimensional vector space over C

possible basis: ¦1¦

z = 1 z

OR (b) a two-dimensional vector space over R (supplemented by a

multiplication rule)

possible basis: ¦1, i¦

z = 1 x + i y

⊲ Example (3): real 2 2 symmetric matrices:

vectors: matrices of the form

_

x y

y z

_

, x, y, z ∈ R

three-dimensional vector space over R

possible basis:

__

1 0

0 0

_

,

_

0 1

1 0

_

,

_

0 0

0 1

__

_

x y

y z

_

=

_

1 0

0 0

_

x +

_

0 1

1 0

_

y +

_

0 0

0 1

_

z

86

5.2.5 Change of basis

The same vector has diﬀerent components with respect to diﬀerent

bases.

Consider two bases ¦e

i

¦ and ¦e

′

i

¦. Since they are bases, we can write the

vectors of one set in terms of the other, using the summation convention

(sum over i from 1 to n):

e

j

= e

′

i

R

ij

The n n array of numbers R

ij

is the transformation matrix between

the two bases. R

ij

is the ith component of the vector e

j

with respect

to the primed basis.

The representation of a vector x in the two bases is

x = e

j

x

j

= e

′

i

x

′

i

where x

i

and x

′

i

are the components. Thus

e

′

i

x

′

i

= e

j

x

j

= e

′

i

R

ij

x

j

from which we deduce the transformation law for vector components:

x

′

i

= R

ij

x

j

Note that the transformation laws for basis vectors and vector compo-

nents are ‘opposite’ so that the vector x is unchanged by the transfor-

mation.

Example: in R

3

, two bases are ¦e

i

¦ = ¦(1, 0, 0), (0, 1, 0), (0, 0, 1)¦ and

¦e

′

i

¦ = ¦(0, 1, 0), (0, 0, 1), (1, 0, 0)¦. Then

R

ij

=

_

_

0 1 0

0 0 1

1 0 0

_

_

There is no need for a basis to be either orthogonal or normalized. In

fact, we have not yet deﬁned what these terms mean.

87

5.3 Matrices

5.3.1 Array viewpoint

A matrix can be regarded simply as a rectangular array of numbers:

A =

_

¸

¸

¸

_

A

11

A

12

A

1n

A

21

A

22

A

2n

.

.

.

.

.

.

.

.

.

.

.

.

A

m1

A

m2

A

mn

_

¸

¸

¸

_

This viewpoint is equivalent to thinking of a vector as an n-tuple, or,

more correctly, as either a column matrix or a row matrix:

x =

_

¸

¸

¸

_

x

1

x

2

.

.

.

x

n

_

¸

¸

¸

_

, x

T

=

_

x

1

x

2

x

n

¸

We will write matrices and vectors in sans serif face when adopting this

viewpoint. The superscript ‘T’ denotes the transpose.

The transformation matrix R

ij

described above is an example of a

square matrix (m = n).

Using the suﬃx notation and the summation convention, the familiar

rules for multiplying a matrix by a vector on the right or left are

(Ax)

i

= A

ij

x

j

(x

T

A)

j

= x

i

A

ij

and the rules for matrix addition and multiplication are

(A +B)

ij

= A

ij

+B

ij

(AB)

ij

= A

ik

B

kj

The transpose of an mn matrix is the n m matrix

(A

T

)

ij

= A

ji

88

5.3.2 Linear operators

A linear operator A on a vector space V acts on elements of V to

produce other elements of V . The action of A on x is written A(x) or

just Ax. The property of linearity means:

A(αx) = αAx for scalar α

A(x +y) = Ax +Ay

Notes:

• a linear operator has an existence without reference to any basis

• the operation can be thought of as a linear transformation or

mapping of the space V (a simple example is a rotation of three-

dimensional space)

• a more general idea, not considered here, is that a linear operator

can act on vectors of one space V to produce vectors of another

space V

′

, possibly of a diﬀerent dimension

The components of A with respect to a basis ¦e

i

¦ are deﬁned by the

action of A on the basis vectors:

Ae

j

= e

i

A

ij

The components form a square matrix. We write (A)

ij

= A

ij

and

(x)

i

= x

i

.

Since A is a linear operator, a knowledge of its action on a basis is

suﬃcient to determine its action on any vector x:

Ax = A(e

j

x

j

) = x

j

(Ae

j

) = x

j

(e

i

A

ij

) = e

i

A

ij

x

j

or

(Ax)

i

= A

ij

x

j

This corresponds to the rule for multiplying a matrix by a vector.

89

The sum of two linear operators is deﬁned by

(A+B)x = Ax +Bx = e

i

(A

ij

+B

ij

)x

j

The product, or composition, of two linear operators has the action

(AB)x = A(Bx) = A(e

i

B

ij

x

j

) = (Ae

k

)B

kj

x

j

= e

i

A

ik

B

kj

x

j

The components therefore satisfy the rules of matrix addition and mul-

tiplication:

(A+B)

ij

= A

ij

+B

ij

(AB)

ij

= A

ik

B

kj

Recall that matrix multiplication is not commutative, so BA ,= AB in

general.

Therefore a matrix can be thought of as the components of a linear

operator with respect to a given basis, just as a column matrix or n-

tuple can be thought of as the components of a vector with respect to

a given basis.

5.3.3 Change of basis again

When changing basis, we wrote one set of basis vectors in terms of the

other:

e

j

= e

′

i

R

ij

We could equally have written

e

′

j

= e

i

S

ij

where S is the matrix of the inverse transformation. Substituting one

relation into the other (and relabelling indices where required), we ob-

tain

e

′

j

= e

′

k

R

ki

S

ij

90

e

j

= e

k

S

ki

R

ij

This can only be true if

R

ki

S

ij

= S

ki

R

ij

= δ

kj

or, in matrix notation,

RS = SR = 1

Therefore R and S are inverses: R = S

−1

and S = R

−1

.

The transformation law for vector components, x

′

i

= R

ij

x

j

, can be

written in matrix notation as

x

′

= Rx

with the inverse relation

x = R

−1

x

′

How do the components of a linear operator Atransform under a change

of basis? We require, for any vector x,

Ax = e

i

A

ij

x

j

= e

′

i

A

′

ij

x

′

j

Using e

j

= e

′

i

R

ij

and relabelling indices, we ﬁnd

e

′

k

R

ki

A

ij

x

j

= e

′

k

A

′

kj

x

′

j

R

ki

A

ij

x

j

= A

′

kj

x

′

j

RA(R

−1

x

′

) = A

′

x

′

Since this is true for any x

′

, we have

A

′

= RAR

−1

These transformation laws are consistent, e.g.

A

′

x

′

= RAR

−1

Rx = RAx = (Ax)

′

A

′

B

′

= RAR

−1

RBR

−1

= RABR

−1

= (AB)

′

91

5.4 Inner product (scalar product)

This is used to give a meaning to lengths and angles in a vector space.

5.4.1 Deﬁnition

An inner product (or scalar product) is a scalar function ¸x[y) of two

vectors x and y. Other common notations are (x, y), ¸x, y) and x · y.

An inner product must:

• be linear in the second argument:

¸x[αy) = α¸x[y) for scalar α

¸x[y +z) = ¸x[y) +¸x[z)

• have Hermitian symmetry:

¸y[x) = ¸x[y)

∗

• be positive deﬁnite:

¸x[x) 0 with equality if and only if x = 0

Notes:

• an inner product has an existence without reference to any basis

• the star is needed to ensure that ¸x[x) is real

• the star is not needed in a real vector space

• it follows that the inner product is antilinear in the ﬁrst argument:

¸αx[y) = α

∗

¸x[y) for scalar α

¸x +y[z) = ¸x[z) +¸y[z)

92

The standard (Euclidean) inner product on R

n

is the ‘dot product’

¸x[y) = x · y = x

i

y

i

which is generalized to C

n

as

¸x[y) = x

∗

i

y

i

The star is needed for Hermitian symmetry. We will see later that

any other inner product on R

n

or C

n

can be reduced to this one by a

suitable choice of basis.

¸x[x)

1/2

is the ‘length’ or ‘norm’ of the vector x, written [x[ or |x|.

In one dimension this agrees with the meaning of ‘absolute value’ if we

are using the dot product.

5.4.2 The Cauchy–Schwarz inequality

[¸x[y)[

2

¸x[x)¸y[y)

or equivalently

[¸x[y)[ [x[[y[

with equality if and only if x = αy (i.e. the vectors are parallel or

antiparallel).

Proof: We assume that x ,= 0 and y ,= 0, otherwise the inequality is

trivial. We consider the non-negative quantity

¸x −αy[x −αy) = ¸x −αy[x) −α¸x −αy[y)

= ¸x[x) −α

∗

¸y[x) −α¸x[y) +αα

∗

¸y[y)

= ¸x[x) +αα

∗

¸y[y) −α¸x[y) −α

∗

¸x[y)

∗

= [x[

2

+[α[

2

[y[

2

−2 Re (α¸x[y))

This result holds for any scalar α. First choose the phase of α such that

α¸x[y) is real and non-negative and therefore equal to [α[[¸x[y)[. Then

[x[

2

+[α[

2

[y[

2

−2[α[[¸x[y)[ 0

93

([x[ −[α[[y[)

2

+ 2[α[[x[[y[ −2[α[[¸x[y)[ 0

Now choose the modulus of α such that [α[ = [x[/[y[, which is ﬁnite

and non-zero. Then divide the inequality by 2[α[ to obtain

[x[[y[ −[¸x[y)[ 0

as required.

In a real vector space, the Cauchy–Schwarz inequality allows us to deﬁne

the angle θ between two vectors through

¸x[y) = [x[[y[ cos θ

If ¸x[y)=0 (in any vector space) the vectors are said to be orthogonal.

5.4.3 Inner product and bases

The inner product is bilinear: it is antilinear in the ﬁrst argument and

linear in the second argument.

Warning. This is the usual convention in physics and applied mathe-

matics. In pure mathematics it is usually the other way around.

A knowledge of the inner product of the basis vectors is therefore suﬃ-

cient to determine the inner product of any two vectors x and y. Let

¸e

i

[e

j

) = G

ij

Then

¸x[y) = ¸e

i

x

i

[e

j

y

j

) = x

∗

i

y

j

¸e

i

[e

j

) = G

ij

x

∗

i

y

j

The n n array of numbers G

ij

are the metric coeﬃcients of the basis

¦e

i

¦. The Hermitian symmetry of the inner product implies that

G

ij

y

∗

i

x

j

= (G

ij

x

∗

i

y

j

)

∗

= G

∗

ij

x

i

y

∗

j

= G

∗

ji

y

∗

i

x

j

94

for all vectors x and y, and therefore

G

ji

= G

∗

ij

We say that the matrix G is Hermitian.

An orthonormal basis is one for which

¸e

i

[e

j

) = δ

ij

in which case the inner product is the standard one

¸x[y) = x

∗

i

y

i

We will see later that it is possible to transform any basis into an or-

thonormal one.

5.5 Hermitian conjugate

5.5.1 Deﬁnition and simple properties

We deﬁne the Hermitian conjugate of a matrix to be the complex con-

jugate of its transpose:

A

†

= (A

T

)

∗

(A

†

)

ij

= A

∗

ji

This applies to general rectangular matrices, e.g.

_

A

11

A

12

A

13

A

21

A

22

A

23

_

†

=

_

_

A

∗

11

A

∗

21

A

∗

12

A

∗

22

A

∗

13

A

∗

23

_

_

The Hermitian conjugate of a column matrix is

x

†

=

_

¸

¸

¸

_

x

1

x

2

.

.

.

x

n

_

¸

¸

¸

_

†

=

_

x

∗

1

x

∗

2

x

∗

n

¸

95

Note that

(A

†

)

†

= A

The Hermitian conjugate of a product of matrices is

(AB)

†

= B

†

A

†

because

(AB)

†

ij

= (AB)

∗

ji

= (A

jk

B

ki

)

∗

= B

∗

ki

A

∗

jk

= (B

†

)

ik

(A

†

)

kj

= (B

†

A

†

)

ij

Note the reversal of the order of the product, as also occurs with the

transpose or inverse of a product of matrices. This result extends to

arbitrary products of matrices and vectors, e.g.

(ABCx)

†

= x

†

C

†

B

†

A

†

(x

†

Ay)

†

= y

†

A

†

x

In the latter example, if x and y are vectors, each side of the equation

is a scalar (a complex number). The Hermitian conjugate of a scalar is

just the complex conjugate.

5.5.2 Relationship with inner product

We have seen that the inner product of two vectors is

¸x[y) = G

ij

x

∗

i

y

j

= x

∗

i

G

ij

y

j

where G

ij

are the metric coeﬃcients. This can also be written

¸x[y) = x

†

Gy

and the Hermitian conjugate of this equation is

¸x[y)

∗

= y

†

G

†

x

96

Similarly

¸y[x) = y

†

Gx

The Hermitian symmetry of the inner product requires ¸y[x) = ¸x[y)

∗

,

i.e.

y

†

Gx = y

†

G

†

x

which is satisﬁed for all vectors x and y provided that G is an Hermitian

matrix:

G

†

= G

If the basis is orthonormal, then G

ij

= δ

ij

and we have simply

¸x[y) = x

†

y

5.5.3 Adjoint operator

A further relationship between the Hermitian conjugate and the inner

product is as follows.

The adjoint of a linear operator A with respect to a given inner product

is a linear operator A

†

satisfying

¸A

†

x[y) = ¸x[Ay)

for all vectors x and y. With the standard inner product this implies

_

(A

†

)

ij

x

j

_

∗

y

i

= x

∗

i

A

ij

y

j

(A

†

)

∗

ij

x

∗

j

y

i

= A

ji

x

∗

j

y

i

(A

†

)

∗

ij

= A

ji

(A

†

)

ij

= A

∗

ji

The components of the operator A with respect to a basis are given by a

square matrix A. The components of the adjoint operator (with respect

to the standard inner product) are given by the Hermitian conjugate

matrix A

†

.

97

5.5.4 Special square matrices

The following types of special matrices arise commonly in scientiﬁc ap-

plications.

A symmetric matrix is equal to its transpose:

A

T

= A or A

ji

= A

ij

An antisymmetric (or skew-symmetric) matrix satisﬁes

A

T

= −A or A

ji

= −A

ij

An orthogonal matrix is one whose transpose is equal to its inverse:

A

T

= A

−1

or AA

T

= A

T

A = 1

These ideas generalize to a complex vector space as follows.

An Hermitian matrix is equal to its Hermitian conjugate:

A

†

= A or A

∗

ji

= A

ij

An anti-Hermitian (or skew-Hermitian) matrix satisﬁes

A

†

= −A or A

∗

ji

= −A

ij

A unitary matrix is one whose Hermitian conjugate is equal to its in-

verse:

A

†

= A

−1

or AA

†

= A

†

A = 1

In addition, a normal matrix is one that commutes with its Hermitian

conjugate:

AA

†

= A

†

A

It is easy to verify that Hermitian, anti-Hermitian and unitary matrices

are all normal.

98

5.6 Eigenvalues and eigenvectors

5.6.1 Basic properties

An eigenvector of a linear operator A is a non-zero vector x satisfying

Ax = λx

for some scalar λ, the corresponding eigenvalue.

The equivalent statement in a matrix representation is

Ax = λx or (A −λ1)x = 0

where A is an n n square matrix.

This equation states that a non-trivial linear combination of the columns

of the matrix (A−λ1) is equal to zero, i.e. that the columns of the matrix

are linearly dependent. This is equivalent to the statement

det(A −λ1) = 0

which is the characteristic equation of the matrix A.

To compute the eigenvalues and eigenvectors, we ﬁrst solve the char-

acteristic equation. This a polynomial equation of degree n in λ and

therefore has n roots, although not necessarily distinct. These are the

eigenvalues of A, and are complex in general. The corresponding eigen-

vectors are obtained by solving the simultaneous equations found in the

rows of Ax = λx.

If the n roots are distinct, then there are n linearly independent eigen-

vectors, each of which is determined uniquely up to an arbitrary multi-

plicative constant.

If the roots are not all distinct, the repeated eigenvalues are said to be

degenerate. If an eigenvalue λ occurs m times, there may be any number

between 1 and m of linearly independent eigenvectors corresponding to

99

it. Any linear combination of these is also an eigenvector and the space

spanned by such vectors is called an eigenspace.

Example (1):

_

0 0

0 0

_

characteristic equation

¸

¸

¸

¸

0 −λ 0

0 0 −λ

¸

¸

¸

¸

= λ

2

= 0

eigenvalues 0, 0

eigenvectors:

_

0 −λ 0

0 0 −λ

_ _

x

y

_

=

_

0

0

_

⇒

_

0

0

_

=

_

0

0

_

two linearly independent eigenvectors, e.g.

_

1

0

_

,

_

0

1

_

two-dimensional eigenspace corresponding to eigenvalue 0

Example (2):

_

0 1

0 0

_

characteristic equation

¸

¸

¸

¸

0 −λ 1

0 0 −λ

¸

¸

¸

¸

= λ

2

= 0

eigenvalues 0, 0

eigenvectors:

_

0 −λ 1

0 0 −λ

_ _

x

y

_

=

_

0

0

_

⇒

_

y

0

_

=

_

0

0

_

only one linearly independent eigenvector

_

1

0

_

one-dimensional eigenspace corresponding to eigenvalue 0

100

5.6.2 Eigenvalues and eigenvectors of Hermitian matrices

The matrices most often encountered in physical applications are real

symmetric or, more generally, Hermitian matrices satisfying A

†

= A. In

quantum mechanics, Hermitian matrices (or operators) represent ob-

servable quantities.

We consider two eigenvectors x and y corresponding to eigenvalues λ

and µ:

Ax = λx (1)

Ay = µy (2)

The Hermitian conjugate of equation (2) is (since A

†

= A)

y

†

A = µ

∗

y

†

(3)

Using equations (1) and (3) we can construct two expressions for y

†

Ax:

y

†

Ax = λy

†

x = µ

∗

y

†

x

(λ −µ

∗

)y

†

x = 0 (4)

First suppose that x and y are the same eigenvector. Then y = x and

µ = λ, so equation (4) becomes

(λ −λ

∗

)x

†

x = 0

Since x ,= 0, x

†

x ,= 0 and so λ

∗

= λ. Therefore the eigenvalues of an

Hermitian matrix are real.

Equation (4) simpliﬁes to

(λ −µ)y

†

x = 0

If x and y are now diﬀerent eigenvectors, we deduce that y

†

x = 0,

provided that µ ,= λ. Therefore the eigenvectors of an Hermitian matrix

corresponding to distinct eigenvalues are orthogonal (in the standard

inner product on C

n

).

101

5.6.3 Related results

Normal matrices (including all Hermitian, anti-Hermitian and unitary

matrices) satisfy AA

†

= A

†

A. It can be shown that:

• the eigenvectors of normal matrices corresponding to distinct eigen-

values are orthogonal

• the eigenvalues of Hermitian, anti-Hermitian and unitary matrices

are real, imaginary and of unit modulus, respectively

These results can all be proved in a similar way.

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Show that the eigenvalues of a unitary matrix are of unit modulus and

the eigenvectors corresponding to distinct eigenvalues are orthogonal.

Let A be a unitary matrix: A

†

= A

−1

Ax = λx

Ay = µy

y

†

A

†

= µ

∗

y

†

y

†

A

†

Ax = µ

∗

λy

†

x

(µ

∗

λ −1)y

†

x = 0

y = x : [λ[

2

−1 = 0 since x ,= 0

⇒[λ[ = 1

y ,= x :

_

λ

µ

−1

_

y

†

x = 0

µ ,= λ : y

†

x = 0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

102

Note that:

• real symmetric matrices are Hermitian

• real antisymmetric matrices are anti-Hermitian

• real orthogonal matrices are unitary

Note also that a 11 matrix is just a number, λ, which is the eigenvalue

of the matrix. To be Hermitian, anti-Hermitian or unitary λ must

satisfy

λ

∗

= λ, λ

∗

= −λ or λ

∗

= λ

−1

respectively. Hermitian, anti-Hermitian and unitary matrices therefore

correspond to real, imaginary and unit-modulus numbers, respectively.

There are direct correspondences between Hermitian, anti-Hermitian

and unitary matrices:

• if A is Hermitian then iA is anti-Hermitian (and vice versa)

• if A is Hermitian then

exp(iA) =

∞

n=0

(iA)

n

n!

is unitary

[Compare the following two statements: If z is a real number then iz is

imaginary (and vice versa). If z is a real number then exp(iz) is of unit

modulus.]

5.6.4 The degenerate case

If a repeated eigenvalue λ occurs m times, it can be shown (with some

diﬃculty) that there are exactly m corresponding linearly independent

eigenvectors if the matrix is normal.

It is always possible to construct an orthogonal basis within this m-

dimensional eigenspace (e.g. by the Gram–Schmidt procedure: see Ex-

ample Sheet 3, Question 2). Therefore, even if the eigenvalues are

103

degenerate, it is always possible to ﬁnd n mutually orthogonal eigen-

vectors, which form a basis for the vector space. In fact, this is possible

if and only if the matrix is normal.

Orthogonal eigenvectors can be normalized (divide by their norm) to

make them orthonormal. Therefore an orthonormal basis can always be

constructed from the eigenvectors of a normal matrix. This is called an

eigenvector basis.

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Find an orthonormal set of eigenvectors of the Hermitian matrix

_

_

0 i 0

−i 0 0

0 0 1

_

_

Characteristic equation:

¸

¸

¸

¸

¸

¸

−λ i 0

−i −λ 0

0 0 1 −λ

¸

¸

¸

¸

¸

¸

= 0

(λ

2

−1)(1 −λ) = 0

λ = 1, −1, 1

Eigenvector for λ = −1:

_

_

1 i 0

−i 1 0

0 0 2

_

_

_

_

x

y

z

_

_

=

_

_

0

0

0

_

_

x + iy = 0, z = 0

Normalized eigenvector:

1

√

2

_

_

1

i

0

_

_

104

Eigenvectors for λ = 1:

_

_

−1 i 0

−i −1 0

0 0 0

_

_

_

_

x

y

z

_

_

=

_

_

0

0

0

_

_

−x + iy = 0

Normalized eigenvectors:

1

√

2

_

_

1

−i

0

_

_

,

_

_

0

0

1

_

_

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.7 Diagonalization of a matrix

5.7.1 Similarity

We have seen that the matrices A and A

′

representing a linear operator

in two diﬀerent bases are related by

A

′

= RAR

−1

or equivalently A = R

−1

A

′

R

where R is the transformation matrix between the two bases.

Two square matrices A and B are said to be similar if they are related

by

B = S

−1

AS (1)

where S is some invertible matrix. This means that A and B are repre-

sentations of the same linear operator in diﬀerent bases. The relation

(1) is called a similarity transformation. S is called the similarity ma-

trix.

A square matrix A is said to be diagonalizable if it is similar to a diagonal

matrix, i.e. if

S

−1

AS = Λ

105

for some invertible matrix S and some diagonal matrix

Λ =

_

¸

¸

¸

_

Λ

1

0 0

0 Λ

2

0

.

.

.

.

.

.

.

.

.

.

.

.

0 0 Λ

n

_

¸

¸

¸

_

5.7.2 Diagonalization

An n n matrix A can be diagonalized if and only if it has n linearly

independent eigenvectors. The columns of the similarity matrix S are

just the eigenvectors of A. The diagonal entries of the matrix Λ are just

the eigenvalues of A:

The meaning of diagonalization is that the matrix is expressed in its

simplest form by transforming it to its eigenvector basis.

106

5.7.3 Diagonalization of a normal matrix

An nn normal matrix always has n linearly independent eigenvectors

and is therefore diagonalizable. Furthermore, the eigenvectors can be

chosen to be orthonormal. In this case the similarity matrix is unitary,

because a unitary matrix is precisely one whose columns are orthonor-

mal vectors (consider U

†

U = 1):

Therefore a normal matrix can be diagonalized by a unitary transfor-

mation:

U

†

AU = Λ

where U is a unitary matrix whose columns are the orthonormal eigen-

vectors of A, and Λ is a diagonal matrix whose entries are the eigenvalues

of A.

In particular, this result applies to the important cases of real symmetric

and, more generally, Hermitian matrices.

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Diagonalize the Hermitian matrix

_

_

0 i 0

−i 0 0

0 0 1

_

_

Using the previously obtained eigenvalues and eigenvectors,

_

_

1/

√

2 −i/

√

2 0

1/

√

2 i/

√

2 0

0 0 1

_

_

_

_

0 i 0

−i 0 0

0 0 1

_

_

_

_

1/

√

2 1/

√

2 0

i/

√

2 −i/

√

2 0

0 0 1

_

_

=

_

_

−1 0 0

0 1 0

0 0 1

_

_

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

107

5.7.4 Orthogonal and unitary transformations

The transformation matrix between two bases ¦e

i

¦ and ¦e

′

i

¦ has com-

ponents R

ij

deﬁned by

e

j

= e

′

i

R

ij

The condition for the ﬁrst basis ¦e

i

¦ to be orthonormal is

e

†

i

e

j

= δ

ij

(e

′

k

R

ki

)

†

e

′

l

R

lj

= δ

ij

R

∗

ki

R

lj

e

′†

k

e

′

l

= δ

ij

If the second basis is also orthonormal, then e

′†

k

e

′

l

= δ

kl

and the condition

becomes

R

∗

ki

R

kj

= δ

ij

R

†

R = 1

Therefore the transformation between orthonormal bases is described by

a unitary matrix.

In a real vector space, an orthogonal matrix performs this task. In R

2

or R

3

an orthogonal matrix corresponds to a rotation and/or reﬂection.

A real symmetric matrix is normal and has real eigenvalues and real

orthogonal eigenvectors. Therefore a real symmetric matrix can be di-

agonalized by a real orthogonal transformation.

Note that this is not generally true of real antisymmetric or orthogonal

matrices because their eigenvalues and eigenvectors are not generally

real. They can, however, be diagonalized by complex unitary transfor-

mations.

108

5.7.5 Uses of diagonalization

Certain operations on (diagonalizable) matrices are more easily carried

out using the representation

S

−1

AS = Λ or A = SΛS

−1

Examples:

A

m

=

_

SΛS

−1

_ _

SΛS

−1

_

_

SΛS

−1

_

= SΛ

m

S

−1

det(A) = det

_

SΛS

−1

_

= det(S) det(Λ) det(S

−1

) = det(Λ)

tr(A) = tr

_

SΛS

−1

_

= tr

_

ΛS

−1

S

_

= tr(Λ)

tr(A

m

) = tr

_

SΛ

m

S

−1

_

= tr

_

Λ

m

S

−1

S

_

= tr(Λ

m

)

Here we use the properties of the determinant and trace:

det(AB) = det(A) det(B)

tr(AB) = (AB)

ii

= A

ij

B

ji

= B

ji

A

ij

= (BA)

jj

= tr(BA)

Note that diagonal matrices are multiplied very easily (i.e. component-

wise). Also the determinant and trace of a diagonal matrix are just the

product and sum, respectively, of its elements. Therefore

det(A) =

n

i=1

λ

i

tr(A) =

n

i=1

λ

i

In fact these two statements are true for all matrices (whether or not

they are diagonalizable), as follows from the product and sum of roots

in the characteristic equation

det(A −λ1) = det(A) + + tr(A)(−λ)

n−1

+ (−λ)

n

= 0

109

5.8 Quadratic and Hermitian forms

5.8.1 Quadratic form

The quadratic form associated with a real symmetric matrix A is

Q(x) = x

T

Ax = x

i

A

ij

x

j

= A

ij

x

i

x

j

Q is a homogeneous quadratic function of (x

1

, x

2

, . . . , x

n

), i.e. Q(αx) =

α

2

Q(x). In fact, any homogeneous quadratic function is the quadratic

form of a symmetric matrix, e.g.

Q = 2x

2

+ 4xy + 5y

2

=

_

x y

¸

_

2 2

2 5

_ _

x

y

_

The xy term is split equally between the oﬀ-diagonal matrix elements.

A can be diagonalized by a real orthogonal transformation:

S

T

AS = Λ with S

T

= S

−1

The vector x transforms according to x = Sx

′

, so

Q = x

T

Ax = (x

′T

S

T

)(SΛS

T

)(Sx

′

) = x

′T

Λx

′

The quadratic form is therefore reduced to a sum of squares,

Q =

n

i=1

λ

i

x

′2

i

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Diagonalize the quadratic form Q = 2x

2

+ 4xy + 5y

2

.

Q =

_

x y

¸

_

2 2

2 5

_ _

x

y

_

= x

T

Ax

The eigenvalues of A are 1 and 6 and the corresponding normalized

eigenvectors are

1

√

5

_

2

−1

_

,

1

√

5

_

1

2

_

110

(calculation omitted). The diagonalization of A is S

T

AS = Λ with

S =

_

2/

√

5 1/

√

5

−1/

√

5 2/

√

5

_

, Λ =

_

1 0

0 6

_

Therefore

Q = x

′2

+ 6y

′2

=

_

1

√

5

(2x −y)

_

2

+ 6

_

1

√

5

(x + 2y)

_

2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The eigenvectors of A deﬁne the principal axes of the quadratic form. In

diagonalizing A by transforming to its eigenvector basis, we are rotating

the coordinates to reduce the quadratic form to its simplest form.

A positive deﬁnite matrix is one for which all the eigenvalues are positive

(λ > 0). Similarly:

• negative deﬁnite means λ < 0

• positive (negative) semi-deﬁnite means λ 0 (λ 0)

• deﬁnite means positive deﬁnite or negative deﬁnite

In the above example, the diagonalization shows that Q(x) > 0 for all

x ,= 0, and we say that the quadratic form is positive deﬁnite.

5.8.2 Quadratic surfaces

The quadratic surfaces (or quadrics) associated with a real quadratic

form in three dimensions are the family of surfaces

Q(x) = k = constant

In the eigenvector basis this equation simpliﬁes to

λ

1

x

′2

1

+λ

2

x

′2

2

+λ

3

x

′2

3

= k

111

The equivalent equation in two dimensions is related to the standard

form for a conic section, i.e. an ellipse (if λ

1

λ

2

> 0)

x

′2

a

2

+

y

′2

b

2

= 1

or a hyperbola (if λ

1

λ

2

< 0)

x

′2

a

2

−

y

′2

b

2

= ±1

The semi-axes (distances of the curve from the origin along the principal

axes) are a =

_

[k/λ

1

[ and b =

_

[k/λ

2

[.

Notes:

• the scale of the ellipse (e.g.) is determined by the constant k

• the shape of the ellipse is determined by the eigenvalues

• the orientation of the ellipse is determined by the eigenvectors

In three dimensions, the quadratic surfaces are ellipsoids (if the eigen-

values all have the same sign) of standard form

x

′2

a

2

+

y

′2

b

2

+

z

′2

c

2

= 1

or hyperboloids (if the eigenvalues diﬀer in sign) of standard form

x

′2

a

2

+

y

′2

b

2

−

z

′2

c

2

= ±1

The quadrics of a deﬁnite quadratic form are therefore ellipsoids.

Some special cases:

• if λ

1

= λ

2

= λ

3

we have a sphere

• if (e.g.) λ

1

= λ

2

we have a surface of revolution about the z

′

-axis

• if (e.g.) λ

3

= 0 we have the translation of a conic section along

the z

′

-axis (an elliptic or hyperbolic cylinder)

112

Conic sections and quadric surfaces

ellipse

x

2

a

2

+

y

2

b

2

= 1 hyperbola

x

2

a

2

−

y

2

b

2

= 1

ellipsoid hyperboloid of one sheet

x

2

a

2

+

y

2

b

2

+

z

2

c

2

= 1

x

2

a

2

+

y

2

b

2

−

z

2

c

2

= 1

hyperboloid of two sheets

x

2

a

2

+

y

2

b

2

−

z

2

c

2

= −1

113

Example:

Let V (r) be a potential with a stationary point at the origin r = 0.

Then its Taylor expansion has the form

V (r) = V (0) +V

ij

x

i

x

j

+O(x

3

)

where

V

ij

=

1

2

∂

2

V

∂x

i

∂x

j

¸

¸

¸

¸

¸

r=0

The equipotential surfaces near the origin are therefore given approx-

imately by the quadrics V

ij

x

i

x

j

= constant. These are ellipsoids or

hyperboloids with principal axes given by the eigenvectors of the sym-

metric matrix V

ij

.

5.8.3 Hermitian form

In a complex vector space, the Hermitian form associated with an Her-

mitian matrix A is

H(x) = x

†

Ax = x

∗

i

A

ij

x

j

H is a real scalar because

H

∗

= (x

†

Ax)

†

= x

†

A

†

x = x

†

Ax = H

We have seen that A can be diagonalized by a unitary transformation:

U

†

AU = Λ with U

†

= U

−1

and so

H = x

†

(UΛU

†

)x = (U

†

x)

†

Λ(U

†

x) = x

′†

Λx

′

=

n

i=1

λ

i

x

′2

i

Therefore an Hermitian form can be reduced to a real quadratic form

by transforming to the eigenvector basis.

114

5.9 Stationary property of the eigenvalues

The Rayleigh quotient associated with an Hermitian matrix A is the

normalized Hermitian form

λ(x) =

x

†

Ax

x

†

x

Notes:

• λ(x) is a real scalar

• λ(αx) = λ(x)

• if x is an eigenvector of A, then λ(x) is the corresponding eigen-

value

In fact, the eigenvalues of A are the stationary values of the function

λ(x). This is the Rayleigh–Ritz variational principle.

Proof:

δλ = λ(x +δx) −λ(x)

=

(x +δx)

†

A(x +δx)

(x +δx)

†

(x +δx)

−λ(x)

=

x

†

Ax + (δx)

†

Ax +x

†

A(δx) +O(δx

2

)

x

†

x + (δx)

†

x +x

†

(δx) +O(δx

2

)

−λ(x)

=

(δx)

†

Ax +x

†

A(δx) −λ(x)[(δx)

†

x + x

†

(δx)] +O(δx

2

)

x

†

x + (δx)

†

x +x

†

(δx) +O(δx

2

)

=

(δx)

†

[Ax −λ(x)x] + [x

†

A −λ(x)x

†

](δx)

x

†

x

+O(δx

2

)

=

(δx)

†

[Ax −λ(x)x]

x

†

x

+ c.c. +O(δx

2

)

where ‘c.c.’ denotes the complex conjugate. Therefore the ﬁrst-order

variation of λ(x) vanishes for all perturbations δx if and only if Ax =

λ(x)x. In that case x is an eigenvector of A, and the value of λ(x) is the

corresponding eigenvalue.

115

6 Elementary analysis

6.1 Motivation

Analysis is the careful study of inﬁnite processes such as limits, conver-

gence, diﬀerential and integral calculus, and is one of the foundations

of mathematics. This section covers some of the basic concepts includ-

ing the important problem of the convergence of inﬁnite series. It also

introduces the remarkable properties of analytic functions of a complex

variable.

6.2 Sequences

6.2.1 Limits of sequences

We ﬁrst consider a sequence of real or complex numbers f

n

, deﬁned for

all integers n n

0

. Possible behaviours as n increases are:

• f

n

tends towards a particular value

• f

n

does not tend to any value but remains limited in magnitude

• f

n

is unlimited in magnitude

Deﬁnition. The sequence f

n

converges to the limit L as n →∞ if, for

any positive number ǫ, [f

n

−L[ < ǫ for suﬃciently large n.

In other words the members of the sequence are eventually contained

within an arbitrarily small disk centred on L. We write this as

lim

n→∞

f

n

= L or f

n

→L as n →∞

Note that L here is a ﬁnite number.

To say that a property holds for suﬃciently large n means that there

exists an integer N such that the property is true for all n N.

116

Example:

lim

n→∞

n

−α

= 0 for any α > 0

Proof: [n

−α

−0[ < ǫ holds for all n > ǫ

−1/α

.

If f

n

does not tend to a limit it may nevertheless be bounded.

Deﬁnition. The sequence f

n

is bounded as n → ∞ if there exists a

positive number K such that [f

n

[ < K for suﬃciently large n.

Example:

_

n + 1

n

_

e

inα

is bounded as n →∞ for any real number α

Proof:

¸

¸

¸

¸

_

n + 1

n

_

e

inα

¸

¸

¸

¸

=

n + 1

n

< 2 for all n 2

6.2.2 Cauchy’s principle of convergence

A necessary and suﬃcient condition for the sequence f

n

to converge is

that, for any positive number ǫ, [f

n+m

−f

n

[ < ǫ for all positive integers

m, for suﬃciently large n. Note that this condition does not require a

knowledge of the value of the limit L.

117

6.3 Convergence of series

6.3.1 Meaning of convergence

What is the meaning of an inﬁnite series such as

∞

n=n

0

u

n

involving the addition of an inﬁnite number of terms?

We deﬁne the partial sum

S

N

=

N

n=n

0

u

n

The inﬁnite series

u

n

is said to converge if the sequence of partial

sums S

N

tends to a limit S as N →∞. The value of the series is then

S. Otherwise the series diverges.

Note that whether a series converges or diverges does not depend on the

value of n

0

(i.e. on when the series begins) but only on the behaviour

of the terms for large n.

According to Cauchy’s principle of convergence, a necessary and suﬃ-

cient condition for

u

n

to converge is that, for any positive number ǫ,

[u

n+1

+u

n+2

+ +u

n+m

[ < ǫ for all positive integers m, for suﬃciently

large n.

6.3.2 Classic examples

The geometric series

z

n

has the partial sum

N

n=0

z

n

=

_

1−z

N+1

1−z

, z ,= 1

N + 1, z = 1

118

Therefore

z

n

converges if [z[ < 1, and the sum is 1/(1−z). If [z[ 1

the series diverges.

The harmonic series

n

−1

diverges. Consider the partial sum

S

N

=

N

n=1

1

n

>

_

N+1

1

dx

x

= ln(N + 1)

Therefore S

N

increases without bound and does not tend to a limit as

N →∞.

6.3.3 Absolute and conditional convergence

If

[u

n

[ converges, then

u

n

also converges.

u

n

is then said to

converge absolutely.

If

[u

n

[ diverges, then

u

n

may or may not converge. If it does, it is

said to converge conditionally.

[Proof of the ﬁrst statement above: If

[u

n

[ converges then, for any

positive number ǫ,

[u

n+1

[ +[u

n+2

[ + +[u

n+m

[ < ǫ

for all positive integers m, for suﬃciently large n. But then

[u

n+1

+u

n+2

+ +u

n+m

[ [u

n+1

[ +[u

n+2

[ + +[u

n+m

[

< ǫ

and so

u

n

also converges.]

119

6.3.4 Necessary condition for convergence

A necessary condition for

u

n

to converge is that u

n

→0 as n →∞.

Formally, this can be shown by noting that u

n

= S

n

− S

n−1

. If the

series converges then limS

n

= limS

n−1

= S, and therefore limu

n

= 0.

This condition is not suﬃcient for convergence, as exempliﬁed by the

harmonic series.

6.3.5 The comparison test

This refers to series of non-negative real numbers. We write these as

[u

n

[ because the comparison test is most often applied in assessing the

absolute convergence of a series of real or complex numbers.

If

[v

n

[ converges and [u

n

[ [v

n

[ for all n, then

[u

n

[ also converges.

This follows from the fact that

N

n=n

0

[u

n

[

N

n=n

0

[v

n

[

and each partial sum is a non-decreasing sequence, which must either

tend to a limit or increase without bound.

More generally, if

[v

n

[ converges and [u

n

[ K[v

n

[ for suﬃciently

large n, where K is a constant, then

[u

n

[ also converges.

Conversely, if

[v

n

[ diverges and [u

n

[ K[v

n

[ for suﬃciently large n,

where K is a positive constant, then

[u

n

[ also diverges.

In particular, if

[v

n

[ converges (diverges) and [u

n

[/[v

n

[ tends to a

non-zero limit as n →∞, then

[u

n

[ also converges (diverges).

6.3.6 D’Alembert’s ratio test

This uses a comparison between a given series

u

n

of complex terms

and a geometric series

v

n

=

r

n

, where r > 0.

120

The absolute ratio of successive terms is

r

n

=

¸

¸

¸

¸

u

n+1

u

n

¸

¸

¸

¸

Suppose that r

n

tends to a limit r as n →∞. Then

• if r < 1,

u

n

converges absolutely

• if r > 1,

u

n

diverges (u

n

does not tend to zero)

• if r = 1, a diﬀerent test is required

Even if r

n

does not tend to a limit, if r

n

r for suﬃciently large n,

where r < 1 is a constant, then

u

n

converges absolutely.

Example: for the harmonic series

n

−1

, r

n

= n/(n+1) →1 as n →∞.

A diﬀerent test is required, such as the integral comparison test used

above.

The ratio test is useless for series in which some of the terms are zero.

However, it can easily be adapted by relabelling the series to remove

the vanishing terms.

6.3.7 Cauchy’s root test

The same conclusions as for the ratio test apply when instead

r

n

= [u

n

[

1/n

This result also follows from a comparison with a geometric series. It

is more powerful than the ratio test but usually harder to apply.

6.4 Functions of a continuous variable

6.4.1 Limits and continuity

We now consider how a real or complex function f(z) of a real or com-

plex variable z behaves near a point z

0

.

121

Deﬁnition. The function f(z) tends to the limit L as z → z

0

if, for

any positive number ǫ, there exists a positive number δ, depending on

ǫ, such that [f(z) −L[ < ǫ for all z such that [z −z

0

[ < δ.

We write this as

lim

z→z

0

f(z) = L or f(z) →L as z →z

0

The value of L would normally be f(z

0

). However, cases such as

lim

z→0

sinz

z

= 1

must be expressed as limits because sin0/0 = 0/0 is indeterminate.

Deﬁnition. The function f(z) is continuous at the point z = z

0

if

f(z) →f(z

0

) as z →z

0

.

Deﬁnition. The function f(z) is bounded as z →z

0

if there exist pos-

itive numbers K and δ such that [f(z)[ < K for all z with [z −z

0

[ < δ.

Deﬁnition. The function f(z) tends to the limit L as z → ∞ if, for

any positive number ǫ, there exists a positive number R, depending on

ǫ, such that [f(z) −L[ < ǫ for all z with [z[ > R.

We write this as

lim

z→∞

f(z) = L or f(z) →L as z →∞

Deﬁnition. The function f(z) is bounded as z → ∞ if there exist

positive numbers K and R such that [f(z)[ < K for all z with [z[ > R.

There are diﬀerent ways in which z can approach z

0

or ∞, especially

in the complex plane. Sometimes the limit or bound applies only if

approached in a particular way, e.g.

lim

x→+∞

tanh x = 1, lim

x→−∞

tanh x = −1

122

This notation implies that x is approaching positive or negative real

inﬁnity along the real axis. In the context of real variables x → ∞

usually means speciﬁcally x → +∞. A related notation for one-sided

limits is exempliﬁed by

lim

x→0+

x(1 +x)

[x[

= 1, lim

x→0−

x(1 +x)

[x[

= −1

6.4.2 The O notation

The useful symbols O, o and ∼ are used to compare the rates of growth

or decay of diﬀerent functions.

• f(z) = O(g(z)) as z →z

0

means that

f(z)

g(z)

is bounded as z →z

0

• f(z) = o(g(z)) as z →z

0

means that

f(z)

g(z)

→0 as z →z

0

• f(z) ∼ g(z) as z →z

0

means that

f(z)

g(z)

→1 as z →z

0

If f(z) ∼ g(z) we say that f is asymptotically equal to g. This should

not be written as f(z) →g(z).

Notes:

• these deﬁnitions also apply when z

0

= ∞

• f(z) = O(1) means that f(z) is bounded

• either f(z) = o(g(z)) or f(z) ∼ g(z) implies f(z) = O(g(z))

• only f(z) ∼ g(z) is a symmetric relation

Examples:

Acos z = O(1) as z →0

Asin z = O(z) = o(1) as z →0

ln x = o(x) as x →+∞

cosh x ∼

1

2

e

x

as x →+∞

123

6.5 Taylor’s theorem for functions of a real variable

Let f(x) be a (real or complex) function of a real variable x, which is

diﬀerentiable at least n times in the interval x

0

x x

0

+h. Then

f(x

0

+h) =f(x

0

) +hf

′

(x

0

) +

h

2

2!

f

′′

(x

0

) +

+

h

n−1

(n −1)!

f

(n−1)

(x

0

) +R

n

where

R

n

=

_

x

0

+h

x

0

(x

0

+h −x)

n−1

(n −1)!

f

(n)

(x) dx

is the remainder after n terms of the Taylor series.

[Proof: integrate R

n

by parts n times.]

The remainder term can be expressed in various ways. Lagrange’s ex-

pression for the remainder is

R

n

=

h

n

n!

f

(n)

(ξ)

where ξ is an unknown number in the interval x

0

< ξ < x

0

+h. So

R

n

= O(h

n

)

If f(x) is inﬁnitely diﬀerentiable in x

0

x x

0

+ h (it is a smooth

function) we can write an inﬁnite Taylor series

f(x

0

+h) =

∞

n=0

h

n

n!

f

(n)

(x

0

)

which converges for suﬃciently small h (as discussed below).

124

6.6 Analytic functions of a complex variable

6.6.1 Complex diﬀerentiability

Deﬁnition. The derivative of the function f(z) at the point z = z

0

is

f

′

(z

0

) = lim

z→z

0

f(z) −f(z

0

)

z −z

0

If this exists, the function f(z) is diﬀerentiable at z = z

0

.

Another way to write this is

df

dz

≡ f

′

(z) = lim

δz→0

f(z +δz) −f(z)

δz

Requiring a function of a complex variable to be diﬀerentiable is a sur-

prisingly strong constraint. The limit must be the same when δz → 0

in any direction in the complex plane.

6.6.2 The Cauchy–Riemann equations

Separate f = u +iv and z = x +iy into their real and imaginary parts:

f(z) = u(x, y) + iv(x, y)

If f

′

(z) exists we can calculate it by assuming that δz = δx + i δy

approaches 0 along the real axis, so δy = 0:

f

′

(z) = lim

δx→0

f(z +δx) −f(z)

δx

= lim

δx→0

u(x +δx, y) + iv(x +δx, y) −u(x, y) −iv(x, y)

δx

= lim

δx→0

u(x +δx, y) −u(x, y)

δx

+ i lim

δx→0

v(x +δx, y) −v(x, y)

δx

=

∂u

∂x

+ i

∂v

∂x

125

The derivative should have the same value if δz approaches 0 along the

imaginary axis, so δx = 0:

f

′

(z) = lim

δy→0

f(z + i δy) −f(z)

i δy

= lim

δy→0

u(x, y +δy) + iv(x, y +δy) −u(x, y) −iv(x, y)

i δy

= −i lim

δy→0

u(x, y +δy) −u(x, y)

δy

+ lim

δy→0

v(x, y +δy) −v(x, y)

δy

= −i

∂u

∂y

+

∂v

∂y

Comparing the real and imaginary parts of these expressions, we deduce

the Cauchy–Riemann equations

∂u

∂x

=

∂v

∂y

,

∂v

∂x

= −

∂u

∂y

These are necessary conditions for f(z) to have a complex derivative.

They are also suﬃcient conditions, provided that the partial derivatives

are also continuous.

6.6.3 Analytic functions

If a function f(z) has a complex derivative at every point z in a region

R of the complex plane, it is said to be analytic in R. To be ana-

lytic at a point z = z

0

, f(z) must be diﬀerentiable throughout some

neighbourhood [z −z

0

[ < ǫ of that point.

Examples of functions that are analytic in the whole complex plane

(known as entire functions):

• f(z) = c, a complex constant

• f(z) = z, for which u = x and v = y, and we easily verify the CR

equations

126

• f(z) = z

n

, where n is a positive integer

• f(z) = P(z) = c

n

z

n

+ c

n−1

z

n−1

+ + c

0

, a general polynomial

function with complex coeﬃcients

• f(z) = exp(z)

In the case of the exponential function we have

f(z) = e

z

= e

x

e

iy

= e

x

cos y + i e

x

siny = u + iv

The CR equations are satisﬁed for all x and y:

∂u

∂x

= e

x

cos y =

∂v

∂y

∂v

∂x

= e

x

siny = −

∂u

∂y

The derivative of the exponential function is

f

′

(z) =

∂u

∂x

+ i

∂v

∂x

= e

x

cos y + i e

x

siny = e

z

as expected.

Sums, products and compositions of analytic functions are also analytic,

e.g.

f(z) = z exp(iz

2

) +z

3

The usual product, quotient and chain rules apply to complex deriva-

tives of analytic functions. Familiar relations such as

d

dz

z

n

= nz

n−1

,

d

dz

sin z = cos z,

d

dz

cosh z = sinh z

apply as usual.

Many complex functions are analytic everywhere in the complex plane

except at isolated points, which are called the singular points or singu-

larities of the function.

127

Examples:

• f(z) = P(z)/Q(z), where P(z) and Q(z) are polynomials. This is

called a rational function and is analytic except at points where

Q(z) = 0.

• f(z) = z

c

, where c is a complex constant, is analytic except at

z = 0 (unless c is a non-negative integer)

• f(z) = ln z is also analytic except at z = 0

The last two examples are in fact multiple-valued functions, which re-

quire special treatment (see next term).

Examples of non-analytic functions:

• f(z) = Re(z), for which u = x and v = 0, so the CR equations

are not satisﬁed anywhere

• f(z) = z

∗

, for which u = x and v = −y

• f(z) = [z[, for which u = (x

2

+y

2

)

1/2

and v = 0

• f(z) = [z[

2

, for which u = x

2

+y

2

and v = 0

In the last case the CR equations are satisﬁed only at x = y = 0 and

we can say that f

′

(0) = 0. However, f(z) is not analytic even at z = 0

because it is not diﬀerentiable throughout any neighbourhood [z[ < ǫ

of 0.

6.6.4 Consequences of the Cauchy–Riemann equations

If we know the real part of an analytic function in some region, we can

ﬁnd its imaginary part (or vice versa) up to an additive constant by

integrating the CR equations.

Example:

u(x, y) = x

2

−y

2

128

∂v

∂y

=

∂u

∂x

= 2x ⇒ v = 2xy +g(x)

∂v

∂x

= −

∂u

∂y

⇒ 2y +g

′

(x) = 2y ⇒ g

′

(x) = 0

Therefore v(x, y) = 2xy+c, where c is a real constant, and we recognize

f(z) = x

2

−y

2

+ i(2xy +c) = (x + iy)

2

+ ic = z

2

+ ic

The real and imaginary parts of an analytic function satisfy Laplace’s

equation (they are harmonic functions):

∂

2

u

∂x

2

+

∂

2

u

∂y

2

=

∂

∂x

_

∂u

∂x

_

+

∂

∂y

_

∂u

∂y

_

=

∂

∂x

_

∂v

∂y

_

+

∂

∂y

_

−

∂v

∂x

_

= 0

The proof that ∇

2

v = 0 is similar. This provides a useful method for

solving Laplace’s equation in two dimensions. Furthermore,

∇u · ∇v =

∂u

∂x

∂v

∂x

+

∂u

∂y

∂v

∂y

=

∂v

∂y

∂v

∂x

−

∂v

∂x

∂v

∂y

= 0

and so the curves of constant u and those of constant v intersect at

right angles. u and v are said to be conjugate harmonic functions.

6.7 Taylor series for analytic functions

If a function of a complex variable is analytic in a region R of the

complex plane, not only is it diﬀerentiable everywhere in R, it is also

129

diﬀerentiable any number of times. If f(z) is analytic at z = z

0

, it has

an inﬁnite Taylor series

f(z) =

∞

n=0

a

n

(z −z

0

)

n

, a

n

=

f

(n)

(z

0

)

n!

which converges within some neighbourhood of z

0

(as discussed below).

In fact this can be taken as a deﬁnition of analyticity.

6.8 Zeros, poles and essential singularities

6.8.1 Zeros of complex functions

The zeros of f(z) are the points z = z

0

in the complex plane where

f(z

0

) = 0. A zero is of order N if

f(z

0

) = f

′

(z

0

) = f

′′

(z

0

) = = f

(N−1)

(z

0

) = 0 but f

(N)

(z

0

) ,= 0

The ﬁrst non-zero term in the Taylor series of f(z) about z = z

0

is then

proportional to (z −z

0

)

N

. Indeed

f(z) ∼ a

N

(z −z

0

)

N

as z →z

0

A simple zero is a zero of order 1. A double zero is one of order 2, etc.

Examples:

• f(z) = z has a simple zero at z = 0

• f(z) = (z −i)

2

has a double zero at z = i

• f(z) = z

2

−1 = (z −1)(z + 1) has simple zeros at z = ±1

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Find and classify the zeros of f(z) = sinh z.

0 = sinh z =

1

2

(e

z

−e

−z

)

130

e

z

= e

−z

e

2z

= 1

2z = 2nπi

z = nπi, n ∈ Z

f

′

(z) = cosh z = cos(nπ) ,= 0 at these points

⇒ all simple zeros

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.8.2 Poles

If g(z) is analytic and non-zero at z = z

0

then it has an expansion

(locally, at least) of the form

g(z) =

∞

n=0

b

n

(z −z

0

)

n

with b

0

,= 0

Consider the function

f(z) = (z −z

0

)

−N

g(z) =

∞

n=−N

a

n

(z −z

0

)

n

with a

n

= b

n+N

and so a

−N

,= 0. This is not a Taylor series because it

includes negative powers of z − z

0

, and f(z) is not analytic at z = z

0

.

We say that f(z) has a pole of order N. Note that

f(z) ∼ a

−N

(z −z

0

)

−N

as z →z

0

A simple pole is a pole of order 1. A double pole is one of order 2, etc.

Notes:

• if f(z) has a zero of order N at z = z

0

, then 1/f(z) has a pole of

order N there, and vice versa

131

• if f(z) is analytic and non-zero at z = z

0

and g(z) has a zero of

order N there, then f(z)/g(z) has a pole of order N there

Example:

f(z) =

2z

(z + 1)(z −i)

2

has a simple pole at z = −1 and a double pole at z = i (as well as a

simple zero at z = 0). The expansion about the double pole can be

carried out by letting z = i +w and expanding in w:

f(z) =

2(i +w)

(i +w + 1)w

2

=

2i(1 −iw)

(i + 1)

_

1 +

1

2

(1 −i)w

¸

w

2

=

2i

(i + 1)w

2

(1 −iw)

_

1 −

1

2

(1 −i)w +O(w

2

)

_

= (1 + i)w

−2

_

1 −

1

2

(1 + i)w +O(w

2

)

_

= (1 + i)(z −i)

−2

−i(z −i)

−1

+O(1) as z →i

6.8.3 Essential singularities

It can be shown that any function that is analytic (and single-valued)

throughout an annulus a < [z −z

0

[ < b centred on a point z = z

0

has a

unique Laurent series

f(z) =

∞

n=−∞

a

n

(z −z

0

)

n

which converges for all values of z within the annulus.

If a = 0, then f(z) is analytic throughout the disk [z − z

0

[ < b ex-

cept possibly at z = z

0

itself, and the Laurent series determines the

behaviour of f(z) near z = z

0

. There are three possibilities:

132

• if the ﬁrst non-zero term in the Laurent series has n 0, then

f(z) is analytic at z = z

0

and the series is just a Taylor series

• if the ﬁrst non-zero term in the Laurent series has n = −N < 0,

then f(z) has a pole of order N at z = z

0

• otherwise, if the Laurent series involves an inﬁnite number of

terms with n < 0, then f(z) has an essential singularity at z = z

0

A classic example of an essential singularity is f(z) = e

1/z

at z = 0.

Here we can generate the Laurent series from a Taylor series in 1/z:

e

1/z

=

∞

n=0

1

n!

_

1

z

_

n

=

0

n=−∞

1

(−n)!

z

n

The behaviour of a function near an essential singularity is remarkably

complicated. Picard’s theorem states that, in any neighbourhood of

an essential singularity, the function takes all possible complex values

(possibly with one exception) at inﬁnitely many points. (In the case of

f(z) = e

1/z

, the exceptional value 0 is never attained.)

6.8.4 Behaviour at inﬁnity

We can examine the behaviour of a function f(z) as z →∞by deﬁning

a new variable ζ = 1/z and a new function g(ζ) = f(z). Then z = ∞

maps to a single point ζ = 0, the point at inﬁnity.

If g(ζ) has a zero, pole or essential singularity at ζ = 0, then we can

say that f(z) has the corresponding property at z = ∞.

Examples: f

1

(z) = e

z

= e

1/ζ

= g

1

(ζ) has an essential singularity at

z = ∞. f

2

(z) = z

2

= 1/ζ

2

= g

2

(ζ) has a double pole at z = ∞.

f

3

(z) = e

1/z

= e

ζ

= g

3

(ζ) is analytic at z = ∞.

133

6.9 Convergence of power series

6.9.1 Circle of convergence

If the power series

f(z) =

∞

n=0

a

n

(z −z

0

)

n

converges for z = z

1

, then it converges absolutely for all z such that

[z −z

0

[ < [z

1

−z

0

[.

[Proof: The necessary condition for convergence, lima

n

(z

1

− z

0

)

n

= 0,

implies that [a

n

(z

1

− z

0

)

n

[ < ǫ for suﬃciently large n, for any ǫ > 0.

Therefore [a

n

(z − z

0

)

n

[ < ǫr

n

for suﬃciently large n, with r = [(z −

z

0

)/(z

1

− z

0

)[ < 1. By comparison with the geometric series

r

n

,

[a

n

(z −z

0

)

n

[ converges.]

It follows that, if the power series diverges for z = z

2

, then it diverges

for all z such that [z −z

0

[ > [z

2

−z

0

[.

Therefore the series converges for [z−z

0

[ < R and diverges for [z−z

0

[ >

R. R is called the radius of convergence and may be zero (exception-

ally), positive or inﬁnite.

[z −z

0

[ = R is the circle of convergence. The series converges inside it

and diverges outside. On the circle, it may either converge or diverge.

134

6.9.2 Determination of the radius of convergence

The absolute ratio of successive terms in a power series is

r

n

=

¸

¸

¸

¸

a

n+1

a

n

¸

¸

¸

¸

[z −z

0

[

Suppose that [a

n+1

/a

n

[ → L as n → ∞. Then r

n

→ r = L[z − z

0

[.

According to the ratio test, the series converges for L[z − z

0

[ < 1 and

diverges for L[z −z

0

[ > 1. The radius of convergence is R = 1/L.

The same result, R = 1/L, follows from Cauchy’s root test if instead

[a

n

[

1/n

→L as n →∞.

The radius of convergence of the Taylor series of a function f(z) about

the point z = z

0

is equal to the distance of the nearest singular point

of the function f(z) from z

0

. Since a convergent power series deﬁnes an

analytic function, no singularity can lie inside the circle of convergence.

6.9.3 Examples

The following examples are generated from familiar Taylor series.

ln(1 −z) = −z −

z

2

2

−

z

3

3

− = −

∞

n=1

z

n

n

Here [a

n+1

/a

n

[ = n/(n + 1) → 1 as n → ∞, so R = 1. The series

converges for [z[ < 1 and diverges for [z[ > 1. (In fact, on the circle

[z[ = 1, the series converges except at the point z = 1.) The function

has a singularity at z = 1 that limits the radius of convergence.

arctan z = z −

z

3

3

+

z

5

5

−

z

7

7

+ = z

∞

n=0

1

2n + 1

(−z

2

)

n

Thought of as a power series in (−z

2

), this has [a

n+1

/a

n

[ = (2n +

1)/(2n + 3) → 1 as n → ∞. Therefore R = 1 in terms of (−z

2

). But

135

since [ −z

2

[ = 1 is equivalent to [z[ = 1, the series converges for [z[ < 1

and diverges for [z[ > 1.

e

z

= 1 +z +

z

2

2

+

z

3

6

+ =

∞

n=0

z

n

n!

Here [a

n+1

/a

n

[ = 1/(n + 1) → 0 as n → ∞, so R = ∞. The series

converges for all z; this is an entire function.

136

7 Ordinary diﬀerential equations

7.1 Motivation

Very many scientiﬁc problems are described by diﬀerential equations.

Even if these are partial diﬀerential equations, they can often be re-

duced to ordinary diﬀerential equations (ODEs), e.g. by the method of

separation of variables.

The ODEs encountered most frequently are linear and of either ﬁrst or

second order. In particular, second-order equations describe oscillatory

phenomena.

Part IA dealt with ﬁrst-order ODEs and also with linear second-order

ODEs with constant coeﬃcients. Here we deal with general linear

second-order ODEs.

The general linear inhomogeneous ﬁrst-order ODE

y

′

(x) +p(x)y(x) = f(x)

can be solved using the integrating factor g = exp

_

p dx to obtain the

general solution

y =

1

g

_

gf dx

Provided that the integrals can be evaluated, the problem is completely

solved. An equivalent method does not exist for second-order ODEs,

but an extensive theory can still be developed.

7.2 Classiﬁcation

The general second-order ODE is an equation of the form

F(y

′′

, y

′

, y, x) = 0

for an unknown function y(x), where y

′

= dy/dx and y

′′

= d

2

y/dx

2

.

137

The general linear second-order ODE has the form

Ly = f

where L is a linear operator such that

Ly = ay

′′

+by

′

+cy

where a, b, c, f are functions of x.

The equation is homogeneous (unforced) if f = 0, otherwise it is inho-

mogeneous (forced).

The principle of superposition applies to linear ODEs as to all linear

equations.

Although the solution may be of interest only for real x, it is often

informative to analyse the solution in the complex domain.

7.3 Homogeneous linear second-order ODEs

7.3.1 Linearly independent solutions

Divide through by the coeﬃcient of y

′′

to obtain a standard form

y

′′

(x) +p(x)y

′

(x) +q(x)y(x) = 0

Suppose that y

1

(x) and y

2

(x) are two solutions of this equation. They

are linearly independent if

Ay

1

(x) +By

2

(x) = 0 (for all x) implies A = B = 0

i.e. if one is not simply a constant multiple of the other.

If y

1

(x) and y

2

(x) are linearly independent solutions, then the general

solution of the ODE is

y(x) = Ay

1

(x) +By

2

(x)

where A and B are arbitrary constants. There are two arbitrary con-

stants because the equation is of second order.

138

7.3.2 The Wronskian

The Wronskian W(x) of two solutions y

1

(x) and y

2

(x) of a second-order

ODE is the determinant of the Wronskian matrix:

W[y

1

, y

2

] =

¸

¸

¸

¸

y

1

y

2

y

′

1

y

′

2

¸

¸

¸

¸

= y

1

y

′

2

−y

2

y

′

1

Suppose that Ay

1

(x) + By

2

(x) = 0 in some interval of x. Then also

Ay

′

1

(x) +By

′

2

(x) = 0, and so

_

y

1

y

2

y

′

1

y

′

2

_ _

A

B

_

=

_

0

0

_

If this is satisﬁed for non-trivial A, B then W = 0 (in that interval of

x). Therefore the solutions are linearly independent if W ,= 0.

7.3.3 Calculation of the Wronskian

Consider

W

′

= y

1

y

′′

2

−y

2

y

′′

1

= y

1

(−py

′

2

−qy

2

) −y

2

(−py

′

1

−qy

1

)

= −pW

since both y

1

and y

2

solve the ODE. This ﬁrst-order ODE for W has

the solution

W = exp

_

−

_

p dx

_

This expression involves an indeﬁnite integral and could be written more

fussily as

W(x) = exp

_

−

_

x

p(ξ) dξ

_

139

Notes:

• the indeﬁnite integral involves an arbitrary additive constant, so

W involves an arbitrary multiplicative constant

• apart from that, W is the same for any two solutions y

1

and y

2

• W is therefore an intrinsic property of the ODE

• if W ,= 0 for one value of x (and p is integrable) then W ,= 0 for

all x, so linear independence need be checked at only one value

of x

7.3.4 Finding a second solution

Suppose that one solution y

1

(x) is known. Then a second solution y

2

(x)

can be found as follows.

First ﬁnd W as described above. The deﬁnition of W then provides a

ﬁrst-order linear ODE for the unknown y

2

:

y

1

y

′

2

−y

2

y

′

1

= W

y

′

2

y

1

−

y

2

y

′

1

y

2

1

=

W

y

2

1

d

dx

_

y

2

y

1

_

=

W

y

2

1

y

2

= y

1

_

W

y

2

1

dx

Again, this result involves an indeﬁnite integral and could be written

more fussily as

y

2

(x) = y

1

(x)

_

x

W(ξ)

[y

1

(ξ)]

2

dξ

140

Notes:

• the indeﬁnite integral involves an arbitrary additive constant,

since any amount of y

1

can be added to y

2

• W involves an arbitrary multiplicative constant, since y

2

can be

multiplied by any constant

• this expression for y

2

therefore provides the general solution of

the ODE

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Given that y = x

n

is a solution of x

2

y

′′

−(2n −1)xy

′

+n

2

y = 0, ﬁnd

the general solution. Standard form

y

′′

−

_

2n −1

x

_

y

′

+

_

n

2

x

2

_

y = 0

Wronskian

W = exp

_

−

_

p dx

_

= exp

_ _

2n −1

x

_

dx

= exp [(2n −1) ln x + constant]

= Ax

2n−1

Second solution

y

2

= y

1

_

W

y

2

1

dx = x

n

__

Ax

−1

dx

_

= Ax

n

ln x +Bx

n

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The same result can be obtained by writing y

2

(x) = y

1

(x)u(x) and

obtaining a ﬁrst-order linear ODE for u

′

. This method applies to higher-

order linear ODEs and is reminiscent of the factorization of polynomial

equations.

141

7.4 Series solutions

7.4.1 Ordinary and singular points

We consider a homogeneous linear second-order ODE in standard form:

y

′′

(x) +p(x)y

′

(x) +q(x)y(x) = 0

A point x = x

0

is an ordinary point of the ODE if:

p(x) and q(x) are both analytic at x = x

0

Otherwise it is a singular point.

A singular point x = x

0

is regular if:

(x −x

0

)p(x) and (x −x

0

)

2

q(x) are both analytic at x = x

0

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Identify the singular points of Legendre’s equation

(1 −x

2

)y

′′

−2xy

′

+ℓ(ℓ + 1)y = 0

where ℓ is a constant, and determine their nature. Divide through by

(1 −x

2

) to obtain the standard form with

p(x) = −

2x

1 −x

2

, q(x) =

ℓ(ℓ + 1)

1 −x

2

Both p(x) and q(x) are analytic for all x except x = ±1. These are the

singular points. They are both regular:

(x −1)p(x) =

2x

1 +x

, (x −1)

2

q(x) = ℓ(ℓ + 1)

_

1 −x

1 +x

_

are both analytic at x = 1, and similarly for x = −1. . . . . . . . . . . . . . . . .

142

7.4.2 Series solutions about an ordinary point

Wherever p(x) and q(x) are analytic, the ODE has two linearly inde-

pendent solutions that are also analytic. If x = x

0

is an ordinary point,

the ODE has two linearly independent solutions of the form

y =

∞

n=0

a

n

(x −x

0

)

n

The coeﬃcients a

n

can be determined by substituting the series into

the ODE and comparing powers of (x−x

0

). The radius of convergence

of the series solutions is the distance to the nearest singular point of

the ODE in the complex plane.

Since p(x) and q(x) are analytic at x = x

0

,

p(x) =

∞

n=0

p

n

(x −x

0

)

n

, q(x) =

∞

n=0

q

n

(x −x

0

)

n

inside some circle centred on x = x

0

. Let

y =

∞

n=0

a

n

(x −x

0

)

n

y

′

=

∞

n=0

na

n

(x −x

0

)

n−1

=

∞

n=0

(n + 1)a

n+1

(x −x

0

)

n

y

′′

=

∞

n=0

n(n −1)a

n

(x −x

0

)

n−2

=

∞

n=0

(n +2)(n +1)a

n+2

(x −x

0

)

n

Note the following rule for multiplying power series:

pq =

∞

ℓ=0

p

ℓ

(x −x

0

)

ℓ

m=0

q

m

(x −x

0

)

m

=

∞

n=0

_

n

r=0

p

n−r

q

r

_

(x −x

0

)

n

143

Thus

py

′

=

∞

n=0

_

n

r=0

p

n−r

(r + 1)a

r+1

_

(x −x

0

)

n

qy =

∞

n=0

_

n

r=0

q

n−r

a

r

_

(x −x

0

)

n

The coeﬃcient of (x −x

0

)

n

in the ODE y

′′

+py

′

+qy = 0 is therefore

(n + 2)(n + 1)a

n+2

+

n

r=0

p

n−r

(r + 1)a

r+1

+

n

r=0

q

n−r

a

r

= 0

This is a recurrence relation that determines a

n+2

(for n 0) in terms

of the preceding coeﬃcients a

0

, a

1

, . . . , a

n+1

. The ﬁrst two coeﬃcients

a

0

and a

1

are not determined: they are the two arbitrary constants in

the general solution.

The above procedure is rarely followed in practice. If p and q are ra-

tional functions (i.e. ratios of polynomials) it is a much better idea to

multiply the ODE through by a suitable factor to clear denominators

before substituting in the power series for y, y

′

and y

′′

.

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Find series solutions about x = 0 of Legendre’s equation

(1 −x

2

)y

′′

−2xy

′

+ℓ(ℓ + 1)y = 0

x = 0 is an ordinary point, so let

y =

∞

n=0

a

n

x

n

, y

′

=

∞

n=0

na

n

x

n−1

, y

′′

=

∞

n=0

n(n −1)a

n

x

n−2

Substitute these expressions into the ODE to obtain

∞

n=0

n(n −1)a

n

x

n−2

+

∞

n=0

[−n(n −1) −2n +ℓ(ℓ + 1)] a

n

x

n

= 0

144

The coeﬃcient of x

n

(for n 0) is

(n + 1)(n + 2)a

n+2

+ [−n(n + 1) +ℓ(ℓ + 1)] a

n

= 0

The recurrence relation is therefore

a

n+2

a

n

=

n(n + 1) −ℓ(ℓ + 1)

(n + 1)(n + 2)

=

(n −ℓ)(n +ℓ + 1)

(n + 1)(n + 2)

a

0

and a

1

are the arbitrary constants. The other coeﬃcients follow from

the recurrence relation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Notes:

• an even solution is obtained by choosing a

0

= 1 and a

1

= 0

• an odd solution is obtained by choosing a

0

= 0 and a

1

= 1

• these solutions are obviously linearly independent since one is not

a constant multiple of the other

• since [a

n+2

/a

n

[ →1 as n →∞, the even and odd series converge

for [x

2

[ < 1, i.e. for [x[ < 1

• the radius of convergence is the distance to the singular points of

Legendre’s equation at x = ±1

• if ℓ 0 is an even integer, then a

ℓ+2

and all subsequent even coef-

ﬁcients vanish, so the even solution is a polynomial (terminating

power series) of degree ℓ

• if ℓ 1 is an odd integer, then a

ℓ+2

and all subsequent odd

coeﬃcients vanish, so the odd solution is a polynomial of degree ℓ

The polynomial solutions are called Legendre polynomials, P

ℓ

(x). They

are conventionally normalized (i.e. a

0

or a

1

is chosen) such that P

ℓ

(1) =

1, e.g.

P

0

(x) = 1, P

1

(x) = x, P

2

(x) =

1

2

(3x

2

−1)

145

7.4.3 Series solutions about a regular singular point

If x = x

0

is a regular singular point, Fuchs’s theorem guarantees that

the ODE has at least one solution of the form

y =

∞

n=0

a

n

(x −x

0

)

n+σ

, a

0

,= 0

i.e. a Taylor series multiplied by a power (x − x

0

)

σ

, where the index σ

is a (generally complex) number to be determined.

Notes:

• this is a Taylor series only if σ is a non-negative integer

• there may be one or two solutions of this form (see below)

• the condition a

0

,= 0 is required to deﬁne σ uniquely

146

To understand in simple terms why the solutions behave in this way,

recall that

P(x) ≡ (x −x

0

)p(x) =

∞

n=0

P

n

(x −x

0

)

n

Q(x) ≡ (x −x

0

)

2

q(x) =

∞

n=0

Q

n

(x −x

0

)

n

are analytic at the regular singular point x = x

0

. Near x = x

0

the ODE

can be approximated using the leading approximations to p and q:

y

′′

+

P

0

y

′

x −x

0

+

Q

0

y

(x −x

0

)

2

≈ 0

The exact solutions of this approximate equation are of the form y =

(x −x

0

)

σ

, where σ satisﬁes the indicial equation

σ(σ −1) +P

0

σ +Q

0

= 0

with two (generally complex) roots. [If the roots are equal, the solutions

are (x−x

0

)

σ

and (x−x

0

)

σ

ln(x−x

0

).] It is reasonable that the solutions

of the full ODE should resemble the solutions of the approximate ODE

near the singular point.

Frobenius’s method is used to ﬁnd the series solutions about a regular

singular point. This is best demonstrated by example.

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

⊲ Find series solutions about x = 0 of Bessel’s equation

x

2

y

′′

+xy

′

+ (x

2

−ν

2

)y = 0

where ν is a constant. x = 0 is a regular singular point because p = 1/x

and q = 1 −ν

2

/x

2

are singular there but xp = 1 and x

2

q = x

2

−ν

2

are

both analytic there.

Seek a solution of the form

y =

∞

n=0

a

n

x

n+σ

, a

0

,= 0

147

Then

y

′

=

∞

n=0

(n +σ)a

n

x

n+σ−1

,

y

′′

=

∞

n=0

(n +σ)(n +σ −1)a

n

x

n+σ−2

Bessel’s equation requires

∞

n=0

_

(n +σ)(n +σ −1) + (n +σ) −ν

2

¸

a

n

x

n+σ

+

∞

n=0

a

n

x

n+σ+2

= 0

Relabel the second series:

∞

n=0

_

(n +σ)

2

−ν

2

¸

a

n

x

n+σ

+

∞

n=2

a

n−2

x

n+σ

= 0

Now compare powers of x

n+σ

:

n = 0 :

_

σ

2

−ν

2

¸

a

0

= 0 (1)

n = 1 :

_

(1 +σ)

2

−ν

2

¸

a

1

= 0 (2)

n 2 :

_

(n +σ)

2

−ν

2

¸

a

n

+a

n−2

= 0 (3)

Since a

0

,= 0 by assumption, equation (1) provides the indicial equation

σ

2

−ν

2

= 0

with roots σ = ±ν. Equation (2) then requires that a

1

= 0 (except

possibly in the case ν = ±

1

2

, but even then a

1

can be chosen to be 0).

Equation (3) provides the recurrence relation

a

n

= −

a

n−2

n(n + 2σ)

Since a

1

= 0, all the odd coeﬃcients vanish. Since [a

n

/a

n−2

[ → 0 as

n →∞, the radius of convergence of the series is inﬁnite.

148

For most values of ν we therefore obtain two linearly independent solu-

tions (choosing a

0

= 1):

y

1

= x

ν

_

1 −

x

2

4(1 +ν)

+

x

4

32(1 +ν)(2 +ν)

+

_

y

2

= x

−ν

_

1 −

x

2

4(1 −ν)

+

x

4

32(1 −ν)(2 −ν)

+

_

However, if ν = 0 there is clearly only one solution of this form. Fur-

thermore, if ν is a non-zero integer one of the recurrence relations will

fail at some point and the corresponding series is invalid. In these cases

the second solution is of a diﬀerent form (see below). . . . . . . . . . . . . . . . .

A general analysis shows that:

• if the roots of the indicial equation are equal, there is only one

solution of the form

a

n

(x −x

0

)

n+σ

• if the roots diﬀer by an integer, there is generally only one solution

of this form because the recurrence relation for the smaller value

of Re(σ) will usually (but not always) fail

• otherwise, there are two solutions of the form

a

n

(x −x

0

)

n+σ

• the radius of convergence of the series is the distance from the

point of expansion to the nearest singular point of the ODE

If the roots σ

1

, σ

2

of the indicial equation are equal or diﬀer by an

integer, one solution is of the form

y

1

=

∞

n=0

a

n

(x −x

0

)

n+σ

1

, Re(σ

1

) Re(σ

2

)

and the other is of the form

y

2

=

∞

n=0

b

n

(x −x

0

)

n+σ

2

+cy

1

ln(x −x

0

)

149

The coeﬃcients b

n

and c can be determined (with some diﬃculty) by

substituting this form into the ODE and comparing coeﬃcients of (x −

x

0

)

n

and (x −x

0

)

n

ln(x −x

0

). In exceptional cases c may vanish.

Alternatively, y

2

can be found (also with some diﬃculty) using the

Wronskian method (section 7.3.4).

Example: Bessel’s equation of order ν = 0:

y

1

= 1 −

x

2

4

+

x

4

64

+

y

2

= y

1

ln x +

x

2

4

−

3x

4

128

+

Example: Bessel’s equation of order ν = 1:

y

1

= x −

x

3

8

+

x

5

192

+

y

2

= y

1

ln x −

2

x

+

3x

3

32

+

These examples illustrate a feature that is commonly encountered in

scientiﬁc applications: one solution is regular (i.e. analytic) and the

other is singular. Usually only the regular solution is an acceptable

solution of the scientiﬁc problem.

7.4.4 Irregular singular points

If either (x − x

0

)p(x) or (x − x

0

)

2

q(x) is not analytic at the point

x = x

0

, it is an irregular singular point of the ODE. The solutions can

have worse kinds of singular behaviour there.

Example: the equation x

4

y

′′

+ 2x

2

y

′

− y = 0 has an irregular singular

point at x = 0. Its solutions are exp(±x

−1

), both of which have an

essential singularity at x = 0.

In fact this example is just the familiar equation d

2

y/dz

2

= y with the

substitution x = 1/z. Even this simple ODE has an irregular singular

point at z = ∞.

150

Schedule Vector calculus. Suﬃx notation. Contractions using δij and ǫijk . Reminder of vector products, grad, div, curl, ∇2 and their representations using suﬃx notation. Divergence theorem and Stokes’s theorem. Vector diﬀerential operators in orthogonal curvilinear coordinates, e.g. cylindrical and spherical polar coordinates. Jacobians. [6 lectures] Partial diﬀerential equations. Linear second-order partial diﬀerential equations; physical examples of occurrence, the method of separation of variables (Cartesian coordinates only). [2 lectures] Green’s functions. Response to impulses, delta function (treated heuristically), Green’s functions for initial and boundary value problems. [3 lectures] Fourier transform. Fourier transforms; relation to Fourier series, simple properties and examples, convolution theorem, correlation fnuctions, Parseval’s theorem and power spectra. [2 lectures] Matrices. N -dimensional vector spaces, matrices, scalar product, transformation of basis vectors. Eigenvalues and eigenvectors of a matrix; degenerate case, stationary property of eigenvalues. Orthogonal and unitary transformations. Quadratic and Hermitian forms, quadric surfaces. [5 lectures] Elementary analysis. Idea of convergence and limits. O notation. Statement of Taylor’s theorem with discussion of remainder. Convergence of series; comparison and ratio tests. Power series of a complex variable; circle of convergence. Analytic functions: Cauchy–Riemann equations, rational functions and exp(z). Zeros, poles and essential singularities. [3 lectures] Series solutions of ordinary diﬀerential equations. Homogeneous equations; solution by series (without full discussion of logarithmic singularities), exempliﬁed by Legendre’s equation. Classiﬁcation of singular points. Indicial equations and local behaviour of solutions near singular points. [3 lectures]

Course website camtools.caret.cam.ac.uk Maths : NSTIB 08 09

Assumed knowledge I will assume familiarity with the following topics at the level of Course A of Part IA Mathematics for Natural Sciences. • Algebra of complex numbers • Algebra of vectors (including scalar and vector products) • Algebra of matrices • Eigenvalues and eigenvectors of matrices • Taylor series and the geometric series • Calculus of functions of several variables • Line, surface and volume integrals • The Gaussian integral • First-order ordinary diﬀerential equations • Second-order linear ODEs with constant coeﬃcients • Fourier series Textbooks The following textbooks are particularly recommended. • K. F. Riley, M. P. Hobson and S. J. Bence (2006). Mathematical Methods for Physics and Engineering, 3rd edition. Cambridge University Press • G. Arfken and H. J. Weber (2005). Mathematical Methods for Physicists, 6th edition. Academic Press

1

1.1

Vector calculus

Motivation

Points and vectors have a geometrical existence without reference to any coordinate system. For deﬁnite calculations, however, we must introduce coordinates. Cartesian coordinates:

Scientiﬁc quantities are of diﬀerent kinds: • scalars have only magnitude (and sign), e.g. mass, electric charge, energy, temperature • vectors have magnitude and direction, e.g. velocity, electric ﬁeld, temperature gradient A ﬁeld is a quantity that depends continuously on position (and possibly on time). Examples: • air pressure in this room (scalar ﬁeld) • electric ﬁeld in this room (vector ﬁeld) Vector calculus is concerned with scalar and vector ﬁelds. The spatial variation of ﬁelds is described by vector diﬀerential operators, which appear in the partial diﬀerential equations governing the ﬁelds. Vector calculus is most easily done in Cartesian coordinates, but other systems (curvilinear coordinates) are better suited for many problems because of symmetries or boundary conditions.

• measured with respect to an origin O and a system of orthogonal axes Oxyz • points have coordinates (x, y, z) = (x1 , x2 , x3 ) • unit vectors (ex , ey , ez ) = (e1 , e2 , e3 ) in the three coordinate diˆ ˆ ˆ rections, also called (ˆ, , k) or (x, y, z) ı ˆˆ

**The position vector of a point P is − − → OP = r = ex x + ey y + ez z =
**

3

ei xi

i=1

1.2.2

Properties of the Cartesian unit vectors

1.2

1.2.1

**Suﬃx notation and Cartesian coordinates
**

Three-dimensional Euclidean space

The unit vectors form a basis for the space. Any vector a belonging to the space can be written uniquely in the form

3

a = ex ax + ey ay + ez az =

i=1

ei ai

This is a close approximation to our physical space: • points are the elements of the space • vectors are translatable, directed line segments • Euclidean means that lengths and angles obey the classical results of geometry 1

where ai are the Cartesian components of the vector a. The basis is orthonormal (orthogonal and normalized): e1 · e1 = e2 · e2 = e3 · e3 = 1 e1 · e2 = e1 · e3 = e2 · e3 = 0

2

1a = a) The Levi-Civita permutation symbol is 1 = −1 0 if (i. e. The Cartesian components of a vector are diﬀerent with respect to two diﬀerent bases. if both orthonormal and right-handed. 2. e.g. e3 ] ≡ e1 · e2 × e3 = +1 This means that e1 × e2 = e3 e2 × e3 = e1 e3 × e1 = e2 1.g. 1.) for a vector component.and right-handed : [e1 .2. e2 . δ12 = 0 δij gives the components of the unit matrix. k) is an odd permutation of (1. j. ei for a unit vector • in three-dimensional space the symbolic index i can have the values 1.3 Suﬃx notation In detail: δ11 = δ22 = δ33 = 1 all others. can also be deﬁned Scalar and vector products can be evaluated in the following way: a · b = (e1 a1 + e2 a2 + e3 a3 ) · (e1 b1 + e2 b2 + e3 b3 ) = a1 b1 + a2 b2 + a3 b3 a × b = (e1 a1 + e2 a2 + e3 a3 ) × (e1 b1 + e2 b2 + e3 b3 ) e1 e2 e3 = a1 a2 a3 b1 b2 b3 δij aj = ai j=1 (in matrix notation. The Kronecker delta symbol is 1 0 if i = j otherwise δij = The choice of basis is not unique. are simply related by a rotation. 2 or 3 • quantities (tensors) with more than one index. Two diﬀerent bases. ǫ112 = 0 3 4 .4 Delta and epsilon Suﬃx notation is made easier by deﬁning two symbols.g. j. 3) if (i. 2.2. 3) otherwise ǫijk = e1 (a2 b3 − a3 b2 ) + e2 (a3 b1 − a1 b3 ) + e3 (a1 b2 − a2 b1 ) In detail: ǫ123 = ǫ231 = ǫ312 = 1 ǫ132 = ǫ213 = ǫ321 = −1 all others. It is symmetric: δji = δij and has the substitution property: 3 • xi for a coordinate. ai (e. k) is an even permutation of (1. such as aij or bijk .

3). It arises in vector products and determinants. so that it is summed over. It is implicit that a repeated index is to be summed over. 3) are (1. The contraction of aij is aii . 1) and (3. this should be noted explicitly. (2.2. which also happen to be the cyclic permutations. Examples of invalid notation are: a · a = a2 i (a · b)(c · d) = ai bi ci di (should be ai ai ) (should be ai bi cj dj ) The repeated index is a dummy index and can be relabelled at will.An even (odd) permutation is one consisting of an even (odd) number of transpositions (interchanges of two objects).6 Matrices and suﬃx notation 1. The free indices in each term in an equation must agree. So ǫijk = ǫjki = ǫkij . 3. e.g. a·b= i=1 j=1 3 3 δij ai bj = i=1 3 ai bi a×b= ǫijk ei aj bk i=1 j=1 k=1 If the summation convention is not being used. 2). Therefore ǫijk is totally antisymmetric: it changes sign when any two indices are interchanged.2. Other indices in an expression are free indices. Thus a · b = ai bi a × b = ǫijk ei aj bk or (a × b)i = ǫijk aj bk Matrix (A) times vector (x): y = Ax yi = Aij xj Matrix times matrix: A = BC Aij = Bik Ckj Transpose of a matrix: (AT )ij = Aji Trace of a matrix: tr A = Aii Determinant of a (3 × 3) matrix: det A = ǫijk A1i A2j A3k (or many equivalent expressions). 2. Then we can write 3 3 3 Examples: a=b |a|2 = b · c a=b×c ai = bi ai = ǫijk bj ck ai ai = bi ci ai = bj cj di + ǫijk ej fk a = (b · c)d + e × f Contraction means to set one free index equal to another. 2. 1. 6 The repeated index should occur exactly twice in any term. 1. The even permutations of (1. ǫjik = −ǫijk . ǫijk has three indices because the space has three dimensions. 5 . Contraction is equivalent to multiplication by a Kronecker delta: aij δij = a11 + a22 + a33 = aii The contraction of δij is δii = 1 + 1 + 1 = 3 (in three dimensions).5 Einstein summation convention The summation sign is conventionally omitted in expressions of this type.

. . k) = (l...... ⊲ Show that (a × b) · (c × d) = (a · c)(b · d) − (a · d)(b · c)...2. k) are equal or when any of (l. e. ... . . . . n) = (1. .. .. 2.... ...... . ...... . . .. .. . m. . = (a · c)(b · d) − (a · d)(b · c) This is the most useful form to remember: ǫijk ǫimn = δjm δkn − δjn δkm 7 8 . . . We contract the identity once by setting l = i: ǫijk ǫimn δii δim δin = δji δjm δjn δki δkm δkn = δii (δjm δkn − δjn δkm ) + δim (δjn δki − δji δkn ) = 3(δjm δkn − δjn δkm ) + (δjn δkm − δjm δkn ) = δjm δkn − δjn δkm + (δjn δkm − δjm δkn ) + δin (δji δkm − δjm δki ) ǫijk ǫijk = 6 Example . ... . k) are interchanged or when any of (l.. . .... n) are interchanged These properties ensure that the LHS and RHS are equal for any choices of the indices.. . . ....... . . . .. . .. Note that the ﬁrst property is in fact implied by the third.. m. .g. .. 3) • changes sign when any of (i. .. . ..... .. . ... m... . (a × b) · (c × d) = (a × b)i (c × d)i = (ǫijk aj bk )(ǫilm cl dm ) = ǫijk ǫilm aj bk cl dm = (δjl δkm − δjm δkl )aj bk cl dm = aj bk cj dk − aj bk ck dj = (aj cj )(bk dk ) − (aj dj )(bk ck ) . n) are equal • is 1 when (i.. The value of both the LHS and the RHS: • is 0 when any of (i. j. .. . . .. the indices can be permuted cyclically into this form. . .. .1.: ǫαβγ ǫµνβ = ǫβγα ǫβµν = δγµ δαν − δγν δαµ Further contractions of the identity: ǫijk ǫijn = δjj δkn − δjn δkj = 3δkn − δkn = 2δkn The general identity ǫijk ǫlmn δil δim δin = δjl δjm δjn δkl δkm δkn can be established by the following argument. .. . . . . . .. ... j... j..7 Product of two epsilons Given any product of two epsilons with one common index.... .. .. .

..3. y + δy.. . . ..g. evaluated at the point r... . .... . . .... . . .. .. . . . is d [Φ(r + ts) − Φ(r)] ds = s=0 For an inﬁnitesimal increment we can write dΦ = (∇Φ) · dr We can consider ∇ (del. . . z) = Φ(r) Taylor’s theorem for a function of three variables: Φ(x + δx. .. . .. .1 Vector diﬀerential operators The gradient operator Example .3 1.. ..1... etc. . . .. . e. . . ∂x dr r df x y z df r ∇f = . ...... .. .. . temperature) Φ(x.. .. . .. .... ⊲ Find ∇f .. where f (r) is a function of r = |r|.. . . . . .... y...... . . grad or nabla) as a vector diﬀerential operator ∇ = ex ∂ ∂ ∂ ∂ + ey + ez = ei ∂x ∂y ∂z ∂xi d (∇Φ) · (ts) + O(s2 ) ds s=0 = t · ∇Φ This is known as a directional derivative. . .. . Notes: • the derivative is maximal in the direction t ∇Φ • the derivative is zero in directions such that t ⊥ ∇Φ 9 10 which acts on Φ to produce ∇Φ....... .. . . y.. = dr r r r dr r ..... . 2r Geometrical meaning of the gradient also written grad Φ. . ... . .. z) ∂Φ ∂Φ ∂Φ δx + δy + δz + O(δx2 . z + δz) = Φ(x. . .. . ) + ∂x ∂y ∂z Equivalently Φ(r + δr) = Φ(r) + (∇Φ) · δr + O(|δr|2 ) where the gradient of the scalar ﬁeld is ∇Φ = ex ∂Φ ∂Φ ∂Φ + ey + ez ∂x ∂y ∂z 1. .2 ∂f df ∂r = ∂x dr ∂x r 2 = x2 + y 2 + z 2 ∂r = 2x ∂x df x ∂f = ..... ..3. . δxδy. . .. . ∇f = ex ∂f ∂f ∂f + ey + ez ∂x ∂y ∂z We consider a scalar ﬁeld (function of position. In suﬃx notation ∇Φ = ei ∂Φ ∂xi The rate of change of Φ with distance s in the direction of the unit vector t. . ...

. . . ... electric ﬁeld) F (r) = ex Fx (r) + ey Fy (r) + ez Fz (r) = ei Fi (r) The divergence of a vector ﬁeld is the scalar ﬁeld ∇·F = ∂ ∂ ∂ + ey + ez ∂x ∂y ∂z ∂Fz ∂Fx ∂Fy + + = ∂x ∂y ∂z ex · (ex Fx + ey Fy + ez Fz ) also written div F . .. . y.. .. x + z.... . .. ... .. . .. where t = dr/ds is the unit tangent vector to the curve 1. ..... . ⊲ Find the unit normal at the point r = (x...3 Related vector diﬀerential operators We now consider a general vector ﬁeld (e. . . .. .. −c. . ... ........ x + z. In suﬃx notation ∇ × F = ei ǫijk ∂Fk ∂xj or (∇ × F )i = ǫijk ∂Fk ∂xj ⇒ −z 2 = −c2 solutions (−c. 11 12 .. . .. . pointing in the direction of increasing Φ • the unit vector normal to the surface Φ = constant is n = ∇Φ/|∇Φ| • the rate of change of Φ with arclength s along a curve is dΦ/ds = t · ∇Φ... . .. .....g. .... .. y + x) 2(x2 + y 2 + z 2 + xy + xz + yz) y+z =x+z =0 ⇒ z = ±c (c... Hence ﬁnd the points where the plane tangent to the surface is parallel to the (x. −c) The curl of a vector ﬁeld is the vector ﬁeld ∇×F = ∂ ∂ ∂ + ey + ez × (ex Fx + ey Fy + ez Fz ) ∂x ∂y ∂z ∂Fy ∂Fz ∂Fx ∂Fz − − = ex + ey ∂y ∂z ∂z ∂x ∂Fy ∂Fx + ez − ∂x ∂y ex ey ez = ∂/∂x ∂/∂y ∂/∂z Fx Fy Fz ex n ez also written curl F . c)....3. ... . y + x) n= ∇Φ = |∇Φ| when (y + z. .. c. . .• the directions t ⊥ ∇Φ therefore lie in the plane tangent to the surface Φ = constant We conclude that: • ∇Φ is a vector ﬁeld. In suﬃx notation ∇·F = ∂Fi ∂xi Example . . .. ∇Φ = (y + z.. . . ..... .. . Note that the Cartesian unit vectors are independent of position and do not need to be diﬀerentiated. . ... . z) to the surface Φ(r) ≡ xy + yz + zx = −c2 where c is a constant.. . . ... . .. . y) plane. ... . . . . . .. . .. . .

. ... 1. .... ... .. . . . Two operators.. .... .. .... . . . .. . .. . .. .. . . y.. z) 14 . . .. Example . .. ... while vector components transform in a particular way under a rotation....g.. etc. −z 2 .. . . .... .. . . . . . ...3. . .. . .... .... ∂ 2 ∂ 2 ∂ 2 ∇·F = (x y) + (y z) + (z x) = 2(xy + yz + zx) ∂x ∂y ∂z ex ey ez ∇ × F = ∂/∂x ∂/∂y ∂/∂z = (−y 2 .. Fields constructed using ∇ share these properties. . div and ∇2 can be deﬁned in spaces of any dimension. ....: • ∇Φ and ∇ × F are vector ﬁelds and their components depend on the basis • ∇ · F and ∇2 Φ are scalar ﬁelds and are invariant under a rotation grad. .. ... ⊲ Find the divergence and curl of the vector ﬁeld F = (x2 y. .... ... ... . .. . ....... . .... . ∇r n = ∂r n ∂r n ∂r n . ⊲ Find ∇2 r n . . one ﬁeld: ∇ · (∇Φ) = ∇2 Φ ∇ · (∇ × F ) = 0 ∇ × (∇ × F ) = ∇(∇ · F ) − ∇2 F ∇ × (∇Φ) = 0 = nr n−2 (x.4 Vector invariance = 3nr n−2 + n(n − 2)r n−2 ∂ ∂ ∂ (nr n−2 x) + (nr n−2 y) + (nr n−2 z) ∂x ∂y ∂z = nr n−2 + n(n − 2)r n−4 x2 + · · · A scalar is invariant under a change of basis..... ..... . ..... . ... . . . In suﬃx notation ∇2 F = ei ∂ 2 Fi ∂xj ∂xj = n(n + 1)r n−2 . .....5 Vector diﬀerential identities because the Cartesian unit vectors are independent of position. y 2 z. .. . . z 2 x). . ... −x2 ) x2 y y2z z2x . . . and F and G are arbitrary vector ﬁelds.... . . ... . .. .. . ... . ∂x ∂y ∂z (recall 13 ∂r x = . . e. .. The directional derivative operator is ∂ ∂ ∂ ∂ + ty + tz = ti t · ∇ = tx ∂x ∂y ∂z ∂xi t · ∇Φ is the rate of change of Φ with distance in the direction of the unit vector t. . . . .... . . .. ... .. Example .. . ...... . ..... ... The Laplacian of a vector ﬁeld can also be deﬁned. . . . but curl (like the vector product) is a three-dimensional concept.. This can be thought of either as t · (∇Φ) or as (t · ∇)Φ.. . . .. .. ...... 1. . . .. . .The Laplacian of a scalar ﬁeld is the scalar ﬁeld ∇2 Φ = ∇ · (∇Φ) = ∂2Φ ∂2Φ ∂2Φ ∂2Φ + + = ∂x2 ∂y 2 ∂z 2 ∂xi ∂xi ∇2 r n = ∇ · ∇r n = The Laplacian diﬀerential operator (del squared ) is ∂2 ∂2 ∂2 ∂2 + 2+ 2 = ∇2 = ∂x2 ∂y ∂z ∂xi ∂xi It appears very commonly in partial diﬀerential equations. . . .) ∂x r Here Φ and Ψ are arbitrary scalar ﬁelds. .3. ... .. .... . .... ... .

. . .. . .. .. Notes: • be clear about which terms to the right an operator is acting on (use brackets if necessary) 15 1. .. . . .. ... . . ... . .. . ... . . . . .. . it can be written as the gradient of a scalar potential : F = ∇Φ. . . ⊲ Show that ∇×(F ×G) = (G·∇)F −G(∇·F )−(F · ∇)G+F (∇·G).. . .. . . e. ... . .. . a ‘conservative’ force ﬁeld such as gravity • if a vector ﬁeld F is solenoidal (∇ · F = 0) in a region of space.. the magnetic ﬁeld 1. .. .. . .. .. .... . j) while ∂ 2 Fk /∂xi ∂xj is symmetric) Example . .1 df dx = f (b) − f (a) dx The gradient theorem (∇Φ) · dr = Φ(r2 ) − Φ(r1 ) C where C is any curve from r1 to r2 . . .. . . . .. .. ... .. . .. . . ∂ ∇ × (F × G) = ei ǫijk (ǫklm Fl Gm ) ∂xj ∂ (Fl Gm ) = ei ǫkij ǫklm ∂xj ∂Fl ∂Gm + Fl ∂xj ∂xj ∂Fj ∂Gj ∂Fi ∂Gi = ei Gj − ei Gi + ei Fi − ei Fj ∂xj ∂xj ∂xj ∂xj = ei (δil δjm − δim δjl ) Gm = (G · ∇)F − G(∇ · F ) + F (∇ · G) − (F · ∇)G . . .. . e. .. .. . ...g.. two ﬁelds: ∇(ΦΨ) = Ψ∇Φ + Φ∇Ψ ∇(F · G) = (G · ∇)F + G × (∇ × F ) + (F · ∇)G + F × (∇ × G) ∇ · (ΦF ) = (∇Φ) · F + Φ∇ · F ∇ · (F × G) = G · (∇ × F ) − F · (∇ × G) ∇ × (F × G) = (G · ∇)F − G(∇ · F ) − (F · ∇)G + F (∇ · G) Example .. . ..One operator. .... . . . .. . . . ⊲ Show that ∇ · (∇ × F ) = 0 for any (twice-diﬀerentiable) vector ﬁeld F. ∇ · (F × G) = (∇ × F ) · G Related results: • if a vector ﬁeld F is irrotational (∇ × F = 0) in a region of space. .. . . . . .. .. . . ..g.. .g. . . . . .... ... .. . . . . . . Outline proof: (∇Φ) · dr = C C dΦ = Φ(r2 ) − Φ(r1 ) 16 . . .. . .4. . . . .4 Integral theorems These very important results derive from the fundamental theorem of calculus (integration is the inverse of diﬀerentiation): b a (since ǫijk is antisymmetric on (i. . ... . .. . . .... . ∇ · (∇ × F ) = ∂ ∂xi ǫijk ∂Fk ∂xj = ǫijk ∂ 2 Fk =0 ∂xi ∂xj ∇ × (ΦF ) = (∇Φ) × F + Φ∇ × F • you cannot simply apply standard vector identities to expressions involving ∇.. . .. ... . . e. . . . . it can be written as the curl of a vector potential : F = ∇ × G..

1. The right-hand side is the ﬂux of F through the surface S. For a multiply connected volume (e. z) − Fx (x1 . y. a spherical shell). the ﬂuxes through internal surfaces cancel out. The vector surface element is dS = n dS. z)] dy dz + two similar terms = S F · dS Related results: (∇Φ) dV = V S Φ dS dS × F V (∇ × F ) dV = S The rule is to replace ∇ in the volume integral with n in the surface integral. leaving only the ﬂux through S. and dV with dS (note that dS = n dS). When the integrals are added together. all the surfaces must be considered.g. A simply connected volume (e.4. y.2 The divergence theorem (Gauss’s theorem) An arbitrary volume V can be subdivided into small cuboids to any desired accuracy. a ball) is one with no holes. 17 18 . It has only an outer surface. (∇ · F ) dV = V S F · dS where V is a volume bounded by the closed surface S (also called ∂V ). where n is the outward unit normal vector. Outline proof: ﬁrst prove for a cuboid: z2 y2 y1 y2 y1 x2 x1 (∇ · F ) dV = V z1 z2 ∂Fx ∂Fy ∂Fz + + ∂x ∂y ∂z dx dy dz = z1 [Fx (x2 .g.

. . . . . . .. ... . .. .. . y) plane. F =− p dS S 1... . . g is the gravitational acceleration and z is the vertical coordinate. .. ... . .. .. .4....... . .. . . ... . . ......... . . . . . . .. .. ... ... .... . . ..Example .. .... . ⊲ A submerged body is acted on by a hydrostatic pressure p = −ρgz. .. .. . 19 20 .. . the line integral follows the direction of a right-handed screw around n. . where A is a region of the plane bounded by the curve C.. .. .... . .......3 The curl theorem (Stokes’s theorem) S (∇ × F ) · dS = F · dr C where S is an open surface bounded by the closed curve C (also called ∂S). Whichever way the unit normal n is deﬁned on S..... . . we have Green’s theorem in the plane: ∂ (ρgz) dV = 0 ∂x A ∂Fy ∂Fx − ∂x ∂y dx dy = C (Fx dx + Fy dy) Fy = 0 F = ez M g Archimedes’ principle: buoyancy force equals weight of displaced ﬂuid . .. .. and the line integral follows a positive sense.. The right-hand side is the circulation of F around the curve C. . . . Find a simpliﬁed expression for the pressure force acting on the body.. Fz = ez · F = S (−ez p) · dS = V ∇ · (−ez p) dV ∂ (ρgz) dV ∂z dV V = V = ρg = Mg Similarly Fx = V (M is the mass of ﬂuid displaced by the body) Special case: for a planar surface in the (x. ... where ρ is the density of the ﬂuid..

(∇Φ) · t δs ≈ δΦ and so t · (∇Φ) = lim δΦ δs→0 δs This deﬁnition can be used to determine the component of ∇Φ in any desired direction.Outline proof: ﬁrst prove Green’s theorem for a rectangle: y2 y1 x2 x1 y2 y1 Notes: • in Stokes’s theorem S is an open surface. Apply the gradient theorem to an arbitrarily small line segment δr = t δs in the direction of any unit vector t. Since the variation of ∇Φ and t along the line segment is negligible. while only one volume is bounded by a closed surface • a multiply connected surface (e. while in Gauss’s theorem it is closed • many diﬀerent surfaces are bounded by the same closed curve. y1 )] dx y2 = x1 Fx (x.4. the circulations along internal curve segments cancel out. y) − Fy (x1 . 21 22 . div and curl The integral theorems can be used to assign coordinate-independent meanings to grad. y) dy y1 + x2 Fx (x. y2 ) dx + y2 Fy (x1 . div and curl. y2 ) − Fx (x. An arbitrary surface S can be subdivided into small planar rectangles to any desired accuracy. y1 ) dx + y1 x1 Fy (x2 .g.4 Geometrical deﬁnitions of grad. When the integrals are added together. leaving only the circulation around C. an annulus) may have more than one bounding curve ∂Fy ∂Fx − ∂x ∂y dx dy = − x2 [Fy (x2 . y) dy = C (Fx dx + Fy dy) 1. y)] dy x2 x1 [Fx (x.

x2 . x3 ). x2 . 1. q2 .g. The curl describes the circulation or rotation of a vector ﬁeld per unit area.g. x2 . q2 (x1 .2 The Jacobian 1 δS F · dr δC The gradient therefore describes the rate of change of a scalar ﬁeld with distance. Therefore the Jacobian is equal to the scalar triple product J = [h1 . by applying the divergence theorem to an arbitrarily small volume δV bounded by a surface δS. The summation convention does not work well with orthogonal curvilinear coordinates. 1. y. δr2 = h2 δq2 and δr3 = h3 δq3 along 24 . we ﬁnd that n · (∇ × F ) = lim δS→0 determines the displacement associated with an increment of the coordinate qi .5. The columns of the above matrix are the vectors hi deﬁned above. q3 ) ∂z/∂q1 ∂z/∂q2 ∂z/∂q3 The inﬁnitesimal line element in Cartesian coordinates is dr = ex dx + ey dy + ez dz In general curvilinear coordinates we have dr = h1 dq1 + h2 dq2 + h3 dq3 23 This is the determinant of the Jacobian matrix of the transformation from coordinates (q1 . we ﬁnd that ∇ · F = lim 1 δV →0 δV F · dS δS where hi = ei hi = ∂r ∂qi (no sum) Finally. hi = 1 and ei are constant. z) with respect to (q1 .1 Line element The Jacobian of (x. cylindrical or spherical polar coordinates. e. q2 . x2 . geophysics). q3 ) to (x1 . consider three small displacements δr1 = h1 δq1 . Curvilinear coordinates are useful for solving problems in curved geometry (e. For Cartesian coordinates. by applying the curl theorem to an arbitrarily small open surface δS with unit normal vector n and bounded by a curve δC. 1. Curvilinear (as opposed to rectilinear) means that the coordinate ‘axes’ are curves. Any point at which hi = 0 is a coordinate singularity at which the coordinate system breaks down.5. y. x3 ). q3 ). but in general both hi and ei depend on position. h2 . The divergence describes the net source or eﬄux of a vector ﬁeld per unit volume. ei is the corresponding unit vector. x3 ). q3 (x1 . q2 . h3 ] = h1 · h2 × h3 Given a point with curvilinear coordinates (q1 . x3 ). z) = ∂y/∂q1 ∂y/∂q2 ∂y/∂q3 ∂(q1 .Similarly. q3 ) is deﬁned as J= ∂x/∂q1 ∂x/∂q2 ∂x/∂q3 ∂(x. This notation generalizes the use of ei for a Cartesian unit vector. It converts a coordinate increment into a length.5 Orthogonal curvilinear coordinates Cartesian coordinates can be replaced with any independent set of coordinates q1 (x1 . q2 . hi = |hi | is the scale factor (or metric coeﬃcient) associated with the coordinate qi .

· · · . · · · .) The lefthand side is the ij-component of the Jacobian matrix of the transformation from γi to αi . y) ∂x/∂q1 ∂x/∂q2 = ∂y/∂q1 ∂y/∂q2 ∂(q1 . y)-plane. δr3 ]| = |J| δq1 δq2 δq3 Hence the volume element in a general curvilinear coordinate system is dV = ∂(x. y. · · · . we ﬁnd The Jacobian therefore appears whenever changing variables in a multiple integral: Φ(r) dV = Φ dx dy dz = Φ ∂(x. with 1 i n. y. In the special case in which γi = αi for all i. the left-hand side is 1 (the determinant of the unit matrix) and we obtain ∂(α1 . q2 . and the equation states that this matrix is the product of the Jacobian matrices of the transformations from γi to βi and from βi to αi . · · · . βi and γi . According to the chain rule of partial diﬀerentiation. · · · . βn ) ∂(γ1 . · · · . · · · . γn ) ∂(β1 . Taking the determinant of this matrix equation. αn ) ∂(β1 . · · · .5. αn ) ∂(β1 . αn ) −1 The limits on the integrals also need to be considered.4 Orthogonality of coordinates The equivalent rule for a one-dimensional integral is f (x) dx = f (x(q)) The modulus sign appears here if the integrals are carried out over a positive range (the upper limits are greater than the lower limits). This is a multidimensional generalization of the result dx/dy = (dy/dx)−1 . αn ) = ∂(γ1 . the area element is dA = |J| dq1 dq2 with J= ∂(x. · · · . q2 ) are introduced in the (x. 25 Calculus in general curvilinear coordinates is diﬃcult. βn ) ∂(α1 . βn ) = ∂(β1 . Jacobians are deﬁned similarly for transformations in any number of dimensions. They span a parallelepiped of volume δV = |[δr1 . q3 ) 1. 1. ∂αi = ∂γj n k=1 ∂αi ∂βk ∂βk ∂γj (Under the summation convention we may omit the Σ sign. z) dq1 dq2 dq3 ∂(q1 . z) dq1 dq2 dq3 ∂(q1 . We can make things easier by choosing the coordinates to be orthogonal: ei · ej = δij 26 . βn ) ∂(α1 .5. · · · . δr2 . q2 ) dx dq dq The Jacobian of an inverse transformation is therefore the reciprocal of that of the forward transformation.3 Properties of Jacobians Consider now three sets of n variables αi . none of which need be Cartesian coordinates. q3 ) ∂(α1 . γn ) This is the chain rule for Jacobians: the Jacobian of a composite transformation is the product of the Jacobians of the transformations of which is it composed. If curvilinear coordinates (q1 . q2 .the three coordinate directions.

0.5 Commonly used orthogonal coordinate systems Cartesian coordinates (x. 0. No singularities.5. hz = 1. z) hx = ∂r = (1. dV = dx dy dz r = x ex + y ey + z ez Orthogonal.and right-handed: e1 × e2 = e3 The squared line element is then |dr|2 = |e1 h1 dq1 + e2 h2 dq2 + e3 h3 dq3 |2 2 2 2 = h2 dq1 + h2 dq2 + h2 dq3 1 2 3 1. y. 0) ez = (0. When oriented along the coordinate directions: • line element dr = e1 h1 dq1 • surface element dS = e3 h1 h2 dq1 dq2 . Note that. 0. • volume element dV = h1 h2 h3 dq1 dq2 dq3 −∞ < x < ∞. 0) ∂x ∂r hy = = (0. y. r = (x. for orthogonal coordinates. z): There are no cross terms such as dq1 dq2 . −∞ < z < ∞ hx = 1. 0. 1) ∂z ex = (1. 0) ∂y hz = ∂r = (0. 0) ey = (0. 1. hy = 1. 1) −∞ < y < ∞. J = h1 · h2 × h3 = h1 h2 h3 27 28 . 1.

r = r er er = (sin θ cos φ. Warning. θ. cos θ) ∂r ∂r = (r cos θ cos φ. r sin θ cos φ. Singular on the axis ρ = 0. Many authors use r for ρ and θ for φ. 0) ∂φ ∂r = (0. 0. 0 < θ < π. ρ cos φ. eφ = (− sin φ. ρ sin φ. z): Spherical polar coordinates (r. hz = 1. cos φ. y. 0) ez = (0. which is useful for other things. hθ = r. −∞ < z < ∞ 0 < r < ∞. eρ = (cos φ. z) hρ = hφ = hz = ∂r = (cos φ. 0) hφ = ρ. cos φ. cos θ) eφ = (− sin φ. 0) eθ = (cos θ cos φ. z) = (r sin θ cos φ. 29 dV = r 2 sin θ dr dθ dφ Orthogonal. 1) ∂z hρ = 1. cos θ sin φ. 0) hφ = ∂φ hr = 1. −r sin θ) hθ = ∂θ ∂r = (−r sin θ sin φ. Instead of ρ. some authors use R. r sin θ sin φ. z) = (ρ cos φ.Cylindrical polar coordinates (ρ. s or ̟. hφ = r sin θ. sin φ. 0. r cos θ) ∂r hr = = (sin θ cos φ. Singular on the axis r = 0. − sin θ) r = ρ eρ + z ez dV = ρ dρ dφ dz Orthogonal. 0 φ < 2π r = (x. 0 φ < 2π. Notes: • cylindrical and spherical are related by ρ = r sin θ. z = r cos θ • plane polar coordinates are the restriction of cylindrical coordinates to a plane z = constant 30 . 0) ∂ρ ∂r = (−ρ sin φ. 1) r = (x. φ. r cos θ sin φ. sin θ sin φ. sin θ sin φ. This is confusing because r and θ then have diﬀerent meanings in cylindrical and spherical polar coordinates. θ = 0 and θ = π. sin φ. y. φ): 0 < ρ < ∞.

Also ∇× e1 h1 = ∇ × (∇q1 ) = 0.1.6 Vector calculus in orthogonal coordinates To work out ∇ · F . The appearance of the scale factors inside the derivatives can be understood with reference to the geometrical deﬁnitions of grad. write e1 (h2 h3 F1 ) + · · · h2 h3 e1 · ∇(h2 h3 F1 ) + · · · ∇·F = h2 h3 e1 e1 ∂ e2 ∂ = · (h2 h3 F1 ) + (h2 h3 F1 ) h2 h3 h1 ∂q1 h2 ∂q2 e3 ∂ (h2 h3 F1 ) + · · · + h3 ∂q3 1 ∂ = (h2 h3 F1 ) + · · · h1 h2 h3 ∂q1 F = Similarly. div and curl: δΦ t · (∇Φ) = lim δs→0 δs 1 ∇ · F = lim F · dS δV →0 δV δS 1 n · (∇ × F ) = lim F · dr δS→0 δS δC 32 .5. q3 ): dΦ = ∂Φ ∂Φ dq1 + dq2 + ∂q1 ∂q2 e1 ∂Φ e2 ∂Φ = + + h1 ∂q1 h2 ∂q2 = (∇Φ) · dr We identify ∇Φ = Thus ei ∇qi = hi (no sum) e2 ∂Φ e3 ∂Φ e1 ∂Φ + + h1 ∂q1 h2 ∂q2 h3 ∂q3 ∂Φ dq3 ∂q3 e3 ∂Φ h3 ∂q3 · (e1 h1 dq1 + e2 h2 dq2 + e3 h3 dq3 ) ∇ × F = ∇(h1 F1 ) × = We now consider a vector ﬁeld in orthogonal coordinates: F = e1 F1 + e2 F2 + e3 F3 Finding the divergence and curl are non-trivial because both Fi and ei depend on position. write F = e1 h1 (h1 F1 ) + · · · e1 h1 + ··· A scalar ﬁeld Φ(r) can be regarded as function of (q1 . to work out ∇ × F . 31 etc. q2 . Consider ∇ × (q2 ∇q3 ) = (∇q2 ) × (∇q3 ) = which implies ∇· e1 h2 h3 =0 e3 e1 e2 × = h2 h3 h2 h3 e1 ∂ e2 ∂ e3 ∂ (h1 F1 ) + (h1 F1 ) + (h1 F1 ) h1 ∂q1 h2 ∂q2 h3 ∂q3 e1 + ··· × h1 e2 ∂ e3 ∂ = (h1 F1 ) − (h1 F1 ) + · · · h1 h3 ∂q3 h1 h2 ∂q2 as well as cyclic permutations of this result.

.. .. . . . . ∇2 Φ = 1 r 2 sin θ hθ = r. ..... .To summarize: ∇Φ = e1 ∂Φ e2 ∂Φ e3 ∂Φ + + h1 ∂q1 h2 ∂q2 h3 ∂q3 1 ∂ ∂ ∂ (h2 h3 F1 ) + (h3 h1 F2 ) + (h1 h2 F3 ) h1 h2 h3 ∂q1 ∂q2 ∂q3 ∇·F = ∇×F = ∂ ∂ e1 (h3 F3 ) − (h2 F2 ) h2 h3 ∂q2 ∂q3 ∂ ∂ e2 (h1 F1 ) − (h3 F3 ) + h3 h1 ∂q3 ∂q1 ∂ ∂ e3 (h2 F2 ) − (h1 F1 ) + h1 h2 ∂q1 ∂q2 h1 e1 h2 e2 h3 e3 1 ∂/∂q1 ∂/∂q2 ∂/∂q3 = h1 h2 h3 h1 F1 h2 F2 h3 F3 1 ∂ h2 h3 ∂Φ h1 h2 h3 ∂q1 h1 ∂q1 h1 h2 ∂Φ ∂ + ∂q3 h3 ∂q3 + ∂ ∂q2 h3 h1 ∂Φ h2 ∂q2 ∇2 Φ = Example . . 33 34 ..... . . . . ⊲ Determine the Laplacian operator in spherical polar coordinates... .. ...... .... . . ... hφ = r sin θ ∂Φ ∂ ∂Φ ∂ ∂ 1 ∂Φ r 2 sin θ + sin θ + ∂r ∂r ∂θ ∂θ ∂φ sin θ ∂φ ∂ ∂2Φ ∂Φ 1 ∂Φ 1 1 ∂ r2 + 2 sin θ + 2 2 = 2 r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2 . . . .. .. .. . . . ... ... .. . . . ....... .... ... ... . . . . .. . . .. . . ... . ..... .. . . .. . . .. ... ..... hr = 1...... .. ..

.... . ...... . . . ..7 Grad. . .. . curl and ∇2 in cylindrical and spherical polar coordinates Example . . ..... .. . . ...... Spherical polar coordinates: ∇Φ = er ∇·F = eφ ∂Φ ∂Φ eθ ∂Φ + + ∂r r ∂θ r sin θ ∂φ 1 ∂ 2 1 ∂ 1 ∂Fφ (r Fr ) + (sin θ Fθ ) + 2 ∂r r r sin θ ∂θ r sin θ ∂φ er r eθ r sin θ eφ 1 ∇×F = 2 ∂/∂r ∂/∂θ ∂/∂φ r sin θ Fr rFθ r sin θ Fφ ∇2 Φ = 1 ∂ r 2 ∂r r2 ∂Φ ∂r + ∂ 1 2 sin θ ∂θ r sin θ ∂Φ ∂θ + ∂2Φ 1 r 2 sin2 θ ∂φ2 35 36 . . . ..... . .... ... . . . ... ... div.. . .. . . ... .... . .. . . .1. ... . ⊲ Evaluate ∇ · r. ... ..... ....... .. ....... . . .5. . ... . . . .. .... .. ∇ × r and ∇2 r n using spherical polar coordinates. . . . . r = er r ∇·r = 1 ∂ 2 (r · r) = 3 r 2 ∂r er r eθ r sin θ eφ 1 ∂/∂r ∂/∂θ ∂/∂φ = 0 r 2 sin θ r 0 0 Cylindrical polar coordinates: ∇Φ = eρ ∇·F = ∂Φ ∂Φ eφ ∂Φ + + ez ∂ρ ρ ∂φ ∂z 1 ∂ 1 ∂Fφ ∂Fz (ρFρ ) + + ρ ∂ρ ρ ∂φ ∂z eρ ρ eφ ez 1 ∂/∂ρ ∂/∂φ ∂/∂z ρ Fρ ρFφ Fz ρ ∂Φ ∂ρ + 1 ∂2Φ ∂2Φ + ρ2 ∂φ2 ∂z 2 ∇×r = ∇2 r n = ∇×F = ∇2 Φ = 1 ∂ ρ ∂ρ ∂r n 1 ∂ r2 = n(n + 1)r n−2 r 2 ∂r ∂r .

g are independent of x and y the equation is said to have constant coeﬃcients These ideas can be generalized to more than two independent variables. then the governing equations are partial diﬀerential equations (PDEs). . Many of the most important PDEs are linear and classical methods of analysis can be applied. b. f. integral). . or to systems of PDEs with more than one dependent variable. The techniques developed for linear equations are also useful in the study of nonlinear PDEs. If the quantities depend on space and time. then αu and u + v also satisfy the homogeneous equation • similarly. . d. d.g. x. or on more than one spatial coordinate. • if f = 0 the equation is said to be homogeneous • if a.2 2. c. 2. y) (the dependent variable) of two independent variables x and y. e.2 Principle of superposition The variation in space and time of scientiﬁc quantities is usually described by diﬀerential equations. c.1 Partial diﬀerential equations Motivation where a. L deﬁned above is an example of a linear operator : L(αu) = αLu L(u + v) = Lu + Lv where u and v any functions of x and y. involving u and any of its derivatives evaluated at the same point. .2. .2. A partial diﬀerential equation is any equation of the form F u.2 2. . and α is any constant. e. ∂u ∂u ∂ 2 u ∂ 2 u ∂ 2 u . .1 Linear PDEs of second order Deﬁnition We consider an unknown function u(x. then up + uc also satisﬁes the inhomogeneous equation: L(up + uc ) = Lup + Luc = f + 0 = f The same principle applies to any other type of linear equation (e. y ∂x ∂y ∂x2 ∂x∂y ∂y 2 =0 The principle of superposition: • if u and v satisfy the homogeneous equation Lu = 0. • the order of the equation is the highest order of derivative that appears • the equation is linear if F depends linearly on u and its derivatives A linear PDE of second order has the general form Lu = f where L is a diﬀerential operator such that Lu = a ∂2u ∂u ∂2u ∂2u ∂u +c 2 +d +e + gu +b 2 ∂x ∂x∂y ∂y ∂x ∂y 37 38 . g are functions of x and y. algebraic. 2. b. ordinary diﬀerential. . any linear combination of solutions of the homogeneous equation is also a solution • if the particular integral up satisﬁes the inhomogeneous equation Lu = f and the complementary function uc satisﬁes the homogeneous equation Lu = 0.

we have Laplace’s equation ∇2 Φ = 0. the electric ﬁeld is related to the electrostatic potential by E = −∇Φ The source of the electric ﬁeld is electric charge: ∇·E = ρ ǫ0 where ρ is the charge density and ǫ0 is the permittivity of free space.1 Physical examples of occurrence Examples of Laplace’s (or Poisson’s) equation Gravitational acceleration is related to gravitational potential by g = −∇Φ The source (divergence) of the gravitational ﬁeld is mass: ∇ · g = −4πGρ where G is Newton’s constant and ρ is the mass density.3. Analogously. Inhomogeneous versions of these equations are also found. ∂t Wave equation: ∂2u = c2 ∇2 u.2 Examples of the diﬀusion equation 2. the form of which depends on the number of active dimensions. Combine: ∇2 Φ = 4πGρ 39 A conserved quantity with density (amount per unit volume) Q and ﬂux density (ﬂux per unit area) F satisﬁes the conservation equation: ∂Q +∇·F =0 ∂t This is more easily understood when integrated over the volume V bounded by any (ﬁxed) surface S: d dt Q dV = V V ∂Q dV = − ∂t 40 V ∇ · F dV = − F · dS S . A scalar potential is easier to work with because it does not have multiple components and its value is independent of the coordinate system.3. The vector ﬁeld (g or E) is said to be generated by the potential Φ.3 Classic examples Laplace’s equation: ∇2 u = 0 Diﬀusion equation (heat equation): ∂u = λ∇2 u. The potential is also directly related to the energy of the system. With the source term the inhomogeneous equation is called Poisson’s equation. in electrostatics. ∂t2 c = wave speed λ = diﬀusion coeﬃcient (or diﬀusivity) External to the mass distribution.3 2. The inhomogeneous term usually represents a ‘source’ or ‘force’ that generates or drives the quantity u. 2.2.2. Combine to obtain Poisson’s equation: ∇2 Φ = − ρ ǫ0 All involve the vector-invariant operator ∇2 .

.. .. ∂t2 ∂x c= T ρ e.. . . . Combine to obtain the one-dimensional wave equation Fy = T sin θ ≈ T ∂2y ∂2y = c2 2 . .. .. Further example: concentration of a contaminant in a gas. .3 Examples of the wave equation Now ∂y ∂x where θ is the angle between the x-axis and the tangent to the string........ . . . Thus ∂ (CT ) + ∇ · (−K∇T ) = 0 ∂t K ∂T = λ∇2 T. . .. ⊲ Heat conduction in a solid. . .g. . The string has uniform mass per unit length ρ and uniform tension T . . . λ ≈ 0. . Eliminate E: 1 ∂E ∂2B =− = −∇ × ∇ × (∇ × B) 2 ∂t ∂t µ0 ǫ0 Now use the identity ∇ × (∇ × B) = ∇(∇ · B) − ∇2 B and Maxwell’s equation ∇ · B = 0 to obtain the (vector) wave equation ∂2B = c2 ∇2 B... .. . . . Maxwell’s equations for the electromagnetic ﬁeld in a vacuum: ∇·E =0 ∇·B =0 ∂B ∇×E+ =0 ∂t ∂E 1 =0 ∇ × B − ǫ0 µ0 ∂t where E is electric ﬁeld.. ∂t2 c= 1 µ0 ǫ0 Waves on a string. . . .. .. . . .. . . Often the ﬂux of Q is directed down the gradient of Q through the linear relation (Fick’s law) F = −λ∇Q Combine to obtain the diﬀusion equation (if λ is independent of r): ∂Q = λ∇2 Q ∂t In a steady state Q satisﬁes Laplace’s equation. . copper: λ ≈ 1 cm s . .. .g.2 cm2 s−1 2.. . . ρ ≈ 1 g m−1 : c ≈ 200 m s−1 Electromagnetic waves.. B is magnetic ﬁeld... . .. . violin (D-)string: T ≈ 40 N. .. Heat ﬂux density: −K∇T (K is thermal conductivity). t) is small (|∂y/∂x| ≪ 1) and satisﬁes Newton’s second law for an element δx of the string: (ρ δx) ∂Fy ∂2y δx = δFy ≈ ∂t2 ∂x 41 Further example: sound waves in a gas. ..3. .. Example . . T is temperature.. .. . ... .. λ= ∂t C 2 −1 e.. . Heat per unit volume: CT (C is heat capacity per unit volume.....The amount of Q in the volume V changes only as the result of a net ﬂux through S. c ≈ 3 × 108 m s−1 . . c is the speed of light. . . The transverse displacement y(x. . . .. . .. . . µ0 is the permeability of free space and ǫ0 is the permittivity of free space. Conserved quantity: energy. . . c ≈ 300 m s−1 42 E obeys the same equation.

This condition is needed to eliminate unphysical solutions. Try a solution in which the variables appear in separate factors: Helmholtz equation (arises in wave problems): ∇2 u + k 2 u = 0 Klein–Gordon equation (arises in quantum mechanics): ∂2u = c2 (∇2 u − m2 u) ∂t2 2.4.1 Separation of variables (Cartesian coordinates) Diﬀusion equation We consider the one-dimensional diﬀusion equation (e. To determine the solution fully we also require initial conditions.g. Both must therefore equal a constant. t) = X(x)T (t) Substitute this into the PDE: X(x)T ′ (t) = λX ′′ (x)T (t) (recall that a prime denotes diﬀerentiation of a function with respect to its argument).e.g. semi-inﬁnite (having one end) or inﬁnite (having no end). Divide through by λXT to separate the variables: T ′ (t) X ′′ (x) = λT (t) X(x) The LHS depends only on t.3.4 2. For the diﬀusion equation this means specifying u as a function of x at some initial time (usually t = 0). we usually require instead that u be bounded (i. remain ﬁnite) as x → ±∞. i 2. while the RHS depends only on x.3. The PDE is separated into two ordinary diﬀerential equations (ODEs) T ′ + λk 2 T = 0 X ′′ + k 2 X = 0 with general solutions T = A exp(−λk 2 t) Burgers’ equation (describes shock waves): ∂u ∂u +u = λ∇2 u ∂t ∂x Nonlinear Schr¨dinger equation (describes solitons.5 Examples of nonlinear PDEs u(x. e. in optical ﬁbre o communication): ∂ψ = −∇2 ψ − |ψ|2 ψ ∂t These equations require diﬀerent methods of analysis. Typical boundary conditions at an end are: • u is speciﬁed (Dirichlet boundary condition) • ∂u/∂x is speciﬁed (Neumann boundary condition) If a boundary is removed to inﬁnity. conduction of heat along a metal bar): ∂u =λ 2 ∂t ∂x ∂2u 43 44 . which we call −k 2 (for later convenience).4 Examples of other second-order linear PDEs Schr¨dinger’s equation (quantum-mechanical wavefunction of a particle o of mass m in a potential V ): 2 ∂ψ =− ∇2 ψ + V (r)ψ i ∂t 2m The bar could be considered ﬁnite (having two ends).2.

n = 0. by any means. . ∂u ∂t speciﬁed at t = 0 u(x. L Combine the factors to obtain the elementary solution (let C = 1 WLOG) u = A cos n2 π 2 λt nπx exp − L L2 This is an example of a Fourier integral. 2. zero heat ﬂux) apply at the two ends x = 0. 2 ∂t ∂x Boundary conditions: u=0 at x = 0. . L. L Initial conditions: u. We may assume k 0 WLOG. k= . Then the admissible solutions of the X equation are nπ X = C cos(kx). . The decay rate is proportional to n2 . 3. The general solution is a linear combination in the form of an integral: ∞ n = 1. studied in section 4. then ∞ Trial solution: u(x.e. If the temperature at time t = 0 is speciﬁed. t) = X(x)T (t) X(x)T ′′ (t) = c2 X ′′ (x)T (t) X ′′ (x) T ′′ (t) = = −k 2 c2 T (t) X(x) T ′′ + c2 k 2 T = 0 X ′′ + k 2 X = 0 T = A sin(ckt) + B cos(ckt) nπ . k= L u(x.2 Wave equation (example) 0<x<L Each elementary solution represents a ‘decay mode’ of the bar. . The coeﬃcients An are just the Fourier coeﬃcients of the initial temperature distribution. 0) = n=0 An cos nπx L is known. . 1. 2.4. but it does for some of the most important examples • it is not supposed to be obvious that the most general solution can be written as a linear combination of separated solutions 2. u= 0 A(k) cos(kx) exp(−λk t) dk 45 2 46 . X = C sin(kx). The principle of superposition allows us to construct a general solution as a linear combination of elementary solutions: ∞ ∂2u ∂2u = c2 2 .X = B sin(kx) + C cos(kx) Finite bar : suppose the Neumann boundary conditions ∂u/∂x = 0 (i. The n = 0 mode represents a uniform temperature distribution and does not decay. . t) = n=0 An cos nπx n2 π 2 λt exp − L L2 The coeﬃcients An can be determined from the initial conditions. Notes: • separation of variables doesn’t work for all linear equations. Semi-inﬁnite bar : The solution X ∝ cos(kx) is valid for any real value of k so that u is bounded as x → ±∞. The solution decays in time unless k = 0.

Elementary solution: u = A sin ∂u nπx nπc = An sin ∂t L L n=1 The Fourier series for the initial conditions determine all the coeﬃcients An . kz = L kx = nx = 1. −kz : 2 X ′′ + kx X = 0 u= n=1 An sin nπct L + Bn cos nπct L sin nπx L 2 Y ′′ + ky Y = 0 2 Z ′′ + kz Z = 0 At t = 0: ∞ u= n=1 Bn sin ∞ nπx L 2 2 2 k 2 = kx + k y + kz X ∝ sin(kx x). 3. y. L ny π ky = . 3. 2. . nz = 1. .Elementary solution: u = A sin General solution: ∞ nπct L + B cos nπct L sin nπx L X ′′ (x) Y ′′ (y) Z ′′ (z) + + + k2 = 0 X(x) Y (y) Z(z) The variables are separated: each term must equal a constant. . 2. 2. Z ∝ sin(kz z).4. ny πy nz πz nx πx sin sin L L L The solution is possible only if k 2 is one of the discrete values A three-dimensional example in a cube: ∇ u + k u = 0. Boundary conditions: u=0 Trial solution: u(x. 3. 2. z < L 2 k 2 = (n2 + ny + n2 ) x z π2 L2 These are the eigenvalues of the problem. . z) = X(x)Y (y)Z(z) X ′′ (x)Y (y)Z(z) + X(x)Y ′′ (y)Z(z) + X(x)Y (y)Z ′′ (z) + k 2 X(x)Y (y)Z(z) = 0 47 48 on the boundaries 2 2 0 < x. . . −ky . . Call the 2 2 2 constants −kx . L nz π . . . . y. Bn .3 Helmholtz equation (example) nx π . ny = 1. Y ∝ sin(ky y).

0<x<ǫ otherwise is the momentum. In the limit ǫ → 0. Suppose that the force is applied only in the time interval 0 < t < δt.1 Newton’s second law for a particle of mass m moving in one dimension subject to a force F (t) is dp =F dt where p=m dx dt The value of H(0) does not matter for most purposes.1. We may wish to represent mathematically a situation in which the momentum is changed instantaneously. etc. Consider the particular ‘top-hat’ function δǫ (x) = 1/ǫ. The delta function is introduced to meet these requirements. This function can also be written as δǫ (x) = H(x) − H(x − ǫ) ǫ The area under the curve is equal to one. It is sometimes taken to be 1/2.1. in such a way that its integral I is ﬁnite and non-zero. if the particle experiences a collision.1 Green’s functions Impulses and the delta function Physical motivation 3. F must tend to inﬁnity while δt tends to zero.g.3 3. we obtain a ‘spike’ of inﬁnite height. where ǫ is a positive parameter. or a localized source of heat. vanishing width and unit area localized at x = 0. e. 49 50 . This limit is the Dirac delta function. To achieve this. 0. waves. δ(x). The total change in momentum is δt δp = 0 F (t) dt = I and is called the impulse. An alternative notation for H(x) is θ(x). x<0 x>0 3.2 Step function and delta function We start by deﬁning the Heaviside unit step function H(x) = 0. 1. H(x) can be used to construct other discontinuous functions. Here we need a mathematical object of inﬁnite density and zero spatial extension but having a non-zero integral eﬀect. In other applications we may wish to represent an idealized point charge or point mass.

x ǫ δ(ξ) dξ = H(x) −∞ but this is not speciﬁc enough to describe it. we obtain x The indeﬁnite integral of δǫ (x) is 0. This result (the boxed formula above) is equivalent to the substitution property of the Kronecker delta: 3 We might think of deﬁning the delta function as δ(x) = ∞. It is not a function in the usual sense but a ‘generalized function’ or ‘distribution’. the integral can be taken over any interval that includes the point x = ξ.In the limit ǫ → 0. It can be placed anywhere: q δ(x − ξ) represents a ‘spike’ of strength q located at x = ξ. It also follows that ∞ or. the solution for its momentum is p = I H(t). i. x 0 x δǫ (ξ) dξ = x/ǫ. One way to justify this result is as follows. x=0 aj δij = ai j=1 51 52 . This is because the unit spike picks out the value of the function at the location of the spike. f (x) = g ′ (x).1) can be represented as F (t) = I δ(t) which represents a spike of strength I localized at t = 0.2 Other deﬁnitions as required. ǫ Instead. the deﬁning property of δ(x) can be taken to be ∞ f (x)δ(x) dx = f (0) −∞ where f (x) is any continuous function. ∞ ǫ→0 −∞ lim f (x)δǫ (x − ξ) dx = g ′ (ξ) = f (ξ) 3. Consider a continuous function f (x) with indeﬁnite integral g(x). equivalently. If the particle is at rest before the impulse.1. ′ −∞ f (x)δ(x − ξ) dx = f (ξ) Since δ(x − ξ) = 0 for x = ξ.e. In other physical applications δ(x) is used to represent an idealized point charge or localized source. δ(x) = H (x) Our idealized impulsive force (section 3. x = 0 0. 0 x −∞ 1. Then ∞ −∞ f (x)δǫ (x − ξ) dx = 1 ξ+ǫ f (x) dx ǫ ξ g(ξ + ǫ) − g(ξ) = ǫ From the deﬁnition of the derivative.

Consider. Gaussians) that generate δ(x). However δ(x)δ(y) is permissible and represents a point source in a two-dimensional space. In particular. smooth choices for δǫ (x) include δǫ (x) = π(x2 ǫ + ǫ2 ) x2 2ǫ2 where f (x) is any diﬀerentiable function. as an example. H(x)δ(x) is meaningless. e. the linear second-order ODE d2 y + y = δ(x) dx2 (1) f (x)δ ′ (x − ξ) dx = − ∞ −∞ f ′ (x)δ(x − ξ) dx = −f ′ (ξ) If x represents time. 3.g. δǫ (x) = (2πǫ2 )−1/2 exp − Not all operations are permitted on generalized functions. 53 54 . The deﬁning property of δ ′ (x) can be taken to be ∞ −∞ If a diﬀerential equation involves a step function or delta function. two generalized functions of the same variable cannot be multiplied together. this equation could represent the behaviour of a simple harmonic oscillator in response to an impulsive force. δ(x) can also be seen as the limit of localized functions other than our top-hat example.g. The equation can be solved separately on either side of the discontinuity and the two parts of the solution connected by applying the appropriate matching conditions.4 Diﬀerential equations containing delta functions 3.3 More on generalized functions Derivatives of the delta function can also be deﬁned as the limits of sequences of functions. and have both positive and negative ‘spikes’ localized at x = 0.The Dirac delta function can be understood as the equivalent of the Kronecker delta symbol for functions of a continuous variable. Alternative. This follows from an integration by parts before the limit is taken. this generally implies a lack of smoothness in the solution. The generating functions for δ ′ (x) are the derivatives of (smooth) functions (e.

we obtain C−A=0 D−B =1 and so the general solution is y= A cos x + B sin x. The size of the jump can be found by integrating equation (1) from x = −ǫ to x = ǫ and letting ǫ → 0: dy dy ≡ lim ǫ→0 dx dx x=ǫ =1 x=−ǫ Since the general solution of a second-order ODE should contain only two arbitrary constants. x<0 x>0 In particular. dy =1 dx at x = 0 Applying these. 55 56 . A cos x + (B + 1) sin x.In each of the regions x < 0 and x > 0 separately. the jump conditions are [y] = 0. then A = B = 0 and the solution is y = H(x) sin x. the right-hand side vanishes and the general solution is a linear combination of cos x and sin x. C cos x + D sin x. Since y is continuous. if the oscillator is at rest before the impulse occurs. We may write y= A cos x + B sin x. it must be possible to relate C and D to A and B. x<0 x>0 y must be of the same class as xH(x): it is continuous but has a jump in its ﬁrst derivative. What is the nature of the non-smoothness in y? It can be helpful to think of a sequence of functions related by diﬀerentiation: Since y is bounded it makes no contribution under this procedure.

α2 .2 Initial-value and boundary-value problems If the BCs are speciﬁed at diﬀerent points we have a two-point boundaryvalue problem.e.g. The principle of superposition applies to linear ODEs as to all linear equations.3 for a x b subject to y(a) = y(b) = 0 Green’s function for an initial-value problem Suppose we want to solve the inhomogeneous ODE y ′′ (x) + py ′ (x) + qy(x) = f (x) subject to the homogeneous BCs y(0) = y ′ (0) = 0 Green’s function G(x.) The general form of a linear BC at a point x = a is α1 y ′ (a) + α2 y(a) = α3 57 .5. otherwise it is inhomogeneous (forced). ξ) = Notes: • G(x. Then the general solution of the homogeneous equation is Ay1 + By2 . then the general solution of the inhomogeneous equation is y = Ay1 + By2 + yp Here y1 and y2 are known as complementary functions and yp as a particular integral. α2 are not both zero. If α3 = 0 the BC is homogeneous. i. Ly1 = Ly2 = 0 and y2 is not simply a constant multiple of y1 .e. Suppose that y1 (x) and y2 (x) are linearly independent solutions of the homogeneous equation. i.1 Inhomogeneous linear second-order ODEs Complementary functions and particular integral where α1 . The equation is homogeneous (unforced) if f = 0.5. ξ) is deﬁned for x 0 and ξ 58 0 ∂G (0. Lyp = f . ξ) = 0 ∂x (4) (3) (2) for x 0 (1) Two boundary conditions (BCs) must be speciﬁed to determine fully the solution of a second-order ODE. α3 are constants and α1 . If both BCs are speciﬁed at the same point we have an initial-value problem.3.g. e. If yp (x) is any solution of the inhomogeneous equation. to solve m d2 x = F (t) for t dt2 0 subject to x = dx = 0 at t = 0 dt The general linear second-order ODE with constant coeﬃcients has the form y (x) + py (x) + qy(x) = f (x) ′′ ′ or Ly = f where L is a linear operator such that Ly = y ′′ + py ′ + qy. 3. (The ODE allows y ′′ and higher derivatives to be expressed in terms of y and y ′ . e. A boundary condition is usually an equation relating the values of y and y ′ at one point. to solve y ′′ (x) + y(x) = f (x) 3.5 3. ξ) for this problem is the solution of ∂2G ∂G + qG = δ(x − ξ) +p ∂x2 ∂x subject to the homogeneous BCs G(0.5.

ξ) = A(ξ)y1 (x) + B(ξ)y2 (x). the solution must be of the form G(x. This works because the ODE is linear and the BCs are homogeneous. The jump conditions can be found by integrating equation (3) from x = ξ − ǫ to x = ξ + ǫ and letting ǫ → 0: ∂G ∂G ≡ lim ǫ→0 ∂x ∂x x=ξ+ǫ To determine A. the solution of equation (1) is then ∞ Since p ∂G/∂x and qG are bounded they make no contribution under this procedure. B. The Wronskian W (x) of two solutions y1 (x) and y2 (x) of a second-order ODE is the determinant of the Wronskian matrix: W [y1 . G must be continuous but have a discontinuous ﬁrst derivative. ∂G =1 ∂x at x = ξ (6) y(x) = 0 G(x. ξ) satisﬁes the same equation and boundary conditions with respect to x as y does • however. rather than a distributed forcing f (x) If Green’s function can be found. The meaning of equation (5) is that the response to distributed forcing (i. the solution of Ly = f ) is obtained by summing the responses to forcing at individual points. ξ)f (ξ) dξ (5) Suppose that two complementary functions y1 .e. 0 x<ξ x>ξ Ly(x) = 0 LG f (ξ) dξ = 0 δ(x − ξ)f (ξ) dξ = f (x) as required. C(ξ)y1 (x) + D(ξ)y2 (x). Applying L to equation (5) gives ∞ ∞ The Wronskian is non-zero unless y1 and y2 are linearly dependent (one is a constant multiple of the other).• G(x. weighted by the force distribution. Since the right-hand side of equation (3) vanishes for x < ξ and x > ξ separately. Boundary conditions at x = 0: A(ξ)y1 (0) + B(ξ)y2 (0) = 0 ′ ′ A(ξ)y1 (0) + B(ξ)y2 (0) = 0 In matrix form: y1 (0) y2 (0) ′ ′ y1 (0) y2 (0) 0 A(ξ) = 0 B(ξ) =1 x=ξ−ǫ Since the determinant W (0) of the matrix is non-zero the only solution is A(ξ) = B(ξ) = 0. 60 59 . note that equation (3) is just an ODE involving a delta function. let L be the diﬀerential operator L= ∂2 ∂ +p +q 2 ∂x ∂x Then equations (1) and (3) read Ly = f and LG = δ(x−ξ) respectively. the jump conditions are [G] = 0. y2 are known. in which ξ appears as a parameter. It also follows from equation (4) that y satisﬁes the boundary conditions (2) as required. C. Since G is continuous. To satisfy this equation. y2 ] = y1 y2 ′ ′ ′ ′ = y 1 y2 − y2 y1 y1 y2 To verify this. To ﬁnd Green’s function. it is the response to forcing that is localized at a point x = ξ. D we apply the boundary conditions (4) and the jump conditions (6).

subject to the two-point homogeneous BCs (1) (2) In matrix form: ya (ξ) yb (ξ) ′ ′ ya (ξ) yb (ξ) Solution: ′ 1 yb (ξ) −yb (ξ) −A(ξ) = ′ B(ξ) W (ξ) −ya (ξ) ya (ξ) α1 y ′ (a) + α2 y(a) = 0 β1 y ′ (b) + β2 y(b) = 0 Green’s function G(x. (These can always be found. B we apply the jump conditions [G] = 0 and [∂G/∂x] = 1 at x = ξ: B(ξ)yb (ξ) − A(ξ)ya (ξ) = 0 ′ ′ B(ξ)yb (ξ) − A(ξ)ya (ξ) = 1 We now consider a similar equation Ly = f for a x b. ξ)f (ξ) dξ −y2 (ξ)/W (ξ) 0 = y1 (ξ)/W (ξ) 1 Green’s function is therefore G(x. ξ) = 0 ∂x and is deﬁned for a x b and a β1 (5) ξ b. Let ya (x) be a complementary function satisfying the left-hand BC (1). ya (ξ)yb (x) W (ξ) . the solution of Ly = f subject to the BCs (1) and (2) is then b 0 C(ξ) = 1 D(ξ) y(x) = a G(x. ξ) = A(ξ)ya (x). To determine A. ξ) = 0 α1 ∂x 61 −A(ξ) 0 = B(ξ) 1 (3) 0 −yb (ξ)/W (ξ) = 1 ya (ξ)/W (ξ) Green’s function is therefore (4) G(x.4 Green’s function for a boundary-value problem satisfying the BCs (4) and (5). In matrix form: y1 (ξ) y2 (ξ) ′ ′ y1 (ξ) y2 (ξ) Solution: ′ 1 y2 (ξ) −y2 (ξ) C(ξ) = ′ D(ξ) W (ξ) −y1 (ξ) y1 (ξ) By a similar argument. ξ) = ya (x)yb (ξ) W (ξ) . B(ξ)yb (x). a ξ x x 62 ξ b .) Since the right-hand side of equation (3) vanishes for x < ξ and x > ξ separately. ξ) + α2 G(a. W (ξ) We ﬁnd Green’s function by a similar method. y1 (ξ)y2 (x)−y1 (x)y2 (ξ) . ξ) for this problem is the solution of LG = δ(x − ξ) subject to the homogeneous BCs ∂G (a. and let yb (x) be a complementary function satisfying the right-hand BC (2). the solution must be of the form G(x.5. a x<ξ ξ<x b 0 x x ξ ξ 3.Jump conditions at x = ξ: C(ξ)y1 (ξ) + D(ξ)y2 (ξ) = 0 ′ ′ C(ξ)y1 (ξ) + D(ξ)y2 (ξ) = 1 ∂G (b. ξ) = 0. ξ) + β2 G(b.

... Wronskian ′ ′ W = ya yb − yb ya = sin x cos(x − 1) − sin(x − 1) cos x = sin 1 Example . . ... . . ... .. . .. . . .. .. ... . .. . ... . 63 64 ... yb ] vanishes.. . .. . ..5 Examples of Green’s function Example .. . ..... .... ... ... ... . .. .. . . y2 = sin x... .. . .. ... . This happens if ya is proportional to yb .. . . . . . . 0 sin ξ sin(x − 1)/ sin 1. . = 0... .. . .. .. . . . . . the solution will not be unique..... . .. . . .... ... .. . .. . . . . Wronskian ′ ′ W = y1 y2 − y2 y1 = cos2 x + sin2 x = 1 sin x sin(ξ − 1)/ sin 1... .5... .. . i. .... .... . .. .. .. .... ... ... .. ... . 0 x x ξ ξ y(x) = 0 sin(x − ξ)f (ξ) dξ . yb = sin(x − 1) satisfying left and right BCs respectively. y(0) = y(1) = 0 Complementary functions ya = sin x. ... . ξ)f (ξ) dξ sin x 1 sin(x − 1) x sin ξ f (ξ) dξ + sin(ξ − 1)f (ξ) dξ sin 1 sin 1 x 0 . .. .. In this (exceptional) case the equation Ly = f may not have a solution satisfying the BCs.... ...... .... . . ... . . ⊲ Find Green’s function for the two-point boundary-value problem y ′′ (x) + y(x) = f (x)... sin(x − ξ)... . .. .. . .. . . if it does. . .. .... ....e.. ⊲ Find Green’s function for the initial-value problem y ′′ (x) + y(x) = f (x). if there is a complementary function that happens to satisfy both homogeneous BCs..... .. .. .... . . ξ) = So 1 Complementary functions y1 = cos x. .. .. .. . .. 3. .... ..This method fails if the Wronskian W [ya . .. ξ) = So x y(x) = 0 G(x. y(0) = y ′ (0) = 0 Thus G(x. ..... . ξ x x ξ 1 Now y1 (ξ)y2 (x) − y1 (x)y2 (ξ) = cos ξ sin x − cos x sin ξ = sin(x − ξ) Thus G(x.

The Fourier coeﬃcients are found from an = 2 P 2 P P/2 f (x) cos(kn x) dx −P/2 P/2 bn = f (x) sin(kn x) dx −P/2 4.g. data analysis. statistics and number theory. Certain operations on a function are more easily computed ‘in the Fourier domain’. then the harmonics have frequencies n/P where n is an integer. The Fourier series gives the extension of the function by periodic repetition. Such a series is also used to write any function that is deﬁned only on an interval of length P . The Fourier transform provides a complementary way of looking at a function.4 4. e. the Fourier transform has innumerable applications in diverse ﬁelds such as astronomy. If the period is P . (an − ibn )/2. It can then be written as a Fourier series 1 f (x) = a0 + an cos(kn x) + bn sin(kn x) 2 n=1 n=1 65 ∞ ∞ 66 . signal processing. This idea is particularly useful in solving certain kinds of diﬀerential equation.2 Relation to Fourier series Deﬁne (a−n + ib−n )/2. The Fourier transform generalizes this idea to functions that are not periodic. cn = a0 /2. −P/2 < x < P/2. optics. is the wavenumber of the nth harmonic. n<0 n=0 n>0 A function f (x) has period P if f (x + P ) = f (x) for all x. Furthermore. The ‘harmonics’ can then have any frequency.1 The Fourier transform Motivation where kn = 2πn P A periodic signal can be analysed into its harmonic components by calculating its Fourier series.

g. They diﬀer in the placement of the 2π factor and in the signs of the exponents. i. Several diﬀerent deﬁnitions of the Fourier transform are in use.e.Then the same result can be expressed more simply and compactly in the notation of the complex Fourier series ∞ Notes: • the Fourier transform operation is sometimes denoted by ˜ f (k) = F[f (x)].g. have a ﬁnite number of discontinuities and be ‘absolutely integrable’. let P → ∞ so that the function is deﬁned on the entire real line without any periodicity. The Fourier sum becomes an integral ∞ f (x) = −∞ dn 1 ˜ f (k) eikx dk P dkn ∞ −∞ However. To approach the Fourier transform. e. certain technical conditions on f (x) are required ˜ A necessary condition for f (k) to exist for all real values of k (in the sense of an ordinary function) is that f (x) → 0 as x → ±∞. The deﬁnition used here is probably the most common. How to remember this deﬁnition: • the sign of the exponent is diﬀerent in the forward and inverse transforms • the inverse transform means that the function f (x) is synthesized from a linear combination of basis functions eikx • the division by 2π always accompanies an integration with respect to k 1 = 2π ˜ f (k) eikx dk We therefore have the forward Fourier transform (Fourier analysis) ˜ f (k) = ∞ −∞ f (x) e−ikx dx and the inverse Fourier transform (Fourier synthesis) f (x) = 1 2π ∞ −∞ ˜ f (k) eikx dk 67 68 . we will see that Fourier transforms can be assigned in a wider sense to some functions that do not satisfy all of these conditions. corresponding to P cn . The wavenumbers kn are replaced by a continuous variable k. We deﬁne ˜ f (k) = ∞ −∞ f (x) e−ikx dx |f (x)| dx < ∞. ∞ −∞ f (x) = n=−∞ cn eikn x P/2 −P/2 1 cn = P P/2 −P/2 f (x) e−ikn x dx This expression for cn can be veriﬁed using the orthogonality relation 1 P ei(kn −km )x dx = δmn which follows from an elementary integration. Warning. Otherwise the Fourier integral does not converge (e. ˜ A set of suﬃcient conditions for f (k) to exist is that f (x) have ‘bounded variation’. ˜ f (x) = F −1 [f (k)] • the variables are often called t and ω rather than x and k • it is sometimes useful to consider complex values of k • for a rigorous proof. for k = 0). f (x) = 1.

3 Examples Example (1): top-hat function: f (x) = ˜ f (k) = c.g. if a = −1. b a<x<b otherwise ic −ikb e − e−ika k a e.4. 0. b = 1 and c = 1: 2 sin k i −ik ˜ e − eik = f (k) = k k c e−ikx dx = Example (3): Gaussian function (normal distribution): 2 f (x) = (2πσx )−1/2 exp − 2 ˜ f (k) = (2πσx )−1/2 ∞ −∞ x2 2 2σx x2 − ikx 2 2σx dx exp − Change variable to z= so that − e−x e−ikx dx 1 e−(1+ik)x 1 + ik ∞ 0 x + iσx k σx Example (2): f (x) = e−|x| ˜ f (k) = = 0 −∞ ∞ 0 0 −∞ σ2 k2 x2 z2 = − 2 − ikx + x 2 2σx 2 ex e−ikx dx + Then 2 ˜ f (k) = (2πσx )−1/2 ∞ −∞ 1 e(1−ik)x 1 − ik 1 1 = + 1 − ik 1 + ik 2 = 1 + k2 − exp − z2 2 dz σx exp − 2 σx k 2 2 σ2 k2 = exp − x 2 69 70 .

e. The result is proportional to a standard Gaussian function of k: k2 2 ˜ f (k) ∝ (2πσk )−1/2 exp − 2 2σk of width (standard deviation) σk related to σx by σx σk = 1 This illustrates a property of the Fourier transform: the narrower the function of x. Diﬀerentiation/multiplication: g(x) = f ′ (x) g(x) = xf (x) Duality: ˜ g(x) = f (x) i. transforming twice returns (almost) the same function Complex conjugation and parity inversion (for real x and k): ˜ g(x) = [f (x)]∗ ⇔ g (k) = [f (−k)]∗ ˜ Symmetry: f (−x) = ±f (x) ⇔ ˜ ˜ f (−k) = ±f (k) (10) ⇔ g (k) = 2πf (−k) ˜ (8) ⇔ ⇔ ˜ g (k) = ik f (k) ˜ ˜ g (k) = if ′ (k) ˜ (6) (7) (9) 71 72 .4 Basic properties of the Fourier transform Linearity: g(x) = αf (x) h(x) = f (x) + g(x) Rescaling (for real α): g(x) = f (αx) ⇔ g (k) = ˜ 1 ˜ k f |α| α (3) ⇔ ⇔ ˜ g (k) = αf (k) ˜ ˜ ˜ h(k) = f (k) + g (k) ˜ (1) (2) where we use the standard Gaussian integral ∞ −∞ Shift/exponential (for real α): g(x) = e g(x) = f (x − α) iαx exp − z2 2 dz = (2π)1/2 f (x) ⇔ ⇔ ˜ g (k) = e−ikα f (k) ˜ ˜ g (k) = f (k − α) ˜ (4) (5) Actually there is a slight cheat here because z has an imaginary part. the wider the function of k.4. This will be explained next term.

i.e. f is even or odd. then ˜ f (−k) = ∞ −∞ ∞ −∞ −ikα f (x) e+ikx dx ±f (−x) eikx dx ∞ g(x) = f (x) ∞ ′ = −∞ g (k) = ˜ −∞ f ′ (x) e−ikx dx ∞ −∞ ∞ =± f (x)(−ik)e−ikx dx f (y) e−iky dy −∞ = f (x) e−ikx ˜ = ik f (k) − −∞ ˜ = ±f (k) The integrated part vanishes because f (x) must tend to zero as x → ±∞ in order to possess a Fourier transform. 73 74 .Sample derivations: property (3): g(x) = f (αx) ∞ Property (7): g(x) = xf (x) ∞ g (k) = ˜ −∞ f (αx) e−ikx dx ∞ g(k) = ˜ dy α =i xf (x) e−ikx dx f (x)(−ix)e−ikx dx = sgn(α) −∞ f (y) e−iky/α −∞ ∞ −∞ ∞ 1 = f (y) e−i(k/α)y dy |α| −∞ 1 ˜ k f = |α| α ∞ d =i f (x) e−ikx dx dk −∞ ˜ = if ′ (k) Property (4): g(x) = f (x − α) ∞ Property (8): ˜ g(x) = f (x) ∞ g (k) = ˜ −∞ ∞ f (x − α) e −ikx dx g(k) = ˜ −∞ ˜ f (x) e−ikx dx = =e Property (6): f (y) e−ik(y+α) dy ˜ f (k) = 2πf (−k) Property (10): if f (−x) = ±f (x).

δ(x). is deﬁned by ∞ h(x) = −∞ f (y)g(x − y) dy Note that the sum of the arguments of f and g is the argument of h.5 The delta function and the Fourier transform Now formally apply the inverse transform: δ(x) = 1 2π ∞ −∞ Consider the Gaussian function of example (3): f (x) = 2 (2πσx )−1/2 exp 2 2 1 eikx dk x2 − 2 2σx Relabel the variables and rearrange (the exponent can have either sign): ∞ −∞ σ k ˜ f (k) = exp − x 2 ∞ e±ikx dx = 2πδ(k) The Gaussian is normalized such that f (x) dx = 1 −∞ As the width σx tends to zero. At the same time. The substitution property allows us to verify the Fourier transform of the delta function: ˜ δ(k) = ∞ −∞ −∞ ∞ −∞ g(y)f (x − y) dy f (z)g(x − z) dz = = [f ∗ g](x) δ(x) e−ikx dx = e−ik0 = 1 75 76 .1 The convolution theorem Deﬁnition of convolution The convolution of two functions. Therefore the Fourier transform of a unit constant (1) is 2πδ(k). The value of f (x) tends to zero for any non-zero value of x. Convolution is a symmetric operation: ∞ [g ∗ f ](x) = In this limit we approach the Dirac delta function. the ˜ value of f (k) tends to unity for any ﬁnite value of k. Note that a constant function does not satisfy the necessary condition for the existence of a (regular) Fourier transform. h = f ∗ g.4. 4.6. But it does have a Fourier transform in the space of generalized functions.6 4. the Gaussian becomes taller and narrower. but the area under the curve remains the same.

4.6.2

Interpretation and examples

In statistics, a continuous random variable x (e.g. the height of a person drawn at random from the population) has a probability distribution (or density) function f (x). The probability of x lying in the range x0 < x < x0 + δx is f (x0 )δx, in the limit of small δx. If x and y are independent random variables with distribution functions f (x) and g(y), then let the distribution function of their sum, z = x+ y, be h(z). Now, for any given value of x, the probability that z lies in the range z0 < z < z0 + δz is just the probability that y lies in the range z0 − x < y < z0 − x + δz which is g(z0 − x)δz. Therefore

∞

A point mass M at position R gives rise to a gravitational potential Φ(r) = −GM/|r − R|. A continuous mass density ρ(r) can be thought of as a sum of inﬁnitely many point masses ρ(R) d3 R at positions R. The resulting gravitational potential is Φ(r) = −G ρ(R) 3 d R |r − R|

which is the (3D) convolution of the mass density ρ(r) with the potential of a unit point charge at the origin, −G/|r|. 4.6.3 The convolution theorem

**The Fourier transform of a convolution is ˜ h(k) = =
**

−∞ ∞ −∞ ∞ −∞ ∞ −∞ ∞ ∞ −∞ ∞

f (y)g(x − y) dy e−ikx dx

**f (y)g(x − y) e−ikx dx dy f (y)g(z) e−iky e−ikz dz dy
**

∞ −∞

h(z0 )δz =

−∞

f (x)g(z0 − x)δz dx

=

−∞ ∞

(z = x − y)

which implies h=f ∗g The eﬀect of measuring, observing or processing scientiﬁc data can often be described as a convolution of the data with a certain function. e.g. when a point source is observed by a telescope, a broadened image is seen, known as the point spread function of the telescope. When an extended source is observed, the image that is seen is the convolution of the source with the point spread function. In this sense convolution corresponds to a broadening or distortion of the original data.

=

−∞

f (y) e−iky dy

g(z) e−ikz dz

˜ g = f (k)˜(k) Similarly, the Fourier transform of f (x)g(x) is This means that: • convolution is an operation best carried out as a multiplication in the Fourier domain • the Fourier transform of a product is a complicated object • convolution can be undone (deconvolution) by a division in the Fourier domain. If g is known and f ∗ g is measured, then f can be obtained, in principle. 78

1 ˜ 2π [f ∗

g ](k). ˜

77

4.6.4

Correlation

This result (or the special case g = f ) is the Wiener–Khinchin theorem. The autoconvolution and autocorrelation of f are f ∗ f and f ⊗ f . Their ˜ ˜ Fourier transforms are f 2 and |f |2 , respectively.

**The correlation of two functions, h = f ⊗ g, is deﬁned by
**

∞

h(x) =

−∞

[f (y)]∗ g(x + y) dy

4.7

Parseval’s theorem

Now the argument of h is the shift between the arguments of f and g. Correlation is a way of quantifying the relationship between two (typically oscillatory) functions. If two signals (oscillating about an average value of zero) oscillate in phase with each other, their correlation will be positive. If they are out of phase, the correlation will be negative. If they are completely unrelated, their correlation will be zero. If we apply the inverse transform to the WK theorem we ﬁnd ∞ ∞ 1 ˜ [f (y)]∗ g(x + y) dy = [f (k)]∗ g (k) eikx dk ˜ 2π −∞ −∞ Now set x = 0 and relabel y → x to obtain Parseval’s theorem ∞ ∞ 1 ˜ [f (k)]∗ g (k) dk ˜ [f (x)]∗ g(x) dx = 2π −∞ −∞ The special case used most frequently is when g = f :

∞ −∞

|f (x)|2 dx =

1 2π

∞ −∞

˜ |f (k)|2 dk

Note that division by 2π accompanies the integration with respect to k. Parseval’s theorem means that the Fourier transform is a ‘unitary transformation’ that preserves the ‘inner product’ between two functions (see later!), in the same way that a rotation preserves lengths and angles. Alternative derivation using the delta function: The Fourier transform of a correlation is ˜ h(k) = =

−∞ ∞ −∞ ∞ ∞ −∞ ∞ ∞ −∞ ∞ ∞ −∞ ∞

[f (x)]∗ g(x) dx =

−∞

[f (y)]∗ g(x + y) dy e−ikx dx

**[f (y)]∗ g(x + y) e−ikx dx dy [f (y)]∗ g(z) eiky e−ikz dz dy
**

∗ ∞ −∞

1 2π

∞ −∞

˜ f (k) eikx dk

∗

1 2π

∞ −∞

g (k ′ ) eik x dk ′ dx ˜

′

=

−∞ −∞ ∞

(z = x + y)

=

−∞

f (y) e−iky dy

g(z) e−ikz dz

˜ = [f (k)]∗ g (k) ˜ 79

∞ ∞ ∞ 1 ′ ˜ [f (k)]∗ g (k ′ ) ei(k −k)x dx dk ′ dk ˜ = 2 (2π) −∞ −∞ −∞ ∞ ∞ 1 ˜ [f (k)]∗ g (k ′ ) 2πδ(k ′ − k) dk ′ dk ˜ = (2π)2 −∞ −∞ ∞ 1 ˜ = [f (k)]∗ g (k) dk ˜ 2π −∞

80

4.8

Power spectra

5

5.1

**Matrices and linear algebra
**

Motivation

The quantity ˜ Φ(k) = |f (k)|2 appearing in the Wiener–Khinchin theorem and Parseval’s theorem is the (power) spectrum or (power) spectral density of the function f (x). The WK theorem states that the FT of the autocorrelation function is the power spectrum. This concept is often used to quantify the spectral content (as a function of angular frequency ω) of a signal f (t). The spectrum of a perfectly periodic signal consists of a series of delta functions at the principal frequency and its harmonics, if present. Its autocorrelation function does not decay as t → ∞. White noise is an ideal random signal with autocorrelation function proportional to δ(t): the signal is perfectly decorrelated. It therefore has a ﬂat spectrum (Φ = constant). Less idealized signals may have spectra that are peaked at certain frequencies but also contain a general noise component.

Many scientiﬁc quantities are vectors. A linear relationship between two vectors is described by a matrix. This could be either • a physical relationship, e.g. that between the angular velocity and angular momentum vectors of a rotating body • a relationship between the components of (physically) the same vector in diﬀerent coordinate systems Linear algebra deals with the addition and multiplication of scalars, vectors and matrices. Eigenvalues and eigenvectors are the characteristic numbers and directions associated with matrices, which allow them to be expressed in the simplest form. The matrices that occur in scientiﬁc applications usually have special symmetries that impose conditions on their eigenvalues and eigenvectors. Vectors do not necessarily live in physical space. In some applications (notably quantum mechanics) we have to deal with complex spaces of various dimensions.

5.2

5.2.1

Vector spaces

Abstract deﬁnition of scalars and vectors

We are used to thinking of scalars as numbers, and vectors as directed line segments. For many purposes it is useful to deﬁne scalars and vectors in a more general (and abstract) way. Scalars are the elements of a number ﬁeld, of which the most important examples are: 81 82

. on the other hand. Then ek can be expressed as a linear combination of the other vectors: ek = − ei i=k xi xk 84 . xm are scalars. . . . Notes: • vector multiplication is not deﬁned in general • R2 is not quite the same as C because C has a rule for multiplication • R3 is not quite the same as physical space because physical space has a rule for the distance between two points (i. . The span of S is the set of all vectors that are linear combinations of S. . . . . The vectors of S are said to be linearly independent if no non-trivial linear combination of S equals zero. A vector space V : • is a set of elements on which the operations of vector addition and scalar multiplication are deﬁned and satisfy certain axioms • is closed under these operations • includes an identity element (the vector 0) for vector addition We will write vectors in bold italic face (or underlined in handwriting). . αxn ) 83 Hence we have Rn and Cn . associative and distributive) • is closed under addition and multiplication • includes identity elements (the numbers 0 and 1) for addition and multiplication • includes inverses (negatives and reciprocals) for addition and multiplication for every element. . . where x1 . then such an equation holds for some non-trivial values of the coeﬃcients x1 . . . . then S is said to span V. An element of F n is a list of n scalars.e. if physical space is approximated as Euclidean) The formal axioms of number ﬁelds and vector spaces can be found in books on linear algebra. If the span of S is the entire vector space V . (x1 . 5.2. if e1 x1 + e2 x2 + · · · + em xm = 0 implies x1 = x2 = · · · = xm = 0 If. . xn ). . . except that 0 has no reciprocal Vectors are the elements of a vector space (or linear space) deﬁned over a number ﬁeld F . . A linear combination of S is any vector of the form e1 x1 + e2 x2 + · · · + em xm . . . . . . the vectors are linearly dependent. .e. xm . Scalar multiplication means multiplying a vector x by a scalar α to obtain the vector αx. . . The basic example of a vector space is F n . This is called an n-tuple. . . . . yn ) = (x1 + y1 .• R. e2 . Suppose that xk is one of the non-zero coeﬃcients. .2 Span and linear dependence Let S = {e1 . x2 . em } be a subset of vectors in V . Note that 0x = 0 and 1x = x. i. xn ) = (αx1 . . Vector addition and scalar multiplication are deﬁned componentwise: (x1 . the set of real numbers • C. . the set of complex numbers A number ﬁeld F : • is a set of elements on which the operations of addition and multiplication are deﬁned and satisfy the usual laws of arithmetic (commutative. . Pythagoras’s theorem. xn + yn ) α(x1 . . where xi ∈ F . xn ) + (y1 .

0 1 1 0 0 0 f (x) = n=−∞ fn einx Here f (x) is the ‘vector’ and fn are its ‘components’ with respect to the ‘basis’ of functions einx . i} z =1·x+i·y ⊲ Example (3): real 2 × 2 symmetric matrices: vectors: matrices of the form x y . 1)} (x. y. . e. .g. n. 1.3 Basis and dimension 5. e2 . R3 : vectors: triples (x. the set of functions deﬁned on the interval 0 < x < 2π and having Fourier series ∞ three-dimensional vector space over R possible basis: 0 0 0 1 1 0 . e2 .2. y z x. z) of real numbers possible basis: {(1. Functional analysis deals with such inﬁnitedimensional vector spaces. the set remains linearly independent 5. y. y. 0. en } that spans V and is linearly independent.Linear independence therefore means that none of the vectors is a linear combination of the others.2. 0)y + (0. . (0. which is called the dimension of V • any n linearly independent vectors in V form a basis for V • any vector x ∈ V can written in a unique way as x = e1 x1 + · · · + en xn and the scalars xi are called the components of x with respect to the basis {e1 .4 Examples ⊲ Example (1): three-dimensional real space. (0. . 1)z ⊲ Example (2): the complex plane. Notes: • if an additional vector is added to a spanning set. z ∈ R A basis for a vector space V is a subset of vectors {e1 . 0. en } Vector spaces can have inﬁnite dimension. . 0. 0). 0). . it remains a spanning set • if a vector is removed from a linearly independent set. 1. C: vectors: complex numbers z EITHER (a) a one-dimensional vector space over C possible basis: {1} z =1·z OR (b) a two-dimensional vector space over R (supplemented by a multiplication rule) possible basis: {1. . . . 0)x + (0. The properties of bases (proved in books on linear algebra) are: • all bases have the same number of elements. 0. 85 0 0 0 1 1 0 x y z y+ x+ = 0 1 1 0 0 0 y z 86 . z) = (1.

. 0. . 0. Rij is the ith component of the vector ej with respect to the primed basis. using the summation convention (sum over i from 1 to n): ej = e′ Rij i The n × n array of numbers Rij is the transformation matrix between the two bases. .. Since they are bases. 0. The transformation matrix Rij described above is an example of a square matrix (m = n). The representation of a vector x in the two bases is ′ x = ej xj = e′ xi i where xi and x′ are the components. more correctly. 0). (0. 0).5 Change of basis 5. Example: in R3 . as either a column matrix or a row matrix: x1 x2 x = . or. 1. . .3 5. we have not yet deﬁned what these terms mean. (0. two bases are {ei } = {(1. 1. The superscript ‘T’ denotes the transpose.5. .2. . 0)}. xT = x1 x2 · · · xn . 0). Consider two bases {ei } and {e′ }. Then i 0 1 0 Rij = 0 0 1 1 0 0 and the rules for matrix addition and multiplication are (A + B)ij = Aij + Bij (AB)ij = Aik Bkj The transpose of an m × n matrix is the n × m matrix (AT )ij = Aji 88 There is no need for a basis to be either orthogonal or normalized. In fact.3. 1)} and {e′ } = {(0.1 Matrices Array viewpoint The same vector has diﬀerent components with respect to diﬀerent bases. 0. . (1. . we can write the i vectors of one set in terms of the other. 1). . the familiar rules for multiplying a matrix by a vector on the right or left are (Ax)i = Aij xj (xT A)j = xi Aij Note that the transformation laws for basis vectors and vector components are ‘opposite’ so that the vector x is unchanged by the transformation. Am1 Am2 · · · Amn We will write matrices and vectors in sans serif face when adopting this viewpoint. Using the suﬃx notation and the summation convention. Thus i e′ x′ = ej xj = e′ Rij xj i i i from which we deduce the transformation law for vector components: x′ i = Rij xj This viewpoint is equivalent to thinking of a vector as an n-tuple. xn A matrix can be regarded simply as a rectangular array of numbers: A11 A12 · · · A1n A21 A22 · · · A2n A= . 87 . (0. . .

or composition. is that a linear operator can act on vectors of one space V to produce vectors of another space V ′ . The property of linearity means: A(αx) = αAx for scalar α A(x + y) = Ax + Ay Notes: • a linear operator has an existence without reference to any basis • the operation can be thought of as a linear transformation or mapping of the space V (a simple example is a rotation of threedimensional space) • a more general idea. Since A is a linear operator. we obtain e′ = e′ Rki Sij j k 90 . We write (A)ij = Aij and (x)i = xi . Therefore a matrix can be thought of as the components of a linear operator with respect to a given basis. of two linear operators has the action (AB)x = A(Bx) = A(ei Bij xj ) = (Aek )Bkj xj = ei Aik Bkj xj The components therefore satisfy the rules of matrix addition and multiplication: (A + B)ij = Aij + Bij (AB)ij = Aik Bkj A linear operator A on a vector space V acts on elements of V to produce other elements of V .5. 89 Recall that matrix multiplication is not commutative. we wrote one set of basis vectors in terms of the other: ej = e′ Rij i We could equally have written e′ = ei Sij j where S is the matrix of the inverse transformation.2 Linear operators The sum of two linear operators is deﬁned by (A + B)x = Ax + Bx = ei (Aij + Bij )xj The product. possibly of a diﬀerent dimension The components of A with respect to a basis {ei } are deﬁned by the action of A on the basis vectors: Aej = ei Aij The components form a square matrix. The action of A on x is written A(x) or just Ax. 5.3 Change of basis again When changing basis.3. so BA = AB in general.3. Substituting one relation into the other (and relabelling indices where required). just as a column matrix or ntuple can be thought of as the components of a vector with respect to a given basis. not considered here. a knowledge of its action on a basis is suﬃcient to determine its action on any vector x: Ax = A(ej xj ) = xj (Aej ) = xj (ei Aij ) = ei Aij xj or (Ax)i = Aij xj This corresponds to the rule for multiplying a matrix by a vector.

A x = RAR ′ ′ −1 Rx = RAx = (Ax) ′ A′ B′ = RAR−1 RBR−1 = RABR−1 = (AB)′ 91 x + y|z = x|z + y|z 92 . e. we ﬁnd = e′ A′ x′ k kj j e′ Rki Aij xj k Rki Aij xj = A′ x′ kj j RA(R−1 x′ ) = A′ x′ Since this is true for any x′ .1 Deﬁnition An inner product (or scalar product) is a scalar function x|y of two vectors x and y.4 Inner product (scalar product) This is used to give a meaning to lengths and angles in a vector space. in matrix notation. An inner product must: • be linear in the second argument: x|αy = α x|y for scalar α x|y + z = x|y + x|z • have Hermitian symmetry: y|x = x|y • be positive deﬁnite: x|x Notes: • an inner product has an existence without reference to any basis • the star is needed to ensure that x|x is real • the star is not needed in a real vector space • it follows that the inner product is antilinear in the ﬁrst argument: αx|y = α∗ x|y for scalar α 0 with equality if and only if x = 0 ∗ x How do the components of a linear operator A transform under a change of basis? We require. RS = SR = 1 Therefore R and S are inverses: R = S−1 and S = R−1 . can be i written in matrix notation as x′ = Rx with the inverse relation x=R −1 ′ 5. The transformation law for vector components.4. Other common notations are (x. we have A = RAR ′ −1 These transformation laws are consistent. for any vector x. 5. x′ = Rij xj . x.g. y and x · y.ej = ek Ski Rij This can only be true if Rki Sij = Ski Rij = δkj or. Ax = ei Aij xj = e′ A′ x′ i ij j Using ej = e′ Rij i and relabelling indices. y).

In one dimension this agrees with the meaning of ‘absolute value’ if we are using the dot product. 5. Proof: We assume that x = 0 and y = 0. Let ei |ej = Gij Then ∗ with equality if and only if x = αy (i. In pure mathematics it is usually the other way around. Warning. In a real vector space.e. otherwise the inequality is trivial. the vectors are parallel or antiparallel). x|x 1/2 is the ‘length’ or ‘norm’ of the vector x. written |x| or x .3 Inner product and bases 0 The inner product is bilinear : it is antilinear in the ﬁrst argument and linear in the second argument.The standard (Euclidean) inner product on Rn is the ‘dot product’ x|y = x · y = xi yi which is generalized to Cn as x|y = x∗ yi i The star is needed for Hermitian symmetry. the Cauchy–Schwarz inequality allows us to deﬁne the angle θ between two vectors through x|y = |x||y| cos θ If x|y =0 (in any vector space) the vectors are said to be orthogonal. We consider the non-negative quantity x − αy|x − αy = x − αy|x − α x − αy|y = x|x − α∗ y|x − α x|y + αα∗ y|y = |x|2 + |α|2 |y|2 − 2 Re (α x|y ) = x|x + αα∗ y|y − α x|y − α∗ x|y x|y = ei xi |ej yj = x∗ yj ei |ej = Gij x∗ yj i i The n × n array of numbers Gij are the metric coeﬃcients of the basis {ei }. First choose the phase of α such that α x|y is real and non-negative and therefore equal to |α|| x|y |.4. Then divide the inequality by 2|α| to obtain |x||y| − | x|y | as required. which is ﬁnite and non-zero. 5. The Hermitian symmetry of the inner product implies that ∗ ∗ ∗ Gij yi xj = (Gij x∗ yj )∗ = G∗ xi yj = G∗ yi xj ji ij i This result holds for any scalar α. This is the usual convention in physics and applied mathematics.4. Then |x| + |α| |y| − 2|α|| x|y | 2 2 2 0 93 94 . We will see later that any other inner product on Rn or Cn can be reduced to this one by a suitable choice of basis.2 The Cauchy–Schwarz inequality | x|y |2 or equivalently | x|y | |x||y| x|x y|y (|x| − |α||y|)2 + 2|α||x||y| − 2|α|| x|y | 0 Now choose the modulus of α such that |α| = |x|/|y|. A knowledge of the inner product of the basis vectors is therefore suﬃcient to determine the inner product of any two vectors x and y.

∗ A11 A∗ † 21 A11 A12 A13 = A∗ A∗ 22 12 A21 A22 A23 A∗ A∗ 23 13 The Hermitian conjugate of a column matrix is † x1 x2 ∗ x† = . . Note that (A† )† = A The Hermitian conjugate of a product of matrices is (AB)† = B† A† because ∗ (AB)† = (AB)∗ = (Ajk Bki )∗ = Bki A∗ = (B† )ik (A† )kj = (B† A† )ij ji jk ij Note the reversal of the order of the product. An orthonormal basis is one for which ei |ej = δij in which case the inner product is the standard one x|y = x∗ yi i We will see later that it is possible to transform any basis into an orthonormal one.2 Relationship with inner product 5. if x and y are vectors. This can also be written x|y = x† Gy and the Hermitian conjugate of this equation is x|y ∗ = y † G† x 96 . and therefore Gji = G∗ ij We say that the matrix G is Hermitian.g. The Hermitian conjugate of a scalar is just the complex conjugate. as also occurs with the transpose or inverse of a product of matrices. 5. each side of the equation is a scalar (a complex number).g. e.5.1 Hermitian conjugate Deﬁnition and simple properties We deﬁne the Hermitian conjugate of a matrix to be the complex conjugate of its transpose: A† = (AT )∗ (A† )ij = A∗ ji This applies to general rectangular matrices.5.5 5. e. = x∗ x2 · · · x∗ n 1 .for all vectors x and y. (ABCx)† = x† C† B† A† (x† Ay)† = y† A† x In the latter example. This result extends to arbitrary products of matrices and vectors. xn 95 We have seen that the inner product of two vectors is x|y = Gij x∗ yj = x∗ Gij yj i i where Gij are the metric coeﬃcients.

98 The components of the operator A with respect to a basis are given by a square matrix A.5. With the standard inner product this implies (A† )ij xj ∗ An anti-Hermitian (or skew-Hermitian) matrix satisﬁes A† = −A or A∗ = −Aij ji yi = x∗ Aij yj i A unitary matrix is one whose Hermitian conjugate is equal to its inverse: A† = A−1 or AA† = A† A = 1 (A† )∗ x∗ yi = Aji x∗ yi ij j j (A† )∗ = Aji ij (A )ij = † A∗ ji In addition. An Hermitian matrix is equal to its Hermitian conjugate: A† = A or A∗ = Aij ji A further relationship between the Hermitian conjugate and the inner product is as follows. y† Gx = y† G† x which is satisﬁed for all vectors x and y provided that G is an Hermitian matrix: G† = G If the basis is orthonormal.3 Adjoint operator † ∗.5. then Gij = δij and we have simply x|y = x y 5.Similarly y|x = y† Gx The Hermitian symmetry of the inner product requires y|x = x|y i.e. 5. A symmetric matrix is equal to its transpose: AT = A or Aji = Aij An antisymmetric (or skew-symmetric) matrix satisﬁes AT = −A AT = A−1 or Aji = −Aij AAT = AT A = 1 An orthogonal matrix is one whose transpose is equal to its inverse: or These ideas generalize to a complex vector space as follows.4 Special square matrices The following types of special matrices arise commonly in scientiﬁc applications. a normal matrix is one that commutes with its Hermitian conjugate: AA† = A† A It is easy to verify that Hermitian. The components of the adjoint operator (with respect to the standard inner product) are given by the Hermitian conjugate matrix A† . anti-Hermitian and unitary matrices are all normal. The adjoint of a linear operator A with respect to a given inner product is a linear operator A† satisfying A† x|y = x|Ay for all vectors x and y. 97 .

This equation states that a non-trivial linear combination of the columns of the matrix (A−λ1) is equal to zero. the repeated eigenvalues are said to be degenerate. that the columns of the matrix are linearly dependent. 0 eigenvectors: 0−λ 0 = λ2 = 0 0 0−λ x 0 0 0 = ⇒ = y 0 0 0 0 1 . Any linear combination of these is also an eigenvector and the space spanned by such vectors is called an eigenspace.6 5. i. there may be any number between 1 and m of linearly independent eigenvectors corresponding to 99 two linearly independent eigenvectors.5. 0 eigenvectors: 0−λ 1 = λ2 = 0 0 0−λ 0 y 0 x = ⇒ = 0 0 0 y 1 0 0−λ 1 0 0−λ only one linearly independent eigenvector one-dimensional eigenspace corresponding to eigenvalue 0 100 .1 Eigenvalues and eigenvectors Basic properties it. then there are n linearly independent eigenvectors. two-dimensional eigenspace corresponding to eigenvalue 0 Example (2): 0 1 0 0 characteristic equation eigenvalues 0. Example (1): 0 0 0 0 characteristic equation eigenvalues 0. The equivalent statement in a matrix representation is Ax = λx or (A − λ1)x = 0 0−λ 0 0 0−λ where A is an n × n square matrix. the corresponding eigenvalue. If the roots are not all distinct. and are complex in general. e. These are the eigenvalues of A. although not necessarily distinct. 1 0 An eigenvector of a linear operator A is a non-zero vector x satisfying Ax = λx for some scalar λ. If the n roots are distinct. we ﬁrst solve the characteristic equation. The corresponding eigenvectors are obtained by solving the simultaneous equations found in the rows of Ax = λx. To compute the eigenvalues and eigenvectors. If an eigenvalue λ occurs m times.g. This a polynomial equation of degree n in λ and therefore has n roots. each of which is determined uniquely up to an arbitrary multiplicative constant. This is equivalent to the statement det(A − λ1) = 0 which is the characteristic equation of the matrix A.6.e.

2 Eigenvalues and eigenvectors of Hermitian matrices 5... provided that µ = λ. Then y = x and µ = λ.... It can be shown that: • the eigenvectors of normal matrices corresponding to distinct eigenvalues are orthogonal • the eigenvalues of Hermitian. . . . .. .. .. .. . . ... x x = 0 and so λ = λ. . .. 101 † ∗ † |λ|2 − 1 = 0 since x = 0 ⇒ |λ| = 1 y=x: µ=λ: λ − 1 y† x = 0 µ y† x = 0 ....... . .. . . . . . . .... ... .. Therefore the eigenvectors of an Hermitian matrix corresponding to distinct eigenvalues are orthogonal (in the standard inner product on Cn )..... .. . ..... . . ... . . We consider two eigenvectors x and y corresponding to eigenvalues λ and µ: Ax = λx Ay = µy The Hermitian conjugate of equation (2) is (since A† = A) y† A = µ∗ y† † Normal matrices (including all Hermitian. . . ⊲ Show that the eigenvalues of a unitary matrix are of unit modulus and the eigenvectors corresponding to distinct eigenvalues are orthogonal.. . In quantum mechanics.3 Related results The matrices most often encountered in physical applications are real symmetric or. respectively These results can all be proved in a similar way. .. Therefore the eigenvalues of an Hermitian matrix are real. ... anti-Hermitian and unitary matrices) satisfy AA† = A† A..6. . .... Hermitian matrices (or operators) represent observable quantities.. we deduce that y† x = 0.... . more generally. . ...... . .5.. anti-Hermitian and unitary matrices are real. . . Hermitian matrices satisfying A† = A.. .... . . Equation (4) simpliﬁes to (λ − µ)y† x = 0 If x and y are now diﬀerent eigenvectors. . . . . 102 .. . .. .. . Example . . imaginary and of unit modulus......6.. so equation (4) becomes (λ − λ )x x = 0 Since x = 0. . .. .. (1) (2) (3) Let A be a unitary matrix: Ax = λx Ay = µy (4) y† A† = µ∗ y† y† A† Ax = µ∗ λy† x (µ∗ λ − 1)y† x = 0 y=x: ∗ A† = A−1 Using equations (1) and (3) we can construct two expressions for y Ax: y† Ax = λy† x = µ∗ y† x (λ − µ )y x = 0 ∗ † First suppose that x and y are the same eigenvector.

. it can be shown (with some diﬃculty) that there are exactly m corresponding linearly independent eigenvectors if the matrix is normal. .Note that: • real symmetric matrices are Hermitian • real antisymmetric matrices are anti-Hermitian • real orthogonal matrices are unitary Note also that a 1×1 matrix is just a number. . . . −1. even if the eigenvalues are 103 . . Orthogonal eigenvectors can be normalized (divide by their norm) to make them orthonormal. anti-Hermitian and unitary matrices: • if A is Hermitian then iA is anti-Hermitian (and vice versa) • if A is Hermitian then ∞ exp(iA) = n=0 (iA)n n! is unitary [Compare the following two statements: If z is a real number then iz is imaginary (and vice versa). Question 2).4 The degenerate case If a repeated eigenvalue λ occurs m times. ⊲ Find an orthonormal set of eigenvectors of the Hermitian matrix 0 i 0 −i 0 0 0 0 1 Characteristic equation: −λ i 0 −i −λ 0 =0 0 0 1−λ (λ2 − 1)(1 − λ) = 0 λ = 1. . . respectively. . . anti-Hermitian and unitary matrices therefore correspond to real. . . . This is called an eigenvector basis. . . . It is always possible to construct an orthogonal basis within this mdimensional eigenspace (e. . .] 5. . . . . which form a basis for the vector space. which is the eigenvalue of the matrix. Hermitian. Example . Therefore an orthonormal basis can always be constructed from the eigenvectors of a normal matrix. . . . . . . λ∗ = −λ or λ∗ = λ−1 degenerate.6. . . . 1 Eigenvector for λ = −1: 0 1 i 0 x −i 1 0 y = 0 0 z 0 0 2 x + iy = 0. . . . . . . imaginary and unit-modulus numbers. . . . anti-Hermitian or unitary λ must satisfy λ∗ = λ. z=0 Normalized eigenvector: 1 1 √ i 2 0 104 respectively. . In fact. λ. . . . by the Gram–Schmidt procedure: see Example Sheet 3. . . There are direct correspondences between Hermitian. . . . . . . . it is always possible to ﬁnd n mutually orthogonal eigenvectors. . . . Therefore. .g. If z is a real number then exp(iz) is of unit modulus. . To be Hermitian. this is possible if and only if the matrix is normal. . .

...1 Diagonalization of a matrix Similarity We have seen that the matrices A and A′ representing a linear operator in two diﬀerent bases are related by A′ = RAR−1 or equivalently A = R−1 A′ R where R is the transformation matrix between the two bases...7..... The columns of the similarity matrix S are just the eigenvectors of A.... ...e..7 5. This means that A and B are representations of the same linear operator in diﬀerent bases.2 Diagonalization S and some diagonal matrix 0 0 .. . .. .... i. An n × n matrix A can be diagonalized if and only if it has n linearly independent eigenvectors. The diagonal entries of the matrix Λ are just the eigenvalues of A: 5......... 2 0 1 for some invertible matrix Λ1 0 · · · 0 Λ2 · · · Λ= ..... if S−1 AS = Λ 105 106 The meaning of diagonalization is that the matrix is expressed in its simplest form by transforming it to its eigenvector basis. 0 0 ··· 5..............7. .Eigenvectors for −1 i −i −1 0 0 −x + iy = 0 λ = 1: 0 0 x y = 0 0 0 z 0 Normalized eigenvectors: 0 1 1 0 √ −i ........ S is called the similarity matrix. The relation (1) is called a similarity transformation. Λn ... . ..... . .. .......... A square matrix A is said to be diagonalizable if it is similar to a diagonal matrix. Two square matrices A and B are said to be similar if they are related by B = S−1 AS (1) where S is some invertible matrix...

..... . . . ..7.. Using the previously obtained eigenvalues and eigenvectors..5. . ... more generally. Note that this is not generally true of real antisymmetric or orthogonal matrices because their eigenvalues and eigenvectors are not generally real.. .. .. Example . In particular.. Furthermore.. In R2 or R3 an orthogonal matrix corresponds to a rotation and/or reﬂection... Therefore a real symmetric matrix can be diagonalized by a real orthogonal transformation. ⊲ Diagonalize 0 i −i 0 0 0 the Hermitian matrix 0 0 1 If the second basis is also orthonormal. .. because a unitary matrix is precisely one whose columns are orthonormal vectors (consider U† U = 1): The transformation matrix between two bases {ei } and {e′ } has comi ponents Rij deﬁned by ej = e′ Rij i The condition for the ﬁrst basis {ei } to be orthonormal is e† ej = δij i (e′ Rki )† e′ Rlj = δij k l ′† ∗ Rki Rlj ek e′ = δij l Therefore a normal matrix can be diagonalized by a unitary transformation: U† AU = Λ where U is a unitary matrix whose columns are the orthonormal eigenvectors of A. .. 107 108 .. . this result applies to the important cases of real symmetric and.... .. an orthogonal matrix performs this task.... ..3 Diagonalization of a normal matrix 5. In this case the similarity matrix is unitary. . . . .... ... .. In a real vector space... . ... . ... Hermitian matrices. .. .. . ... .. . ..4 Orthogonal and unitary transformations An n × n normal matrix always has n linearly independent eigenvectors and is therefore diagonalizable. then e′† e′ = δkl and the condition k l becomes ∗ Rki Rkj = δij R† R = 1 Therefore the transformation between orthonormal bases is described by a unitary matrix.. ... ... however. . . .... .. A real symmetric matrix is normal and has real eigenvalues and real orthogonal eigenvectors. and Λ is a diagonal matrix whose entries are the eigenvalues of A. .7.. ... . . .. . . .. .. They can.. ... . .. √ √ √ √ −1 0 0 0 i 0 1/√ 2 1/ √ 0 2 1/√2 −i/ 2 0 √ 1/ 2 i/ 2 0 −i 0 0 i/ 2 −i/ 2 0 = 0 1 0 0 0 1 0 0 1 0 0 1 0 0 1 . the eigenvectors can be chosen to be orthonormal. . .. . .. . . ... . . . be diagonalized by complex unitary transformations.. .

7.8 5. . . . Also the determinant and trace of a diagonal matrix are just the product and sum. . . . . . . x2 . . . . . . respectively. so Q = xT Ax = (x′T ST )(SΛST )(Sx′ ) = x′T Λx′ The quadratic form is therefore reduced to a sum of squares. A can be diagonalized by a real orthogonal transformation: ST AS = Λ with ST = S−1 The vector x transforms according to x = Sx′ . . Q = 2x2 + 4xy + 5y 2 = x y 2 2 2 5 x y det(A) = det SΛS−1 = det(S) det(Λ) det(S−1 ) = det(Λ) tr(A) = tr SΛS−1 = tr ΛS−1 S = tr(Λ) tr(Am ) = tr SΛm S−1 = tr Λm S−1 S = tr(Λm ) Here we use the properties of the determinant and trace: det(AB) = det(A) det(B) tr(AB) = (AB)ii = Aij Bji = Bji Aij = (BA)jj = tr(BA) Note that diagonal matrices are multiplied very easily (i. . Q(αx) = α2 Q(x). . . . −1 5 1 1 √ 5 2 110 . . . . . . xn ). . . of its elements. . . . . . e. .5. . . . . . . . . .1 Quadratic and Hermitian forms Quadratic form Certain operations on (diagonalizable) matrices are more easily carried out using the representation S−1 AS = Λ Examples: A m or A = SΛS−1 The quadratic form associated with a real symmetric matrix A is Q(x) = xT Ax = xi Aij xj = Aij xi xj = SΛ S m −1 = SΛS −1 SΛS −1 · · · SΛS −1 Q is a homogeneous quadratic function of (x1 . ⊲ Diagonalize the quadratic form Q = 2x2 + 4xy + 5y 2 .g. Q= x y 2 2 2 5 x = xT Ax y tr(A) = i=1 λi In fact these two statements are true for all matrices (whether or not they are diagonalizable). . . . . . . . . as follows from the product and sum of roots in the characteristic equation det(A − λ1) = det(A) + · · · + tr(A)(−λ)n−1 + (−λ)n = 0 109 The eigenvalues of A are 1 and 6 and the corresponding normalized eigenvectors are 1 2 √ . . . . n Q= λi i=1 det(A) = i=1 n λi x′2 i Example . .e. . . . In fact.e.8. any homogeneous quadratic function is the quadratic form of a symmetric matrix. . . . i. . . componentwise). .5 Uses of diagonalization 5. . . Therefore n The xy term is split equally between the oﬀ-diagonal matrix elements.

) is determined by the constant k • the shape of the ellipse is determined by the eigenvalues • the orientation of the ellipse is determined by the eigenvectors In three dimensions........) λ1 = λ2 we have a surface of revolution about the z ′ -axis • if (e. A positive deﬁnite matrix is one for which all the eigenvalues are positive (λ > 0). Notes: • the scale of the ellipse (e......g......g....... Some special cases: • if λ1 = λ2 = λ3 we have a sphere • if (e.....e..... 5...... the quadratic surfaces are ellipsoids (if the eigenvalues all have the same sign) of standard form x′2 y ′2 z ′2 + 2 + 2 =1 a2 b c or hyperboloids (if the eigenvalues diﬀer in sign) of standard form The eigenvectors of A deﬁne the principal axes of the quadratic form....8... Similarly: • negative deﬁnite means λ < 0 • positive (negative) semi-deﬁnite means λ 0 (λ 0) • deﬁnite means positive deﬁnite or negative deﬁnite In the above example..2 Quadratic surfaces The quadratic surfaces (or quadrics) associated with a real quadratic form in three dimensions are the family of surfaces Q(x) = k = constant In the eigenvector basis this equation simpliﬁes to λ1 x′2 1 + λ2 x′2 2 + λ3 x′2 3 =k x′2 y ′2 z ′2 + 2 − 2 = ±1 a2 b c The quadrics of a deﬁnite quadratic form are therefore ellipsoids.. we are rotating the coordinates to reduce the quadratic form to its simplest form..g....) λ3 = 0 we have the translation of a conic section along the z ′ -axis (an elliptic or hyperbolic cylinder) 111 112 . The equivalent equation in two dimensions is related to the standard form for a conic section... the diagonalization shows that Q(x) > 0 for all x = 0. and we say that the quadratic form is positive deﬁnite. In diagonalizing A by transforming to its eigenvector basis. an ellipse (if λ1 λ2 > 0) x′2 y ′2 + 2 =1 a2 b or a hyperbola (if λ1 λ2 < 0) x′2 y ′2 − 2 = ±1 a2 b The semi-axes (distances of the curve from the origin along the principal axes) are a = |k/λ1 | and b = |k/λ2 |... i............(calculation omitted)...... Λ= 0 6 −1/ 5 2/ 5 Therefore 2 2 1 1 Q = x′2 + 6y ′2 = √ (2x − y) + 6 √ (x + 2y) 5 5 .. The diagonalization of A is ST AS = Λ with √ √ 1 0 2/ √ 1/√5 5 S= ..

114 113 .3 Hermitian form In a complex vector space. These are ellipsoids or hyperboloids with principal axes given by the eigenvectors of the symmetric matrix Vij .Conic sections and quadric surfaces Example: Let V (r) be a potential with a stationary point at the origin r = 0. the Hermitian form associated with an Hermitian matrix A is ellipsoid x2 a2 hyperboloid of one sheet =1 x2 a2 H(x) = x† Ax = x∗ Aij xj i H is a real scalar because H ∗ = (x† Ax)† = x† A† x = x† Ax = H We have seen that A can be diagonalized by a unitary transformation: U† AU = Λ and so n + y2 b2 + z2 c2 + y2 b2 − z2 c2 =1 with U† = U−1 H = x† (UΛU† )x = (U† x)† Λ(U† x) = x′† Λx′ = i=1 λi x′2 i hyperboloid of two sheets x2 a2 + y2 b2 − z2 c2 = −1 Therefore an Hermitian form can be reduced to a real quadratic form by transforming to the eigenvector basis. 5.8. Then its Taylor expansion has the form V (r) = V (0) + Vij xi xj + O(x3 ) where Vij = 1 ∂2V 2 ∂xi ∂xj ellipse x2 a2 + y2 b2 =1 hyperbola x2 a2 − y2 b2 =1 r=0 The equipotential surfaces near the origin are therefore given approximately by the quadrics Vij xi xj = constant.

diﬀerential and integral calculus. for any positive number ǫ. We write this as n→∞ lim fn = L or fn → L as n → ∞ where ‘c. + O(δx2 ) x† x x† Ax x† x Analysis is the careful study of inﬁnite processes such as limits.2 6. and is one of the foundations of mathematics.c. then λ(x) is the corresponding eigenvalue In fact.c. deﬁned for all integers n n0 . The sequence fn converges to the limit L as n → ∞ if. 6. and the value of λ(x) is the corresponding eigenvalue. This section covers some of the basic concepts including the important problem of the convergence of inﬁnite series.1 Elementary analysis Motivation The Rayleigh quotient associated with an Hermitian matrix A is the normalized Hermitian form λ(x) = Notes: • λ(x) is a real scalar • λ(αx) = λ(x) • if x is an eigenvector of A. 115 Note that L here is a ﬁnite number. convergence. In that case x is an eigenvector of A. In other words the members of the sequence are eventually contained within an arbitrarily small disk centred on L. This is the Rayleigh–Ritz variational principle. To say that a property holds for suﬃciently large n means that there exists an integer N such that the property is true for all n N . Therefore the ﬁrst-order variation of λ(x) vanishes for all perturbations δx if and only if Ax = λ(x)x.5.2. 116 . It also introduces the remarkable properties of analytic functions of a complex variable.1 Sequences Limits of sequences We ﬁrst consider a sequence of real or complex numbers fn . the eigenvalues of A are the stationary values of the function λ(x).’ denotes the complex conjugate. Proof: δλ = λ(x + δx) − λ(x) = = = = = (x + δx)† A(x + δx) − λ(x) (x + δx)† (x + δx) x† Ax + (δx)† Ax + x† A(δx) + O(δx2 ) − λ(x) x† x + (δx)† x + x† (δx) + O(δx2 ) (δx)† Ax + x† A(δx) − λ(x)[(δx)† x + x† (δx)] + O(δx2 ) x† x + (δx)† x + x† (δx) + O(δx2 ) (δx)† [Ax − λ(x)x] + [x† A − λ(x)x† ](δx) + O(δx2 ) x† x (δx)† [Ax − λ(x)x] + c. Possible behaviours as n increases are: • fn tends towards a particular value • fn does not tend to any value but remains limited in magnitude • fn is unlimited in magnitude Deﬁnition.9 Stationary property of the eigenvalues 6 6. |fn − L| < ǫ for suﬃciently large n.

According to Cauchy’s principle of convergence.3.2 Classic examples z n has the partial sum z=1 z=1 Cauchy’s principle of convergence A necessary and suﬃcient condition for the sequence fn to converge is that. for any positive number ǫ. for any positive number ǫ. Note that whether a series converges or diverges does not depend on the value of n0 (i. Example: n+1 n Proof: n+1 n 6. cient condition for |un+1 + un+2 + · · · + un+m| < ǫ for all positive integers m. The geometric series N z = n=0 n 1−z N +1 1−z . |fn+m − fn | < ǫ for all positive integers m.2 e inα What is the meaning of an inﬁnite series such as ∞ un n=n0 involving the addition of an inﬁnite number of terms? We deﬁne the partial sum einα is bounded as n → ∞ for any real number α N SN = n=n0 un n+1 <2 = n for all n 2 The inﬁnite series un is said to converge if the sequence of partial sums SN tends to a limit S as N → ∞. The sequence fn is bounded as n → ∞ if there exists a positive number K such that |fn | < K for suﬃciently large n. on when the series begins) but only on the behaviour of the terms for large n.3. for suﬃciently large n. N + 1. Otherwise the series diverges. Deﬁnition. If fn does not tend to a limit it may nevertheless be bounded.2. for suﬃciently large n.1 Convergence of series Meaning of convergence lim n−α = 0 Proof: |n−α − 0| < ǫ holds for all n > ǫ−1/α . 117 118 .e.Example: n→∞ 6. Note that this condition does not require a knowledge of the value of the limit L. a necessary and suﬃun to converge is that. 6.3 for any α > 0 6. The value of the series is then S.

6. But then |un+1 + un+2 + · · · + un+m | and so un also converges.3. then converge absolutely. If |z| the series diverges. Consider the partial sum N +1 1 dx = ln(N + 1) x un to converge is that un → 0 as n → ∞. |un+1 | + |un+2 | + · · · + |un+m | < ǫ for all positive integers m. as exempliﬁed by the harmonic series. if |vn | diverges and |un | K|vn | for suﬃciently large n. and each partial sum is a non-decreasing sequence. We write these as |un | because the comparison test is most often applied in assessing the absolute convergence of a series of real or complex numbers. then In particular. for any This uses a comparison between a given series un of complex terms n and a geometric series vn = r . If |vn | converges and |un | |vn | for all n.3. This refers to series of non-negative real numbers. then un may or may not converge. then This follows from the fact that N n=n0 N |un | also converges. 6.3. and therefore lim un = 0. 6. This condition is not suﬃcient for convergence. where K is a positive constant.Therefore z n converges if |z| < 1. 120 . If the series converges then lim Sn = lim Sn−1 = S.4 Necessary condition for convergence n−1 diverges.3. then |un | also converges (diverges). [Proof of the ﬁrst statement above: If positive number ǫ. un is then said to |un | n=n0 |vn | If |un | converges. A necessary condition for Formally. which must either tend to a limit or increase without bound. and the sum is 1/(1 − z). if |vn | converges and |un | K|vn | for suﬃciently large n. if |vn | converges (diverges) and |un |/|vn | tends to a non-zero limit as n → ∞. where K is a constant.6 D’Alembert’s ratio test If |un | diverges. where r > 0.5 The comparison test Therefore SN increases without bound and does not tend to a limit as N → ∞. this can be shown by noting that un = Sn − Sn−1 . The harmonic series SN = 1 > n n=1 N 1 6.3 Absolute and conditional convergence un also converges. for suﬃciently large n. More generally. then |un | also converges. Conversely. If it does. |un | also diverges.] 119 <ǫ |un+1 | + |un+2 | + · · · + |un+m| |un | converges then. it is said to converge conditionally.

such that |f (z) − L| < ǫ for all z such that |z − z0 | < δ. there exists a positive number R. f (z) → L as z → ∞ 6. x→−∞ lim tanh x = −1 122 . Deﬁnition.g.The absolute ratio of successive terms is rn = un+1 un Deﬁnition. depending on ǫ. cases such as sin z =1 z→0 z lim must be expressed as limits because sin 0/0 = 0/0 is indeterminate. Deﬁnition. for any positive number ǫ. The function f (z) tends to the limit L as z → z0 if. for any positive number ǫ. It is more powerful than the ratio test but usually harder to apply. Example: for the harmonic series n−1 . a diﬀerent test is required Even if rn does not tend to a limit. there exists a positive number δ.1 Functions of a continuous variable Limits and continuity Deﬁnition. Sometimes the limit or bound applies only if approached in a particular way. The function f (z) tends to the limit L as z → ∞ if. • if r > 1. We write this as z→∞ The same conclusions as for the ratio test apply when instead rn = |un | 1/n lim f (z) = L or This result also follows from a comparison with a geometric series.7 Cauchy’s root test The value of L would normally be f (z0 ). especially in the complex plane. The function f (z) is continuous at the point z = z0 if f (z) → f (z0 ) as z → z0 . However. such that |f (z) − L| < ǫ for all z with |z| > R. it can easily be adapted by relabelling the series to remove the vanishing terms. We write this as z→z0 Suppose that rn tends to a limit r as n → ∞. e. un converges absolutely un diverges (un does not tend to zero) lim f (z) = L or f (z) → L as z → z0 • if r = 1. depending on ǫ. if rn r for suﬃciently large n. 121 lim tanh x = 1. The ratio test is useless for series in which some of the terms are zero.4 6.4. x→+∞ We now consider how a real or complex function f (z) of a real or complex variable z behaves near a point z0 . A diﬀerent test is required. then un converges absolutely. The function f (z) is bounded as z → ∞ if there exist positive numbers K and R such that |f (z)| < K for all z with |z| > R. However. Then • if r < 1. The function f (z) is bounded as z → z0 if there exist positive numbers K and δ such that |f (z)| < K for all z with |z − z0 | < δ. 6. such as the integral comparison test used above. Deﬁnition. where r < 1 is a constant.3. rn = n/(n+1) → 1 as n → ∞. There are diﬀerent ways in which z can approach z0 or ∞.

2 The O notation x(1 + x) = −1 x→0− |x| lim 6.5 Taylor’s theorem for functions of a real variable Let f (x) be a (real or complex) function of a real variable x.4. • f (z) = O(g(z)) as z → z0 means that • f (z) = o(g(z)) as z → z0 means that • f (z) ∼ g(z) as z → z0 means that f (z) g(z) f (z) g(z) is bounded as z → z0 → 0 as z → z0 Rn = x0 (x0 + h − x)n−1 (n) f (x) dx (n − 1)! is the remainder after n terms of the Taylor series. Lagrange’s expression for the remainder is Rn = hn (n) f (ξ) n! f (z) g(z) → 1 as z → z0 If f (z) ∼ g(z) we say that f is asymptotically equal to g. Notes: • these deﬁnitions also apply when z0 = ∞ • f (z) = O(1) means that f (z) is bounded • either f (z) = o(g(z)) or f (z) ∼ g(z) implies f (z) = O(g(z)) • only f (z) ∼ g(z) is a symmetric relation Examples: A cos z = O(1) as z → 0 as z → 0 where ξ is an unknown number in the interval x0 < ξ < x0 + h. [Proof: integrate Rn by parts n times.This notation implies that x is approaching positive or negative real inﬁnity along the real axis.] The remainder term can be expressed in various ways. A related notation for one-sided limits is exempliﬁed by x(1 + x) = 1. So Rn = O(hn ) If f (x) is inﬁnitely diﬀerentiable in x0 x x0 + h (it is a smooth function) we can write an inﬁnite Taylor series hn (n) f (x0 ) n! n=0 ∞ A sin z = O(z) = o(1) ln x = o(x) cosh x ∼ 1 ex 2 f (x0 + h) = as x → +∞ as x → +∞ 123 which converges for suﬃciently small h (as discussed below). o and ∼ are used to compare the rates of growth or decay of diﬀerent functions. In the context of real variables x → ∞ usually means speciﬁcally x → +∞. which is diﬀerentiable at least n times in the interval x0 x x0 + h. Then f (x0 + h) =f (x0 ) + hf ′ (x0 ) + ··· + where x0 +h h2 ′′ f (x0 ) + · · · 2! hn−1 (n−1) f (x0 ) + Rn (n − 1)! The useful symbols O. This should not be written as f (z) → g(z). x→0+ |x| lim 6. 124 .

6. To be analytic at a point z = z0 . it is said to be analytic in R. so δx = 0: f ′ (z) = lim f (z + i δy) − f (z) δy→0 i δy u(x. y) u(x + δx. f (z) must be diﬀerentiable throughout some neighbourhood |z − z0 | < ǫ of that point. so δy = 0: f ′ (z) = lim f (z + δx) − f (z) δx u(x + δx. y + δy) − u(x. 6.1 Analytic functions of a complex variable Complex diﬀerentiability The derivative should have the same value if δz approaches 0 along the imaginary axis. The limit must be the same when δz → 0 in any direction in the complex plane. y + δy) − u(x. a complex constant • f (z) = z. y) − v(x.6.2 The Cauchy–Riemann equations Comparing the real and imaginary parts of these expressions. Examples of functions that are analytic in the whole complex plane (known as entire functions): • f (z) = c. They are also suﬃcient conditions. y + δy) + iv(x. for which u = x and v = y. y) If f ′ (z) exists we can calculate it by assuming that δz = δx + i δy approaches 0 along the real axis.6 6. y) = lim δy→0 i δy v(x.6. y) − u(x. y) + iv(x + δx. The derivative of the function f (z) at the point z = z0 is f ′ (z0 ) = lim z→z0 f (z) − f (z0 ) z − z0 If this exists. ∂x ∂y ∂v ∂u =− ∂x ∂y These are necessary conditions for f (z) to have a complex derivative. y) − iv(x. y) − u(x. y) + lim = −i lim δy→0 δy→0 δy δy ∂u ∂v + = −i ∂y ∂y Deﬁnition. y) − iv(x. y + δy) − v(x. y) + i lim = lim δx→0 δx→0 δx δx ∂u ∂v = +i ∂x ∂x δx→0 If a function f (z) has a complex derivative at every point z in a region R of the complex plane. y) = lim δx→0 δx v(x + δx. y) u(x.6. and we easily verify the CR equations 126 125 . we deduce the Cauchy–Riemann equations ∂u ∂v = . the function f (z) is diﬀerentiable at z = z0 .3 Analytic functions Separate f = u + iv and z = x + iy into their real and imaginary parts: f (z) = u(x. provided that the partial derivatives are also continuous.6. Another way to write this is f (z + δz) − f (z) df ≡ f ′ (z) = lim δz→0 dz δz Requiring a function of a complex variable to be diﬀerentiable is a surprisingly strong constraint. y) + iv(x.

127 d sin z = cos z. we can ﬁnd its imaginary part (or vice versa) up to an additive constant by integrating the CR equations.g. This is called a rational function and is analytic except at points where Q(z) = 0. for which u = x2 + y 2 and v = 0 In the last case the CR equations are satisﬁed only at x = y = 0 and we can say that f ′ (0) = 0. where n is a positive integer • f (z) = P (z) = cn z n + cn−1 z n−1 + · · · + c0 . f (z) is not analytic even at z = 0 because it is not diﬀerentiable throughout any neighbourhood |z| < ǫ of 0.• f (z) = z n . However. dz d cosh z = sinh z dz ∂v ∂u +i = ex cos y + i ex sin y = ez ∂x ∂x Examples: • f (z) = P (z)/Q(z). is analytic except at z = 0 (unless c is a non-negative integer) • f (z) = ln z is also analytic except at z = 0 The last two examples are in fact multiple-valued functions. f (z) = z exp(iz 2 ) + z 3 The usual product. where P (z) and Q(z) are polynomials. for which u = (x2 + y 2 )1/2 and v = 0 • f (z) = |z|2 . which require special treatment (see next term). Sums. Many complex functions are analytic everywhere in the complex plane except at isolated points. which are called the singular points or singularities of the function. dz apply as usual. y) = x2 − y 2 128 . • f (z) = z c . so the CR equations are not satisﬁed anywhere • f (z) = z ∗ . quotient and chain rules apply to complex derivatives of analytic functions. for which u = x and v = −y • f (z) = |z|.4 Consequences of the Cauchy–Riemann equations If we know the real part of an analytic function in some region. Familiar relations such as d n z = nz n−1 . products and compositions of analytic functions are also analytic.6. for which u = x and v = 0. Example: u(x. Examples of non-analytic functions: • f (z) = Re(z). where c is a complex constant. e. 6. a general polynomial function with complex coeﬃcients • f (z) = exp(z) In the case of the exponential function we have f (z) = ez = ex eiy = ex cos y + i ex sin y = u + iv The CR equations are satisﬁed for all x and y: ∂u ∂v = ex cos y = ∂x ∂y ∂u ∂v = ex sin y = − ∂x ∂y The derivative of the exponential function is f ′ (z) = as expected.

∂u ∂v = = 2x ∂y ∂x ∂v ∂u =− ∂x ∂y ⇒ ⇒ v = 2xy + g(x) ⇒ g ′ (x) = 0 diﬀerentiable any number of times. . .8 6. A double zero is one of order 2. Examples: • f (z) = z has a simple zero at z = 0 • f (z) = (z − i)2 has a double zero at z = i Example . . . . . . . . . . . . . . . . . . . . . . . . .1 Zeros. . .8. . u and v are said to be conjugate harmonic functions. 1 0 = sinh z = (ez − e−z ) 2 129 130 • f (z) = z 2 − 1 = (z − 1)(z + 1) has simple zeros at z = ±1 and so the curves of constant u and those of constant v intersect at right angles. . . . If f (z) is analytic at z = z0 . . ⊲ Find and classify the zeros of f (z) = sinh z. In fact this can be taken as a deﬁnition of analyticity. . it has an inﬁnite Taylor series ∞ 2y + g ′ (x) = 2y f (z) = n=0 Therefore v(x. . . . . and we recognize f (z) = x2 − y 2 + i(2xy + c) = (x + iy)2 + ic = z 2 + ic The real and imaginary parts of an analytic function satisfy Laplace’s equation (they are harmonic functions): ∂ ∂2u ∂2u + 2 = 2 ∂x ∂y ∂x ∂ = ∂x =0 ∂u ∂x ∂v ∂y ∂ ∂y ∂ + ∂y + ∂u ∂y ∂v − ∂x an (z − z0 )n . etc. . A zero is of order N if f (z0 ) = f ′ (z0 ) = f ′′ (z0 ) = · · · = f (N −1) (z0 ) = 0 but f (N ) (z0 ) = 0 The proof that ∇2 v = 0 is similar. 6. . . . . . y) = 2xy + c. . . . poles and essential singularities Zeros of complex functions The zeros of f (z) are the points z = z0 in the complex plane where f (z0 ) = 0. it is also . ∇u · ∇v = ∂u ∂v ∂u ∂v + ∂x ∂x ∂y ∂y ∂v ∂v ∂v ∂v − = ∂y ∂x ∂x ∂y =0 The ﬁrst non-zero term in the Taylor series of f (z) about z = z0 is then proportional to (z − z0 )N . . an = f (n) (z0 ) n! which converges within some neighbourhood of z0 (as discussed below). . . This provides a useful method for solving Laplace’s equation in two dimensions. . not only is it diﬀerentiable everywhere in R.7 Taylor series for analytic functions If a function of a complex variable is analytic in a region R of the complex plane. Indeed f (z) ∼ aN (z − z0 )N as z → z0 A simple zero is a zero of order 1. . . . . . . . . 6. where c is a real constant. . Furthermore.

This is not a Taylor series because it includes negative powers of z − z0 ...... A double pole is one of order 2.. etc. 6..... Note that f (z) ∼ a−N (z − z0 )−N as z → z0 It can be shown that any function that is analytic (and single-valued) throughout an annulus a < |z − z0 | < b centred on a point z = z0 has a unique Laurent series ∞ f (z) = n=−∞ an (z − z0 )n A simple pole is a pole of order 1.8. We say that f (z) has a pole of order N ......2 Poles has a simple pole at z = −1 and a double pole at z = i (as well as a simple zero at z = 0).......... Notes: • if f (z) has a zero of order N at z = z0 .. and vice versa 131 which converges for all values of z within the annulus..... and f (z) is not analytic at z = z0 .8...... f ′ (z) = cosh z = cos(nπ) = 0 ⇒ all simple zeros n∈Z • if f (z) is analytic and non-zero at z = z0 and g(z) has a zero of order N there... then f (z)/g(z) has a pole of order N there Example: f (z) = at these points 2z (z + 1)(z − i)2 ... There are three possibilities: 132 . The expansion about the double pole can be carried out by letting z = i + w and expanding in w: f (z) = 2(i + w) (i + w + 1)w2 2i(1 − iw) = 1 (i + 1) 1 + 2 (1 − i)w w2 1 2i (1 − iw) 1 − (1 − i)w + O(w2 ) = 2 (i + 1)w 2 1 = (1 + i)w−2 1 − (1 + i)w + O(w2 ) 2 = (1 + i)(z − i)−2 − i(z − i)−1 + O(1) as z → i If g(z) is analytic and non-zero at z = z0 then it has an expansion (locally..... and the Laurent series determines the behaviour of f (z) near z = z0 .. then f (z) is analytic throughout the disk |z − z0 | < b except possibly at z = z0 itself............... at least) of the form ∞ g(z) = n=0 bn (z − z0 )n with b0 = 0 Consider the function ∞ f (z) = (z − z0 )−N g(z) = n=−N an (z − z0 )n 6...ez = e−z e2z = 1 2z = 2nπi z = nπi. then 1/f (z) has a pole of order N there.... If a = 0.3 Essential singularities with an = bn+N and so a−N = 0........

(In the case of f (z) = e1/z .1 Convergence of power series Circle of convergence If the power series ∞ f (z) = n=0 an (z − z0 )n 1 = n! n=0 ∞ 1 z n = 1 zn (−n)! n=−∞ 0 converges for z = z1 .9.] It follows that. Then z = ∞ maps to a single point ζ = 0. pole or essential singularity at ζ = 0. If g(ζ) has a zero. |z − z0 | = R is the circle of convergence. Here we can generate the Laurent series from a Taylor series in 1/z: e 1/z 6.9 6. then f (z) has an essential singularity at z = z0 A classic example of an essential singularity is f (z) = e1/z at z = 0. R is called the radius of convergence and may be zero (exceptionally). Therefore |an (z − z0 )n | < ǫr n for suﬃciently large n.• if the ﬁrst non-zero term in the Laurent series has n 0.8. the exceptional value 0 is never attained. for any ǫ > 0. positive or inﬁnite. Picard’s theorem states that. then it diverges for all z such that |z − z0 | > |z2 − z0 |. if the power series diverges for z = z2 . [Proof: The necessary condition for convergence. The series converges inside it and diverges outside. lim an (z1 − z0 )n = 0. By comparison with the geometric series |an (z − z0 )n | converges.4 Behaviour at inﬁnity We can examine the behaviour of a function f (z) as z → ∞ by deﬁning a new variable ζ = 1/z and a new function g(ζ) = f (z). Examples: f1 (z) = ez = e1/ζ = g1 (ζ) has an essential singularity at z = ∞. z0 )/(z1 − z0 )| < 1. On the circle. the point at inﬁnity. f3 (z) = e1/z = eζ = g3 (ζ) is analytic at z = ∞. it may either converge or diverge. implies that |an (z1 − z0 )n | < ǫ for suﬃciently large n. then we can say that f (z) has the corresponding property at z = ∞. in any neighbourhood of an essential singularity. then it converges absolutely for all z such that |z − z0 | < |z1 − z0 |. with r = |(z − rn . then f (z) is analytic at z = z0 and the series is just a Taylor series • if the ﬁrst non-zero term in the Laurent series has n = −N < 0. f2 (z) = z 2 = 1/ζ 2 = g2 (ζ) has a double pole at z = ∞. The behaviour of a function near an essential singularity is remarkably complicated. Therefore the series converges for |z −z0 | < R and diverges for |z −z0 | > R.) 6. if the Laurent series involves an inﬁnite number of terms with n < 0. the function takes all possible complex values (possibly with one exception) at inﬁnitely many points. 133 134 . then f (z) has a pole of order N at z = z0 • otherwise.

R = 1/L.6. Therefore R = 1 in terms of (−z 2 ). ∞ arctan z = z − z3 z5 z7 1 + − + ··· = z (−z 2 )n 3 5 7 2n + 1 n=0 Thought of as a power series in (−z 2 ). no singularity can lie inside the circle of convergence. this is an entire function. (In fact. The series converges for |z| < 1 and diverges for |z| > 1. The series converges for all z.3 Examples Here |an+1 /an | = 1/(n + 1) → 0 as n → ∞. the series converges except at the point z = 1. the series converges for |z| < 1 and diverges for |z| > 1.) The function has a singularity at z = 1 that limits the radius of convergence. on the circle |z| = 1. But 135 136 . so R = ∞. The radius of convergence is R = 1/L. According to the ratio test. Since a convergent power series deﬁnes an analytic function. so R = 1. 6.9. this has |an+1 /an | = (2n + 1)/(2n + 3) → 1 as n → ∞. Then rn → r = L|z − z0 |. z2 z3 zn + + ··· = 2 6 n! n=0 ∞ ez = 1 + z + Suppose that |an+1 /an | → L as n → ∞. the series converges for L|z − z0 | < 1 and diverges for L|z − z0 | > 1. ln(1 − z) = −z − z2 z3 − − ··· = − 2 3 ∞ n=1 zn n Here |an+1 /an | = n/(n + 1) → 1 as n → ∞. The radius of convergence of the Taylor series of a function f (z) about the point z = z0 is equal to the distance of the nearest singular point of the function f (z) from z0 . follows from Cauchy’s root test if instead |an |1/n → L as n → ∞. The following examples are generated from familiar Taylor series.9. The same result.2 Determination of the radius of convergence The absolute ratio of successive terms in a power series is rn = an+1 |z − z0 | an since | − z 2 | = 1 is equivalent to |z| = 1.

by the method of separation of variables. They are linearly independent if Ay1 (x) + By2 (x) = 0 (for all x) implies A = B = 0 Provided that the integrals can be evaluated. the problem is completely solved. 137 ′ ′′ 2 2 If y1 (x) and y2 (x) are linearly independent solutions.7 7. 7. otherwise it is inhomogeneous (forced). Here we deal with general linear second-order ODEs. An equivalent method does not exist for second-order ODEs. second-order equations describe oscillatory phenomena. they can often be reduced to ordinary diﬀerential equations (ODEs). The ODEs encountered most frequently are linear and of either ﬁrst or second order. The general linear inhomogeneous ﬁrst-order ODE y ′ (x) + p(x)y(x) = f (x) can be solved using the integrating factor g = exp general solution y= 1 g gf dx p dx to obtain the 7. if one is not simply a constant multiple of the other.1 Ordinary diﬀerential equations Motivation The general linear second-order ODE has the form Ly = f where L is a linear operator such that Ly = ay ′′ + by ′ + cy where a. Very many scientiﬁc problems are described by diﬀerential equations.1 Homogeneous linear second-order ODEs Linearly independent solutions Divide through by the coeﬃcient of y ′′ to obtain a standard form y ′′ (x) + p(x)y ′ (x) + q(x)y(x) = 0 Suppose that y1 (x) and y2 (x) are two solutions of this equation.3 7. There are two arbitrary constants because the equation is of second order. x) = 0 for an unknown function y(x).3. f are functions of x. c.2 Classiﬁcation The general second-order ODE is an equation of the form F (y ′′ . Part IA dealt with ﬁrst-order ODEs and also with linear second-order ODEs with constant coeﬃcients. it is often informative to analyse the solution in the complex domain. Although the solution may be of interest only for real x.e. y. y ′ . In particular. The principle of superposition applies to linear ODEs as to all linear equations. then the general solution of the ODE is y(x) = Ay1 (x) + By2 (x) where A and B are arbitrary constants. The equation is homogeneous (unforced) if f = 0. 138 . where y = dy/dx and y = d y/dx . e. b.g. but an extensive theory can still be developed. i. Even if these are partial diﬀerential equations.

this result involves an indeﬁnite integral and could be written more fussily as x W (x) = exp − p(ξ) dξ y2 (x) = y1 (x) W (ξ) dξ [y1 (ξ)]2 139 140 . so W involves an arbitrary multiplicative constant • apart from that.2 The Wronskian Notes: • the indeﬁnite integral involves an arbitrary additive constant.3 Calculation of the Wronskian Suppose that one solution y1 (x) is known. y2 ] = 1 2 = y1 y2 − y2 y1 ′ ′ y1 y2 Suppose that Ay1 (x) + By2 (x) = 0 in some interval of x. Then also ′ ′ Ay1 (x) + By2 (x) = 0. This ﬁrst-order ODE for W has the solution W = exp − p dx d dx y2 y1 = W 2 y1 W 2 dx y1 y2 = y 1 This expression involves an indeﬁnite integral and could be written more fussily as x Again.3. W is the same for any two solutions y1 and y2 • W is therefore an intrinsic property of the ODE • if W = 0 for one value of x (and p is integrable) then W = 0 for all x. 7.4 Finding a second solution The Wronskian W (x) of two solutions y1 (x) and y2 (x) of a second-order ODE is the determinant of the Wronskian matrix: y y ′ ′ W [y1 .3.7. B then W = 0 (in that interval of x). and so y1 y2 ′ ′ y1 y2 0 A = 0 B If this is satisﬁed for non-trivial A. Then a second solution y2 (x) can be found as follows. so linear independence need be checked at only one value of x 7. The deﬁnition of W then provides a ﬁrst-order linear ODE for the unknown y2 : ′ ′ y1 y2 − y 2 y1 = W Consider W = = ′ ′′ − y2 y1 ′ y1 (−py2 − qy2 ) ′′ y 1 y2 ′ − y2 (−py1 − qy1 ) = −pW ′ ′ y2 y2 y1 W − 2 = 2 y1 y1 y1 since both y1 and y2 solve the ODE. First ﬁnd W as described above.3. Therefore the solutions are linearly independent if W = 0.

. .4. . . . . . . . . 1+x (x − 1)2 q(x) = ℓ(ℓ + 1) 1−x 1+x are both analytic at x = 1. . . . Standard form y ′′ − Wronskian 2n − 1 W = exp − p dx = exp x = exp [(2n − 1) ln x + constant] = Ax2n−1 Second solution y2 = y 1 W dx = xn 2 y1 Ax−1 dx dx 2n − 1 x y′ + n2 x2 xn x2 y ′′ − (2n − 1)xy ′ + n2 y = 0.. . . ..1 Series solutions Ordinary and singular points We consider a homogeneous linear second-order ODE in standard form: y ′′ (x) + p(x)y ′ (x) + q(x)y(x) = 0 A point x = x0 is an ordinary point of the ODE if: p(x) and q(x) are both analytic at x = x0 Otherwise it is a singular point. . .. . . .. . . .. . . . ..... . A singular point x = x0 is regular if: y=0 (x − x0 )p(x) and (x − x0 )2 q(x) are both analytic at x = x0 Example . . ⊲ Identify the singular points of Legendre’s equation (1 − x2 )y ′′ − 2xy ′ + ℓ(ℓ + 1)y = 0 where ℓ is a constant..... . ... .. . . . . . . They are both regular: (x − 1)p(x) = 2x . .. .. . ... . since y2 can be multiplied by any constant • this expression for y2 therefore provides the general solution of the ODE Example . .. ... . .. ..Notes: • the indeﬁnite integral involves an arbitrary additive constant.. . . . . ... . .. The same result can be obtained by writing y2 (x) = y1 (x)u(x) and obtaining a ﬁrst-order linear ODE for u′ . Both p(x) and q(x) are analytic for all x except x = ±1. .. ... .. Divide through by (1 − x2 ) to obtain the standard form with p(x) = − 2x . . . . . . . . . . . . . 141 142 . . .. since any amount of y1 can be added to y2 • W involves an arbitrary multiplicative constant. ... This method applies to higherorder linear ODEs and is reminiscent of the factorization of polynomial equations... . ... . . . . . . . . .. . .. . . . .. .. .. . .. . .. ﬁnd 7.. . . .. ... . . .. .... . .. . . . . 1 − x2 q(x) = ℓ(ℓ + 1) 1 − x2 = Axn ln x + Bxn . . . . . . . .. and determine their nature.4 7. . . . .. ⊲ Given that y = is a solution of the general solution. . . .. .. .. . ... These are the singular points. . and similarly for x = −1... . ... . .. . . .

. . y′ = n=0 nan xn−1 . . . . . . ratios of polynomials) it is a much better idea to multiply the ODE through by a suitable factor to clear denominators before substituting in the power series for y. . . . . . . .7. . . . . . . . y ′′ = n=0 n(n − 1)an xn−2 pq = ℓ=0 ∞ pℓ (x − x0 ) n r=0 ℓ m=0 qm (x − x0 ) n m Substitute these expressions into the ODE to obtain ∞ n=0 ∞ = n=0 pn−r qr (x − x0 ) n(n − 1)an xn−2 + n=0 [−n(n − 1) − 2n + ℓ(ℓ + 1)] an xn = 0 144 143 . The radius of convergence of the series solutions is the distance to the nearest singular point of the ODE in the complex plane.4. . . Example . . . the ODE has two linearly independent solutions that are also analytic. Since p(x) and q(x) are analytic at x = x0 . . . If x = x0 is an ordinary point. The above procedure is rarely followed in practice. . . . . an+1 . . . . . . .2 Series solutions about an ordinary point Thus ∞ n r=0 n r=0 Wherever p(x) and q(x) are analytic. y ′ and y ′′ . . . .e. The ﬁrst two coeﬃcients a0 and a1 are not determined: they are the two arbitrary constants in the general solution. . . ∞ ∞ (n + 2)(n + 1)an+2 + r=0 pn−r (r + 1)ar+1 + r=0 qn−r ar = 0 p(x) = n=0 pn (x − x0 ) . . Let ∞ y= n=0 ∞ an (x − x0 )n ∞ y = n=0 ∞ ′′ ′ nan (x − x0 ) n−1 = n=0 (n + 1)an+1 (x − x0 ) ∞ n (1 − x2 )y ′′ − 2xy ′ + ℓ(ℓ + 1)y = 0 x = 0 is an ordinary point. . . a1 . . the ODE has two linearly independent solutions of the form ∞ py ′ = n=0 ∞ pn−r (r + 1)ar+1 (x − x0 )n qy = an (x − x0 )n n=0 y= n=0 qn−r ar (x − x0 )n The coeﬃcient of (x − x0 )n in the ODE y ′′ + py ′ + qy = 0 is therefore n n The coeﬃcients an can be determined by substituting the series into the ODE and comparing powers of (x − x0 ). . . . . . . . . . . . . . . . If p and q are rational functions (i. . so let ∞ ∞ ∞ y = n=0 n(n − 1)an (x − x0 ) n−2 = n=0 (n + 2)(n + 1)an+2 (x − x0 )n Note the following rule for multiplying power series: ∞ y= n=0 an xn . . ⊲ Find series solutions about x = 0 of Legendre’s equation inside some circle centred on x = x0 . n q(x) = n=0 qn (x − x0 )n This is a recurrence relation that determines an+2 (for n 0) in terms of the preceding coeﬃcients a0 . .

e. a Taylor series multiplied by a power (x − x0 )σ . .The coeﬃcient of xn (for n 0) is (n + 1)(n + 2)an+2 + [−n(n + 1) + ℓ(ℓ + 1)] an = 0 The recurrence relation is therefore n(n + 1) − ℓ(ℓ + 1) (n − ℓ)(n + ℓ + 1) an+2 = = an (n + 1)(n + 2) (n + 1)(n + 2) a0 and a1 are the arbitrary constants. where the index σ is a (generally complex) number to be determined. . . . . the even and odd series converge for |x2 | < 1. . . . . Notes: • this is a Taylor series only if σ is a non-negative integer • there may be one or two solutions of this form (see below) • the condition a0 = 0 is required to deﬁne σ uniquely 145 146 . . . . . . . . . for |x| < 1 • the radius of convergence is the distance to the singular points of Legendre’s equation at x = ±1 • if ℓ 0 is an even integer. i. e. 1 P2 (x) = 2 (3x2 − 1) 7. . . P1 (x) = x. . . . so the even solution is a polynomial (terminating power series) of degree ℓ • if ℓ 1 is an odd integer. . . . Fuchs’s theorem guarantees that the ODE has at least one solution of the form ∞ y= n=0 an (x − x0 )n+σ .g. . . . . Pℓ (x). . They are conventionally normalized (i. . .4. .e. so the odd solution is a polynomial of degree ℓ The polynomial solutions are called Legendre polynomials.3 Series solutions about a regular singular point If x = x0 is a regular singular point. then aℓ+2 and all subsequent odd coeﬃcients vanish. then aℓ+2 and all subsequent even coefﬁcients vanish. . The other coeﬃcients follow from the recurrence relation. . a0 or a1 is chosen) such that Pℓ (1) = 1. . Notes: • an even solution is obtained by choosing a0 = 1 and a1 = 0 • an odd solution is obtained by choosing a0 = 0 and a1 = 1 • these solutions are obviously linearly independent since one is not a constant multiple of the other • since |an+2 /an | → 1 as n → ∞. . . . . a0 = 0 i. . . .e. . . P0 (x) = 1. .

. . the radius of convergence of the series is inﬁnite. . . . . . . Equation (2) then requires that a1 = 0 (except possibly in the case ν = ± 1 . x = 0 is a regular singular point because p = 1/x and q = 1 − ν 2 /x2 are singular there but xp = 1 and x2 q = x2 − ν 2 are both analytic there. Since |an /an−2 | → 0 as n → ∞. . [If the roots are equal. . . . . . . . Example . . . . Seek a solution of the form ∞ Bessel’s equation requires ∞ n=0 ∞ (n + σ)(n + σ − 1) + (n + σ) − ν 2 an xn+σ + an xn+σ+2 = 0 n=0 Relabel the second series: ∞ n=0 ∞ (n + σ)2 − ν 2 an xn+σ + an−2 xn+σ = 0 n=2 Now compare powers of xn+σ : n=0: n=1: n 2: σ 2 − ν 2 a0 = 0 (n + σ)2 − ν 2 an + an−2 = 0 (1 + σ)2 − ν 2 a1 = 0 (1) (2) (3) Since a0 = 0 by assumption. This is best demonstrated by example. . all the odd coeﬃcients vanish. . . . . . . but even then a1 can be chosen to be 0). . . a0 = 0 147 Since a1 = 0. . . 148 . . where σ satisﬁes the indicial equation σ(σ − 1) + P0 σ + Q0 = 0 with two (generally complex) roots. the solutions are (x−x0 )σ and (x−x0 )σ ln(x−x0 ). . . . . .To understand in simple terms why the solutions behave in this way. . . . 2 Equation (3) provides the recurrence relation an = − an−2 n(n + 2σ) y= n=0 an xn+σ . . . . . . . . equation (1) provides the indicial equation σ2 − ν 2 = 0 with roots σ = ±ν. recall that ∞ Then ∞ P (x) ≡ (x − x0 )p(x) = n=0 ∞ Pn (x − x0 )n Qn (x − x0 )n y = ∞ ′ (n + σ)an xn+σ−1 . . Near x = x0 the ODE can be approximated using the leading approximations to p and q: Q0 y P0 y ′ + ≈0 y ′′ + x − x0 (x − x0 )2 The exact solutions of this approximate equation are of the form y = (x − x0 )σ . Frobenius’s method is used to ﬁnd the series solutions about a regular singular point.] It is reasonable that the solutions of the full ODE should resemble the solutions of the approximate ODE near the singular point. . . . ⊲ Find series solutions about x = 0 of Bessel’s equation x2 y ′′ + xy ′ + (x2 − ν 2 )y = 0 where ν is a constant. . . . . . . n=0 Q(x) ≡ (x − x0 )2 q(x) = y = ′′ n=0 n=0 (n + σ)(n + σ − 1)an xn+σ−2 are analytic at the regular singular point x = x0 . . . . .

y2 can be found (also with some diﬃculty) using the Wronskian method (section 7. Example: the equation x4 y ′′ + 2x2 y ′ − y = 0 has an irregular singular point at x = 0. 150 and the other is of the form ∞ y2 = n=0 bn (x − x0 )n+σ2 + cy1 ln(x − x0 ) 149 . In these cases the second solution is of a diﬀerent form (see below). Furthermore. . .4).For most values of ν we therefore obtain two linearly independent solutions (choosing a0 = 1): y1 = xν 1 − x4 x2 + + ··· 4(1 + ν) 32(1 + ν)(2 + ν) x2 x4 + + ··· 4(1 − ν) 32(1 − ν)(2 − ν) The coeﬃcients bn and c can be determined (with some diﬃculty) by substituting this form into the ODE and comparing coeﬃcients of (x − x0 )n and (x − x0 )n ln(x − x0 ). Example: Bessel’s equation of order ν = 0: y1 = 1 − x2 x4 + + ··· 4 64 y2 = x−ν 1 − However. . . . In exceptional cases c may vanish.4.3. . Usually only the regular solution is an acceptable solution of the scientiﬁc problem. y2 = y1 ln x − 7. it is an irregular singular point of the ODE. . . Even this simple ODE has an irregular singular point at z = ∞. there is generally only one solution of this form because the recurrence relation for the smaller value of Re(σ) will usually (but not always) fail • otherwise. In fact this example is just the familiar equation d2 y/dz 2 = y with the substitution x = 1/z.e. one solution is of the form ∞ y1 = n=0 an (x − x0 )n+σ1 . there is only one solution of the form an (x − x0 )n+σ • if the roots diﬀer by an integer. . . Its solutions are exp(±x−1 ). . if ν = 0 there is clearly only one solution of this form. if ν is a non-zero integer one of the recurrence relations will fail at some point and the corresponding series is invalid. . . σ2 of the indicial equation are equal or diﬀer by an integer. both of which have an essential singularity at x = 0. Re(σ1 ) Re(σ2 ) If either (x − x0 )p(x) or (x − x0 )2 q(x) is not analytic at the point x = x0 . The solutions can have worse kinds of singular behaviour there. analytic) and the other is singular. . there are two solutions of the form an (x − x0 )n+σ x2 3x4 − + ··· 4 128 Example: Bessel’s equation of order ν = 1: y2 = y1 ln x + y1 = x − x5 x3 + + ··· 8 192 2 3x3 + + ··· x 32 These examples illustrate a feature that is commonly encountered in scientiﬁc applications: one solution is regular (i. . Alternatively.4 Irregular singular points • the radius of convergence of the series is the distance from the point of expansion to the nearest singular point of the ODE If the roots σ1 . . A general analysis shows that: • if the roots of the indicial equation are equal.

Sign up to vote on this title

UsefulNot useful- Mathematical Methods for Physicists Arfken 5e Solutionsby Gibum Kim
- Mathematical Methods For Physicists Webber/Arfken Selected Solutions ch. 8by Josh Brewer
- Mathematical Methods for Physicists, Arfken, Weberby jacintachiclana
- Instructors’ Solutions for Mathematical Methods for Physics and Engineering by Rileyby astrowiz88

- Mathematical Methods for Physicists Solutions Ch. 2, Webber and Arfkenby Eyjay Samson
- Mathematical Methods For Physicists Webber and Arfken Selected Solutions Ch. 5by Josh Brewer
- Mathematical Methods for Physicists Webber/Arfken Selected Solutions from ch. 6by Josh Brewer
- Order from Chaos_Strogatzby Marius Barlea

- Mathematical Method For Physicists Ch. 1 & 2 Selected solutions Webber and Arfkenby Josh Brewer
- Mathematical Methods for Physicists Weber & Arfken Selected solutions ch. 1by Josh Brewer
- Arfken Weber Math Selected Problem Solutionby alirezatehrani
- Weber & Arfken Mathematical Methods For Physicists Ch. 5 selected solutionsby Josh Brewer

- Weber & Arfken Mathematical Methods For Physicists Ch. 5 selected Solutionsby Josh Brewer
- Mathematical Methods For Physicists Webber and Arfken Ch. 3 Selected Solutionsby Josh Brewer
- Mathematical Methods for Physicists Weber & Arfken selected solutions ch. 2by Josh Brewer
- Mathematical Methods for Physicists Weber & Arfken ch.15 & 17 selected solutionsby Josh Brewer

- Mathematical Methods
- George Arfken - Solutions - 02
- Mathematical Methods
- George Arfken - Solutions - 01
- Mathematical Methods For Physicists Webber/Arfken Selecet Solutions ch. 7
- Mathematical Methods for Physicists Weber & Arfken selected solutions ch. 1
- Mathematical Methods for Physicists
- Arfken Solutions 2
- Mathematical Methods for Physicists Arfken 5e Solutions
- Mathematical Methods For Physicists Webber/Arfken Selected Solutions ch. 8
- Mathematical Methods for Physicists, Arfken, Weber
- Instructors’ Solutions for Mathematical Methods for Physics and Engineering by Riley
- Mathematical Methods for Physicists Solutions Ch. 2, Webber and Arfken
- Mathematical Methods For Physicists Webber and Arfken Selected Solutions Ch. 5
- Mathematical Methods for Physicists Webber/Arfken Selected Solutions from ch. 6
- Order from Chaos_Strogatz
- Mathematical Methods For Physicists Weber & Arfken selected Solutions ch. 1
- Math Methods for Physicist solutions Webber & Arfken
- Mathematical Methods For Physcists Webber/Arfken Selected solutions ch. 6 & 7
- Arfken-Solutions-Manual-7th-Ed.pdf
- Mathematical Method For Physicists Ch. 1 & 2 Selected solutions Webber and Arfken
- Mathematical Methods for Physicists Weber & Arfken Selected solutions ch. 1
- Arfken Weber Math Selected Problem Solution
- Weber & Arfken Mathematical Methods For Physicists Ch. 5 selected solutions
- Weber & Arfken Mathematical Methods For Physicists Ch. 5 selected Solutions
- Mathematical Methods For Physicists Webber and Arfken Ch. 3 Selected Solutions
- Mathematical Methods for Physicists Weber & Arfken selected solutions ch. 2
- Mathematical Methods for Physicists Weber & Arfken ch.15 & 17 selected solutions
- Solutions Manual of Riley Et Al. Mathematical Methods 3e
- Mathematical Methods